2

I have unstructured documents from which I have to extract the information like let buyer name, seller name, expiry date, buying date etc. I had planned to use spacy(Custom entity recolonization(Followed this blog https://medium.com/@manivannan_data/how-to-train-ner-with-custom-training-data-using-spacy-188e0e508c6)). But it seems sometimes buyer name predict as seller name and vice-versa and also sometimes got multiple predicted data wrongly in single entity when I passed whole document content. FYI.. This documents have approx 2-20 pages. so it has large content.

Can someone share if we can use any other packages for higher accuracy? if not how I need to train the model so that accuracy will be higher? Thanks in advance

Rajesh das
  • 113
  • 7

1 Answers1

0

Try to clean your document and use the flair library, it's a user friendly library from Zalando Research that allows you do do all sorts of nlp tasks very quickly. Especially NER.

Stephen Rauch
  • 1,783
  • 11
  • 21
  • 34