0

I understand there are many techniques/libraries/packages to extract named entities like people, places etc. from data.

Personally, for me an entity is something like:

first name: john
surname: smith
dob: 1/1/2000
shoesize: 6
address: ...

etc.

So an entity is a class having fields, to use object orientated terminology.

One would expect that these fields/attributes would occur close in a unstructured data (closeness could be defined by word distance). Are there techniques to extract, what I would call, complete entities? Of course this would no be too accurate but anything would be better than nothing. I did a few google searches without success.

desertnaut
  • 1,908
  • 2
  • 13
  • 23
cs0815
  • 174
  • 1
  • 8
  • By unstructured data you mean general text, right? General text would rarely contain all the attributes you're looking for, at best it would only be possible to capture a few characteristics about a person. If you have a particular kind of text in mind, please give more detail, ideally an example of such text. – Erwan Apr 25 '21 at 00:14
  • thanks. yes I mean text. please note an entity could also be a company. the text consists of millions of documents from emails to word documents. – cs0815 Apr 25 '21 at 08:59

0 Answers0