I have a pdf file that contains information . I would like to extract few key terms/phrase along with a value for example (current balance : CHF (swiss francs) 1,000)
I can convert pdf file to text using pdfminer . But how i can extract the above keyword using tensorflow text classification or other methods.
Can anyone suggest how i can start with it? I haven't come across a single example with tensorflow. There is a question here (Keyword/phrase extraction from Text using Deep Learning libraries ). but this doesn't help me to understand much as I am a beginner.
I used NLTK to tokenize words and read no of words by using
import re
text = re.sub(r'[_"\-;%()|+&=*.!?:#$@\[\]/]', ' ', f)
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
tokens = word_tokenize(text)
stop_words = set(stopwords.words('english'))
words = [w for w in tokens if not w in stop_words]
freq = nltk.FreqDist(words)
for key,val in freq.items():
frq = str(key) + ':' + str(val)
So i have this text that i could like to extract the key phrase current balance : CHF (swiss francs) 1,000 along with the value using Neural Network. How can i do that? I came across many posts but couldnt get what i need.
PS: every time i upload a new file, i want my algorithm to find out the same keyword and use the value next to it.