Highest Voted 'nltk' Questions - Data Science Stack Exchange

33

votes

5 answers

How can I get a measure of the semantic similarity of words?

What is the best way to figure out the semantic similarity of words? Word2Vec is okay, but not ideal: # Using the 840B word Common Crawl GloVe vectors with gensim: # 'hot' is closer to 'cold' than 'warm' In [7]: model.similarity('hot',…

asked Jul 19 '16 at 21:54

Thomas Johnson

665
1
7
11

23

votes

6 answers

Similarity between two words

I'm looking for a Python library that helps me identify the similarity between two words or sentences. I will be doing Audio to Text conversion which will result in an English dictionary or non dictionary word(s) ( This could be a Person or Company…

nlp nltk

asked Jul 04 '16 at 06:00

gogasca

749
2
8
17

9

votes

6 answers

NLP: What are some popular packages for multi-word tokenization?

I intend to tokenize a number of job description texts. I have tried the standard tokenization using whitespace as the delimiter. However I noticed that there are some multi-word expressions that are splitted by whitespace, which may well cause…

nlp nltk tokenization

asked Mar 02 '17 at 07:04

CyberPlayerOne

392
1
4
14

8

votes

1 answer

Complex Chunking with NLTK

I am trying to figure out how to use NLTK's cascading chunker as per Chapter 7 of the NLTK book. Unfortunately, I'm running into a few issues when performing non-trivial chunking measures. Let's start with this phrase: "adventure movies between 2000…

python nlp nltk

asked May 16 '15 at 00:15

grill

234
3
7

8

votes

2 answers

Is there an alternative to nltk in golang?

Golang is one of my favourite languages and I want to use it for a personal NLP/ML project. Is golang's ecosystem good and rich enough for this? Is there an alternative package for nltk in golang?

nlp nltk software-recommendation

asked Jun 03 '16 at 16:38

Dariush

183
1
5

7

votes

2 answers

Combining Machine Learning classifier with NLTK Vader for Sentiment Analysis

As a part of my university project, I am researching/developing a sentiment analysis model wherein I am trying to combine NLTK Vader (SentimentIntensityAnalyzer) results with a Machine Learning trained classifier for prediction of Sentiments on…

machine-learning neural-network scikit-learn sentiment-analysis nltk

asked Aug 15 '17 at 12:37

Chetan

171
3

6

votes

1 answer

Is there an NLP corpus that contains common medical terms?

I am trying to use the NLTK library to extract keywords denoting medical symptoms from medical reports of patients. For example, I have a medical report as follows: s:a 33 year old female crystallographer presents with mild spells of vertigo, mild…

python nlp nltk

asked Mar 01 '21 at 12:43

user112647

6

votes

3 answers

Training NLP with multiple text input features

Question: How can I train a NLP model with discrete labels that is based on multiple text input features? Background: I'm trying to predict the difficulty of a 4-option multiple choice exam question (probability of a test-taker selecting the correct…

machine-learning nlp nltk

asked Feb 28 '19 at 19:38

Carl Molnar

111
2
6

6

votes

1 answer

How to extract Question/s from document with NLTK?

How to extract Only Question/s from document with NLTK ? Can we categorise this Question into Y/N and details type answerable ? Note: I am one week old in NLTK ;-)

python nltk

asked Jan 09 '18 at 06:45

Saurabh Chandra Patel

169
1
2
5

5

votes

3 answers

Chunking Sentences with Spacy

I have a lot of sentences (500k) which looks like this: "Penalty missed! Bad penalty by Felipe Brisola - Riga FC - shot with right foot is very close to the goal. Felipe Brisola should be disappointed." "Penalty saved! Damir Kojasevic - Sutjeska…

machine-learning nlp nltk spacy

asked Nov 30 '19 at 15:53

senty

153
3

5

votes

3 answers

Machine learning or NLP approach to convert string about month ,year into dates

I'm currently in the process of developing a program with the capability of converting human style of representing year into actual dates. Example : last year last month into December 2018 string may be complete sentence like : what were you doing 5…

machine-learning python nlp nltk regex

asked Feb 20 '19 at 06:30

Bipul

201
1
9

5

votes

1 answer

Accuracy of word and sent tokenize versus custom tokenizers in nltk

The Natural Language Processing with Python book is a really good resource to understand basics of NLP. One of the chapters introduces training 'sentence segmentation' using Naive Bayes Classifer and provides a method to perform sentence…

python nlp nltk tokenization

asked Dec 30 '17 at 11:22

MrKickass

111
8

5

votes

1 answer

Inferring Relational Hierarchies of Words

I am new to natural language processing and I have not heard of a problem similar to mine yet. I was wondering if anyone could refer me to a method for solving my problem, or tell me how this problem is referred to in the academic literature, so…

nlp unsupervised-learning nltk

asked Feb 01 '16 at 16:05

Pholochtairze

153
6

4

votes

3 answers

TFIDF for very short sentences

I'm trying to build a regression model, in which one of the features contains text data. I was thinking in using scikit-learn's sklearn.feature_extraction.text.TfidfVectorizer. The issue however, is that the actual strings contain very few words.…

machine-learning nltk tfidf

asked Sep 06 '19 at 08:29

yatu

293
1
11

4

votes

3 answers

Is there a good German Stemmer?

What I tried: # -*- coding: utf-8 -*- from nltk.stem.snowball import GermanStemmer st = GermanStemmer() token_groups = [(["experte", "Experte", "Experten", "Expertin", "Expertinnen"], []), (["geh", "gehe", "gehst", "geht", "gehen",…

nlp nltk stemming

asked Aug 08 '19 at 06:31

Martin Thoma

18,630
31
92
167

Questions tagged [nltk]