Highest Voted Questions - Data Science Stack Exchange

8

votes

1 answer

Keras Early Stopping: Monitor 'loss' or 'val_loss'?

I often use "early stopping" when I train neural nets, e.g. in Keras: from keras.callbacks import EarlyStopping # Define early stopping as callback early_stopping = EarlyStopping(monitor='loss', patience=5, mode='auto',…

neural-network keras early-stopping

asked May 07 '20 at 11:37

Peter

7,277
5
18
47

8

votes

2 answers

XGBoost and Random Forest: ntrees vs. number of boosting rounds vs. n_estimators

So I understand the main difference between Random Forests and GB Methods. Random Forests grow parallel trees and GB Methods grow one tree for each iteration. However, I am confused on the vocab used with scikit's RF regressor and xgboost's…

python random-forest decision-trees xgboost hyperparameter

asked Apr 22 '20 at 15:06

Jack Armstrong

233
2
6

8

votes

1 answer

Encoding with OrdinalEncoder : how to give levels as user input?

I am trying to do ordinal encoding using: from sklearn.preprocessing import OrdinalEncoder I will try to explain my problem with a simple dataset. X = pd.DataFrame({'animals':['low','med','low','high','low','high']}) enc =…

machine-learning scikit-learn data-cleaning preprocessing encoding

asked Apr 15 '20 at 00:25

Ayush Ranjan

401
1
4
13

8

votes

2 answers

Joining tables from different locations in Bigquery

I have been trying to join two tables from different datasets that are in different locations but in the same project. However, I keep getting the error: dataset not found in US location. The datasets' locations are US and us-east1 Here is what I…

google-cloud

asked Apr 10 '20 at 18:43

shivanshu dhawan

178
1
9

8

votes

2 answers

Optimising for Brier objective function directly gives worse Brier score than optimising with custom objective - what does it tell me?

I am training an XGBoost model and as I care the most about resulting probabilities, not classification itself I have chosen Brier score as a metric for my model, so that probabilities would be well calibrated. I tuned my hyperparameters using…

xgboost machine-learning-model optimization objective-function

asked Apr 06 '20 at 07:27

Xaume

182
2
11

8

votes

3 answers

Difference between Ridge and Linear Regression

From what I have understood, the Ridge Regression is just having the loss function for an optimization problem with the addition of the regularization term (L2 Norm in the case of Ridge). However I am not sure if the loss function can be described…

regression linear-regression

asked Mar 13 '20 at 19:09

Panathinaikos

187
1
1
7

8

votes

2 answers

What should be the labels for subword tokens in BERT for NER task?

For any NER task, we need a sequence of words and their corresponding labels. To extract features for these words from BERT, they need to be tokenized into subwords. For example, the word 'infrequent' (with label B-count) will be tokenized into…

python named-entity-recognition bert

asked Mar 13 '20 at 13:32

PinkBanter

374
3
15

8

votes

2 answers

Why does vanilla transformer has fixed-length input?

I know that in the math on which the transformer is based there is no restriction on the length of input. But I still can’t understand why we should fix it in the frameworks (PyTorch). Because of this problem Transformer-XL has been created. Can you…

nlp transformer

asked Mar 08 '20 at 16:28

Ann

133
7

8

votes

1 answer

Which of the NIPS 2014 papers are most significant, and why?

As a newcomer to the field, I find many of the NIPS 2014 papers fascinating, but it is difficult for me to evaluate which ones represent real progress over current approaches. Which papers do you think are most significant and are likely to have a…

machine-learning research state-of-the-art

asked Aug 21 '15 at 18:10

Michael R. Bernstein

189
2

8

votes

2 answers

What are some standard ways of computing the distance between individual search queries?

I made a similar question asking about distance between "documents" (Wikipedia articles, news stories, etc.). I made this a separate question because search queries are considerably smaller than documents and are considerably noisier. I hence…

machine-learning nlp search

asked Jul 05 '14 at 16:20

Matt

811
1
7
12

8

votes

1 answer

Why gradient boosting uses sampling without replacement?

In Random Forest each tree is built selecting a sample with replacement (bootstrap). And I assumed that Gradient Boosting's trees were selected with the same sampling technique. (@BenReiniger corrected me). Here there are the sampling techniques…

machine-learning random-forest decision-trees xgboost sampling

asked Feb 07 '20 at 06:59

Carlos Mougan

6,011
2
15
45

8

votes

2 answers

How word2vec can handle unseen / new words to bypass this for new classifications?

In simple terms, if my classification is based on word2vec as features, what I am supposed to do, if a new word comes, which does not have a word2vec? I am trying to used word2vec or word vectors for classification based on entity. For example: I…

machine-learning nlp deep-learning word-embeddings

asked Aug 11 '15 at 05:04

Sarath

81
1
2

8

votes

2 answers

Can I use LSTM models to evaluate multiple, independent time series?

Let's say that I would like to predict the temperature tomorrow. I could use the approach whereby I train a model based on a time-series dataset collected from a single location (for example, see this excellent…

machine-learning keras r lstm

asked Jan 28 '20 at 21:26

CharismaticChromoFauna

81
1
5

8

votes

2 answers

Linearly increasing data with manual reset

I have a linearly increasing time series dataset of a sensor, with value ranges between 50 and 150. I've implemented a Simple Linear Regression algorithm to fit a regression line on such data, and I'm predicting the date when the series would reach…

machine-learning statistics time-series

asked Jul 04 '14 at 05:12

ArunDhaJ

183
6

8

votes

5 answers

How can we extract fields from images?

I am making an document parser which extracts data fields from the documents and store them in a structured way. Each field in my dataset is horizontal which is easy to extract. But the model fails on following type of example - Is there any way…

machine-learning python deep-learning keras object-detection

asked Jan 16 '20 at 12:35

hR 312

81
1
8

Most Popular