Highest Voted Questions - Data Science Stack Exchange

8

votes

2 answers

How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task)

First of all, I want to say that I am asking this question because I am interested in using BERT embeddings as document features to do clustering. I am using Transformers from the Hugging Face library. I was thinking of averaging all of the Word…

machine-learning deep-learning nlp word-embeddings bert

asked Aug 21 '20 at 02:00

fractalnature

805
6
19

8

votes

4 answers

Understanding how convolutional layers work

After working with a CNN using Keras and the Mnist dataset for the well-know hand written digit recognition problem, I came up with some questions about how the convolutional layer work. I can understand what the convolution process is. My first…

cnn training convolution backpropagation

asked Aug 18 '20 at 11:48

Karampistis Dimitrios

93
4

8

votes

4 answers

Does reinforcement learning require the help of other learning algorithms?

Can't reinforcement learning be used without the help of other learning algorithms like SVM and MLP back propagation? I consulted two papers: Paper 1 Paper 2 both have used other machine learning methods in the inner loop.

machine-learning reinforcement-learning algorithms

asked Sep 07 '15 at 08:29

girl101

1,161
2
11
25

8

votes

3 answers

Are there any machine learning techniques to identify points on plots/ images?

I have data for each vehicle's lateral position over time and lane number as shown in these 3 plots in the image and sample data below. > a Frame.ID xcoord Lane 1 452 27.39400 3 2 453 27.38331 3 3 454 27.42999 3 4 …

machine-learning r

asked Sep 06 '15 at 02:04

umair durrani

344
2
8

8

votes

2 answers

Can a linear regression model without polynomial features overfit?

I've read in some articles on the internet that linear regression can overfit. However is that possible when we are not using polynomial features? We are just plotting a line trough the data points when we have one feature or a plane when we have…

linear-regression overfitting

asked Aug 08 '20 at 20:21

Tim von Känel

361
1
10

8

votes

4 answers

Job title similarity

I'm trying to define a metric between job titles in IT field. For this I need some metric between words of job titles that are not appearing together in the same job title, e.g. metric between the words senior, primary, lead, head, vp, director,…

machine-learning dataset

asked Jul 21 '14 at 09:00

Mher

181
5

8

votes

1 answer

Anybody know what this type of visualisation is called?

I think this is a pretty cool way to visualise changes in values but I can’t find any name for this type of visualisation. I Source: https://www.economist.com/graphic-detail/2020/07/28/americans-are-getting-more-nervous-about-what-they-say-in-public

visualization

asked Jul 29 '20 at 17:53

K G

183
3

8

votes

2 answers

image_dataset_from_directory VS flow_from_directory

What is the main diffrence between flow_from_directory VS image_dataset_from_directory in keras? which one should I use?

machine-learning deep-learning keras tensorflow data-science-model

asked Jul 28 '20 at 07:38

Bala venkatesh

361
3
10

8

votes

1 answer

Is it possible to have stratified train-test split of a set based on two columns?

Consider a dataframe that contains two columns, text and label. I can very easily create a stratified train-test split using sklearn.model_selection.train_test_split. The only thing I have to do is to set the column I want to use for the…

python scikit-learn dataset pandas

asked Jul 23 '20 at 13:09

Aventinus

203
1
3
7

8

votes

2 answers

Finding optimal threshold in multi-class classification task

In a binary classification problem, it is easy to find the optimal threshold (F1) by setting different thresholds, evaluating them and picking the one with the highest F1. Similarly is there a proper way to find optimal thresholds for all the…

classification

asked Jul 06 '20 at 21:01

saiRegrefree

81
1
2

8

votes

3 answers

Bert-Transformer : Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this…

machine-learning deep-learning tensorflow bert transformer

asked Jul 02 '20 at 21:25

Aaditya ura

415
5
16

8

votes

2 answers

Do I need validation data if my train and test accuracy/loss is consistent?

I am trying to understand the purpose of a 3rd split in the form of a validation dataset. I am not necessarily talking about cross-validation here. In the scenario below, it would appear that the model is overfit to the training dataset. Train…

machine-learning neural-network deep-learning model-evaluations

asked Jun 16 '20 at 01:18

Kermit

519
5
16

8

votes

2 answers

Is over fitting okay if test accuracy is high enough?

I am trying to build a binary classifier. I have tried deep neural networks with various different structures and parameters and I was not able to get anything better than Train set accuracy : 0.70102 Test set accuracy : 0.70001 Then I tried…

scikit-learn random-forest overfitting

asked May 23 '20 at 04:54

skrrrt

304
2
13

8

votes

1 answer

Which ML approach to choose for the game AI when rewards are delayed?

Question: Which Machine Learning approach should I choose for the AI of my computer game, where the actions of the AI do not lead to immediate rewards, but delayed rewards instead? About me: I am a complete beginner in the area of machine learning.…

machine-learning random-forest decision-trees reinforcement-learning game

asked May 17 '20 at 11:43

Logende

61
4

8

votes

2 answers

Why leaky relu is not so common in real practice?

As leaky relu does not lead any value to 0, so training always continues. And I can't think of any disadvantages it have. Yet Leaky relu is less popular than Relu in real practice. Can someone tell why?

machine-learning neural-network deep-learning activation-function

asked May 14 '20 at 02:30

Prashant Gupta

181
1
3

Most Popular