Most Popular

1500 questions
8
votes
2 answers

How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task)

First of all, I want to say that I am asking this question because I am interested in using BERT embeddings as document features to do clustering. I am using Transformers from the Hugging Face library. I was thinking of averaging all of the Word…
8
votes
4 answers

Understanding how convolutional layers work

After working with a CNN using Keras and the Mnist dataset for the well-know hand written digit recognition problem, I came up with some questions about how the convolutional layer work. I can understand what the convolution process is. My first…
8
votes
4 answers

Does reinforcement learning require the help of other learning algorithms?

Can't reinforcement learning be used without the help of other learning algorithms like SVM and MLP back propagation? I consulted two papers: Paper 1 Paper 2 both have used other machine learning methods in the inner loop.
girl101
  • 1,161
  • 2
  • 11
  • 25
8
votes
3 answers

Are there any machine learning techniques to identify points on plots/ images?

I have data for each vehicle's lateral position over time and lane number as shown in these 3 plots in the image and sample data below. > a Frame.ID xcoord Lane 1 452 27.39400 3 2 453 27.38331 3 3 454 27.42999 3 4 …
umair durrani
  • 344
  • 2
  • 8
8
votes
2 answers

Can a linear regression model without polynomial features overfit?

I've read in some articles on the internet that linear regression can overfit. However is that possible when we are not using polynomial features? We are just plotting a line trough the data points when we have one feature or a plane when we have…
Tim von Känel
  • 361
  • 1
  • 10
8
votes
4 answers

Job title similarity

I'm trying to define a metric between job titles in IT field. For this I need some metric between words of job titles that are not appearing together in the same job title, e.g. metric between the words senior, primary, lead, head, vp, director,…
Mher
  • 181
  • 5
8
votes
1 answer

Anybody know what this type of visualisation is called?

I think this is a pretty cool way to visualise changes in values but I can’t find any name for this type of visualisation. I Source: https://www.economist.com/graphic-detail/2020/07/28/americans-are-getting-more-nervous-about-what-they-say-in-public
K G
  • 183
  • 3
8
votes
2 answers

image_dataset_from_directory VS flow_from_directory

What is the main diffrence between flow_from_directory VS image_dataset_from_directory in keras? which one should I use?
8
votes
1 answer

Is it possible to have stratified train-test split of a set based on two columns?

Consider a dataframe that contains two columns, text and label. I can very easily create a stratified train-test split using sklearn.model_selection.train_test_split. The only thing I have to do is to set the column I want to use for the…
Aventinus
  • 203
  • 1
  • 3
  • 7
8
votes
2 answers

Finding optimal threshold in multi-class classification task

In a binary classification problem, it is easy to find the optimal threshold (F1) by setting different thresholds, evaluating them and picking the one with the highest F1. Similarly is there a proper way to find optimal thresholds for all the…
saiRegrefree
  • 81
  • 1
  • 2
8
votes
3 answers

Bert-Transformer : Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this…
8
votes
2 answers

Do I need validation data if my train and test accuracy/loss is consistent?

I am trying to understand the purpose of a 3rd split in the form of a validation dataset. I am not necessarily talking about cross-validation here. In the scenario below, it would appear that the model is overfit to the training dataset. Train…
8
votes
2 answers

Is over fitting okay if test accuracy is high enough?

I am trying to build a binary classifier. I have tried deep neural networks with various different structures and parameters and I was not able to get anything better than Train set accuracy : 0.70102 Test set accuracy : 0.70001 Then I tried…
skrrrt
  • 304
  • 2
  • 13
8
votes
1 answer

Which ML approach to choose for the game AI when rewards are delayed?

Question: Which Machine Learning approach should I choose for the AI of my computer game, where the actions of the AI do not lead to immediate rewards, but delayed rewards instead? About me: I am a complete beginner in the area of machine learning.…
8
votes
2 answers

Why leaky relu is not so common in real practice?

As leaky relu does not lead any value to 0, so training always continues. And I can't think of any disadvantages it have. Yet Leaky relu is less popular than Relu in real practice. Can someone tell why?