Most Popular

1500 questions
48
votes
5 answers

Opening a 20GB file for analysis with pandas

I am currently trying to open a file with pandas and python for machine learning purposes it would be ideal for me to have them all in a DataFrame. Now The file is 18GB large and my RAM is 32 GB but I keep getting memory errors. From your experience…
Hari Prasad
  • 491
  • 1
  • 5
  • 4
47
votes
3 answers

What does Logits in machine learning mean?

"One common mistake that I would make is adding a non-linearity to my logits output." What does the term "logit" means here or what does it represent ?
Rajat
  • 1,017
  • 2
  • 9
  • 10
47
votes
2 answers

What exactly is bootstrapping in reinforcement learning?

Apparently, in reinforcement learning, temporal-difference (TD) method is a bootstrapping method. On the other hand, Monte Carlo methods are not bootstrapping methods. What exactly is bootstrapping in RL? What is a bootstrapping method in RL?
user10640
47
votes
3 answers

What is Ground Truth

In the context of Machine Learning, I have seen the term Ground Truth used a lot. I have searched a lot and found the following definition in Wikipedia: In machine learning, the term "ground truth" refers to the accuracy of the training set's…
Green Falcon
  • 13,868
  • 9
  • 55
  • 98
46
votes
6 answers

Calculating KL Divergence in Python

I am rather new to this and can't say I have a complete understanding of the theoretical concepts behind this. I am trying to calculate the KL Divergence between several lists of points in Python. I am using this to try and do this. The problem that…
Nanda
  • 773
  • 1
  • 7
  • 8
46
votes
12 answers

Data Science in C (or C++)

I'm an R language programmer. I'm also in the group of people who are considered Data Scientists but who come from academic disciplines other than CS. This works out well in my role as a Data Scientist, however, by starting my career in R and only…
Hack-R
  • 1,919
  • 1
  • 21
  • 34
46
votes
9 answers

How much of data wrangling is a data scientist's job?

I'm currently working as a data scientist at a large company (my first job as a DS, so this question may be a result of my lack of experience). They have a huge backlog of really important data science projects that would have a great positive…
Victor Valente
  • 569
  • 4
  • 9
46
votes
2 answers

How does the validation_split parameter of Keras' fit function work?

Validation-split in Keras Sequential model fit function is documented as following on https://keras.io/models/sequential/ : validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set…
rnso
  • 1,558
  • 3
  • 16
  • 34
46
votes
2 answers

Merging two different models in Keras

I am trying to merge two Keras models into a single model and I am unable to accomplish this. For example in the attached Figure, I would like to fetch the middle layer $A2$ of dimension 8, and use this as input to the layer $B1$ (of dimension 8…
Rkz
  • 1,033
  • 1
  • 10
  • 12
46
votes
5 answers

Does gradient descent always converge to an optimum?

I am wondering whether there is any scenario in which gradient descent does not converge to a minimum. I am aware that gradient descent is not always guaranteed to converge to a global optimum. I am also aware that it might diverge from an optimum…
46
votes
5 answers

How to force weights to be non-negative in Linear regression

I am using a standard linear regression using scikit-learn in python. However, I would like to force the weights to be all non-negative for every feature. is there any way I can accomplish that? I was looking in the documentation but could not find…
user
  • 1,971
  • 6
  • 20
  • 36
45
votes
4 answers

Early stopping on validation loss or on accuracy?

I am currently training a neural network and I cannot decide which to use to implement my Early Stopping criteria: validation loss or a metrics like accuracy/f1score/auc/whatever calculated on the validation set. In my research, I came upon articles…
qmeeus
  • 1,239
  • 1
  • 10
  • 13
45
votes
4 answers

Why is ReLU used as an activation function?

Activation functions are used to introduce non-linearities in the linear output of the type w * x + b in a neural network. Which I am able to understand intuitively for the activation functions like sigmoid. I understand the advantages of ReLU,…
45
votes
3 answers

What does the notation mAP@[.5:.95] mean?

For detection, a common way to determine if one object proposal was right is Intersection over Union (IoU, IU). This takes the set $A$ of proposed object pixels and the set of true object pixels $B$ and calculates: $$IoU(A, B) = \frac{A \cap B}{A…
Martin Thoma
  • 18,630
  • 31
  • 92
  • 167
44
votes
2 answers

What does from_logits=True do in SparseCategoricalcrossEntropy loss function?

In the documentation it has been mentioned that y_pred needs to be in the range of [-inf to inf] when from_logits=True. I truly didn't understand what this means, since the probabilities need to be in the range of 0 to 1! Can someone please explain…