Highest Voted 'validation' Questions - Data Science Stack Exchange

3

votes

1 answer

Does validation data has any effect on training or it acts solely without affecting the training?

When using Keras library of Python, we use validation data with training data while training our model. In every epoch, we get a validation accuracy. Does this validation accuracy have any effect on training on the next epoch?

asked Aug 04 '21 at 07:50

Rawnak Yazdani

133
4

3

votes

1 answer

Why do machine learning engineers insist on training with more data than validation set?

Among my colleagues I have noticed a curious insistence on training with, say, 70% or 80% of data and validating on the remainder. The reason it is curious to me is the lack of any theoretical reasoning, and it smacks of influence from a five-fold…

linear-regression training bigdata bayesian validation

asked Dec 28 '20 at 21:31

brethvoice

200
8

2

votes

0 answers

ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Any advice?

The error below presented itself when attempting to assemble a PCA. My code: is_float = X.dtype.kind in 'fc' if is_float and (np.isfinite(_safe_accumulator_op(np.sum, X))): pass elif is_float: msg_err = "Input contains {}…

python pca validation

asked May 20 '21 at 17:37

Macgregorfb

21
1

2

votes

0 answers

Validation set after hyperparameter tuning

Let's say I'm comparing few models, and for my dataset I'm using train/validation/test split, and not cross validation. Let's say I'm completely done with parameter tuning for one of them and want to evaluate on the test set. Will I train a new…

machine-learning machine-learning-model validation

asked Feb 13 '21 at 12:49

Oz0234

21
2

2

votes

1 answer

Using the whole dataset for testing (not validation) in case of small datasets

for an object detection task I created a small dataset to train an object detector. The class frequency is more or less balanced, however I defined some additional attributes with environmental conditions for each image, which results in a rather…

dataset feature-selection object-detection validation

asked Jan 20 '21 at 10:58

P. Leibner

123
4

2

votes

1 answer

Validation and training loss of a model are not stable

Below I have a model trained and the loss of both the training dataset (blue) and validation dataset (orange) are shown. From my understanding, the ideal case is that both validation and training loss should converge and stabilize in order to tell…

deep-learning loss-function accuracy validation

asked Dec 02 '22 at 16:18

Avv

231
2
9

2

votes

1 answer

How to build a model when we have three separate train, validation, and test sets?

I have a data set which should be divided into train, test, and validation sets. set.seed(98274) # Creating example data y <- sample(c(0,1), replace=TRUE, size=500) x1 <- rnorm(500) + 0.2 * y x2 <- rnorm(500) + 0.2 * x1 +…

machine-learning classification r feature-selection validation

asked Oct 11 '22 at 20:40

ebrahimi

1,277
7
20
39

2

votes

3 answers

What is exactly the difference between Validation data and Testing data

I asked this question on stack overflow and was told that this is a better place for it. I am confused with the terms validation and testing, is validating the model same as testing it? is it possible to use testing data for validation? what even…

python deep-learning neural-network data validation

asked Jan 21 '22 at 21:22

besa

23
5

2

votes

2 answers

Dataset and why use evaluate()?

I am starting in Machine Learning, and I have doubts about some concepts. I've read we need to split our dataset into training, validation and test sets. I'll ask four questions related to them. 1 - Training set: It is used in .fit() for our model…

dataset training validation test

asked Dec 03 '21 at 14:44

Murilo

125
3

1

vote

3 answers

Spliting Training Test and Validation for Image Dataset

I have 600 images in the training folder, 200 images in the validation folder, and 200 images in the test folder. Suppose if I fit the training data generator and validation data generator for some epochs for learning purposes -…

deep-learning convolutional-neural-network training accuracy validation

asked Aug 10 '21 at 18:07

User

36
5

1

vote

1 answer

Measure performance of classification model for training on different snapshots

I am trying to do binary classification on some chronological data. Let's assume we have weekly data from the first week of 2017 through the last week of 2020. Now we have found out that 26 weeks of training data might be sufficient for doing…

classification machine-learning-model model-selection performance validation

asked Aug 02 '21 at 04:06

Ricky

189
1
8

1

vote

1 answer

Using Z-test score to evaluate model performance

I think I know the answer to this question but I am looking for a sanity check here: Is it appropriate to use z-test scores in order to evaluate the performance of my model? I have a binary model that I have developed with a NN in Keras. I know the…

neural-network keras statistics metric validation

asked Jun 21 '21 at 15:20

I_Play_With_Data

2,079
2
16
39

1

vote

2 answers

dataset split for image classification

I am trying to do image classification for 14 categories (around 1000 images for each cat). And i initially created two folders for training and validation. In this case, do I still need to set a validation split or a subset in a code? or I can use…

dataset image-classification overfitting validation

asked Apr 15 '21 at 21:26

Hello-experts

13
5

1

vote

1 answer

Does overfitting depend only on validation loss or both training and validation loss?

There are several scenarios that can occur while training and validating: Both training loss and validation loss are decreasing, with the training loss lower than the validation loss. Both training loss and validation loss are decreasing, with the…

training overfitting validation

asked Feb 02 '21 at 03:56

mhdadk

131
5

1

vote

1 answer

Is it right to maintain the train distribution in test set for unbalanced data?

If the training set was unbalanced the chances are the model will be biased. But if the data distribution in the test set is the same distribution as the train set, this kind of bias is not going to affect validation accuracy. But my question is if…

machine-learning training sampling data-leakage validation

asked Dec 09 '20 at 06:42

Marzi Heidari

249
3
11

Questions tagged [validation]