1

I am getting 100% accuracy for the below models. can you please advise me where I am doing wrong? If it is due to overfitting then why other models like below do not over-fit the model.

enter image description here

Attached same code for random forest

from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(random_state = 1)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
cv_score = cross_val_score(rf, X_train, y_train.values.ravel(), cv=kfold)
cv_score = cross_val_score(rf, X_train, y_train.values.ravel(), cv=kfold)
rf_score = cv_score.mean()
print('Random Forest K-fold Scores:')
print(cv_score)
print()
print('Random Forest Average Score:')
print(rf_score)

Data sampleenter image description here

Deep
  • 11
  • 3
  • 1
    Can you please also add some of your data? – ksohan Feb 07 '22 at 09:21
  • Added data sample – Deep Feb 07 '22 at 11:38
  • If these results are obtained with cross-validation like in the code, then this is not overfitting. Overfitting would cause low performance on the test set. It is possible that the results are correct, assuming that there is no error in the data. In particular duplicates in the data would cause a bias in estimating the performance. – Erwan Feb 07 '22 at 16:32

0 Answers0