clustering with k means

Question

I have dataset with two label class (good and bad), I want to apply K Means on my dataset using python, should I use that label dataset or I have to delete the label class column ?

K-means clustering is done to give labels to data. You already have those, so why are you applying k-means? What is the problem statement? — bkshi, Feb 09 '19 at 08:58
I think the OP actually meant the dataset contains a `binary feature`. — Louis T, Feb 09 '19 at 09:49
Possible duplicate of [K-Means clustering for mixed numeric and categorical data](https://datascience.stackexchange.com/questions/22/k-means-clustering-for-mixed-numeric-and-categorical-data) — Louis T, Feb 09 '19 at 09:50
Thanks all for your answers , yes my dataset consists of binary features and I want to use clustering to test new samples — lona, Feb 11 '19 at 03:23

score 1 · Answer 1 · answered Feb 09 '19 at 17:31

1

Delete the label column.

Assuming that you want to compare the clusters to the labels later, then the labels must not be part of the data passed to k-means.

And k-means only works well on continuous variables anyway.

answered Feb 09 '19 at 17:31

Has QUIT--Anony-Mousse

7,969
1
14
30

clustering with k means

1 Answers1