Highest Voted 'k-means' Questions - Data Science Stack Exchange

200

votes

13 answers

K-Means clustering for mixed numeric and categorical data

My data set contains a number of numeric attributes and one categorical. Say, NumericAttr1, NumericAttr2, ..., NumericAttrN, CategoricalAttr, where CategoricalAttr takes one of three possible values: CategoricalAttrValue1, CategoricalAttrValue2 or…

asked May 14 '14 at 05:58

IgorS

5,444
11
31
43

67

votes

9 answers

Clustering geo location coordinates (lat,long pairs)

What is the right approach and clustering algorithm for geolocation clustering? I'm using the following code to cluster geolocation coordinates: import numpy as np import matplotlib.pyplot as plt from scipy.cluster.vq import kmeans2,…

machine-learning python clustering k-means geospatial

asked Jul 17 '14 at 09:50

rok

813
1
7
6

25

votes

3 answers

K-means incoherent behaviour choosing K with Elbow method, BIC, variance explained and silhouette

I'm trying to cluster some vectors with 90 features with K-means. Since this algorithm asks me the number of clusters, I want to validate my choice with some nice math. I expect to have from 8 to 10 clusters. The features are Z-score scaled. Elbow…

clustering k-means

asked Jul 20 '15 at 08:03

marcodena

1,667
4
14
17

20

votes

4 answers

K-means: What are some good ways to choose an efficient set of initial centroids?

When a random initialization of centroids is used, different runs of K-means produce different total SSEs. And it is crucial in the performance of the algorithm. What are some effective approaches toward solving this problem? Recent approaches are…

data-mining clustering k-means

asked Apr 30 '15 at 13:42

ngub05

333
1
2
8

17

votes

2 answers

K-means vs. online K-means

K-means is a well known algorithm for clustering, but there is also an online variation of such algorithm (online K-means). What are the pros and cons of these approaches, and when should each be preferred?

clustering algorithms k-means

asked Jun 18 '14 at 19:48

Rubens

4,097
5
23
42

14

votes

2 answers

Fast k-means like algorithm for $10^{10}$ points?

I am looking to do k-means clustering on a set of 10-dimensional points. The catch: there are $10^{10}$ points. I am looking for just the center and size of the largest clusters (let's say 10 to 100 clusters); I don't care about what cluster each…

clustering k-means

asked May 11 '15 at 06:53

Alex I

3,142
1
21
27

12

votes

1 answer

What are practical differences between kernel k-means and spectral clustering?

I've been lately wondering about kernel k-means and spectral clustering algorithms and their differences. I know that spectral clustering is a more broad term and different settings can affect the way it works, but one popular variant is using…

clustering k-means kernel spectral-clustering

asked Jan 09 '20 at 07:40

Kuba_

264
1
10

12

votes

1 answer

How to measure the similarity between two images?

I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively. My goal is try to cluster the images by using k-means. Assume image1 is x, and image2 is y.Here we need to measure the similarity between any…

machine-learning k-means similarity image

asked Apr 05 '19 at 00:36

jason

309
2
4
9

12

votes

2 answers

Clustering high dimensional data

TL;DR: Given a big image dataset (around 36 GiB of raw pixels) of unlabeled data, how can I cluster the images (based on the pixel values) without knowing the number of clusters K to begin with? I am currently working on an unsupervised learning…

clustering tensorflow k-means unsupervised-learning tsne

asked Jan 25 '17 at 17:55

sunside

223
1
2
8

12

votes

3 answers

How to get the probability of belonging to clusters for k-means?

I need to get the probability for each point in my data set. The idea is to compute distance matrix (first column contsins distances to first cluster, second column conteins distances to second cluster and etc). The closest point has probability =…

python clustering k-means

asked Oct 10 '16 at 06:17

Толкачёв Иван

317
1
3
7

11

votes

4 answers

Clustering for mixed numeric and nominal discrete data

My data includes survey responses that are binary (numeric) and nominal / categorical. All responses are discrete and at individual level. Data is of shape (n=7219, p=105). Couple things: I am trying to identify a clustering technique with a…

clustering k-means scikit-learn categorical-data

asked Nov 02 '15 at 04:12

kms

310
1
4
14

10

votes

1 answer

Convergence in Hartigan-Wong k-means method and other algorithms

I have been trying to understand the different k-means clustering algorithms mainly that are implemented in the stats package of the R language. I understand the Lloyd's algorithm and MacQueen's online algorithm. The way I understand them is as…

r clustering k-means

asked Jan 19 '16 at 20:59

Sid

101
1
5

10

votes

1 answer

Confused about how to apply KMeans on my a dataset with features extracted

I am trying to apply a basic use of the scikitlearn KMeans Clustering package, to create different clusters that I could use to identify a certain activity. For example, in my dataset below, I have different usage events (0,...,11), and each event…

python clustering k-means unsupervised-learning

asked Feb 02 '17 at 14:27

Gary

529
2
5
12

8

votes

2 answers

Image clustering by similarity measurement (CW-SSIM)

I'm trying to use scikit-learn and pyssim for clustering a set of images - less than 100. The end goal is to place the images into several buckets (clusters) according to the calculated similarity measures - CW-SSIM. The task seems to be trivial,…

machine-learning r python scikit-learn k-means

asked Jan 10 '16 at 19:44

Oleg Puzanov

111
1
4

8

votes

1 answer

Bag of Visual Words

What I am trying to do: I am trying to classify some images using local and global features. What I have done so far: I have extracted sift descriptors for each image and I am using this as my input for k-means to create my vocabulary from all of…

python clustering image-classification k-means

asked May 02 '18 at 22:30

Kevin

261
3
7

Questions tagged [k-means]