Mixed geospatial and categorical clustering

Asked Jan 15 '17 at 22:47

Active Apr 07 '19 at 12:43

Viewed 684 times

I'm working on a project that seeks to identify clusters in urban development based on location (in lat/lon) and a categorical variable (what the particular site is zoned for). Ideally, the analysis would identify clusters of sites that are 1) near each other and 2) zoned the same. Below is a sample of what my data looks like:

 lat       lon      zone
 33.22320 -112.6741 R-43      
 33.45324 -113.0888 R-43      
 33.71800 -112.3885 R-43      
 33.45626 -111.9408 AG        
 33.45746 -111.9313 R-6       
 33.45747 -111.9309 R-6

I've seen methods that define distance on just lat/lons using great circle distances, but I haven't seen any mixed clustering of the sort I'm trying to implement. I'm fairly new to cluster analysis so any guidance on implementing something like this in R or Python would be greatly appreciated!

edited Apr 07 '19 at 12:43

Tasos

3,860
4
22
54

asked Jan 15 '17 at 22:47

user27974

1

This is just a special case of mixed continuous/categorical clustering. If you search for that you should find something, e.g., [this](http://datascience.stackexchange.com/questions/22/k-means-clustering-for-mixed-numeric-and-categorical-data) question. – Emre Jan 16 '17 at 03:41
I'd **split the data** per category, cluster each, then compare. – Has QUIT--Anony-Mousse Jan 16 '17 at 07:00

Mixed geospatial and categorical clustering

0 Answers0