1

I have a dataset of houses like this:

HouseID  Latitude Longitude PriceIndex
  1          1.4     103.120    1.21
  2          1.42    103.112    2.01 

I want to find houses which are similar to each other both on the basis of their position as well as their price index.[Also would need to Rank in order of similarities, given one house] I tried using hclust package in R and was able to extract 9 classes. However the groups don't seem to have any interpretable similarities (for example the points are spread all across the city etc). I haven't done clustering based projects before so any help in the right direction will be helpful. Thanks!

Edit: Removing the price index column from the clustering data-set actually clusters spatially. But adding the price shows only price-based clustering

UD1989
  • 258
  • 1
  • 3
  • 6

1 Answers1

1

Check the ranges of your dimensions and consider scaling if you see a large difference.

I would interpret your described behaviour due to much larger range if the index that the two other dimensions. See also the question.

Marmite Bomber
  • 1,103
  • 1
  • 8
  • 11
  • Yes! Scaling solved the problem. I had scaled the variables individually but not with respect to each other -how stupid. Thanks a lot! – UD1989 Dec 01 '15 at 09:43