0

I one-hot-encoded features and want to calculate similarity with the Jaccard index. But I am 100% sure that features have different importance for my clustering (i.e. some features are more important than others to calculate distance between my users )

Thinking about how Jaccard is calculated, the most obvious answer is to duplicate the important features as they count multiple in these cases for the similarity. Is there any other way?

alexryder
  • 31
  • 1

0 Answers0