I am interested in knowing what really happens in Hellinger Distance (in simple terms). Furthermore, I am also interested in knowing what are types of problems that we can use Hellinger Distance? What are the benefits of using Hellinger Distance?
Asked
Active
Viewed 1.9k times
29
-
13The Hellinger distance is a probabilistic analog of the Euclidean distance. A salient property is its symmetry, as a metric. Such mathematical properties are useful if you are writing a paper and you need a distance function that possesses certain properties to make your proof possible. In application, someone might discover that one metric produces nicer or better results than another for a certain task; e.g., the [Wasserstein distance](https://en.wikipedia.org/wiki/Wasserstein_metric) is all the rage in [generative adversarial networks](https://www.quora.com/What-is-new-with-Wasserstein-GAN) – Emre Aug 31 '17 at 05:43
-
Thank you for the comment. I came across this question, which is quite similar to the question I have now. https://datascience.stackexchange.com/questions/22324/combine-two-sets-of-clusters Please let me know, why the answer says Hellinger Distance is suitable? – Smith Volka Aug 31 '17 at 05:59
-
3Probably to [visualize](https://blakeboswell.github.io/2016/gensim-hellinger/) the topics in a metric space. Another nice property is that the Hellinger distance is finite for distributions with different support. It is good that you are asking these questions. I suggest trying different metrics for yourself and observing the results. – Emre Aug 31 '17 at 06:06
-
Thanks. its a good link. helps a lot. But is Hellinger distance only limited to topics derived from Latent Dirichlet Allocation (LDA) as mentioned in the link? – Smith Volka Aug 31 '17 at 06:11
-
2No, it has no inherent connection to LDA. – Emre Aug 31 '17 at 06:12
-
Thanks. The points you mentioned helped a lot to understand. – Smith Volka Aug 31 '17 at 06:17
-
Have you seen the [Wikipedia entry](https://en.wikipedia.org/wiki/Hellinger_distance)? – Martin Thoma Sep 02 '17 at 07:32
-
@Emre Please let me know if you know an answer for this https://datascience.stackexchange.com/questions/22828/clustering-with-cosine-similarity – Smith Volka Sep 05 '17 at 06:22
1 Answers
12
Hellinger distance is a metric to measure the difference between two probability distributions. It is the probabilistic analog of Euclidean distance.
Given two probability distributions, $P$ and $Q$, Hellinger distance is defined as:
$$h(P,Q) = \frac1{\sqrt2}\cdot \|\sqrt{P}-\sqrt{Q}\|_2$$
It is useful when quantifying the difference between two probability distributions. For example, if you estimate a distribution for users and non-users of a service. If the Hellinger distance is small between those groups for some features, then those features are not statistically useful for segmentation.
timleathart
- 3,900
- 20
- 35
Brian Spiering
- 20,142
- 2
- 25
- 102
-
6(also for @Emre) where does the claim that hellinger distance is the probabilistic analog of euclidean distance comes from? why is that? – carlo May 21 '20 at 16:06