1

I am trying to plot a distribution of positive integers which contains a lot of variance. I opted to use the log of the y-values but that causes issues due to the inclusion of zeros. I though of plotting log10(n+1), but it seems a bit janky.

Is this solution used more often?
Does it have a name?
Is there a better/more common method?

iHnR
  • 13
  • 2
  • Why do any transformation at all? In other words, what do you want your transformation to accomplish? – Dave Aug 25 '22 at 19:05
  • @Dave Log(0) amounts to minus infinity, which is hard to plot. By making all values at least 1, I assure that the log is always positive without ignoring the zeros outright. – iHnR Aug 25 '22 at 21:38
  • But why transform at all? Why not plot the original values? – Dave Aug 25 '22 at 21:38
  • @Dave because 10 and 1000 are hard to distinguish while 1 and 3 aren't. I'll probably not use this method as I expect it to cause confusion, but it seemed like an obvious solution so I wondered if it had a name. – iHnR Aug 25 '22 at 21:42

1 Answers1

1

I'm not aware of a standard name, but it does appear to be fairly common. Programmatically, it's often implemented as log1p ("log of 1 plus").

There's also the "symlog" or "log-modulus" functions when you really care about negative values as well:

Ben Reiniger
  • 11,094
  • 3
  • 16
  • 53