Your suggestion is a valid one, encoding variables with a known outcome once the scaling is applied. Log(1) will become zero, so just keep that in mind for your next stage. You can use clip or replace for this:
df.clip(1, df.max())
or try replacing with a NaN
df.replace(0, np.nan)
Alternatively you could do one of the following:
- Drop the zero value rows e.g.
df = df[df['column'] !=0] but then you lose some data.
- Fill the zero values with a statistically representative value (i.e. interpolation). You can explore the Pandas interpolate method here.
Which ever method you decide upon depends on your use-case, and compatibility with the plotting function you use.