2

I'm plotting a dataset with a trendline, but using ols vs lowess gives me significantly different line shapes. I'm sure this is an expected and perfectly normal result, but I'm afraid I don't understand the significance of the difference. Could someone explain what's going on? Here's my code using ols:

fig = px.scatter(df_small, y="Ratio",
                 trendline="ols", 
                 trendline_color_override="red")
fig.show()

And here's the plot it produced: enter image description here

Using lowess gave me this plot: enter image description here For context, my dataset represents the ratio of daily Covid deaths:new infections in the Province of Ontario over the previous 18 months.

dlanced
  • 123
  • 5
  • 1
    Welcome to DataScienceSE. I don't know the details but essentially these are two different methods to fit the data. But what I notice on your graphs is the use of a logarithmic X axis, I'm not sure it's very good here since it makes most of the points appear together on the right side. In case X represents time, it's definitely not a good idea imho. – Erwan Aug 13 '21 at 18:21
  • Thanks. I did notice the X-axis problem and was going to work on that next. – dlanced Aug 13 '21 at 18:27

1 Answers1

2

I was curious so I looked it up:

For the record I did a small test with R and ggplot2 which seems to have a better implementation:

library(ggplot2)
library(plyr)

t<-seq(0,12,.1)
noise<-rnorm(121,0,.25)
v<-cos(t)+noise
d<-data.frame(x=t,y=v)
ggplot(d,aes(x,y))+geom_point()+ geom_smooth(method="loess")

noisy curve with loess regression in ggplot2

Erwan
  • 24,823
  • 3
  • 13
  • 34