-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] px.scatter "ols" not producing linear trend line #3683
Comments
I updated to the latest dash (2.3.1) and the problem still persists... |
@nicolaskruchten I can't quite tell what we're falling back on here but I'm guessing this just means |
Possibly, it worked as expected when I ran the same code a Mac. This is currently being run on Windows 10. I can provide my entire codebase if that might help? |
A full reproducible example would be great, yes. Simplified to the minimal case if you can. My hunch about what's happening here: we're not able to use dates as the x data in the curve fitting algorithm, so it's using row indices as the x data during fitting, but somehow the indices used are out of order on Windows. If that's the case, then even on Mac where the indices are ordered correctly, the fit looks right only because your dates happen to be evenly spaced. |
This is meant to work even on non-evenly-spaced dates: dates are converted to floats and the regression happens there, then the |
Hi @nicolaskruchten... was there any update on this? |
No update yet, no. Can you provide a fully runnable example including data please? The standard test case I use for OLS with dates on the X axis is this |
Also can you confirm the version of Plotly you are using? The latest is 5.7.0 |
Hello, I'm not sure if this has been resolved but I am seeing a similar issue when trying to plot data that only has date values on the first of each month (though this is in a Jupyter environment and not Dash so I'm not sure if it is exactly the same case). I've included the code below. This code does work properly when I run it in Google Colab, and there is a specific difference in the data that I don't understand (more below). OS: Windows 10 v 20H2 build 19042.1826
This produces two plots, the first of which uses the Datetime column and has a non-linear trendline. The second plot I converted the dates into a serialized format and the trendline is now linear. But as I noted, the plots render as expected when I run them in Google Colab. The major difference between the results I get in my environment and what I get in Google Colab are the values in the DateValue field.
I have no idea why the Datevalue numbers are so different, but I imagine the values being out of order is part of (or the entire) issue. EDIT -- If I convert to int64 instead of int I get the same values as I see in Colab. It looks like this line in the Plotly code linked above converts to int, which I suspect produces the negative values for me:
Colab environment: My current environment: I also replicated my error on an older environment: Let me know if any other details would be helpful. |
I can replicate this with current plotly 5.11.0, pandas 1.4.1, statsmodels 0.13.5 and Python 3.8.15 on Win10 Enterprise 64bit 22H2. I ran the example code from @sdelu and got the same plots. Likewise, the example |
By the way, this issue is not limited to "ols", so maybe the issue can be renamed to something along the lines of "broken trendlines with datetime x-axis". Here is how the the second "lowess" example from the documentation looks on my system: import plotly.express as px
df = px.data.stocks(datetimes=True)
fig = px.scatter(df, x="date", y="GOOG", trendline="lowess", trendline_options=dict(frac=0.1))
fig.show() |
Fix trendlines for datetimes (#3683)
Fixed in 5.12! |
Describe your context
Please provide us your environment, so we can easily reproduce the issue.
pip list | grep dash
belowif frontend related, tell us your Browser, Version and OS
Describe the bug
"ols" (original least squares) function to add a linear trend line is not producing a regression line. It is instead something closer to polynomial.
The code I'm using to create the graph is as follows:
Instead of a logistic regression line per the example here - https://plotly.com/python/linear-fits/
I'm getting this:
The x and y values are just floating point numbers and date values respectively.
The Plotly version is 5.7.0
Expected behavior
Linear regression line.
The text was updated successfully, but these errors were encountered: