-
-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tests for P2P stable ordering #8458
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 26 files + 8 26 suites +8 9h 20m 55s ⏱️ + 4h 4m 50s For more details on these failures, see this check. Results for commit bba2b06. ± Comparison against base commit a31744f. This pull request removes 2 and adds 9 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
@pytest.mark.parametrize("disk", [True, False]) | ||
@pytest.mark.parametrize("keep", ["first", "last"]) | ||
@gen_cluster(client=True) | ||
async def test_shuffle_stable_ordering(c, s, a, b, keep, disk): | ||
"""Ensures that shuffling guarantees ordering for individual entries | ||
belonging to the same shuffle key""" | ||
df = dask.datasets.timeseries( | ||
start="2000-01-01", | ||
end="2000-02-01", | ||
dtypes={"x": int, "y": int, "z": int}, | ||
freq="1 s", | ||
) | ||
df["x"] = df["x"] % 23 | ||
df["y"] = df["y"] % 19 | ||
df["z"] = df["z"] % 17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is easier to construct a dataframe very explicitly and assert explicitly on the output group instead of testing this indirectly via drop_duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's easier to write the test, but I see your point that it might be easier to argue about it.
Follow-up to #8453
Closes dask/dask#10708 by including regression test.
pre-commit run --all-files