Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update distributed CI test dependencies #445

Merged
merged 6 commits into from Oct 11, 2021

Conversation

jrbourbeau
Copy link
Member

This PR adds pytest-asyncio and pytest-rerunfailures as dependencies to distributed downstream CI build.

pytest-asyncio is needed to run some distributed tests, otherwise they will be skipped. For example, looking at the CI builds in #432, we see this warning about being skipped for lots of tests:

...
distributed/tests/test_utils.py: 1 warning
distributed/tests/test_utils_test.py: 4 warnings
distributed/tests/test_worker.py: 7 warnings
  /opt/hostedtoolcache/Python/3.7.11/x64/lib/python3.7/site-packages/_pytest/python.py:172: PytestUnhandledCoroutineWarning: async def functions are not natively supported and have been skipped.
  You need to install a suitable plugin for your async framework, for example:
    - anyio
    - pytest-asyncio
    - pytest-tornasync
    - pytest-trio
    - pytest-twisted
    warnings.warn(PytestUnhandledCoroutineWarning(msg.format(nodeid)))

pytest-rerunfailures was recently added as a test dependency for distributed to automatically re-run some known flaky tests.

cc @jakirkham

@codecov
Copy link

codecov bot commented Sep 11, 2021

Codecov Report

Merging #445 (ac52c58) into master (fca7b22) will increase coverage by 0.16%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #445      +/-   ##
==========================================
+ Coverage   92.36%   92.52%   +0.16%     
==========================================
  Files           4        4              
  Lines         720      709      -11     
  Branches      150      155       +5     
==========================================
- Hits          665      656       -9     
+ Misses         34       32       -2     
  Partials       21       21              
Impacted Files Coverage Δ
cloudpickle/__init__.py 100.00% <0.00%> (ø)
cloudpickle/cloudpickle.py 88.53% <0.00%> (+0.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fca7b22...ac52c58. Read the comment docs.

@jakirkham jakirkham added the ci distributed Signal the CI to run the test suite of distributed (downstream project of cloudpickle) label Sep 11, 2021
@jakirkham
Copy link
Member

Thanks James! 😀

Should we start running these skipped tests again?

Also added a label so the next commit pushed will run the Distributed test suite.

@jrbourbeau
Copy link
Member Author

jrbourbeau commented Sep 11, 2021

Also added a label so the next commit pushed will run the Distributed test suite.

👍

Should we start running these skipped tests again?

Hrm good question. It should be safe to run those tests again since they've both been marked as flaky in distributed and should be automatically rerun. I'll uncomment the current skips to confirm they pass here. I'm also going to add the --runslow flag used in distributed to unskip a bunch of other tests. Let's see how much longer that makes the distributed CI build take to run

Copy link
Member Author

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to punt on running all slow tests in distributed for now since there are still a few flaky tests causing issues here. However, we can start running test_dont_steal_unknown_function and test_target_duration as they've been taken care of in distributed

@jakirkham
Copy link
Member

Thanks for the update James! 😄 Yeah that makes sense

It looks like CI is encountering a couple of test failures. In particular test_profile_server and test_log_exception_on_failed_task. Do we know what is going on with those?

@jrbourbeau
Copy link
Member Author

Marking test_profile_server as flaky over in dask/distributed#5375. test_log_exception_on_failed_task is marked with avoid_cidownstream in distributed, so I pushed a commit here to avoid all tests marked with avoid_ci (TIL this is what we do in distributed). Let's see if that gives us a green check mark 🤞

@jrbourbeau
Copy link
Member Author

Marking test_worker_reconnects_mid_compute_multiple_states_on_scheduler as flaky in dask/distributed#5378

@jakirkham
Copy link
Member

Just realized we hadn't re-run after merging that. Trying to rerun CI

@jrbourbeau
Copy link
Member Author

Woo, all green ✅

Unfortunately distributed has several flaky tests right now. I'm going to push a couple of empty commits to see if repeated CI runs trigger another failure or not. FWIW I'm not sure how often downstream CI is run here, but if an unexpected failure pops up feel free to ping me. I'm happy to help out

@jakirkham
Copy link
Member

Looks like the last commit also worked. Was there anything else you wanted to do here James? Or are we good to merge?

I think it is run mainly before releases, but may be run on other changes if there’s concerns that the changes will affect downstream projects

@jakirkham
Copy link
Member

Hmm...actually now seeing test_scheduler_address_config fail

@jrbourbeau
Copy link
Member Author

That's unfortunate : / I think we'll continue to have some flaky tests here in the distributed build until we're able to mark as flaky / xfail / fix tests upstream in distributed. There's an ongoing effort to run the distributed test suite many times to get a better since for what tests are flaky (xref dask/distributed#5378 (comment)). Based on those results we can then go through an start targeting specific tests. My sense is the changes here are a net benefit and this PR is probably good to merge.

@jakirkham
Copy link
Member

@pierreglaser @ogrisel , do you have thoughts here?

Copy link
Member

@pierreglaser pierreglaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @jrbourbeau @jakirkham. Not much thoughts on this, apart from me appreciating your work on making the distributed build more robust to random failures.

Will wait a few days (if @ogrisel wants to comment), and then feel free to merge @jakirkham

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @jrbourbeau @jakirkham. Not much thoughts on this, apart from me appreciating your work on making the distributed build more robust to random failures.

Same feeling. Let's merge.

@ogrisel ogrisel merged commit 7ddb089 into cloudpipe:master Oct 11, 2021
@ogrisel
Copy link
Contributor

ogrisel commented Oct 11, 2021

Thanks again @jrbourbeau!

@jakirkham
Copy link
Member

Thanks all! 😄

@jrbourbeau
Copy link
Member Author

Thanks all 🚀

@jrbourbeau jrbourbeau deleted the distributed-ci-deps branch October 12, 2021 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci distributed Signal the CI to run the test suite of distributed (downstream project of cloudpickle)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants