Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit threads usage in numpy during test to avoid time-out #4584

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

yuxuanzhuang
Copy link
Contributor

@yuxuanzhuang yuxuanzhuang commented May 1, 2024

fix #4209

Changes made in this Pull Request:

  • make numpy only use one thread during test
  • reduce pytest worker
  • make tests lighter

Similar to #2950, allowing numpy to make full usages of the resources clogs the machine and likely leads to the occassional frequent time-out in multiprocessing-related testings.

See the comparison between duration below; it takes a lot more time in test_encore for numpy calculations before.

new test duration

============================= slowest 50 durations =============================
11.82s call     MDAnalysisTests/analysis/test_contacts.py::TestContacts::test_n_initial_contacts[datafiles1-41814]
10.93s call     MDAnalysisTests/coordinates/test_gro.py::TestGROLargeWriter::test_writer_large
10.70s call     MDAnalysisTests/coordinates/test_pdb.py::test_conect_bonds_all
9.34s call     MDAnalysisTests/parallelism/test_multiprocessing.py::test_multiprocess_COG[u11]
9.02s call     MDAnalysisTests/utils/test_pickleio.py::test_fileio_pickle
8.53s call     MDAnalysisTests/analysis/test_gnm.py::test_closeContactGNMAnalysis
8.31s call     MDAnalysisTests/analysis/test_gnm.py::test_closeContactGNMAnalysis_weights_None
8.27s call     MDAnalysisTests/analysis/test_encore.py::TestEncore::test_ces_error_estimation_ensemble_bootstrap
7.62s call     MDAnalysisTests/coordinates/test_gro.py::TestGROLargeWriter::test_write_trajectory_universe
7.52s call     MDAnalysisTests/coordinates/test_gro.py::TestGROLargeWriter::test_write_trajectory_atomgroup

old test duration (https://github.com/MDAnalysis/mdanalysis/actions/runs/8869719078/job/24350703650)

20.06s call     MDAnalysisTests/analysis/test_encore.py::TestEncore::test_hes_to_self
18.93s call     MDAnalysisTests/analysis/test_encore.py::TestEncore::test_hes
13.07s call     MDAnalysisTests/analysis/test_encore.py::TestEncore::test_ces_error_estimation_ensemble_bootstrap
12.93s call     MDAnalysisTests/analysis/test_encore.py::TestEncore::test_hes_custom_weights
12.80s call     MDAnalysisTests/coordinates/test_pdb.py::test_conect_bonds_all
12.12s call     MDAnalysisTests/analysis/test_contacts.py::TestContacts::test_n_initial_contacts[datafiles1-41814]
11.40s call     MDAnalysisTests/analysis/test_gnm.py::test_closeContactGNMAnalysis
10.79s call     MDAnalysisTests/coordinates/test_gro.py::TestGROLargeWriter::test_writer_large

gonna run it three times.

pass once
image

pass twice
image

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4584.org.readthedocs.build/en/4584/

@yuxuanzhuang yuxuanzhuang marked this pull request as draft May 1, 2024 05:35
Copy link

github-actions bot commented May 1, 2024

Linter Bot Results:

Hi @yuxuanzhuang! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ✅ Passed
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/8936078582/job/24545744435


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

Copy link

codecov bot commented May 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.23%. Comparing base (45fb664) to head (747bfb8).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4584      +/-   ##
===========================================
- Coverage    93.64%   93.23%   -0.42%     
===========================================
  Files          168       12     -156     
  Lines        21254     1079   -20175     
  Branches      3918        0    -3918     
===========================================
- Hits         19904     1006   -18898     
+ Misses         892       73     -819     
+ Partials       458        0     -458     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pep8speaks
Copy link

pep8speaks commented May 1, 2024

Hello @yuxuanzhuang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 128:53: E231 missing whitespace after ','

Comment last updated at 2024-05-03 08:14:41 UTC

@yuxuanzhuang yuxuanzhuang changed the title Test to fix test time-out Limit threads usage in numpy during test to avoid time-out May 1, 2024
@yuxuanzhuang yuxuanzhuang marked this pull request as ready for review May 1, 2024 15:10
@yuxuanzhuang yuxuanzhuang marked this pull request as draft May 1, 2024 16:15
@yuxuanzhuang
Copy link
Contributor Author

@orbeckst @IAlibay I now believe that these three adjustments can reduce the chance of timeouts:

  1. Limiting the number of threads used by numpy during testing. This also decreases the duration of tests that are heavily dependent on numpy and computationally intensive, e.g. in encore.

  2. Reducing the number of workers in pytest to prevent overloading the entire test runner.

  3. Lightening the tests. I have already modified some, but we can certainly further optimize the other time-consuming tests.

I have re-run the GitHub CI three times consecutively, and they all pass. I have also patched it to #4162 that are more likely to time-out in yuxuanzhuang#6 and it also seems to work well.

@yuxuanzhuang yuxuanzhuang marked this pull request as ready for review May 3, 2024 16:20
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this pesky issues. If your changes improve our CI then I am all in favor.

I have one comment on how to set the number of used processes for pytest-xdist (see comments) but I am not going to block over it.

I am also shocked at how insensitive the ENCORE tests are: you changed them dramatically and they still pass without changing the reference values. Maybe a thing to raise on the mdaencore repo....

export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
# limit to 2 workers to avoid overloading the CI
export PYTEST_XDIST_AUTO_NUM_WORKERS=2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be set to auto, because that's what we are already using. Does auto not work correctly?

If auto does not work then I'd prefer we define a variable GITHUB_CI_MAXCORES or similar and then use it invoking pytest -n $GITHUB_CI_MAXCORES. I prefer commandline arguments over env vars when determining code behavior because you immediately see what affects the command itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to find a way to find the default number of workers pytest will use and reduce it by one because the Ubuntu runner has 4 cores and Mac has 3 cores. but I ended up setting it to 2 and found the performance acceptable.

Comment on lines +135 to +136
# limit to 2 workers to avoid overloading the CI
$env:PYTEST_XDIST_AUTO_NUM_WORKERS=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is confusing: if you set workers to 1 but say you limit to 2 workers. Change comment, otherwise same as above: perhaps just add to commandline args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to correct the comment line when I realized Azure has only 2 cores so I changed # workers to 1.

$env:MKL_NUM_THREADS=1
# limit to 2 workers to avoid overloading the CI
$env:PYTEST_XDIST_AUTO_NUM_WORKERS=1

pytest MDAnalysisTests --disable-pytest-warnings -n auto --timeout=200 -rsx --cov=MDAnalysis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we already have -n auto – how does this interact with PYTEST_XDIST_AUTO_NUM_WORKERS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest will read PYTEST_XDIST_AUTO_NUM_WORKERS when -n auto. It's equivalent to pytest -n $PYTEST_XDIST_AUTO_NUM_WORKERS

@@ -142,12 +142,12 @@ def test_rmsd_matrix_with_superimposition(self, ens1):
conf_dist_matrix = encore.confdistmatrix.conformational_distance_matrix(
ens1,
encore.confdistmatrix.set_rmsd_matrix_elements,
select="name CA",
select="name CA and resnum 1:3",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final output is insensitive to just using resnum 1:3 as opposed to using the full protein??!?!? What are we even testing here? Are the encore tests so insensitive/random?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests here are not comparing to a reference array but rmsd calculated with rms.RMSD (see l150).

For the tests that have a set of reference results, it's indeed different if I use resnum 1:3


parallel_calculation = encore.utils.ParallelCalculation(function=function,
n_jobs=4,
n_jobs=2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this change alone (together with restricting threads) accounts for a lot of improvement.

@orbeckst
Copy link
Member

orbeckst commented May 3, 2024

@IAlibay would you please shepherd the PR to completion?

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @yuxuanzhuang

I'm unfortunately pretty swamped this weekend so I can only very briefly look at this today but I'll try to get back to you asap with a longer review and better explanation of where I think those env exports should go so they apply globally.

@@ -108,6 +108,13 @@ jobs:
- name: run_tests
if: contains(matrix.name, 'asv_check') != true
run: |
export OPENBLAS_NUM_THREADS=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: there's another place to put this so it applies everywhere

@hmacdope
Copy link
Member

@IAlibay with my PR shepherding hat on, just pinging here, no rush though 😄

@IAlibay IAlibay self-requested a review May 27, 2024 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Tests timing out occasionally
5 participants