unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) #34

archxlith · 2019-11-12T16:57:13Z

Expected behavior

All test pass

Actual behavior

From test_encore.py the following test fail: test_hes_to_self, test_hes, test_hes_align

RHEL 7.7
py-hypothesis/4.7.2
py-mock/2.0.0
py-pbr/3.1.1
py-pytest/4.3.0
py-py/1.5.4
py-attrs/19.2.0
py-more-itertools/4.3.0
py-atomicwrites/1.1.5
py-pluggy/0.7.1
py-mpld3/0.3
py-jinja2/2.10
py-markupsafe/1.0
py-babel/2.6.0
py-joblib/0.11
python/3.6.8
py-numpy/1.16.2
py-six/1.12.0
py-biopython/1.73
py-networkx/2.2
py-decorator/4.3.0
py-griddataformats/0.5.0
py-scipy/1.2.1
py-gsd/1.9.3
py-mmtf-python/1.1.2
py-msgpack/0.6.2
py-matplotlib/3.0.2
py-setuptools/40.8.0
py-dateutil/2.5.2
py-pyparsing/2.3.1
py-pytz/2017.2
py-cycler/0.10.0
py-kiwisolver/1.0.1
py-pillow/5.4.1
py-seaborn/0.9.0
py-pandas/0.24.1
py-numexpr/2.6.9
py-bottleneck/1.2.1

Three test from analysis/test_encore.py fail.

Code to reproduce the behavior

pytest --disable-pytest-warnings test_encore.py
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.6.8, pytest-4.3.0, py-1.5.4, pluggy-0.7.1
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/MDAnalysisTests-0.20.1/MDAnalysisTests/analysis/.hypothesis/examples')
rootdir: /MDAnalysisTests-0.20.1, inifile: setup.cfg
plugins: hypothesis-4.7.2
collected 46 items

test_encore.py .......FF.F....X..X.s......sssssss......ss.s.. [100%]

========================================================================================== FAILURES ==========================================================================================
________________________________________________________________________________ TestEncore.test_hes_to_self _________________________________________________________________________________

self = <MDAnalysisTests.analysis.test_encore.TestEncore object at 0x7f9ced8914a8>, ens1 = <Universe with 3341 atoms>

def test_hes_to_self(self, ens1):
    results, details = encore.hes([ens1, ens1])
    result_value = results[0, 1]
    expected_value = 0.
    assert_almost_equal(result_value, expected_value,

                      err_msg="Harmonic Ensemble Similarity to itself not zero: {0:f}".format(result_value))

E AssertionError:
E Arrays are not almost equal to 7 decimals
E Harmonic Ensemble Similarity to itself not zero: -1209691.728006
E ACTUAL: -1209691.728006141
E DESIRED: 0.0

test_encore.py:237: AssertionError
____________________________________________________________________________________ TestEncore.test_hes _____________________________________________________________________________________

self = <MDAnalysisTests.analysis.test_encore.TestEncore object at 0x7f9ce918c4e0>, ens1 = <Universe with 3341 atoms>, ens2 = <Universe with 3341 atoms>

def test_hes(self, ens1, ens2):
    results, details = encore.hes([ens1, ens2], weights='mass')
    result_value = results[0, 1]
    min_bound = 1E5

  assert result_value > min_bound, "Unexpected value for Harmonic " \

                                      "Ensemble Similarity: {0:f}. Expected {1:f}.".format(result_value, min_bound)

E AssertionError: Unexpected value for Harmonic Ensemble Similarity: -112355881.255052. Expected 100000.000000.
E assert -112355881.25505194 > 100000.0

test_encore.py:243: AssertionError
_________________________________________________________________________________ TestEncore.test_hes_align __________________________________________________________________________________

self = <MDAnalysisTests.analysis.test_encore.TestEncore object at 0x7f9cef6ea7f0>, ens1 = <Universe with 3341 atoms>, ens2 = <Universe with 3341 atoms>

def test_hes_align(self, ens1, ens2):
    # This test is massively sensitive!
    # Get 5260 when masses were float32?
    results, details = encore.hes([ens1, ens2], align=True)
    result_value = results[0,1]
    expected_value = 2047.05
    assert_almost_equal(result_value, expected_value, decimal=-3,

                      err_msg="Unexpected value for Harmonic Ensemble Similarity: {0:f}. Expected {1:f}.".format(result_value, expected_value))

E AssertionError:
E Arrays are not almost equal to -3 decimals
E Unexpected value for Harmonic Ensemble Similarity: 543454.295111. Expected 2047.050000.
E ACTUAL: 543454.2951113285
E DESIRED: 2047.05

test_encore.py:262: AssertionError
========================================================== 3 failed, 30 passed, 11 skipped, 2 xpassed, 4 warnings in 18.43 seconds ===========================================================

I tested with version 0.20.1, 0.19.2, and 0.17.0 - the actual failed values are the same in all three

Currently version of MDAnalysis

Which version are you using? 0.20.1 (same issue with 0.19.2 and 0.17.0 - only others I tested)
Which version of Python (python -V)? 3.6.8
Which operating system? RHEL 7.7

The text was updated successfully, but these errors were encountered:

orbeckst · 2019-11-12T21:00:16Z

The short answer is that failures in these tests are not uncommon and not something that we have been worrying too much about.

The long answer:

The encore tests are very brittle – see #38 #36 #35 . This is because the algorithms only work properly with large data sets but seem to have high variance on small data sets. The algorithms use random numbers #37 so their behavior is not predictable MDAnalysis/mdanalysis#1933. They mostly give consistent results on Travis CI but even there we have occasional failures, which then get "fixed" by restarting the job.

We have not really come up with a good solution – suggestions welcome.

richardjgowers · 2019-11-18T10:52:26Z

Tbh I'd happily make encore external once we've released 1.0. I think originally we kept it inside the package so it maintained compatibility, but it's caused a lot of grief with the test failures.

mtiberti · 2019-11-18T12:10:01Z

Hi, thanks for the report - I don't think these fails are specifically related to RNG, I can't reproduce them on my system though (Ubuntu 18.04) by using the same numpy and scipy versions - is there any way I can test them without installing a RHEL VM from scratch?

About the tests - I can try and improve the random number generation set-up for DRES as has been suggested in the past, and that should help with the occasional fails we get because of the small dataset. Would that help?

mtiberti · 2019-11-18T16:16:49Z

@archxlith I tried installing a CentOS 7.7 64 bit VM, tried to match your environment as much as I could in terms of Python and package versions but still can't reproduce your fails. Any special set up or environment you're using? Do you expect any difference by trying this on RHEL proper?

archxlith changed the title ~~unitest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align)~~ unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) Nov 12, 2019

IAlibay transferred this issue from MDAnalysis/mdanalysis Sep 6, 2023

orbeckst added the tests Improvements or fixes to the tests. label Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) #34

unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) #34

archxlith commented Nov 12, 2019

orbeckst commented Nov 12, 2019

richardjgowers commented Nov 18, 2019

mtiberti commented Nov 18, 2019

mtiberti commented Nov 18, 2019 •

edited

unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) #34

unittest failure for three encore.hes test (test_hes_to_self, test_hes, test_hes_align) #34

Comments

archxlith commented Nov 12, 2019

orbeckst commented Nov 12, 2019

richardjgowers commented Nov 18, 2019

mtiberti commented Nov 18, 2019

mtiberti commented Nov 18, 2019 • edited

mtiberti commented Nov 18, 2019 •

edited