Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests have flakey coverage? #88

Open
belm0 opened this issue Nov 10, 2018 · 5 comments
Open

tests have flakey coverage? #88

belm0 opened this issue Nov 10, 2018 · 5 comments
Labels
bug Something isn't working
Milestone

Comments

@belm0
Copy link
Member

belm0 commented Nov 10, 2018

e.g. coverage tool reports regression on unrelated change:

https://coveralls.io/builds/20028763/source?filename=trio_websocket%2F_impl.py#L842

it's the except clause in _write_pending

if there are timing issues in the tests, that could explain intermittent errors like #68

@mehaase
Copy link
Contributor

mehaase commented Nov 12, 2018

Thank you for pointing this out, although I'm not sure how to interpret the coverage report.

screenshot from 2018-11-12 12-15-26

It says "uncov" on the side but the lines are green?

Anyway, you are right that the coverage of these lines depends on a race. Here's a build where the 3.6 tests covered these lines but the 3.7 lines did not. There are other builds where both 3.6 and 3.7 cover.

@mehaase
Copy link
Contributor

mehaase commented Nov 12, 2018

Here's an example where test_server_does_not_close_handshake coverage is flakey. The first time it covers the lines in question (842-846), then the next time it does not.

(venv) mhaase@MEH-MBP:/V/C/trio-websocket:master$
rm .coverage; and pytest -k test_server_does_not_close_handshake --cov=trio_websocket; and coverage report -m
=============== test session starts ===============
platform darwin -- Python 3.7.0, pytest-3.10.0, py-1.7.0, pluggy-0.8.0
rootdir: /Volumes/Case_Sensitive/trio-websocket, inifile: pytest.ini
plugins: trio-0.5.1, cov-2.6.0
collected 31 items / 30 deselected

tests/test_connection.py .                                                                                   [100%]

---------- coverage: platform darwin, python 3.7.0-final-0 -----------
Name                         Stmts   Miss  Cover
------------------------------------------------
trio_websocket/__init__.py       1      0   100%
trio_websocket/_impl.py        365    160    56%
trio_websocket/version.py        1      0   100%
------------------------------------------------
TOTAL                          367    160    56%


=============== 1 passed, 30 deselected in 0.24 seconds ===============
Name                         Stmts   Miss  Cover   Missing
----------------------------------------------------------
trio_websocket/__init__.py       1      0   100%
trio_websocket/_impl.py        365    160    56%   [SNIP] 808-809, 842-846, 852-854 [SNIP]
trio_websocket/version.py        1      0   100%
----------------------------------------------------------
TOTAL                          367    160    56%
(venv) mhaase@MEH-MBP:/V/C/trio-websocket:master$
rm .coverage; and pytest -k test_server_does_not_close_handshake --cov=trio_websocket; and coverage report -m
=============== test session starts ===============
platform darwin -- Python 3.7.0, pytest-3.10.0, py-1.7.0, pluggy-0.8.0
rootdir: /Volumes/Case_Sensitive/trio-websocket, inifile: pytest.ini
plugins: trio-0.5.1, cov-2.6.0
collected 31 items / 30 deselected

tests/test_connection.py .                                                                                   [100%]

---------- coverage: platform darwin, python 3.7.0-final-0 -----------
Name                         Stmts   Miss  Cover
------------------------------------------------
trio_websocket/__init__.py       1      0   100%
trio_websocket/_impl.py        365    156    57%
trio_websocket/version.py        1      0   100%
------------------------------------------------
TOTAL                          367    156    57%


=============== 1 passed, 30 deselected in 0.23 seconds ===============
Name                         Stmts   Miss  Cover   Missing
----------------------------------------------------------
trio_websocket/__init__.py       1      0   100%
trio_websocket/_impl.py        365    156    57%   [SNIP] 808-809, 846, 852-854 [SNIP]
trio_websocket/version.py        1      0   100%
----------------------------------------------------------
TOTAL                          367    156    57%

I checked to see if there's any other difference in coverage between these two runs:

trio_websocket/_impl.py        365    160    56%   605-606, 842-846
trio_websocket/_impl.py        365    156    57%   604, 846

Notice that when the _write_pending() exception lines are skipped, so are 605 and 606. Here are the relevant source lines.

595: async def send_message(self, message):
596-602: <docstring>
603:     if self._close_reason:
604:         raise ConnectionClosed(self._close_reason)
605:     self._wsproto.send_data(message)
606:     await self._write_pending()

So the race seems to be related to sending at the same time a connection is closed. Sometimes it raise in line 604, and sometimes it raises inside _write_pending(). I don't necessarily think either code path is wrong, but each test needs to have a deterministic shutdown.

@mehaase
Copy link
Contributor

mehaase commented Nov 14, 2018

I messed around with this a bit on its own, and also in conjunction with implementing timeouts. I have a few ideas about how to improve this situation:

  • Many of the WebSocketConnection methods have a guard condition like if self._close_reason: raise ConnectionClosed. This guard condition is unnecessary and just adds a second code path that is prone to race conditions, as seen in the example at the end of my previous post. Most of these guards can safely be removed.
  • A lot of the tests have race conditions themselves, e.g. a test creates a client and a server and relies on the automatically closing behavior, so both ends race to initiating the closing handshake. Each test needs to have a single, deterministic order for tearing down connections.
    • As an example, we can force the client to initiate closing by having the server call get_message() when no message is pending, which will raise ConnectionClosed when the client has sent the closing handshake.
    • Alternatively, tests can use Trio events to force an intended ordering of events.
  • I don't think it's possible to make network tests 100% determinstic, so a last resort we can add tests that force cover of problematic areas like lines 842-844 shown above. I.e. some tests may still have slightly different execution paths based on races, but at least one test is guaranteed to deterministically cover those lines. This doesn't prevent intermittent test failures, but it does ensure consistent coverage.

@belm0
Copy link
Member Author

belm0 commented Nov 14, 2018

trio.testing may help

https://trio.readthedocs.io/en/latest/reference-testing.html#inter-task-ordering

Essentially the same as using events.

@njsmith
Copy link
Member

njsmith commented Nov 14, 2018

I don't think it's possible to make network tests 100% determinstic, so a last resort we can add tests that force cover of problematic areas like lines 842-844 shown above.

This is what I usually do...

@mehaase mehaase added the bug Something isn't working label Nov 17, 2018
@mehaase mehaase added this to the 1.0 milestone Nov 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants