Account for time spent in garbage collection #3979

jobh · 2024-05-10T11:11:31Z

Avoid flaky DeadlineExceeded due to garbage collection.

hypothesis-python/src/hypothesis/core.py

Zac-HD

OK, I really like this approach! You've already handled most of the tricky double-counting concerns too, so I have fewer nitpicks than I feared 🙂

hypothesis-python/src/hypothesis/statistics.py

hypothesis-python/tests/cover/test_deadline.py

hypothesis-python/src/hypothesis/stateful.py

hypothesis-python/src/hypothesis/core.py

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py

hypothesis-python/docs/schema_observations.json

hypothesis-python/src/hypothesis/statistics.py

jobh · 2024-05-14T07:22:04Z

hypothesis-python/src/hypothesis/internal/scrutineer.py

@@ -114,6 +114,9 @@ def __exit__(self, *args, **kwargs):
    f"{sep}_pytest{sep}assertion{sep}rewrite.py",
    f"{sep}_pytest{sep}_io{sep}saferepr.py",
    f"{sep}pluggy{sep}_result.py",
+    # These are triggered by gc callbacks, for some reason
+    f"{sep}reprlib.py",
+    f"{sep}_pytest{sep}assertion{sep}util.py",


This makes the tests pass (fingers crossed), but is it just papering over something that should be fixed?

Note, the failures can be reproduced on master by adding

import gc; gc.callbacks.append(lambda *_: None)

at top level of conftest.py, so it's nothing to do with the other code changes.

Nah, this is totally fine - the whole purpose of this list is to suppress reports of unhelpful locations, and stdlib or pytest internals are almost always in that category. (at some point maybe we should generalize this further with an is_pytest_file helper?)

Would you support something like this: jobh@5ddfd2e

It's just adding some glob wildcards

hypothesis-python/tests/conjecture/test_engine.py

jobh · 2024-05-14T11:55:07Z

All tests passed, finally. There were more complications than I expected, the two major ones are commented above.

The scrutineer part is probably just due to more variation in traces; which can be seen as a good thing, as the variation covers more of real-world usage (although some globbing would be nice to have in that list).

The recursion depth thing ~~may or may not be a problem in practice, I just don't know. I set out to kill one source of flakiness, only to create another. Not happy with that one~~ is ok actually.

Anyway: Ready for re-review :-)

Zac-HD · 2024-05-14T17:12:36Z

(jumping on a flight to PyCon now, but the Scrutineer thing is not a problem, and I'll aim to review the unraisable problem later)

jobh · 2024-05-14T18:37:40Z

(jumping on a flight to PyCon now, but the Scrutineer thing is not a problem, and I'll aim to review the unraisable problem later)

Thanks @Zac-HD! Have fun at PyCon!

Zac-HD · 2024-05-20T18:29:59Z

(just want to note that I haven't forgotten about this; it's just tricky enough that I need a dedicated chunk of time to think about it and make sure I've tested all the edge cases. thanks for your patience 🙏)

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py

and also handle it in scrutineer trace, since that can happen.

Zac-HD

Thanks @jobh, I've added a last test based on your repro and I think we're ready to merge!

Zac-HD · 2024-05-15T03:51:36Z

hypothesis-python/RELEASE.rst

+
+Account for time spent in garbage collection during tests, to avoid
+flaky DeadlineExceeded errors. Also fixes overcounting of stateful
+run times, introduced in PR #3890.


For the changelog, refer to version numbers rather than PRs - the latter are much more meaningful for downstream users, even if we think in terms of pr references 🙂

Noted! I will put some prose in the RELEASE example to this effect.

hypothesis-python/tests/conjecture/test_engine.py

Zac-HD · 2024-05-15T16:01:19Z

hypothesis-python/RELEASE-sample.rst

-    (``https://hypothesis.readthedocs.io/en/latest/<chapter>.html#<anchor>``)
+    be ``package.function``, or :func:`~package.function` to show ``function``.
+  - :class:`package.class` for link to classes (abbreviated as above).
+  - :issue:`issue-number` for referencing issues, :pr:`pr-number` for pull requests.


oops, we only support :pull: at the moment. Happy to either change the docs, or add an alias in conf.py.

I will change the docs 👍

but either I or Github is confused. I didn't see this comment (or several others) before now, but they are timestamped "last week". I'll have a look through again to make sure everything is covered!

Zac-HD · 2024-05-15T16:04:24Z

hypothesis-python/RELEASE.rst

+flaky `DeadlineExceeded` errors as seen in :issue:`3975`. Also fixes
+overcounting of stateful run times resulting from :pr:`3890`.


Suggested change

flaky `DeadlineExceeded` errors as seen in :issue:`3975`. Also fixes

overcounting of stateful run times resulting from :pr:`3890`.

flaky ``DeadlineExceeded`` errors as seen in :issue:`3975`.

Also fixes double-counting of runtime for stateful rules, towards overall test execution runtime,

a minor observability bug dating to :ref:`version 6.98.9 <v6.98.9>`.

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py

Zac-HD · 2024-05-23T17:11:50Z

Apparently GitHub didn't send my review correctly last week??? Sorry about that, I should have spotted it.

jobh · 2024-05-23T17:15:02Z

Aha, I guess the comments are batched up until the review is completed. No worries from my side, just a bit of extra work for you to fix the things I neved did 😁

Zac-HD · 2024-05-23T17:18:38Z

I'm also somewhat concerned by the number of failures I'm seeing on e5a2390, which looked like a very small change - crashed worker on Windows and linux, this weird error on several builds, prints-on-healthcheck tests on several builds...

It looks like several of these would be fixable, but I'm wondering whether the benefits of measuring GC time are actually worth the fragility vs e.g. running gc.collect() before each 100th test execution.

jobh · 2024-05-23T17:26:36Z

It looks like several of these would be fixable, but I'm wondering whether the benefits of measuring GC time are actually worth the fragility vs e.g. running gc.collect() before each 100th test execution.

Yep. Let's wait a bit with this, and then when I have more time I'll try to understand what happens and why. The actual execution-time change seems so small, but sometimes it just doesn't work.

jobh · 2024-05-24T08:04:17Z

I think it's ok actually.

The NaN infects every timing result afterwards, which is why deadlines fail. That's fixable.

While I don't fully understand the crashes, I believe they are likely caused by memory exhaustion due to the new test's deep recursion inside gc. We need to do the recursion outside and then probe gc at each stack level, otherwise we're not testing the shallowest-level callback properly.

[edit: the test still crashes, even with no callback attached (i.e., no gc accounting)]

jobh requested a review from Zac-HD as a code owner May 10, 2024 11:11

jobh commented May 10, 2024

View reviewed changes

hypothesis-python/src/hypothesis/core.py Outdated Show resolved Hide resolved

jobh force-pushed the gc-accounting branch from 6f8129f to 05d8a51 Compare May 10, 2024 13:22

Zac-HD reviewed May 12, 2024

View reviewed changes

jobh force-pushed the gc-accounting branch 3 times, most recently from e452a0c to 3f79ceb Compare May 13, 2024 08:06

jobh commented May 13, 2024

View reviewed changes

hypothesis-python/docs/schema_observations.json Show resolved Hide resolved

jobh commented May 13, 2024

View reviewed changes

hypothesis-python/src/hypothesis/statistics.py Outdated Show resolved Hide resolved

jobh force-pushed the gc-accounting branch 2 times, most recently from c434f20 to 2e5e9cf Compare May 13, 2024 14:31

This comment was marked as outdated.

Sign in to view

jobh commented May 14, 2024

View reviewed changes

jobh force-pushed the gc-accounting branch 3 times, most recently from 8d23490 to e580dff Compare May 14, 2024 08:15

jobh commented May 14, 2024

View reviewed changes

hypothesis-python/tests/conjecture/test_engine.py Outdated Show resolved Hide resolved

jobh force-pushed the gc-accounting branch 5 times, most recently from 1395d9d to 68e818a Compare May 14, 2024 10:34

jobh force-pushed the gc-accounting branch from 0d5593e to a3af8e8 Compare May 15, 2024 08:09

jobh added 4 commits May 15, 2024 10:31

Add failing test

a9b5f4d

Account for time in GC

1bd9c56

Add RELEASE.rst

b70a814

Describe observability semantics more precisely

8f295fb

jobh added 9 commits May 15, 2024 10:31

Mark another few "unhelpful locations" in scrutineer

7feb694

Revert statistics printout of gc time, it is nearly always zero

6f4151f

Another scrutineer ignore

b2ef7db

Don't fail on recursion error in gc callback

6b630b6

PyPy support

dfeba6b

Yet another scrutineer ignore

c3c7dee

Yet another scrutineer ignore

2b1df09

Ignore RecursionError in top level callback

91c3ef0

Follow best RELEASE.rst best practice

d3baf60

jobh force-pushed the gc-accounting branch from a3af8e8 to d3baf60 Compare May 15, 2024 08:37

Verify slowness of deadline test

38603f4

Comment and doc tweaks

d07a7ca

jobh commented May 23, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/junkdrawer.py Show resolved Hide resolved

Test RecursionError in GC hook

e5a2390

and also handle it in scrutineer trace, since that can happen.

Zac-HD approved these changes May 23, 2024

View reviewed changes

Zac-HD enabled auto-merge May 23, 2024 16:32

Zac-HD disabled auto-merge May 23, 2024 17:10

jobh force-pushed the gc-accounting branch 3 times, most recently from 2a943be to da3b950 Compare May 24, 2024 09:37

Try to make gc recursion test less fragile

8a688f4

jobh force-pushed the gc-accounting branch from da3b950 to 8a688f4 Compare May 24, 2024 10:00

Rework gc timing to require less stack

2ff5c6c

jobh force-pushed the gc-accounting branch from 7200c3d to 2ff5c6c Compare May 24, 2024 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Account for time spent in garbage collection #3979

Account for time spent in garbage collection #3979

jobh commented May 10, 2024

Zac-HD left a comment

This comment was marked as outdated.

jobh May 14, 2024

jobh May 14, 2024

Zac-HD May 14, 2024

jobh May 15, 2024 •

edited

jobh commented May 14, 2024 •

edited

Zac-HD commented May 14, 2024

jobh commented May 14, 2024

Zac-HD commented May 20, 2024

Zac-HD left a comment

Zac-HD May 15, 2024

jobh May 23, 2024

Zac-HD May 15, 2024

jobh May 23, 2024

jobh May 23, 2024

Zac-HD May 15, 2024

Zac-HD commented May 23, 2024

jobh commented May 23, 2024

Zac-HD commented May 23, 2024

jobh commented May 23, 2024

jobh commented May 24, 2024 •

edited

		flaky `DeadlineExceeded` errors as seen in :issue:`3975`. Also fixes
		overcounting of stateful run times resulting from :pr:`3890`.

-flaky `DeadlineExceeded` errors as seen in :issue:`3975`. Also fixes
-overcounting of stateful run times resulting from :pr:`3890`.
+flaky ``DeadlineExceeded`` errors as seen in :issue:`3975`.
+Also fixes double-counting of runtime for stateful rules, towards overall test execution runtime,
+a minor observability bug dating to :ref:`version 6.98.9 <v6.98.9>`.

Account for time spent in garbage collection #3979

Are you sure you want to change the base?

Account for time spent in garbage collection #3979

Conversation

jobh commented May 10, 2024

Zac-HD left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jobh May 15, 2024 • edited

Choose a reason for hiding this comment

jobh commented May 14, 2024 • edited

Zac-HD commented May 14, 2024

jobh commented May 14, 2024

Zac-HD commented May 20, 2024

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented May 23, 2024

jobh commented May 23, 2024

Zac-HD commented May 23, 2024

jobh commented May 23, 2024

jobh commented May 24, 2024 • edited

jobh May 15, 2024 •

edited

jobh commented May 14, 2024 •

edited

jobh commented May 24, 2024 •

edited