Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current approach to restarting data collection? #1758

Open
ShaheedHaque opened this issue Mar 27, 2024 · 1 comment
Open

Current approach to restarting data collection? #1758

ShaheedHaque opened this issue Mar 27, 2024 · 1 comment
Labels
needs triage support A support question from a user

Comments

@ShaheedHaque
Copy link

Have you asked elsewhere?

The thread in https://stackoverflow.com/a/40518553/6332554 describes a way to "restart" data collection.

Describe your situation

Celery workers do not honour atexit() handlers, see celery/celery#8923. I'm trying a couple of different approaches to dealing with this, one being to use the Coverage API to dump the collected data as the worker runs. Since the worker runs my tasks, I can "easily" add the code to the end of my task function.

I therefore need incremental collection. I don't see an obvious solution using the current public API, but I do see the auto_load option on the constructor. Can it or the load() method be used to restart collection?

If not, then is the method in the stackoverflow.com exchange still current?

@ShaheedHaque ShaheedHaque added needs triage support A support question from a user labels Mar 27, 2024
@ShaheedHaque
Copy link
Author

ShaheedHaque commented Mar 30, 2024

I was able to close #1454 by implementing on-going dumping and restarting data collection. My current approach, in lieu of using the internal APIs used by the stackoverflow thread, is something like this:

    ctx.coverage.stop()
    ctx.coverage.save()
    data_file = <some PID-based name with a monotonically increasing count as a suffix>
    ctx.coverage = _coverage.Coverage(data_file=data_file, data_suffix='cov', cover_pylib=False, config_file=whatever)
    ctx.coverage.start()

This assumes that the garbage collector will clean up after the old instance of ctx.coverage, and has the potential to create a lot of files. Hence the interest in whether auto_load or load() (or indeed some other API) could be useful. Ideally, something like:

    ctx.coverage.stop()
    ctx.coverage.save()
    ctx.coverage.resume()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage support A support question from a user
Projects
None yet
Development

No branches or pull requests

1 participant