Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Second attempt at fixing trace propagation in Celery 4.2+ #831

Merged
merged 5 commits into from
Sep 23, 2020

Conversation

untitaker
Copy link
Member

@untitaker untitaker commented Sep 22, 2020

It turns out the bug is perfectly reproducible with the Redis backend, I just messed up the test.

apply_async was basically patched at the wrong point in time, and as such the trace propagation didn't work out.

Sorry y'all for dropping this on you without so little context, we can go through this tomorrow.

Follow-up to #824 #825

kwarg_headers = kwargs.setdefault("headers", {})
# Note: kwargs can contain headers=None, so no setdefault!
# Unsure which backend though.
kwarg_headers = kwargs.get("headers") or {}
kwarg_headers.update(headers)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is currently crashing all the time (in capture_internal_exceptions) because the old code could not deal with kwargs == {"headers": None}, only kwargs == {}.

kwarg_headers.update(headers)

# https://github.com/celery/celery/issues/4875
#
# Need to setdefault the inner headers too since other
# tracing tools (dd-trace-py) also employ this exact
# workaround and we don't want to break them.
#
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it is perfectly reproducible. The bug we're working around lives in celery.app.amqp, but it seems that module is used in redis too??? tbh I no longer understand how celery separates concerns.

@@ -42,6 +42,7 @@ def inner(propagate_traces=True, backend="always_eager", **kwargs):

# this backend requires capture_events_forksafe
celery.conf.worker_max_tasks_per_child = 1
celery.conf.worker_concurrency = 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This speeds up tests, otherwise celery forks 20 times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good on the perspective of speeding up tests -- won't this have an effect on bugs and code paths surfaced by tests, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we haven't had any bugs related to this so far and my gut feeling tells me no. There is no direct communication between the forked processes so I'd say it's unlikely, and the tests won't send in high event volumes ever anyway.

@request.addfinalizer
def _():
assert not in_process_events
capture_events()
Copy link
Member Author

@untitaker untitaker Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove this assertion because we actually do have events in the same process now.

We need to call capture_events or otherwise our transport will raise an error (this fixture setup is overdue for refactor...)

@untitaker untitaker merged commit 4bf4859 into master Sep 23, 2020
@untitaker untitaker deleted the fix/celery-trace-propagation-2 branch September 23, 2020 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants