Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(client-reports): Record lost sample_rate events only if tracing is enabled #1268

Merged
merged 8 commits into from Dec 10, 2021

Conversation

sl0thentr0py
Copy link
Member

Spurious client report outcomes were recorded before even if traces_sample_rate and traces_sampler were both None in the config.

Fixes https://getsentry.atlassian.net/browse/WEB-279

@sl0thentr0py sl0thentr0py requested review from a team, lobsterkatie and AbhiPrasad and removed request for a team December 6, 2021 14:54
Copy link
Member

@lobsterkatie lobsterkatie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sentry_sdk/tracing_utils.py Show resolved Hide resolved
tests/tracing/test_sampling.py Outdated Show resolved Hide resolved
Copy link
Contributor

@rhcarvalho rhcarvalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed we only look at traces_sampler and traces_sample_rate in isolation, missing a test case for when both are set, but I'm okay to leave it out.

Only two possible points left:

  • Potentially stale comment
  • Potentially use mock.patch instead of monkeypatch (or just a good reason to be different than the rest of the file)

tests/tracing/test_sampling.py Show resolved Hide resolved
tests/tracing/test_sampling.py Outdated Show resolved Hide resolved
sentry_sdk/tracing.py Show resolved Hide resolved
@rhcarvalho
Copy link
Contributor

Also need to investigate the failed tests

image

Copy link
Contributor

@rhcarvalho rhcarvalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @sl0thentr0py!

[
(None, False, []),
(0.0, False, [("sample_rate", "transaction")]),
(1.0, True, []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My curiosity, were there no tests covering this case before?

The key behavior this PR is changing is the first case, when traces_sample_rate is None.

I read the test names again, and while it is better now that each input and output is an explicit parameter, it is not so obvious why only the 0.0-False-... case has a report (one needs to think/know what reports are for reporting "lost" events).

Maybe a clearer test would be one where there's always some lost event, and the expected output for the None-... is that only the transaction-related report should be skipped, but not one about errors. Is that covered somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no that's not covered afaik, will add one.

One high level comment I want to make here re: my personal testing philosophy is about the impossibility of testing all cases.
If we have 20 config parameters for sentry.init each taking 2 values (boolean), all possible combinations of those are 2^20 =1048576 total cases. Even more if those are not booleans. If any of those values is continuous, theoretically you need an infinite number of tests to test every possible state of your program/api surface.
Given this theoretical state of affairs, as engineers, we need to pick and choose where we write tests and not all tests add value/are worth adding.

I will add this one anyway but I just wanted to clarify how I think about testing for the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your extrapolation. But we're not trying to test all combinations of input, but rather behaviors, and there are not so many interesting behaviors to test here.

For testing arbitrary inputs then we can rely on other tools such as fuzzing, and not hand-written unit tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a clearer test would be one where there's always some lost event, and the expected output for the None-... is that only the transaction-related report should be skipped,

this tbh is a fairly contrived case which involves sample_rate too so I don't believe this belongs to the scope that these tests are trying to tackle.

@sl0thentr0py sl0thentr0py merged commit d09221d into master Dec 10, 2021
@sl0thentr0py sl0thentr0py deleted the fix-sample-client-reports branch December 10, 2021 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants