Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sentry session tracking #4157

Merged
merged 12 commits into from Sep 6, 2022
Merged

Add sentry session tracking #4157

merged 12 commits into from Sep 6, 2022

Conversation

raubitsj
Copy link
Member

@raubitsj raubitsj commented Aug 22, 2022

Description

Problem:
We have information about failures seen by wandb library users but we don't have a good way of knowing the percentage of users that encounter a given problem. The goal is to calculate a problem rate metric:

ProblemRate = NumberOfUsersExperiencingAProblem / TotalNumberOfUsers

The service sentry.io collects problems, and number of users. but we have a blind spot within sentry about total number of users. We have other data sources which we could use to get this denominator but then the way data is being captured could be inconsistent. The larger project to work on this is/will be: Internal Notion Doc

Sentry has two endpoints we would like to use:

Currently we are only using the store endpoint.
Envelope can be used to send session informaiton.

Caveats:

  • The default thread which saves session information runs with a sleep timer of 60 seconds and might prevent runs shorter than 60 seconds from being captured
  • We might need to be careful how we define a session, is the internal process and user process a single session, is it ok if they are two sessions? The ProblemRate per user should be not impacted by that distinction
  • Should we use the sentry auto session interface?

Partially answered questions:

  • How was this working before? Before June 2022 we were using the global sentry hub so if our user was also using sentry they might have configured it to use a session which would explain why we had data for a while.

Possible future changes:

  • We might change our sentry sdk project so we have cleaner data once this change goes out and is good (we already have sdk-old)
  • We might want to add our own exception endpoint so we can collect data more inline. We will use sentry for digging into details but use our own internal metric to track improvements over time. We should validate that they give similar answers or understand why they dont.
  • If we are adding our own exception endpoint, we might add a telemetry endpoint to keep track of other situations like how often certain features are used in the command line.

TODO:

  • Debug some more
  • Add envelope support to mockserver / yea

Requires:

Testing

Will be tested with yea

Checklist

  • Include reference to internal ticket "Fixes WB-NNNN" (and github issue "Fixes #NNNN" if applicable)

@codecov
Copy link

codecov bot commented Aug 22, 2022

Codecov Report

Merging #4157 (99de28c) into master (54d64ef) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4157      +/-   ##
==========================================
+ Coverage   82.63%   82.66%   +0.02%     
==========================================
  Files         256      256              
  Lines       32608    32610       +2     
==========================================
+ Hits        26945    26956      +11     
+ Misses       5663     5654       -9     
Flag Coverage Δ
functest 55.96% <100.00%> (+0.03%) ⬆️
unittest 73.41% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/util.py 87.27% <100.00%> (+0.02%) ⬆️
wandb/integration/tensorboard/monkeypatch.py 90.12% <0.00%> (-2.47%) ⬇️
wandb/sdk/wandb_run.py 91.06% <0.00%> (-0.07%) ⬇️
wandb/sdk/lib/git.py 79.01% <0.00%> (ø)
wandb/sdk/wandb_setup.py 88.94% <0.00%> (+0.50%) ⬆️
wandb/sdk/internal/internal_api.py 86.98% <0.00%> (+0.51%) ⬆️
wandb/sdk/launch/agent/agent.py 92.25% <0.00%> (+1.40%) ⬆️
wandb/sdk/internal/meta.py 90.79% <0.00%> (+3.06%) ⬆️

Copy link
Member

@dmitryduev dmitryduev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this easy?? sick!!

@raubitsj raubitsj added this to the sdk-2022-09.1 milestone Sep 1, 2022
@raubitsj raubitsj marked this pull request as ready for review September 6, 2022 20:13
Copy link
Member

@dmitryduev dmitryduev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raubitsj raubitsj merged commit 8f9177a into master Sep 6, 2022
@raubitsj raubitsj deleted the sentry-session branch September 6, 2022 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants