Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Track metrics around usage of the GitHub API #3273

Merged
merged 8 commits into from May 20, 2024
Merged

Conversation

bryanhuhta
Copy link
Contributor

Related: https://github.com/grafana/pyroscope-squad/issues/138

We have metrics surrounding how the VCS service behaves (latency, errors, etc) but we have limited insight into our usage of the GitHub API. This PR tracks our GitHub API usage in two ways:

  • duration of requests
  • count of how close we are to rate limiting

@bryanhuhta bryanhuhta self-assigned this May 2, 2024
@bryanhuhta bryanhuhta requested a review from a team as a code owner May 2, 2024 22:54
@bryanhuhta bryanhuhta requested review from aleks-p and removed request for aleks-p May 2, 2024 22:54
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this.

I do think it might be easier instrumenting on the *http.Client level with a rounder tripper, as this would be a lot less subject to refactoring and breakages and has access to a lot more data.

},
[]string{"path", "status"},
),
APIRateLimit: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that this metric might not be too helpful/accurate:

My understanding is that we are subject to rate limits of the github user that authenticated via oauth2. So if we have two github users using the service at the same time we would constantly have a jumping value, between those two values of remaining requests.

I do think this is something important to watch. Maybe this could be a histogram with fitting buckets, that would show us how close to being rate limited we got.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right here. The rate limit remaining here is per-user (doc) since we're using a user token. If we had an installation token, the rate limit would be per app.

How I'm using the metric here is not helpful at all. Like you said, it's going to jump to whatever remaining rate limit the last token had. I'm going to rethink how to track this, as a gauge is clearly not the right approach.

),
APIRateLimit: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
Namespace: "pyroscope",
Name: "vcs_github_rate_limit",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Name: "vcs_github_rate_limit",
Name: "vcs_github_remaining_request_quota",

pkg/querier/vcs/client/github.go Outdated Show resolved Hide resolved
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@simonswine
Copy link
Contributor

Thanks for this new revision, I have not fully tested it myself, but it looks all correct code wise

@bryanhuhta bryanhuhta merged commit 2063655 into main May 20, 2024
16 checks passed
@bryanhuhta bryanhuhta deleted the gh-metrics branch May 20, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants