Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start drafting timing histogram #109094

Closed

Conversation

MikeSpreitzer
Copy link
Member

@MikeSpreitzer MikeSpreitzer commented Mar 29, 2022

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR introduces a new variant of Prometheus histograms, intended to eventually replace the existing sample-and-watermark histograms in k/apiserver/pkg/util/flowcontrol with something less complex to consume and less costly at runtime to update. This new variant keeps track of the time that the relevant variable spends in each of the ranges defined by the bucket boundaries.

This variant of histograms is hoped to eventually migrate into prometheus. Until then, it resides in k/component-base/metrics/prometheusextension .

Which issue(s) this PR fixes:

Special notes for your reviewer:

This is part of addressing #108272 .

Does this PR introduce a user-facing change?

TBD

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 29, 2022
@MikeSpreitzer
Copy link
Member Author

/sig api-machinery
/sig instrumentation
/cc @wojtek-t
@beorn7
@lavalamp
@deads2k
@tkashem
/cc @logicalhan

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Mar 29, 2022
@MikeSpreitzer
Copy link
Member Author

This will provide a better alternative to #109066 .

limitations under the License.
*/

package prometheusextension
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the high-level I like it, but I would like someone more familiar with prometheus to review it more deeply.

@dgrisonnet @logicalhan

@MikeSpreitzer
Copy link
Member Author

FYI: I discovered while doing this that the protobuf declares the counter in each bucket to be an integer rather than a float: https://github.com/prometheus/client_model/blob/a863571f36499140c0dbeb5d704144490c6f59e6/go/metrics.pb.go#L403 .

@MikeSpreitzer
Copy link
Member Author

MikeSpreitzer commented Mar 29, 2022

This is relevant to prometheus/client_golang#796 too.

Considering prometheus/client_golang#796 (comment) from @caibirdme, I am thinking of factoring this into two layers:

  1. a lower, more general layer in which Observe takes a "weight". As noted above, the weight will sadly have to be an integer.
  2. a higher layer that implicitly weights each Observe by the time.Duration to the next Observe (with partial update at Write time).

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: MikeSpreitzer
To complete the pull request process, please assign brancz, sttts after the PR has been reviewed.
You can assign the PR to them by writing /assign @brancz @sttts in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@MikeSpreitzer: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-gce-ubuntu-containerd 7d4dc66 link true /test pull-kubernetes-e2e-gce-ubuntu-containerd
pull-kubernetes-integration 7d4dc66 link true /test pull-kubernetes-integration
pull-kubernetes-verify-govet-levee 7d4dc66 link true /test pull-kubernetes-verify-govet-levee
pull-kubernetes-node-e2e-containerd 7d4dc66 link true /test pull-kubernetes-node-e2e-containerd
pull-kubernetes-dependencies 7d4dc66 link true /test pull-kubernetes-dependencies
pull-kubernetes-unit 7d4dc66 link true /test pull-kubernetes-unit
pull-kubernetes-e2e-gce-100-performance 7d4dc66 link true /test pull-kubernetes-e2e-gce-100-performance
pull-kubernetes-e2e-kind 7d4dc66 link true /test pull-kubernetes-e2e-kind
pull-kubernetes-e2e-kind-ipv6 7d4dc66 link true /test pull-kubernetes-e2e-kind-ipv6
pull-kubernetes-conformance-kind-ga-only-parallel 7d4dc66 link true /test pull-kubernetes-conformance-kind-ga-only-parallel
pull-kubernetes-typecheck 7d4dc66 link true /test pull-kubernetes-typecheck
pull-kubernetes-verify 7d4dc66 link true /test pull-kubernetes-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@MikeSpreitzer
Copy link
Member Author

What is going wrong in the CI testing? It is full of errors that I do not understand. For example, https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/109094/pull-kubernetes-verify/1508780663120072704 leads off with

{Script Error ScriptError vendor/k8s.io/component-base/metrics/timing_histogram.go:22:2: cannot find package "github.com/blang/semver" in any of:
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/blang/semver (vendor tree)
	/usr/local/go/src/github.com/blang/semver (from $GOROOT)
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/github.com/blang/semver (from $GOPATH)

That import is not new with component-base/metrics/timing_histogram.go, it also appears in component-base/metrics/histogram.go. I see github.com/blang/semver in vendor/ and in vendor/modules.txt. What is going wrong here?

@cici37
Copy link
Contributor

cici37 commented Mar 29, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 29, 2022
@MikeSpreitzer
Copy link
Member Author

See #109277

@logicalhan
Copy link
Member

@MikeSpreitzer is this PR still alive?

@logicalhan
Copy link
Member

/triage accepted

@wojtek-t
Copy link
Member

wojtek-t commented Apr 8, 2022

#109277 is much better approach. I'm closing this one and we will proceed with the other one.

@wojtek-t wojtek-t closed this Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants