Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Changelog section to Glean Dictionary for metrics and pings #1519

Open
Dexterp37 opened this issue Nov 18, 2022 · 8 comments
Open

Add Changelog section to Glean Dictionary for metrics and pings #1519

Dexterp37 opened this issue Nov 18, 2022 · 8 comments

Comments

@Dexterp37
Copy link
Contributor

Originally filed here by @rebecca-burwei


With Sponsored Tiles, Suggest and Urlbar on Firefox Desktop (Legacy telemetry), the implementing engineers write changelog-style docs for the probes. Here a few examples that illustrate what the changelog looks like and why this is necessary:

The Glean Dictionary should strongly encourage changelog-style docs. The open-ended commentary section is not sufficiently opinionated.

@Dexterp37
Copy link
Contributor Author

@rebecca-burwei (or @perrymcmanis144 , since you were cc'd on the initial bug), would you kindly expand a bit on what do you use the information for / why that's necessary or helpful? Thanks!

@rebecca-burwei
Copy link

Probes change frequently (for evidence, see any of the examples I linked above). It's impossible to interpret our historical data without knowing how the probes have changed (for evidence, see any of the examples I linked above).

@Dexterp37
Copy link
Contributor Author

Probes change frequently (for evidence, see any of the examples I linked above). It's impossible to interpret our historical data without knowing how the probes have changed (for evidence, see any of the examples I linked above).

@rebecca-burwei the changelog mostly report "thing X was added, thing Y was removed in version Z". Is that good enough? I'm trying to get a better understanding of what you specifically are looking for because there might be an opportunity for us to automate this.

@rebecca-burwei
Copy link

Yes -- that's exactly what's needed. :) We need to answer questions like:

  • Are the NULLs we see in the data legitimate because we are no longer sending that type of data in the probe? Or are the NULLs an indication of a different problem? "After version V, value Y was removed" will answer questions like this.
  • How do we track values in categorical data over time for consistent analyses? For example, "after version V, value X is now split out into Y and Z instead", or "after version V, Y and Z are combined into X".
  • When default values change: "Before version V, the default value for this probe was True. After version V, the default value is False." or "The conditions under which a default value is sent have changed".

It's very easy to develop misleading analyses without this information. @perrymcmanis144 did indicate that some of this might be automate-able.

Even if nothing else is possible, a simple changelog like:

  • Version 1: probe implemented for the first time
  • Version 5: probe changed
  • Version 7: probe changed

Even a simple changelog like that would be an improvement because it would turn "unknown unknowns" into "known unknowns", and we could go digging in other docs as needed.

@rebecca-burwei
Copy link

Oh one more that would be very helpful:
"After version V, this probe is turned on/off by default for all clients on channels A, B, C."

@Dexterp37
Copy link
Contributor Author

@rebecca-burwei that's great context , thanks!

@Iinh @badboy do you think the above could be achieved via the probe info service data?

@badboy
Copy link
Member

badboy commented Nov 21, 2022

This partly overlaps with glean-annotations, which already allow to add further (unstructured) data to metrics.

Additionally, metrics can have a version field (example, in use here). The dictionary currently does not show this version. Mapping that metric version to the app's version is currently not (automatically) possible. An app doesn't communicate in its payload which particular metric versions it contains.

To me it's not 100% clear what kind of structured data we'd want to enforce here.
In the end a user will have to digest this information anyway and interpret it when looking at data.
This could therefore become a documented convention for the description field.
Or if we want to make tooling a little bit more aware of it a new changelog field under metadata of each metric.

@badboy
Copy link
Member

badboy commented Nov 28, 2022

Couple more thoughts I had on this:

  • Do we right now already know about cases in Glean metrics that changed and it was documented (other than my example above)?
  • Frequently changing metrics might indicate that the design of said metric is suboptimal. That's not to say it can't happen, but might be a good point to take a step back and redesign that metric
  • Not all metrics are defined in the final app. Documenting their behavior across versions will be more difficult to communicate.
  • Changes like "turned off by default on channel A, B, C after date X" is exactly what we built glean-annotations for: Changeable docs that don't require a lengthy pull request workflow and editing a YAML file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants