Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse channel from activity-stream pings #1070

Open
jklukas opened this issue Jan 10, 2020 · 3 comments
Open

Parse channel from activity-stream pings #1070

jklukas opened this issue Jan 10, 2020 · 3 comments
Labels
pipeline metadata Should be solved by capturing new metadata in JSON schemas

Comments

@jklukas
Copy link
Contributor

jklukas commented Jan 10, 2020

Impression-stats and other docTypes have a release top-level field that should be used in the pipeline as input to normalized_channel. Currently, they have null normalized_channel.

@jklukas
Copy link
Contributor Author

jklukas commented Mar 3, 2021

We should probably encode this in JSON schemas under mozPipelineMetadata as a new normalized_channel_source field and have the pipeline use that to decide where to look.

@jklukas jklukas added the pipeline metadata Should be solved by capturing new metadata in JSON schemas label Mar 3, 2021
@jklukas
Copy link
Contributor Author

jklukas commented May 24, 2021

We also have cases where we want to use a static value for channel. For Fenix, we codify the value for app_channel in https://github.com/mozilla/probe-scraper/blob/main/repositories.yaml

We could represent that in the generated JSON schemas as a static value.

So perhaps we should have mozPipelineMetadata like the following:

"static_fields": {
  "attribute": "normalized_channel",
  "static_value": "release"
}
"fields_from_payload": {
  "attribute": "normalized_channel"
  "source_path": "#/channel"
}

That would make this more generally applicable compared to supporting just normalized_channel. We'd have to think carefully about the interface and what to call the fields.

@jklukas
Copy link
Contributor Author

jklukas commented May 25, 2021

Thinking more about interface, this could be cast as attribute_mappings similar to the existing jwe_mappings. Each mapping would have a required attribute field and then either a static_value or source_path field.

For a value like normalized_channel, though, this isn't quite powerful enough. The source_path would generally be pointing to a "raw" channel identifier; the value still needs to go through the NormalizeAttributes#channel logic. So I suppose we'd be populating attribute app_update_channel via this metadata, and we still rely on the pipeline knowing about this as the attribute to use as source for normalization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pipeline metadata Should be solved by capturing new metadata in JSON schemas
Projects
None yet
Development

No branches or pull requests

1 participant