Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add exponential moving average to vega-lite #9225

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

julieg18
Copy link

@julieg18 julieg18 commented Jan 12, 2024

PR Description

This PR adds a new parameter, aggregate_param, to aggregated field definitions in view-level transforms and encoding level transforms, allowing exponential moving average.

Closes #9200

Checklist

  • This PR is atomic (i.e., it fixes one issue at a time).
  • The title is a concise semantic commit message (e.g. "fix: correctly handle undefined properties").
  • yarn test runs successfully
  • For new features:
    • Has unit tests.
    • Has documentation under site/docs/ + examples.

Copy link
Author

@julieg18 julieg18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm new to vega/vega-lite and this is my first contribution, any comments to improve are welcome!

meas[field][op] = {aliases: new Set([as ? as : vgField(s, {forAs: true})])};

if (aggregate_param) {
meas[field][op].aggregate_param = aggregate_param;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added aggregate_param to view-level transforms, but do we want included in encoding as well? Maybe something like

{
  "encoding": {
    "x": {
      "aggregate": {"exponential": 0.5},
    },
  }
}

I based the example off how the code works with argmax/argmin

Copy link
Member

@domoritz domoritz Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I wonder whether we want to use a similar API for transforms.

"aggregate": [{
  "op": {"exponential": 0.5},
  "as": "exp_0.5"
}]

Current API in this pull request for reference

"aggregate": [{
  "op": "exponential"
  "aggregate_param": 0.5,
  "as": "exp_0.5"
}]

We often have more concise transform APIs than Vega and this would make the API more consistent between encodings and transforms.

However, it would be a bit less constant with the API in https://vega.github.io/vega-lite/docs/aggregate.html#argmax

    "aggregate": [{
      "op": "argmax",
      "field": "US Gross",
      "as": "argmax_US_Gross"
    }]

What do you think?

I'm somewhat leaning towards "op": {"exponential": 0.5} because the parameter is so closely tied to the exponential here and aggregate_param is not as meaningful of a description as field in the argmax case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arvind +1s "op": {"exponential": 0.5}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I'll update encoding level transforms to have "aggregate": {"exponential": 0.5}, and view level transforms to have "op": {"exponential": 0.5}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Btw, one thing @arvind brought up is that aggregate_param (which is the term in Vega) can be confused with the param concept in Vega-Lite (which Vega doesn't have (yet)). Just wanted to note that here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I'll update encoding level transforms to have "aggregate": {"exponential": 0.5}, and view level transforms to have "op": {"exponential": 0.5}.

Encoding level transforms and view level transforms are now updated.

@@ -27,7 +27,7 @@ import {DataFlowNode} from './dataflow';
import {isRectBasedMark} from '../../mark';
import {OFFSETTED_RECT_END_SUFFIX, OFFSETTED_RECT_START_SUFFIX} from './timeunit';

type Measures = Dict<Partial<Record<AggregateOp, Set<string>>>>;
type Measures = Dict<Partial<Record<AggregateOp, {aliases: Set<string>; aggregate_param?: number}>>>;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated our measures to hold the fieldnames (aliases) and the aggregate param (aggregate_param).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now aggregateParam

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. It's an internal name so it doesn't matter too much.

src/transform.ts Outdated
/**
* A parameter that can be passed to aggregation functions. The aggregation operation `"exponential"` requires it.
*/
aggregate_param?: number;
Copy link
Author

@julieg18 julieg18 Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregate_params can also be used in window transforms, but the type WindowTransform is not updated.

If I'm not misunderstanding something, I can open a pr to update WindowTransform so we can add exponential moving average to window transforms in vega-lite either in this pr or in a later one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Yes, please send a pull request and I can merge it. We can add support for window aggregates in a follow up pull request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Opened vega/vega#3874

@julieg18 julieg18 marked this pull request as ready for review January 12, 2024 16:06
@julieg18 julieg18 requested a review from a team as a code owner January 12, 2024 16:06
@julieg18 julieg18 marked this pull request as draft January 17, 2024 13:41
@julieg18

This comment was marked as resolved.

@julieg18 julieg18 marked this pull request as ready for review January 18, 2024 13:22
@julieg18
Copy link
Author

@domoritz, sorry for bothering you, but are there any issues with this PR that are stopping it from getting a review? If it's just a matter of time, no problem, just wanted to check if there was anything I needed to do on my end.

Copy link
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping and your patience on this. I think we should resolve the discussion about view level transforms but otherwise this is great. Thanks for the pull request and offering to support window transforms as well.

meas[field][op] = {aliases: new Set([as ? as : vgField(s, {forAs: true})])};

if (aggregate_param) {
meas[field][op].aggregate_param = aggregate_param;
Copy link
Member

@domoritz domoritz Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I wonder whether we want to use a similar API for transforms.

"aggregate": [{
  "op": {"exponential": 0.5},
  "as": "exp_0.5"
}]

Current API in this pull request for reference

"aggregate": [{
  "op": "exponential"
  "aggregate_param": 0.5,
  "as": "exp_0.5"
}]

We often have more concise transform APIs than Vega and this would make the API more consistent between encodings and transforms.

However, it would be a bit less constant with the API in https://vega.github.io/vega-lite/docs/aggregate.html#argmax

    "aggregate": [{
      "op": "argmax",
      "field": "US Gross",
      "as": "argmax_US_Gross"
    }]

What do you think?

I'm somewhat leaning towards "op": {"exponential": 0.5} because the parameter is so closely tied to the exponential here and aggregate_param is not as meaningful of a description as field in the argmax case.

src/transform.ts Outdated
/**
* A parameter that can be passed to aggregation functions. The aggregation operation `"exponential"` requires it.
*/
aggregate_param?: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Yes, please send a pull request and I can merge it. We can add support for window aggregates in a follow up pull request.

{"price": 15.0, "year": 2023},
{"price": 10.19, "year": 2024},
{"price": 8.86, "year": 2024}
]
Copy link
Author

@julieg18 julieg18 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I was planning on using some advanced data like data/stocks.csv for an example, but exponential was returning NaN whenever I wanted to use timeUnit in the x encoding.

Since the created spec in vega looked correct, could there be an error with how exponential works in vega or am I misunderstanding how to use timeUnit? Here's what I'm seeing on my local editor(vega 5.27.0)

image image

export interface AggregatedFieldDef {
/**
* The aggregation operation to apply to the fields (e.g., `"sum"`, `"average"`, or `"count"`).
* See the [full list of supported aggregation operations](https://vega.github.io/vega-lite/docs/aggregate.html#ops)
* for more information.
*/
op: AggregateOp;
op: AggregateFieldOp;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated AggregatedFieldDef to have its own type that set exponential as an object instead of a string.

This change caused:

  • type adjustments in some files (compositemark/boxplot.ts for example)
  • a change to docs types
  • needing to check for an exponential object whenever we are working with field definitions and handle accordingly (channeldef.ts for example)

@@ -893,7 +893,8 @@ export function verbalTitleFormatter(fieldDef: FieldDefBase<string>, config: Con
} else if (isArgminDef(aggregate)) {
return `${field} for min ${aggregate.argmin}`;
} else {
return `${titleCase(aggregate)} of ${field}`;
const aggregateOp = isExponentialDef(aggregate) ? 'exponential' : aggregate;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up adding a mixture of ternary operators and if statements into the code to handle exponential being an object which I know can increase code complexity. Please let me know if there are more clean ways to handle this :)

Copy link
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this. I am on vacation for the next week so my review will take a while.

Maybe someone else from @vega/maintainers can review instead.

@@ -80,7 +93,7 @@
"description": "The data field for which to compute aggregate function. This is required for all aggregation operations except `\"count\"`."
},
"op": {
"$ref": "#/definitions/AggregateOp",
"$ref": "#/definitions/AggregateFieldOp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test the docs as well? I see we use AggregateOp in https://github.com/julieg18/vega-lite/blob/261728bbeadb674041266dc3276ffd46a4c7510d/site/_data/link.yml#L61 so we need to make sure the links are still correct.

@domoritz domoritz requested a review from a team February 18, 2024 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add exponential moving average to vega-lite
2 participants