New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[plugin-aws-ec2-ebs] fix misuse of period #915
Conversation
"ec2.ebs.bandwidth.#.read": cloudWatchSetting{ | ||
MetricName: "VolumeReadBytes", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val / period }, | ||
CalcFunc: valuePerSec, | ||
}, | ||
"ec2.ebs.bandwidth.#.write": cloudWatchSetting{ | ||
MetricName: "VolumeWriteBytes", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val / period }, | ||
CalcFunc: valuePerSec, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mackerel-agent-plugins/mackerel-plugin-aws-ec2-ebs/lib/aws-ec2-ebs.go
Lines 105 to 112 in 1be28e2
"ec2.ebs.bandwidth.#": { | |
Label: "EBS Bandwidth", | |
Unit: "bytes/sec", | |
Metrics: []mp.Metrics{ | |
{Name: "read", Label: "Read", Diff: false}, | |
{Name: "write", Label: "Write", Diff: false}, | |
}, | |
}, |
"ec2.ebs.throughput.#.read": cloudWatchSetting{ | ||
MetricName: "VolumeReadOps", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val / period }, | ||
CalcFunc: valuePerSec, | ||
}, | ||
"ec2.ebs.throughput.#.write": cloudWatchSetting{ | ||
MetricName: "VolumeWriteOps", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val / period }, | ||
CalcFunc: valuePerSec, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mackerel-agent-plugins/mackerel-plugin-aws-ec2-ebs/lib/aws-ec2-ebs.go
Lines 113 to 120 in 1be28e2
"ec2.ebs.throughput.#": { | |
Label: "EBS Throughput (op/s)", | |
Unit: "iops", | |
Metrics: []mp.Metrics{ | |
{Name: "read", Label: "Read", Diff: false}, | |
{Name: "write", Label: "Write", Diff: false}, | |
}, | |
}, |
"ec2.ebs.latency.#.read": cloudWatchSetting{ | ||
MetricName: "VolumeTotalReadTime", Statistics: "Average", | ||
CalcFunc: func(val float64, period float64) float64 { return val * 1000 }, | ||
CalcFunc: sec2msec, | ||
}, | ||
"ec2.ebs.latency.#.write": cloudWatchSetting{ | ||
MetricName: "VolumeTotalWriteTime", Statistics: "Average", | ||
CalcFunc: func(val float64, period float64) float64 { return val * 1000 }, | ||
CalcFunc: sec2msec, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VolumeTotalReadTime and VolumeTotalWriteTime is in seconds, but this plugin's graph defs is labeled (ms/op)
.
mackerel-agent-plugins/mackerel-plugin-aws-ec2-ebs/lib/aws-ec2-ebs.go
Lines 129 to 136 in 1be28e2
"ec2.ebs.latency.#": { | |
Label: "EBS Avg Latency (ms/op)", | |
Unit: "float", | |
Metrics: []mp.Metrics{ | |
{Name: "read", Label: "Read", Diff: false}, | |
{Name: "write", Label: "Write", Diff: false}, | |
}, | |
}, |
"ec2.ebs.idle_time.#.idle_time": cloudWatchSetting{ | ||
MetricName: "VolumeIdleTime", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val / period * 100 }, | ||
CalcFunc: func(val float64) float64 { return val / aggregationPeriod * 100.0 }, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
% Time Spent Idle
Sum(VolumeIdleTime) / Period × 100
https://docs.aws.amazon.com/en_us/AWSEC2/latest/WindowsGuide/using_cloudwatch_ebs.html
@@ -266,7 +285,7 @@ func (p EBSPlugin) getLastPoint(vol *ec2.Volume, metricName string, statType str | |||
} | |||
} | |||
|
|||
return latestVal, period, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[note] Maybe the simplest fix is returning 60
(or aggregationPeriod
) here instead of period
?
}, | ||
"ec2.ebs.consumed_ops.#.consumed_ops": cloudWatchSetting{ | ||
MetricName: "VolumeConsumedReadWriteOps", Statistics: "Sum", | ||
CalcFunc: func(val float64, period float64) float64 { return val }, | ||
CalcFunc: value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[off-topic] I'm not sure, but because the statistics is Sum
, this metric value is per-aggregationPeriod
, right?
If so, to output per-minute value consistently, CalcFunc
should be something like val / aggregationPeriod * 60
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm...
@@ -42,62 +46,74 @@ var io1Graphs = append([]string{ | |||
type cloudWatchSetting struct { | |||
MetricName string | |||
Statistics string | |||
CalcFunc func(float64, float64) float64 | |||
CalcFunc func(float64) float64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the second argument (period
) was good, because each CalcFunc
could be implemented without knowing how getLastPoint
works.
Is there any intention you've omitted this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious is there valuable reason to specify other numbers to the period? If nothing, I think it is sufficient to refers aggregationPeriod
directly in each CalcFunc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK!
I feel like it's a bit tightly-coupled and hard to understand (you can't tell why CalcFunc
divides value by aggregationPeriod
unless you read the whole program), but it will be no problem because the program itself is fairy small.
CloudWatch *cloudwatch.CloudWatch | ||
Volumes []*ec2.Volume | ||
|
||
// internal states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks for your reviewing! |
I fixed period-misuse bugs in mackerel-plugin-aws-ec2-ebs.
The plugin retrieves recent values of the metrics each minutes, and then it elects only a most recent value each metrics from them. Thus these metric values are aggregated by a minute.
In addition, some metric's graph definition such as
ec2.ebs.bandwidth.#.read
orec2.ebs.throughput.#.read
are designed with per-second units. Thus the plugin should convert theseSum
ed values to the values avarage per second, likeval / 60
.mackerel-agent-plugins/mackerel-plugin-aws-ec2-ebs/lib/aws-ec2-ebs.go
Lines 230 to 243 in 1be28e2
However, in the current implementaton, that formula is wrong.
For instance, getLastPoint method returns a metric value and its period that are aggregated each period, but this period is
metricPeriodDefault = 300
(but if not io1).mackerel-agent-plugins/mackerel-plugin-aws-ec2-ebs/lib/aws-ec2-ebs.go
Lines 288 to 298 in 1be28e2