Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation around migration from rollup to downsampling #107965

Merged
merged 10 commits into from
May 1, 2024
2 changes: 1 addition & 1 deletion docs/reference/rollup/api-quickref.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<titleabbrev>API quick reference</titleabbrev>
++++

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

Most rollup endpoints have the following base:

Expand Down
4 changes: 3 additions & 1 deletion docs/reference/rollup/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[[xpack-rollup]]
== Rolling up historical data

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

Keeping historical data around for analysis is extremely useful but often avoided due to the financial cost of
archiving massive amounts of data. Retention periods are thus driven by financial realities rather than by the
Expand All @@ -20,6 +20,7 @@ cost of raw data.
* <<rollup-understanding-groups,Understanding rollup grouping>>
* <<rollup-agg-limitations,Rollup aggregation limitations>>
* <<rollup-search-limitations,Rollup search limitations>>
* <<rollup-migrating-to-downsampling,Migrating to downsampling>>


include::overview.asciidoc[]
Expand All @@ -28,3 +29,4 @@ include::rollup-getting-started.asciidoc[]
include::understanding-groups.asciidoc[]
include::rollup-agg-limitations.asciidoc[]
include::rollup-search-limitations.asciidoc[]
include::migrating-to-downsampling.asciidoc[]
119 changes: 119 additions & 0 deletions docs/reference/rollup/migrating-to-downsampling.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
[role="xpack"]
[[rollup-migrating-to-downsampling]]
=== {rollup-cap} Migrating to downsampling
martijnvg marked this conversation as resolved.
Show resolved Hide resolved
++++
<titleabbrev>Migrating to downsampling</titleabbrev>
++++

Rollup and downsampling are two different features that allow historical metrics to be rolled up.
From a high level rollup offers more functionality compared to downsampling, but downsampling is a more robust and
easier feature to downsample metrics.
martijnvg marked this conversation as resolved.
Show resolved Hide resolved

The following features are missing from downsampling:

* Downsampling can only rollup metrics by the `_tsid` (combination of all dimensions) and `@timestamp` field. Rollup allows
to group by other fields. For example `@timestamp` and hostname or any other combination of fields.
* Downsampling doesn't support calendar time intervals. Only fixed intervals are supported.
* Downsampling can only downsample time series data streams, which means a data source needs to be metrics, with
configured dimension, metric fields and a `@timestamp` field. The rollup feature can rollup any index with a timestamp field.
* Downsampling has less flexible scheduling control and it isn't possible to rollup data after short intervals. This
is because downsampling is tied to life cycle management. Essentially only after an index has been rolled
over it can be downsampled. This is because downsampling can only occur when a backing index is marked read only.

The following aspects of downsampling are easier or more robust:
martijnvg marked this conversation as resolved.
Show resolved Hide resolved

* No need to schedule jobs. Downsampling is integrated with Index Lifecycle Management (ILM) and Data Stream Lifecycle (DSL).
* No separate search API. Downsampled indices can be accessed via the search api and es|ql.
* No separate rollup configuration. Downsampling uses the time series dimension and metric configuration from the mapping.

It isn't possible to migrate all rollup usages to downsampling, because of the listed differences. The first requirement
is that the data should be stored in Elasticsearch as <<tsds,time series data stream (TSDS)>>.
Rollup usages to basically roll the data up by time can to migrate to downsampling.
martijnvg marked this conversation as resolved.
Show resolved Hide resolved

An example rollup usage that can be migrated to downsampling:

[source,console]
--------------------------------------------------
PUT _rollup/job/sensor
{
"index_pattern": "sensor-*",
"rollup_index": "sensor_rollup",
"cron": "0 0 * * * *",
"page_size": 1000,
"groups": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "60m"
},
"terms": {
"fields": [ "node" ]
}
},
"metrics": [
{
"field": "temperature",
"metrics": [ "min", "max", "sum" ]
},
{
"field": "voltage",
"metrics": [ "avg" ]
}
]
}
--------------------------------------------------
// TEST[setup:sensor_index]

The equivalent <<tsds,time series data stream (TSDS)>> setup that uses downsampling via DSL:

[source,console]
--------------------------------------------------
PUT _index_template/sensor-template
{
"index_patterns": ["sensor-*"],
"data_stream": { },
"template": {
"lifecycle": {
"downsampling": [
{
"after": "1d",
"fixed_interval": "1h"
}
]
},
"settings": {
"index.mode": "time_series"
},
"mappings": {
"properties": {
"node": {
"type": "keyword",
"time_series_dimension": true
},
"temperature": {
"type": "half_float",
"time_series_metric": "gauge"
},
"voltage": {
"type": "half_float",
"time_series_metric": "gauge"
},
"@timestamp": {
"type": "date"
}
}
}
}
}
--------------------------------------------------
// TEST[continued]

////
[source,console]
----
DELETE _index_template/sensor-template
----
// TEST[continued]
////

The downsample configuration is included in the above template for a <<tsds,time series data stream (TSDS)>>.
Only the `downsampling` part is necessary to enable downsampling, which indicates when to downsample to what fixed interval.
2 changes: 1 addition & 1 deletion docs/reference/rollup/overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<titleabbrev>Overview</titleabbrev>
++++

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

Time-based data (documents that are predominantly identified by their timestamp) often have associated retention policies
to manage data growth. For example, your system may be generating 500 documents every second. That will generate
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/rollup/rollup-agg-limitations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[[rollup-agg-limitations]]
=== {rollup-cap} aggregation limitations

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

There are some limitations to how fields can be rolled up / aggregated. This page highlights the major limitations so that
you are aware of them.
Expand All @@ -22,4 +22,4 @@ And the following metrics are allowed to be specified for numeric fields:
- Max aggregation
- Sum aggregation
- Average aggregation
- Value Count aggregation
- Value Count aggregation
2 changes: 1 addition & 1 deletion docs/reference/rollup/rollup-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[[rollup-apis]]
== Rollup APIs

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

[discrete]
[[rollup-jobs-endpoint]]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/rollup/rollup-getting-started.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<titleabbrev>Getting started</titleabbrev>
++++

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

To use the Rollup feature, you need to create one or more "Rollup Jobs". These jobs run continuously in the background
and rollup the index or indices that you specify, placing the rolled documents in a secondary index (also of your choosing).
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/rollup/rollup-search-limitations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[[rollup-search-limitations]]
=== {rollup-cap} search limitations

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

While we feel the Rollup function is extremely flexible, the nature of summarizing data means there will be some limitations. Once
live data is thrown away, you will always lose some flexibility.
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/rollup/understanding-groups.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[[rollup-understanding-groups]]
=== Understanding groups

deprecated::[8.11.0,"Rollups will be removed in a future version. Use <<downsampling,downsampling>> instead."]
deprecated::[8.11.0,"Rollups will be removed in a future version. Please <<rollup-migrating-to-downsampling,migrate>> to <<downsampling,downsampling>> instead."]

To preserve flexibility, Rollup Jobs are defined based on how future queries may need to use the data. Traditionally, systems force
the admin to make decisions about what metrics to rollup and on what interval. E.g. The average of `cpu_time` on an hourly basis. This
Expand Down