Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Doc+) Flush out Data Tiers #107981

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
140 changes: 109 additions & 31 deletions docs/reference/datatiers.asciidoc
Expand Up @@ -2,40 +2,78 @@
[[data-tiers]]
== Data tiers

A _data tier_ is a collection of nodes with the same data role that
typically share the same hardware profile:
A _data tier_ is a collection of <<modules-node,nodes>> within a cluster that share the same
<<node-roles,data node role>>, and a hardware profile that's appropriately sized for the role. Elastic recommends that nodes in the same tier share the same
hardware profile to avoid <<hotspotting,hot spotting>>.

* <<content-tier, Content tier>> nodes handle the indexing and query load for content such as a product catalog.
* <<hot-tier, Hot tier>> nodes handle the indexing load for time series data such as logs or metrics
and hold your most recent, most-frequently-accessed data.
* <<warm-tier, Warm tier>> nodes hold time series data that is accessed less-frequently
The data tiers that you use, and the way that you use them, depends on the data's <<data-management,category>>.

The following data tiers are can be used with each data category:

Content data:

* <<content-tier,Content tier>> nodes handle the indexing and query load for content
indices, such as a <<system-indices,system index>> or a product catalog.
Comment on lines +15 to +16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System indices and data streams can also be time series data, so I don't think we should use it as an example here. I think we should stick with a timeseries/non-timeseries distinction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be another πŸ˜• point for me then if we can discuss:

  1. Lower down on the existing page under Content header already says "System indices and other indices that aren’t part of a data stream are automatically allocated to the content tier." which is why I didn't realize I might be misunderstanding.
  2. Support encourages users to keep all system indices on hot/content. Does Dev agree?
  3. AFAIK (and it's an ongoing discussion / definition-problem) system indices are the indices which report from the snapshot's feature states. So from the unofficial list I wrote for Support we later learned e.g. .ilm-history and .kibana-event-log don't qualify as system indices. So e.g. only (A) qualify as system indices and AFAICT that subset doesn't have time series data (at least no indices which'd rollover. EDIT: other than the ML ones if that's what you were referencing?).
(A)
{
  "feature_states": [
    {
      "feature_name": "security",
      "indices": [".security-tokens-7",".security-7",".security-profile-8"]
    },
    {
      "feature_name": "geoip",
      "indices": [".geoip_databases"]
    },
    {
      "feature_name": "async_search",
      "indices": [".async-search"]
    },
    {
      "feature_name": "machine_learning",
      "indices": [".ml-inference-native-000002",".ml-inference-000005",".ml-config"]
    },
    {
      "feature_name": "transform",
      "indices": [".transform-internal-007"]
    },
    {
      "feature_name": "kibana",
      "indices": [
        ".kibana_analytics_8.12.2_001",
        ".kibana_task_manager_8.12.2_001",
        ".kibana_ingest_8.12.2_001",
        ".apm-custom-link",
        ".apm-agent-configuration",
        ".kibana_8.12.2_001",
        ".kibana_security_session_1",
        ".kibana_security_solution_8.12.2_001",
        ".kibana_alerting_cases_8.12.2_001"
      ]
    },
    {
      "feature_name": "tasks",
      "indices": [".tasks"]
    },
    {
      "feature_name": "fleet",
      "indices": [
        ".fleet-agents-7",
        ".fleet-enrollment-api-keys-7",
        ".fleet-actions-7",
        ".fleet-policies-7",
        ".fleet-servers-7",
        ".fleet-policies-leader-7"
      ]
    }
  ]
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support encourages users to keep all system indices on hot/content.

Users cannot control these at all. System indices cannot be configured apart from specialized APIs. Generally, we shouldn't be talking about system indices with users (if at all), since they are meant to be used only for system usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a recent change? Because users sending .kibana* or .reporting* system indices or or .alert* if they count as system indices to warm/cold tier is an ongoing lowkey concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the concern with them moving the indices?

Inspecting the code here, we differentiate between internal and external system indices, and those managed by ES and unmanaged by ES (I stand corrected, sorry about my confusion here). It looks like we do have a concept of a system index that a user could influence somewhat (through setting an appropriate origin).


Time series data:

* <<hot-tier,Hot tier>> nodes handle the indexing load for time series data,
such as logs or metrics. They hold your most recent, most-frequently-accessed data.
* <<warm-tier,Warm tier>> nodes hold time series data that is accessed less-frequently
and rarely needs to be updated.
* <<cold-tier,Cold tier>> nodes hold time series data that is accessed
infrequently and not normally updated. To save space, you can keep
<<fully-mounted,fully mounted indices>> of
<<ilm-searchable-snapshot,{search-snaps}>> on the cold tier. These fully mounted
indices eliminate the need for replicas, reducing required disk space by
approximately 50% compared to the regular indices.
* <<frozen-tier, Frozen tier>> nodes hold time series data that is accessed
* <<frozen-tier,Frozen tier>> nodes hold time series data that is accessed
rarely and never updated. The frozen tier stores <<partially-mounted,partially
mounted indices>> of <<ilm-searchable-snapshot,{search-snaps}>> exclusively.
This extends the storage capacity even further β€” by up to 20 times compared to
the warm tier.

stefnestor marked this conversation as resolved.
Show resolved Hide resolved
IMPORTANT: {es} generally expects nodes within a data tier to share the same
hardware profile. Variations not following this recommendation should be
IMPORTANT: {es} generally expects nodes within a data tier to share the same
hardware profile. Variations that don't follow this recommendation should be
carefully architected to avoid <<hotspotting,hot spotting>>.
shainaraskas marked this conversation as resolved.
Show resolved Hide resolved
The way data tiers are used often depends on the data's category:

- Content data remains on the <<content-tier,content tier>> for its entire
data lifecycle.

- Time series data should progress through the
stefnestor marked this conversation as resolved.
Show resolved Hide resolved
descending temperature data tiers (hot, warm, cold, and frozen) according to your
performance, resiliency, and data retention requirements.
+
You can automate these lifecycle transitions using the <<data-streams,data stream lifecycle>>, or custom <<index-lifecycle-management,{ilm}>>.

[TIP]
====
A data tier's performance depends on its backing hardware profile.
For example hardware profiles, refer to Elastic Cloud's {cloud}/ec-reference-hardware.html[instance configurations].

Elastic generally assumes that lower temperature data tiers have an increased ratio of data storage to CPU or heap resources. This allows later tiers to gain more space for data storage at the cost of slower response times.

Using the principle that lower-temperature tiers should hold less frequently updated and accessed data, your requests should be distributed to data tiers in the following approximate proportions. These proportions keep your clusters stable and highly responsive.

- Search: 85% hot, 10% warm, 5% cold, and 1% frozen
- Ingest: 95% hot, 4% warm, 1% cold, and 0% frozen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ‘‹πŸ½ @dakrone will you kindly review these proportional percentages per data tier for Dev sign-off? I believe the rest of this PR consolidates content from existing doc pages for clarity, but this call out uniquely makes a new claim.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we get these numbers? I don't think we can make generalizations for these kinds of percentages, for example, it's perfectly valid to have a "search" load that's hot and frozen, where the searches hit each tier 50% of the time (again, the performance requirements aren't something we can supply, they have to come from the user).

On the ingestion side, I wouldn't expect any indexing at all on the warm and cold tiers, how did we arrive at the 4% and 1% numbers respectively?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we get these numbers?

In PR description I highlighted that I guesstimated/made-up these numbers. Please only consider them placeholders.

for example, it's perfectly valid to have a "search" load that's hot and frozen, where the searches hit each tier 50% of the time (again, the performance requirements aren't something we can supply, they have to come from the user).

From Support, I may only deal with the situations where searches 50% hitting frozen breaks the cluster. The age-old example is Frozen tier having future dates takes down the entire cluster. I do want to highlight though that the existing doc does already say "Frozen tier nodes hold time series data that is accessed rarely and never updated.". I may be missing the intended interpretation, but "accessed rarely" does not sound like 50% to me but a lot more like the 1% I guesstimated.

On the ingestion side, I wouldn't expect any indexing at all on the warm and cold tiers, how did we arrive at the 4% and 1% numbers respectively?

Again guesstimated from the existing doc saying " Warm tier nodes hold time series data that is accessed less-frequently and rarely needs to be updated. ... Cold tier nodes hold time series data that is accessed infrequently and not normally updated.". I don't know what these numbers should be which is why I requested your feedback πŸ™‚ .

I'm on board if in general we're concerned about explicit percentages, but at least from what I see users feel unguided and don't realize for desiring performance that they haven't architected in a way that'd get themselves there. That's the need I'm hoping to fill in better, but I'm not tied on how we do that. So if wording needs to change or we need to have an "it depends" blog instead and just link to it from here, all that's fine by me. But I would like to advocate for something more concrete to point users to for base level architecture / expectation setting.

You can check how your access requests are distributed among your data tiers using the <<cat-thread-pool,CAT thread pools>> API. If your lower temperature tiers are being accessed at higher proportions, then your cluster performance might be impacted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think looking through the cat threadpool API is a big request for an end user. It would be fairly easy to misunderstand, and since it's non-persistent it may give a very skewed view of a workload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! I'm curious what alternative investigation you'd recommend since it's a current user need?

(I again may be ignorant of better ways. For the limited view I have: IME there's only hodge-podge answers like this outlined API currently but that would be a design improvement takeaway but not stop us from telling users the best they can introspect right now. A possible alternative would might be enabling Monitoring and then comparing node ingest rates; would that be better?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible alternative would might be enabling Monitoring and then comparing node ingest rates; would that be better?

Doing it through monitoring would be better (not just ingest rates but also node load). We can think of threadpools as more of an "implementation detail" of the load/throughput rather than something we want them to rely upon.


These proportions are intended to serve as a general baseline that you can apply to your specific
use case, hardware profiles, and architecture.
Comment on lines +61 to +62
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would these actually be applied? You mention above "your requests should be distributed to data tiers in the following approximate proportions", but that's not something prescriptive a user can actually do.

We don't want them to try and route queries to different tiers based on ratios, but rather to size things accordingly. Again, I'm worried that we simplify the problem here, it's not only a performance trade-off but also one of cost (for which this does not account).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fair πŸ€”.

I did not list (my miss) but expected the answer to line up to Support's (A) hold data in higher tiers longer probably by updating an ILM policy, (B) where possible filter searches by time range to avoid load on lower tiers, or (C) review performance vs billing needs via the currently listed "apply to your specific use case, hardware profiles, and architecture". +(D) we recommend Searchable Snapshots to reduce billing while extending data retention.

====

When you index documents directly to a specific index, they remain on content tier nodes indefinitely.
[discrete]
[[available-tier]]
=== Available data tiers
stefnestor marked this conversation as resolved.
Show resolved Hide resolved

When you index documents to a data stream, they initially reside on hot tier nodes.
You can configure <<index-lifecycle-management, {ilm}>> ({ilm-init}) policies
to automatically transition your time series data through the hot, warm, and cold tiers
according to your performance, resiliency and data retention requirements.
Learn more about each data tier, including when and how it should be used.

[discrete]
[[content-tier]]
=== Content tier
==== Content tier

// tag::content-tier[]
Data stored in the content tier is generally a collection of items such as a product catalog or article archive.
Expand All @@ -50,13 +88,14 @@ While they are also responsible for indexing, content data is generally not inge
as time series data such as logs and metrics. From a resiliency perspective the indices in this
tier should be configured to use one or more replicas.

The content tier is required. System indices and other indices that aren't part
of a data stream are automatically allocated to the content tier.
The content tier is required and is often deployed within the same node
grouping as the hot tier. System indices and other indices that aren't part
of a data stream are automatically allocated to the content tier.
// end::content-tier[]

[discrete]
[[hot-tier]]
=== Hot tier
==== Hot tier

// tag::hot-tier[]
The hot tier is the {es} entry point for time series data and holds your most-recent,
Expand All @@ -71,7 +110,7 @@ data stream>> are automatically allocated to the hot tier.

[discrete]
[[warm-tier]]
=== Warm tier
==== Warm tier

// tag::warm-tier[]
Time series data can move to the warm tier once it is being queried less frequently
Expand All @@ -84,7 +123,7 @@ For resiliency, indices in the warm tier should be configured to use one or more

[discrete]
[[cold-tier]]
=== Cold tier
==== Cold tier

// tag::cold-tier[]
When you no longer need to search time series data regularly, it can move from
Expand All @@ -106,7 +145,7 @@ but doesn't reduce required disk space compared to the warm tier.

[discrete]
[[frozen-tier]]
=== Frozen tier
==== Frozen tier

// tag::frozen-tier[]
Once data is no longer being queried, or being queried rarely, it may move from
Expand All @@ -120,9 +159,15 @@ sometimes fetch frozen data from the snapshot repository, searches on the frozen
tier are typically slower than on the cold tier.
// end::frozen-tier[]

[discrete]
[[configure-data-tiers]]
=== Configure data tiers
stefnestor marked this conversation as resolved.
Show resolved Hide resolved

Follow the instructions for your deployment type to configure data tiers.

[discrete]
[[configure-data-tiers-cloud]]
=== Configure data tiers on {ess} or {ece}
==== {ess} or {ece}

The default configuration for an {ecloud} deployment includes a shared tier for
hot and content data. This tier is required and can't be removed.
Expand Down Expand Up @@ -156,7 +201,7 @@ tier].

[discrete]
[[configure-data-tiers-on-premise]]
=== Configure data tiers for self-managed deployments
==== Self-managed deployments

For self-managed deployments, each node's <<data-node,data role>> is configured
in `elasticsearch.yml`. For example, the highest-performance nodes in a cluster
Expand All @@ -174,25 +219,58 @@ tier.
[[data-tier-allocation]]
=== Data tier index allocation

When you create an index, by default {es} sets
<<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>
The <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>> setting determines the tier index shards should be allocated to.
stefnestor marked this conversation as resolved.
Show resolved Hide resolved

When you create an index, by default {es} sets the `_tier_preference`
stefnestor marked this conversation as resolved.
Show resolved Hide resolved
to `data_content` to automatically allocate the index shards to the content tier.

When {es} creates an index as part of a <<data-streams, data stream>>,
by default {es} sets
<<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>
by default {es} sets the `_tier_preference`
to `data_hot` to automatically allocate the index shards to the hot tier.

You can explicitly set `index.routing.allocation.include._tier_preference`
to opt out of the default tier-based allocation.
At the time of index creation, you can override the default setting by explicitly setting
the preferred value in one of two ways:

- By using an <<index-templates,index template>>. Refer to <<getting-started-index-lifecycle-management,Automate rollover with ILM>> for details.
- From within the <<indices-create-index,create index>> request body.
stefnestor marked this conversation as resolved.
Show resolved Hide resolved

You can override this
setting after index creation by <<indices-update-settings,updating the index setting>> to the preferred
value.

In this setting, you can provide multiple tiers in order of preference to prevent indices from remaining unallocated if no nodes are available in the preferred tier.
stefnestor marked this conversation as resolved.
Show resolved Hide resolved

To remove the data tier preference
setting, set the `_tier_preference` value to `null`. This allows the index to allocate to any data node within the cluster. Setting the `_tier_preference` to `null` does not restore the default value. Note that, in the case of managed indices, a <<ilm-migrate,migrate>> action might apply a new value in its place.

[discrete]
[[data-tier-allocation-value]]
==== Determine the current data tier preference

You can check an existing index's data tier preference by <<indices-get-settings,polling its
settings>> for `index.routing.allocation.include._tier_preference`:

[source,console]
--------------------------------------------------
GET /my-index-000001/_settings?filter_path=*.settings.index.routing.allocation.include._tier_preference
--------------------------------------------------
shainaraskas marked this conversation as resolved.
Show resolved Hide resolved

[discrete]
[[data-tier-allocation-troubleshooting]]
==== Troubleshooting

The `_tier_preference` setting might conflict with other allocation settings. This conflict might prevent the shard from allocating. A conflict might occur when a cluster has not yet been completely <<troubleshoot-migrate-to-tiers,migrated
to data tiers>>.

This setting will not unallocate a currently allocated shard, but might prevent it from migrating from its current location to its designated data tier. To troubleshoot, call the <<cluster-allocation-explain,cluster allocation explain API>> and specify the suspected problematic shard.

[discrete]
[[data-tier-migration]]
=== Automatic data tier migration
==== Automatic data tier migration

{ilm-init} automatically transitions managed
indices through the available data tiers using the <<ilm-migrate, migrate>> action.
By default, this action is automatically injected in every phase.
You can explicitly specify the migrate action with `"enabled": false` to disable automatic migration,
You can explicitly specify the migrate action with `"enabled": false` to <<ilm-disable-migrate-ex,disable automatic migration>>,
for example, if you're using the <<ilm-allocate, allocate action>> to manually
specify allocation rules.