Implement data retention #14

narqo · 2019-03-28T12:17:43Z

There is almost no sense to keep the profiling data for more than N days.

The exact implementation is in TBD. Open questions:

should this be a general part of cmd/profefe daemon or a dedicated tool for particular storage?
TODO

The text was updated successfully, but these errors were encountered:

narqo · 2019-05-27T22:33:51Z

"Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers" describes an analytical approach for comparing several profiles, based on "profile's variation (entropy)" (see "Stability of profiles"):

We use a single metric, entropy, to measure a given profile’s variation. In short, entropy is a measure of the uncertainty associated with a random variable, which in this case is profile samples.
[..] When we need to identify the changes on the same entries between profiles, we calculate the Manhattan distance of two profiles...

The question is: can a similar approach be used to identify profiles that are no longer needed because the data they contain is redundant comparing to data of other profiles.

narqo · 2020-04-25T18:52:13Z

Following #92 for storage/clickhouse this can be done via ClickHouse's TTL directive in database's DDL. Refer to docs for MergeTree table engine

narqo mentioned this issue Aug 10, 2019

Storage badger #28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement data retention #14

Implement data retention #14

narqo commented Mar 28, 2019

narqo commented May 27, 2019 •

edited

narqo commented Apr 25, 2020 •

edited

Implement data retention #14

Implement data retention #14

Comments

narqo commented Mar 28, 2019

narqo commented May 27, 2019 • edited

narqo commented Apr 25, 2020 • edited

narqo commented May 27, 2019 •

edited

narqo commented Apr 25, 2020 •

edited