Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Attempt to reduce memory footprint of compaction #627

Closed
wants to merge 49 commits into from

Conversation

codesome
Copy link
Contributor

@codesome codesome commented Jun 11, 2019

What does this PR do

  • Removes building of entire new postings using the postings from other blocks.
  • Builds ([old series id] -> [new series id]) map for each block, and uses that to write postings 1 by 1 for each label-value pair.

I will do any more cleanup needed and post the benchmark results tomorrow. Hoping for some good news.

Update: This PR ended up have much more memory allocation savings in many places than just the above. This comment summarizes it all (hopefully).

UPDATE 2: As this PR got very complicated for review, it is broken down into multiple PRs (more to come soon). For now these are the child PRs which originate from this PR

  1. Breakdown generic writeOffsetTable #643 - Breakdown generic writeOffsetTable
  2. Check error before wrapping #644 - Check error before wrapping
  3. Re-use 'keys' in ReadOffsetTable #645 - Reuse keys in ReadOffsetTable
  4. Reuse byte buffer in WriteChunks and writeHash #653 - Reuse byte buffer in WriteChunks and writeHash
  5. Reuse string buffer in stringTuples.Swap #654 - Reuse string buffer in stringTuples.Swap
  6. NumSeries() method for BlockReader interface #656 - NumSeries() method for BlockReader interface

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…d -> new series id) map.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
compact.go Outdated Show resolved Hide resolved
@codesome
Copy link
Contributor Author

Results are little disappointing

benchmark                                                                                old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     22137952284     37095788828     +67.57%

benchmark                                                                                old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17821797       23269082       +30.57%

benchmark                                                                                old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2111300240     2192055400     +3.82%

Memory profile for this PR: compact_opt_PR.zip

Memory profile for master: compact_opt_master.zip

@codesome
Copy link
Contributor Author

codesome commented Jun 12, 2019

I managed to make it better but has a failing test in vertical compaction. Will push after I fix it.
UPDATE: Benchmarks scores are updated after the fix.

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     22137952284     25780080396     +16.45%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17821797       20869162       +17.10%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2111300240     1629204160     -22.83%

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

This is the profile after the latest commit (getting rid of postingMap). Finally some reduced memory usage, but allocs are still more.
compact_opt_noPostingMap.zip

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

Minute improvement after the last commit

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     22627970234     24963923703     +10.32%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17837934       20792238       +16.56%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2095948320     1549497480     -26.07%

@codesome codesome marked this pull request as ready for review June 12, 2019 13:02
@codesome
Copy link
Contributor Author

This zip container memory and cpu profiles for both master and this PR while running those above benchmarks (the block creation was done independently and BenchmarkCompaction was modified to do only compaction)
mem_cpu_profiles.zip

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…mory.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

While trying to reduce allocs, I was able to get the B/op less than before instead in the last few commits.

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     15926839269     21870626474     +37.32%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17854969       20803998       +16.52%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2111793184     1377519288     -34.77%

Still digging more to reduce allocs.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

Found the culprit for increased allocs: https://github.com/prometheus/tsdb/blob/c1c39ed2e7900fb056445c9818754e3254518a49/compact.go#L870 (this line alone)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

With the latest commit (re-using the bigEndianPostings object instead of creating a new one every time), allocs reduced significantly (With ~17.8M allocs at 0%, +5.11% now compared to +16.x% before).

With that came more B/op reduction. CPU usage is little higher as expected.

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     15923268078     21262455380     +33.53%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17853104       18765444       +5.11%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2112769168     1245923288     -41.03%

I will be looking into reducing more allocs if possible.

This reduces the allocs of WriteLabelIndex significantly.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

codesome commented Jun 18, 2019

Good news in the last 2 commits. Allocs reduced significantly and the diff for allocs is negative now.

EDIT: Numbers updated after fixing the race.

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     16858323771     20695705390     +22.76%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17847227       13111051       -26.54%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2112914256     1218299784     -42.34%

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

More reduction in allocs in the last few commits!

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     16858323771     20462639317     +21.38%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     17847227       10110743       -43.35%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     2112914256     1151946488     -45.48%

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

This last commit f8ddc03 shed off about 2-3s of compaction time from a total of 15-17s before. The change was the avoid repeated binary search on the postingBuf and replace with a single linear scan.

Now it compares like this with the current master.

benchmark                                                                                 old ns/op       new ns/op       delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     13061950871     15038790382     +15.13%

benchmark                                                                                 old allocs     new allocs     delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     9149767        7111218        -22.28%

benchmark                                                                                 old bytes      new bytes      delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     1980380560     1031404000     -47.92%

This benchmark results are on top of recently merged optimizations, and does not relate directly to numbers in this comment #627 (comment) (they were w.r.t. the master without any optimizations).

Copy link
Collaborator

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, makes sense overall. I want to see if we can simplify things further.

compact.go Outdated
return errors.Wrap(err, "write postings")
postingBuf := make([]uint64, 0, 1000000)
var bigEndianPost index.Postings = index.NewBigEndianPostings(nil)
var listPost = index.NewListPostings(nil).(*index.ListPostings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make NewListPostings return a ListPostings?

compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
head.go Outdated
@@ -1349,6 +1361,7 @@ func (h *Head) getOrCreateWithID(id, hash uint64, lset labels.Labels) (*memSerie

h.metrics.series.Inc()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're tracking this through NumSeries(). Can we just drop the gauge and use gaugeFunc()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Also, I have another PR for NumSeries() #656 if that would help in reviewing. Do you want to be in a separate PR or inside this PR itself?

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Copy link
Collaborator

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost lgtm :) Will take another pass after suggestions are fixed before I approve though.

compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
compact.go Outdated Show resolved Hide resolved
index/postings.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

While running prombench with this PR, compactions for head block was failing with the following error.

level=error ts=2019-07-16T09:58:13.238Z caller=db.go:377 component=tsdb msg=\"compaction failed\" err=\"persist head block: write compaction: write postings: 0xc36a049540 series for reference 17138146 not found\"\n

I am not able to replicate it yet. Compaction from head is working fine in synthetic test.

@codesome
Copy link
Contributor Author

Able to re-create by just running prometheus in local machine! Will dig in more.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
This fixes series out-of-order bug when compacting from on-disk block,
because postings used to contain invalid series references as even
the deleted series were present in seriesMap with bogus series references.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

codesome commented Aug 1, 2019

The previous way of merging remapped postings did not scale well. I was keeping a sorted list of postings. And adding the postings in that list while keeping it sorted. This was slow. During prombench test, compaction of on-disk blocks was stuck at this place and allocations also shot up (it had ~13-14M series in total from all the blocks being compacted).

Now I am gathering the postings list from one index reader at a time and merging the sorted list in the old fashion way. I tested this on the indices that I had downloaded from prombench, and they finally don't get stuck here.

Below are the updated (synthetic) benchmark results.

2 blocks, same as all the benchmarks in this PR thread.

benchmark                                                                                  old ns/op       new ns/op       delta
BenchmarkCompaction2/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     12804627758     14663944235     +14.52%

benchmark                                                                                  old allocs     new allocs     delta
BenchmarkCompaction2/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     6149618        4111065        -33.15%

benchmark                                                                                  old bytes      new bytes      delta
BenchmarkCompaction2/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8     1973256560     1042666800     -47.16%

This is for 3 blocks. An indication that with more blocks the time will get more worse.:

benchmark                                                                                  old ns/op       new ns/op       delta
BenchmarkCompaction2/type=normal,blocks=3,series=1000000,samplesPerSeriesPerBlock=51-8     15223759663     19132151137     +25.67%

benchmark                                                                                  old allocs     new allocs     delta
BenchmarkCompaction2/type=normal,blocks=3,series=1000000,samplesPerSeriesPerBlock=51-8     7135520        5096165        -28.58%

benchmark                                                                                  old bytes      new bytes      delta
BenchmarkCompaction2/type=normal,blocks=3,series=1000000,samplesPerSeriesPerBlock=51-8     2137628704     1276175792     -40.30%

I will test this on prombench now.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Contributor Author

codesome commented Aug 5, 2019

Closing it for now. Analysis of this PR vs master is in this doc https://docs.google.com/document/d/1ZzZ8sslkA5LPshr9gqz-MmRpPmyY19jFDz-AKrkwpVw/edit?usp=sharing.

@codesome codesome closed this Aug 5, 2019
@bwplotka
Copy link
Contributor

bwplotka commented Aug 5, 2019

Thanks for that work guys, I think we learnt a lot from this despite negative outcome of this particular experiment!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants