Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: avoid initializing huge buffers #365

Merged
merged 1 commit into from Apr 10, 2024

Conversation

artemrakov
Copy link
Contributor

@artemrakov artemrakov commented Apr 10, 2024

Addressing the issue: #364

Avoid initializing continuously increasing buffers and limit the size to: 8 * 1024

  • Adding temporary fix by @Marwes
  • Also adding benchmarks for buffers

But one thing I noticed if we use bigger data like 200 MB. The new code behaves significantly slower like 10x. Maybe we should add this code behind feature flag?

Before:

Gnuplot not found, using plotters backend
buffers_small         time:   [1.6066 µs 1.7055 µs 1.8335 µs]
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

buffers_big           time:   [39.428 ns 39.802 ns 40.332 ns]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

After:

buffers_small         time:   [160.24 ns 170.23 ns 182.88 ns]
                        change: [-95.166% -91.191% -84.012%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

buffers_big           time:   [41.895 ns 42.547 ns 43.386 ns]
                        change: [+1.6991% +26.314% +58.727%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

Also did some tests in redis-rs lib:
Before:

Connected to redis. Starting the loop
LoopNumber: 1, RedisGet latency_ms=79, value.is_some()=true
LoopNumber: 2, RedisGet latency_ms=135, value.is_some()=true
LoopNumber: 3, RedisGet latency_ms=130, value.is_some()=true
LoopNumber: 4, RedisGet latency_ms=129, value.is_some()=true
LoopNumber: 5, RedisGet latency_ms=128, value.is_some()=true
LoopNumber: 6, RedisGet latency_ms=133, value.is_some()=true
LoopNumber: 7, RedisGet latency_ms=128, value.is_some()=true
LoopNumber: 8, RedisGet latency_ms=129, value.is_some()=true
LoopNumber: 9, RedisGet latency_ms=129, value.is_some()=true

After:

Connecting to redis
Connected to redis. Starting the loop
LoopNumber: 1, RedisGet latency_ms=19, value.is_some()=true
LoopNumber: 2, RedisGet latency_ms=14, value.is_some()=true
LoopNumber: 3, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 4, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 5, RedisGet latency_ms=17, value.is_some()=true
LoopNumber: 6, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 7, RedisGet latency_ms=15, value.is_some()=true
LoopNumber: 8, RedisGet latency_ms=15, value.is_some()=true
LoopNumber: 9, RedisGet latency_ms=16, value.is_some()=true

@Marwes
Copy link
Owner

Marwes commented Apr 10, 2024

But one thing I noticed if we use bigger data like 200 MB. The new code behaves significantly slower like 10x. Maybe we should add this code behind feature flag?

I think it is fine, it is not as big of a regression compared to the status quo, and it only happens in a less realistic case. (Realistically only when you are using stream decoder on an in-memory buffer, any kind of IO and you are unlikely to get so much data at a time).

I will try to make a better, more holistic fix soon-ish anyway.

@Marwes Marwes merged commit ca74a14 into Marwes:master Apr 10, 2024
4 checks passed
@Marwes
Copy link
Owner

Marwes commented Apr 10, 2024

Thanks. Released as 4.6.7!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants