New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRubyTraceBuffer can exceed its max size and that leads to flaky tests #1704
Comments
I think the main issue here is there there's a way for the internal array to grow, but never to shrink, meaning over time it can slowly grow (if there are consistently more than 1000 traces/sec). Adding I suggest we add a way to shrink the array during |
I guess that would work, but I'm not entirely convinced that a E.g. a simple benchmark on my machine shows: require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'benchmark-ips'
end
require 'benchmark/ips'
THE_ARRAY = [1, 2, 3, 4, 5]
Benchmark.ips do |x|
x.report(RUBY_DESCRIPTION) { THE_ARRAY.slice!(10...) }
x.compare!
end
On the "slow path" (e.g. slice does need to happen) it would probably be a bit slower BUT if the code correctly had gone for a replace rather than a add, the slice would happen anyway so I claim that accepting the same trade-off seems reasonable. |
I was thinking about this and Because we concatenate at the end of the internal buffer, this means we are always dropping the most recent elements on There are other options (do not slice at the end of buffer, or insert new elements not at the end of buffer), but they make the operations much more costly on a non-full buffer. I'm thinking that we need to address this on |
Good point, indeed |
In #1172 when we added
CRubyTraceBuffer
we documented thatdd-trace-rb/lib/ddtrace/buffer.rb
Lines 179 to 180 in 148dfae
I've found that this class can exceed the 4% after it was pointed out as a flaky test in CI.
On a closer look, the issue with the current implementation is that while it's safe to operate on an array concurrently on MRI, the current way in which we decide to add or replace and item in the buffer is not atomic with the actual operation.
This means that this simple change:
easily triggers test failures and goes over the expected 4%
In terms of correctness, I think we could fix this by executing a
slice!
operation after we add an item, to ensure that the array did not grow, something along the lines of......but I decided to open this issue instead of a PR because we may want to take a different approach.
The text was updated successfully, but these errors were encountered: