Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Mode: Adding first part support for synchronous instruments - storage #5998

Merged

Conversation

asafm
Copy link
Contributor

@asafm asafm commented Nov 19, 2023

Epic

This is the second PR out of several which implements #5105.

What was done here?

Synchronous storage now reuses the list of points used to return the result in collect. Also, in DELTA mode, we are accumulating Attributes in-memory instead of starting from an empty map to avoid memory allocations. This means higher memory usage in DELTA aggregation temporality, and potentially more frequent recording to overflow Attributes when max cardinality is reached frequently. Also, to alleviate recording when max cardinality is reached in DELTA mode, we remove unused Attributes from memory, thus potentially increasing memory allocation. This should be solved by increasing max cardinality configuration. A WARN log message gives you indication of that.

Synchronous Storage (DefaultSynchronousMetricStorage)

Added support for MemoryMode:

  • The list of points returned by collect() is now allocated once and re-used across calls to collect(), in REUSABLE_DATA memory mode
  • in DELTA aggregation temporality, each time we start collect() we switch the active AggregatorHolder with a new one, which contains a new empty aggregatorHandles map. This means it allocated a new key-value and AggregatorHandler per each unique Attributes recorded. Hence, we've made the following changes for REUSABLE_DATA memory mode:
    • When collect() ends, we keep the non-active aggregatorHandles map (which we just finished reading the data from), so next time we start collect() and switch the active AggregatorHolder it will be switch to one that has that map as the aggregatorHandles map. This will prevent any memory allocations to the map and aggregator handles, if the Attributes was used before.
      • The penalty is higher memory usage for DELTA, as we accumulate Attributes in-memory instead of of starting from an empty map on each collect cycle.
    • The above change required collect() to differentiate between handles that were actually recorded since the last collect() and ones that were not. Hence a new boolean flag was added to AggregatorHandle called valuesRecorded. It's basically a "dirty" flag which help us to know whether a recording was made to this handler. Of course it is reset along with the values when doing a reset (This explanation is in the context of DELTA temporality).
    • When the aggregatorHandles map reaches full capacity, as defined by maxCardinality (used-defined or default), we've decided to sacrifice the memory allocation in favor of continuing to record new unique Attributes. We're doing that by removing any AggregatorHandle which was not recorded to since last collect() (using our new boolean flag). REUSABLE_DATA users sensitive to memory allocations will see the improve WARN message, and head over to to MemoryMode.java where we added detailed explanation to memory allocations and how it all behaves.

MemoryMode

  • Added explanation to memory behavior as noted above

Other alternatives

The initial idea was to create a PooledConcurrentHashMap similar to what was done at AsynchronousMetricStorage, but unfortunately, it was discovered there isn't such exists in open source, nor the JDK implementation was simple enough that can be imitated with the small adjustment of pooling the map entries. With such pool, we could have used it as the data structure for aggregatorHandles and avoid allocation map entries. For AggregatorHandle we could have changed the existing pool to a PooledConcurrentArrayStack. Again, there was non to be found, and since we couldn't implement the pooled concurrent hashmap we didn't see advantage of continuing this path.

Next?

This is just the first part. We estimate 3-6 parts, in an effort to keep the PRs small.

@asafm asafm requested a review from a team as a code owner November 19, 2023 15:18
@asafm
Copy link
Contributor Author

asafm commented Nov 19, 2023

@jack-berg First PR. This the most minimal PR I could come up with , that has value as well.

Copy link

codecov bot commented Nov 22, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (531898c) 91.18% compared to head (3444c2c) 91.20%.
Report is 13 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5998      +/-   ##
============================================
+ Coverage     91.18%   91.20%   +0.02%     
- Complexity     5613     5627      +14     
============================================
  Files           616      616              
  Lines         16566    16593      +27     
  Branches       1642     1647       +5     
============================================
+ Hits          15105    15133      +28     
+ Misses         1013     1012       -1     
  Partials        448      448              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* the first 1999 Attributes would be recorded, and the rest of 1001 unique Attributes values
* would be recorded in the CARDINALITY_OVERFLOW Attributes. If after several collect operations,
* the user now records values to only 500 unique attributes, during collect operation, the unused
* 1500 Attributes memory would be cleared from memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to call out this paragraph to other @open-telemetry/java-approvers / @open-telemetry/java-maintainers - @asafm and I have talked about this, but others should be aware of / comment on this behavior.

Essentially, when temporality is DELTA and memory mode is REUSEABLE, you might have a cardinality limit of 2000 for an instrument but only get back 1000 series. This is the result of REUSEABLE memory mode prioritizing low allocations, which is implemented by holding onto the aggregators for series even when they don't have measurements. The behavior of IMMUTABLE mode with delta is to remove the aggregators from the ConcurrentHashMap and put them in an object pool for reuse, but re-adding the entries to the map causing memory allocations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I haven't looked at any of the code, so could be way off base)

would it work to set a boolean field on the reusable objects back to false after collect, and then keeping a running count of how many reusable objects flipped back to true during the interval, so we can "degrade" to creating new objects right away if needed, for "correctness"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trask I believe (correct if I'm wrong) @jack-berg meant a different problem. Say before the 1st collection you recorded to attributes A1-A1000. Same before 2nd collection.
Then before 3rd collection, you recorded only to attributes A1001-A2000. In this case, the response you will get in the 3rd collection is a list of 1000 points (for this instrument) + 1 point of the OVERFLOW Attributes. Also you will get a warning in the log saying you crossed the amount of unique attributes allowed.
The user might confused: He set cardinality limit of 2000, got 1000 points in the last collection and this warning about the limit crossed, but 1000 < 2000 (limit). Before collection 5th, the map would already be clear of attributes A1-A1000, and only filled with Attributes A1001-A1999 (with value = 0).

This only happens in MemoryMode.RESUABLE_DATA which has to be set explicitly, thus you are supposed to read the javaDoc of MemoryMode which explains this.

What you are suggesting is to try to optimize and clear the map of unused handlers during recording, and not wait for collect(). We do have the flag you suggest, on the handlers (hasRecordedValues). The only issue is that we'll have to keep another data structure to know which of the elements in the handlers map are free to evacuate (hasRecordedValues=false). Otherwise we'll have O(N) search to find in the map values which has this flag as false. This will be done per recording() which is the hot path. Not sure it's a good thing.

asafm and others added 5 commits November 22, 2023 20:10
…moryMode.java

Co-authored-by: jack-berg <34418638+jack-berg@users.noreply.github.com>
…lder.java

Co-authored-by: jack-berg <34418638+jack-berg@users.noreply.github.com>
@asafm
Copy link
Contributor Author

asafm commented Nov 22, 2023

@jack-berg Fixed everything

@asafm
Copy link
Contributor Author

asafm commented Nov 23, 2023

@jack-berg I can't think of a scenario where we have a pool larger than maxCardinality:

    // Trim pool down if needed. pool.size() will only exceed maxCardinality if new handles are
    // created during collection.
    int toDelete = aggregatorHandlePool.size() - (maxCardinality + 1);
    for (int i = 0; i < toDelete; i++) {
      aggregatorHandlePool.poll();
    }

@asafm
Copy link
Contributor Author

asafm commented Nov 26, 2023

@jack-berg All conversations have been fixed

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm going to hold off on merging until some of the other @open-telemetry/java-approvers have a chance to weigh in on the proposed behavior highlighted here.

@asafm
Copy link
Contributor Author

asafm commented Dec 7, 2023

@jack-berg Who shall we ping from java-approvers? :)

@asafm
Copy link
Contributor Author

asafm commented Dec 7, 2023

@open-telemetry/java-approvers

@jack-berg
Copy link
Member

Planning on merging this tomorrow if there are no additional comments.

@jack-berg jack-berg merged commit ffd53c7 into open-telemetry:main Dec 14, 2023
18 checks passed
@asafm asafm deleted the memory-mode-sync-instruments-part1 branch December 24, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants