New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Properly handle records with multiple rows in batching #1647
Conversation
Co-authored-by: Kemal <223029+disq@users.noreply.github.com>
TODO: Consider making a slicer to handle batching by rows & size. This is complicated by the fact that different batchers use different messages, so an overhaul might be required for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments, the main ones are:
- Don't think we need the abstraction over int64 as it's basically wrapping
+=
,>
andGet
- This seems to change the logic to flush after the limit had reached instead of before, it won't be an issue for limiting by rows I think (since it works the same for single row records, and is broken for multi row records anyway), but can be an issue for limiting by bytes. Regardless probably best to flush before going over the limit
Also per your comment in #1647 (comment) if we roll out multi-rows this will become an issue for batching both by rows and by bytes, and with single rows records we're less likely to exceed the batch size.
Not sure what to do about that, and how slicing can impact the performance improvements we get from multi records
6ba3a90
to
1f44030
Compare
🤖 I have created a release *beep* *boop* --- ## [4.42.1](v4.42.0...v4.42.1) (2024-05-15) ### Bug Fixes * Correct error message on Read failure ([#1680](#1680)) ([dc31c3a](dc31c3a)) * Properly handle records with multiple rows in batching ([#1647](#1647)) ([926a7fc](926a7fc)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Fixes https://github.com/cloudquery/cloudquery-issues/issues/1567
Out of scope: https://github.com/cloudquery/cloudquery-issues/issues/1655