DB: Truncate timestamps to microseconds #7075
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Historically, go-mysql-driver rounded all timestamps to the nearest microsecond before forwarding them to the database connector. This is because MySQL only supports microsecond accuracy for timestamps, so the driver did the rounding first, to avoid sending unnecessary bits on the wire.
However, this led to a bug regarding double-rounding, detailed in go-sql-driver/mysql#1121. The fix for this bug was for go-mysql-driver to stop rounding timestamps entirely, favoring correctness over optimization, and allowing the database itself to handle rounding and truncation however it is configured to do so. This fix was landed in go-sql-driver/mysql#1172, which was included in the v1.6.0 release of that package (https://github.com/go-sql-driver/mysql/releases/tag/v1.6.0).
When we attempted to upgrade our version of go-mysql-driver to v1.6.0, we ran into significant performance problems which we failed to diagnose. The same happened over a year later, when we attempted to upgrade to v1.7.1. In the course of these investigations we found and fixed numerous other bugs, including upstreaming fixes to ProxySQL, but none of those seemed to alleviate the original issue. See the following bugs and PRs for the full timeline and context:
Finally, with the help of new deployment infrastructure, we bisected the whole set of go-mysql-driver commits between v1.5.0 and v1.7.1 to try to find the culprit. This revealed the timestamp-rounding commit described above as the first commit to cause performance issues, and finally made the explanation clear.
Our database has a significant quantity of timestamp data, all of which has been written to the database with truncated timestamps, thanks to go-sql-driver's historical behavior. When switching to a newer version, all of our SELECT queries which used time ranges were now querying for those time ranges with untruncated precision, and were therefore not using our existing indexes.
This obviously led to significant query performance issues, particularly on the
fqdnSets
,orderFqdnSets
, andauthz2
tables. These tables are very heavily queried with time ranges both for order and authorization reuse, and for rate limits. The fact that this issue only arises in the presence of significant historical data also explains why we could never reproduce it in the development environment or integration tests.This change causes Boulder to do the time truncation that go-sql-driver used to do. This should allow our queries to continue using the existing indexes, and prevent performance regressions.