Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing preparations #1479

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

pitdicker
Copy link
Collaborator

@pitdicker pitdicker commented Mar 2, 2024

These issues came up when trying to convert our parsing code to the new error types (only 750 errors to go... 😆).

I would like to restrict the visibility of ParseError to the format module on 0.5, and make all parsing go through format::parsed.
That means the FromStr implementations of DateTime<FixedOffset> and FixedOffset should use the standard parsing methods instead of reaching directly for the functions in format::{parse, scan}.

In timezone_offset I simplified digit parsing a bit because it did not have quite the places to hook up new error values as I wanted. It remains a pretty broken parser though.

Further I noticed a 0i64.checked_sub(v).ok_or(OUT_OF_RANGE) line. Subtracting a positive i64 from 0 can never fail. Then I noticed we don't support parsing negative timestamps at all! That is fixed in the last commit. Parsing i64::MIN was a little tricky.

Copy link

codecov bot commented Mar 2, 2024

Codecov Report

Attention: Patch coverage is 98.07692% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 91.83%. Comparing base (391187f) to head (3b7fa22).

Files Patch % Lines
src/format/scan.rs 96.77% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1479   +/-   ##
=======================================
  Coverage   91.82%   91.83%           
=======================================
  Files          37       37           
  Lines       18173    18166    -7     
=======================================
- Hits        16687    16682    -5     
+ Misses       1486     1484    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -14,7 +14,7 @@ use crate::Weekday;
/// More than `max` digits are consumed up to the first `max` digits.
/// Any number that does not fit in `i64` is an error.
#[inline]
pub(super) fn number(s: &str, min: usize, max: usize) -> ParseResult<(&str, i64)> {
pub(super) fn number(s: &str, min: usize, max: usize, positive: bool) -> ParseResult<(&str, i64)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we trying to do here? Why do we need to support negative timestamps? Having to pass in true for positive all over the place doesn't look great.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to support negative timestamps?

We support negative timestamps everywhere, it seems like an oversight to not support them with the %s formatting specifier when parsing.

Having to pass in true for positive all over the place doesn't look great.

My problem was that I wanted to do a complete fix instead of partial. I.e. to also support parsing i64::MIN (which can be stored in Parsed even though it is out of range for NaiveDateTime, and is something I expect users may test against). Because -i64::MIN can't be represented in an i64 this function should either return an u64, or it must return a value with the correct sign. Returning with the correct sign was the minimal change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think it would be better to turn number() into a helper wrapper of an underlying parsing function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll do that.

@pitdicker
Copy link
Collaborator Author

Added another commit to remove some duplicated length checks. They where just before scan::nanosecond_fixed, which uses scan::number that does the same checks.

@@ -15,6 +15,16 @@ use crate::Weekday;
/// Any number that does not fit in `i64` is an error.
#[inline]
pub(super) fn number(s: &str, min: usize, max: usize) -> ParseResult<(&str, i64)> {
let (s, n) = negative_number(s, min, max)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused by the logic of parsing a non-negative number by first parsing a negative number and then negating it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was quite nice. Positive numbers are not 'better' than negative numbers, and in this case negative numbers had the one integer larger range we needed.

Updated to parse to an u64 and convert with a tricky cast instead.

@@ -424,25 +424,16 @@ where
}

&Internal(InternalFixed { val: InternalInternal::Nanosecond3NoDot }) => {
if s.len() < 3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in the commit message why they're duplicate (that is, how these invariants are upheld)? It's not clear in the diff.

/// More than `max` digits are consumed up to the first `max` digits.
/// Any number that does not fit in `i64` is an error.
#[inline]
pub(super) fn negative_number(s: &str, min: usize, max: usize) -> ParseResult<(&str, i64)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to inline this in the caller (more like before), since there aren't any other users, anyway.

Is there a way to more directly express the precondition for negating the number? This feels kind of roundabout -- even the previous 0i64.checked_sub(v).ok_or() feels clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants