Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect iso week-years (w) #887

Closed
wants to merge 5 commits into from
Closed

Fix incorrect iso week-years (w) #887

wants to merge 5 commits into from

Conversation

xmo-odoo
Copy link
Contributor

@xmo-odoo xmo-odoo commented Jun 15, 2022

This is related to #885 but doesn't fix the non-ISO issues, probably (not actually tested).

The issue this fixes is that while #621 fixed some cases it broke others, specifically the case where a day of year A is part of a week of year A-1, but A-1 has 53 weeks and A has 52.

See last commit for a complete explanation of the process.

  • The first commit is just a convenience because the current tox.ini doesn't forward parameters to pytest which is frustrating.
  • The second commit adds an exhaustive-ish parametrized test to the test suite: using a different method of computing week numbers (which I assume is correct but might not be), it checks every date within 10 days of January 1st, from year 1900 to 2100, to verify that babel formats the correct iso-week-year and iso-week.
  • The last commit fixes the issue.

Note that this is only a fix (somewhat dubious) for incorrect ISO weeks. It does not fix:

I figure it might be useful to expand the exhaustive test to other known locales, or even to all combinations of (min_week_days, first_week_day) though ensuring the oracle is correct under all conditions would be less trivial, here I was able to check no a normal calendar to see if test failures were genuine (aka the oracle was correct and babel not). I don't think there's a calendar out there where the first DOW is wednesday and the first week of the year must have 5 days or some such nonsense.

I also just realised that I could just have used isocalendar() directly as my ISO oracle instead of computing it by hand... Tough I guess my excuse is that I was thinking of non-ISO calendars where I don't think there's a built-in oracle.

@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #887 (9245914) into master (76677ea) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #887      +/-   ##
==========================================
+ Coverage   90.98%   91.00%   +0.02%     
==========================================
  Files          25       25              
  Lines        4393     4393              
==========================================
+ Hits         3997     3998       +1     
+ Misses        396      395       -1     
Files Coverage Δ
babel/dates.py 87.39% <100.00%> (+0.14%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xmo-odoo
Copy link
Contributor Author

@akx any issue?

@akx
Copy link
Member

akx commented Oct 31, 2022

Hi @xmo-odoo, so sorry for the delay.

I took a look at this originally and left it to simmer in a background thread and eventually forgot. I'm pretty hesitant about a calendar-related change (because times are hard and this could probably affect someone's business).

I'd love to see a reference to some other system implementing things the same way – in fact using isocalendar for the tests would be a great way to do that, since I do trust Python's stdlib to have gotten ISO weeks right..?

@xmo-odoo
Copy link
Contributor Author

xmo-odoo commented Nov 2, 2022

I'd love to see a reference to some other system implementing things the same way – in fact using isocalendar for the tests would be a great way to do that, since I do trust Python's stdlib to have gotten ISO weeks right..?

The second commit should be a pretty exhaustive test for the range [1900, 2100]. Using a known and perfectly reliable computation method (first iso week is the one which contains Jan 4th) seemed like an ideal oracle for tests. Unless it's not a proper definition of ISO week and has caveats?

Furthermore since the actual code uses isocalendar(), I figured it would be a bad idea to reuse that for the test oracle.

@xmo-odoo
Copy link
Contributor Author

@akx rebased the code because there were new conflicts, however there seem to be unrelated test failures from problems with some datetime names or something along those lines: 3 tests fail because they assume US datetimes are standard timezones (resp. US/Eastern and America/Los_Angeles) but they get the daylight version.

It's convenient to be able to invoke maxfail or request a specific
test even through tox, but because of the setenv it didn't seem to
work (or I did something wrong).
200 years of tests might be more than necessary, but this doesnt seem
to actually add a completely unreasonable amount of runtime: before
this, on my machine, `-epy310` lists 4010 tests running in 25.65s with
a total `time` of 44.55 (43 user, 1.52 sys); with these it lists 8231
tests running in 40s with a total `time` of 59.18 (56.83 user, 2.28
system).

For each year, the first week of year is computed through the Jan 4th
method: the first week of a year contains Jan 4th, then week numbers
are computed based on their offset from that.

Uses parametrized test with an explicit id for precise yet readable
selection e.g. a given instance shows up as

    test_iso_week_exhaustive[1904-01-01]

where the id between brackets is the date being checked in ISO format.

Also remove ad-hoc existing ISO tests for `w` since they're
redundant (though they're not actively harmful, probably).
PR #621 had added an adjustment for overflowing week
numbers (e.g. days of calendar year A living in weak-year A+1),
however this adjustment would mis-trigger on *underflow*: in that case
the year of `day_of_period` and `self.value` differ (`self.value` is
the next year), meaning the number of weeks in the year differs, so
when underflowing a week from a 52-weeks year to a 53-weeks year, the
weeks count would then be overflowed back, and associated with the
wrong week-year.

Example: in 2010-01-04 is on a Monday, so 2010-01-03 is part of the
last week of 2009, which is 2009 W53.

But because of the way `get_week_number` works we first compute the
week number of "3, sunday", find that it's 0, compute the week number
of December 31st, 2009, get 53, then check the number of weeks
against *2010* because `isocalendar` is called with 2010-01-03
adjusted to *2010*-12-28, thus we end up with a week number of 1,
associated with a correct week-year of 2009.

Hence being told that 2010-01-03 is in 2009-W01.

Fix by moving the modulo operation to `format_week` and only to the
case where we may have overflowed the year.

Notes:

- the check is incredibly gross
- the issue of overflowing the year occurs in every week numbering
  system, however it means we need to compute the number of weeks in
  non-ISO locales
- also it's *under*flowing which is only an issue in ISO and
  technically it's not even ISO itself, but a `min_week_days` which is
  not 1, otherwise by definition the week-year of a date can only be
  equal to or higher than the calendar year
- similar issues almost certainly occurs with months as well, but I'm
  not convinced it matters to anyone given week-months are not a thing
  so non-split month-weeks are unusable
Simplifies the generation of subtests sufficiently that using a single comprehension is kinda feasible.
@xmo-odoo xmo-odoo closed this by deleting the head repository Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants