Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect date without leading zero in .get('2016-1-17') #292

Closed
bencharb opened this issue Jan 4, 2016 · 9 comments
Closed

incorrect date without leading zero in .get('2016-1-17') #292

bencharb opened this issue Jan 4, 2016 · 9 comments

Comments

@bencharb
Copy link

bencharb commented Jan 4, 2016

This is misleading. '2016-01-17' != '2016-1-17'

import arrow
import dateutil

withzero = '2016-01-17'
withoutzero = '2016-1-17'
assert dateutil.parser.parse(withzero) == dateutil.parser.parse(withoutzero)
assert arrow.get(withzero) != arrow.get(withoutzero)
@mattalytics
Copy link

Even more disturbingly:

arrow.get('2016/1/1').datetime == arrow.get('2016/1/10').datetime

!!!

@bencharb
Copy link
Author

bencharb commented Jan 6, 2016

I'm startled that this basic parsing fails.

@philiptzou
Copy link
Contributor

According to the document, arrow.get support the format "2016-01-17" is because it is an ISO-8601-formatted str [doc:ArrowFactory]. According to RFC3339 which follows ISO-8601, the date-month, date-mday, time-hour, time-minute and time-second are all strict 2 digit chars[rfc3339:sec5.6].

So I'm afraid this might be wontfix and perhaps you could use more flexible way to parse that date string. For example:

import arrow
arrow.parser.DateTimeParser().parse('2016-1-10', 'YYYY-M-D')

Or another solution here is we add another method to DateTimeParser which loosely parse string looks like ISO-8601 but not exactly is? Actually you can add such thing to your software easily. Just copy all the code of DateTimeParser.parse_iso and replace all MM to M, DD to D, HH to H, mm to m, and ss to s. Also don't forget the ones in DateTimeParser.MARKERS. And you got your own parse_loose_iso function.

@mattalytics
Copy link

Hi Philip,

I see. That is not unreasonable. However, if Arrow is to strictly
enforce the 2 digit month/day standard, perhaps a 1 digit day/month (e.g.
2010-1-1) should throw an error. Quietly transforming the date is a good
way to cause bugs in users' implementations.

Personally, I like the idea of arrow being a bit more flexible. Ease of
use is why I decided to try using arrow. Just my personal perspective.

Matt
On Tuesday, January 5, 2016, Philip Tzou notifications@github.com wrote:

According to the document, arrow.get support the format "2016-01-17" is
because it is an ISO-8601-formatted str [doc:ArrowFactory]
http://arrow.readthedocs.org/en/latest/#arrow.factory.ArrowFactory.
According to RFC3339 which follows ISO-8601, the date-month, date-mday,
time-hour, time-minute and time-second are all strict 2 digit chars
[rfc3339:sec5.6] http://tools.ietf.org/html/rfc3339#section-5.6.

So I'm afraid this might be wontfix and perhaps you could use more
flexible way to parse that date string. For example:

import arrow
arrow.parser.DateTimeParser().parse('2016-1-10', 'YYYY-M-D')

Or another solution here is we add another method to DateTimeParser which
loosely parse string looks like ISO-8601 but not exactly is? Actually you
can add such thing to your software easily. Just copy all the code of
DateTimeParser.parse_iso and replace all MM to M, DD to D, HH to H, mm to
m, and ss to s. Also don't forget the ones in DateTimeParser.MARKERS. And
you got your own parse_loose_iso function.


Reply to this email directly or view it on GitHub
#292 (comment).

@bencharb
Copy link
Author

bencharb commented Jan 6, 2016

I'm with Matt, if the date string is invalid or ambiguous it ought to raise an exception. Thanks for your attention to this.

@philiptzou
Copy link
Contributor

I'm also with the idea of raising an exception. I think it is feasible. Btw, @bencharb I think you are still need to parse the non-standard string. I know Arrow is great which helps us solved the timezone headache and output beatiful human-readable strings, but it may be not the best choice suitable for your needs. AFAIK parsedatetime is good at parsing human-readable datetime strings. So you may want to try that instead.

@bencharb
Copy link
Author

bencharb commented Jan 6, 2016

That works, thanks, @philliptzou.

@balihoo-gens
Copy link

I'd like to reference issue #267 here as it points out a similar case of defaulting to 1 when parsing fails (instead of an exception). Example:

>>> arrow.get("2016-09-31")
<Arrow [2016-09-01T00:00:00+00:00]>

One would hope for an exception like dateutils.parser.parse gives:

>>> dateutil.parser.parse("2016-09-31")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "dateutil/parser.py", line 1164, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "dateutil/parser.py", line 577, in parse
    ret = default.replace(**repl)
ValueError: day is out of range for month

Currently, my workaround is to not have arrow.get do any parsing, using dateutil instead:
arrow.get(dateutil.parser.parse(date_string))

@andrewelkins
Copy link
Contributor

Will be handled by #91

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants