Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backwards compatible support for fmt::Write #407

Closed
wants to merge 12 commits into from

Conversation

Mathspy
Copy link

@Mathspy Mathspy commented Dec 6, 2021

Hello, first thank you for all the work on this lovely library! ❤️

I am submitting this PR for discussion and learning more than I am trying to get it actually merged

Implementation

This resolves #375 by adding a compatibility layer between fmt::Write and io::Write. And switching the formatting machinery to use fmt::Write.

Rationale

The reason for picking fmt::Write as the canonical implementation is because of this quote from Rust fmt::Write doc page:

This trait only accepts UTF-8–encoded data and is not flushable. If you only want to accept Unicode and you don’t need flushing, you should implement this trait; otherwise you should implement std::io::Write.

Since we only want to accept UTF-8 (Is this assumption correct?) and we don't need flushing (couldn't find any calls to .flush() in the codebase) it felt more correct to use fmt::Write.
Also since fmt::Write has more guarantees (i.e UTF-8) it's easier to implement fmt::Write for io::Write than the other way around.

Uncertainties

Possible follow-ups

  • We can add a SafeLiteral(&str) to FormatItem. We should also be able to update the macros and runtime parsers to use that instead of Literal(&[u8]).
  • If you think this is PR is a good idea, it's up to you to decide whether you want to keep support for both Writes for foreseeable future or to deprecate the support for io::Write to be removed in a future breaking change. (Could also deprecate Literal in favor of SafeLiteral in parallel)

Tell me what do you think and if you think this is taking a completely wrong dimension please don't hesitate to close it!

Thank you ❤️

@codecov
Copy link

codecov bot commented Dec 6, 2021

Codecov Report

Merging #407 (959bb73) into main (2b9e193) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #407   +/-   ##
=======================================
  Coverage   99.71%   99.71%           
=======================================
  Files          60       60           
  Lines        5615     5696   +81     
=======================================
+ Hits         5599     5680   +81     
  Misses         16       16           
Impacted Files Coverage Δ
src/date.rs 100.00% <100.00%> (ø)
src/error/format.rs 100.00% <100.00%> (ø)
src/formatting/formattable.rs 100.00% <100.00%> (ø)
src/formatting/mod.rs 100.00% <100.00%> (ø)
src/offset_date_time.rs 100.00% <100.00%> (ø)
src/primitive_date_time.rs 100.00% <100.00%> (ø)
src/time.rs 100.00% <100.00%> (ø)
src/utc_offset.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b9e193...959bb73. Read the comment docs.

@jhpratt jhpratt added A-formatting Area: formatting C-feature-request Category: a new feature (not already implemented) C-keep-open Category: should not be closed due to inactivity labels Dec 9, 2021
@jhpratt jhpratt self-requested a review December 9, 2021 05:56
Mathspy added a commit to Mathspy/diary-generator that referenced this pull request Dec 22, 2021
For now `notion-generator` can't be published to crates.io because of
relying on the fork of time crate
Might be able to published it after time-rs/time#407
Mathspy added a commit to Mathspy/diary-generator that referenced this pull request Dec 22, 2021
For now `notion-generator` can't be published to crates.io because of
relying on the fork of time crate
Might be able to published it after time-rs/time#407
@jhpratt
Copy link
Member

jhpratt commented Jan 2, 2022

Hey there. Sorry about the delay in getting to this. Would you mind rebasing? The conflict is just due to a dependency upgrade and is trivial. The other thing I immediately see is that the error branches aren't covered; could you do this? std::fmt::Error is publicly a ZST so it shouldn't be difficult to construct a case that covers this.

Once these two things are done I'll do an in-depth review and merge if appropriate.

@jhpratt
Copy link
Member

jhpratt commented Jan 2, 2022

Oh, and assuming that the bytes are UTF-8 is not an assumption that should be made. It is currently possible to construct a FormatDescription manually with raw bytes, and I believe that there are tests that rely on this. I'm 99% certain there are real-world users that need this behavior too.

@Mathspy
Copy link
Author

Mathspy commented Jan 3, 2022

Hey there. Sorry about the delay in getting to this. Would you mind rebasing? The conflict is just due to a dependency upgrade and is trivial. The other thing I immediately see is that the error branches aren't covered; could you do this? std::fmt::Error is publicly a ZST so it shouldn't be difficult to construct a case that covers this.

Once these two things are done I'll do an in-depth review and merge if appropriate.

No worries! Happy new year ❤️
Fixed conflict and coverage

Oh, and assuming that the bytes are UTF-8 is not an assumption that should be made. It is currently possible to construct a FormatDescription manually with raw bytes, and I believe that there are tests that rely on this. I'm 99% certain there are real-world users that need this behavior too.

Yeah it looks like my assumption was indeed in incorrect, I was able to observe different behaviors using this test case

use time::format_description::FormatItem;
use time::macros::date;

const NON_UTF8: &[FormatItem<'_>] = &[
    FormatItem::Literal(&[195, 40]),
];

fn main() {
    let today = date!(2022 - 01 - 03);

    let mut bytes = vec![195, 40];

    dbg!(today.format_into(&mut bytes, NON_UTF8));
    dbg!(bytes);
}

On main branch this produced [195, 40, 195, 40] but on this branch it produces [195, 40, 239, 191, 189, 40] so this isn't mergable as-is.

One possible solution is to reverse the roles and implement io::Write for fmt::Write like this:

struct Compat<'a, W: fmt::Write> {
    writer: &'a mut W,
}

impl<'a, W> io::Write for Compat<'a, W>
where
    W: fmt::Write,
{
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        let string = String::from_utf8_lossy(buf);

        self.writer
            .write_str(&string)
            .map(|_| string.len())
            .map_err(|_| io::Error::new(io::ErrorKind::Other, "Internal fmt error"))
    }

    fn flush(&mut self) -> io::Result<()> {
        Ok(())
    }
}

But this can be done in userland so I am not really sure if there's any benefit to including it here.
One other possible solution is for the Sealed trait to require a manual implementation for format_into and format_into_fmt_write where they'd be pretty much identical implementations except for behavior around FormatItem::Literal but that seems like a lot of duplicate code to keep around.

So I am not sure if there's anything here that's actionable for time crate anymore, should I close this?

@jhpratt
Copy link
Member

jhpratt commented Jan 3, 2022

But this can be done in userland so I am not really sure if there's any benefit to including it here.

Well, it's been requested a few times so there's some benefit in just doing it once so others don't have to. It's not like it's out of scope or anything.

One other possible solution is for the Sealed trait to require a manual implementation for format_into and format_into_fmt_write where they'd be pretty much identical implementations except for behavior around FormatItem::Literal but that seems like a lot of duplicate code to keep around.

Wouldn't the only place that would need to be changed be impl<'a> sealed::Sealed for FormatItem<'a>? That is the implementation that everything ultimately falls back on. If this is the case it's not that much code to duplicate: just a dozen lines or so. Well-known types can be passed through String::from_utf8 (matching and using unreachable!("…") on the result), as the formats are known to be UTF-8 in their entirety.

My thought right now (having largely not reviewed the approach you actually took) is to create a private trait that any type implementing io::Write or fmt::Write implements. All internal APIs can then be made generic over that trait. Right off the top of my head I think this might run into a currently unsolvable problem, though: a type can implement both io::Write and fmt::Write and specialization is not stable. If this is in fact the case, then altering the aforementioned implementation is probably the way to go unless I'm missing something?

@Mathspy
Copy link
Author

Mathspy commented Jan 3, 2022

Wouldn't the only place that would need to be changed be impl<'a> sealed::Sealed for FormatItem<'a>? That is the implementation that everything ultimately falls back on. If this is the case it's not that much code to duplicate: just a dozen lines or so. Well-known types can be passed through String::from_utf8 (matching and using unreachable!("…") on the result), as the formats are known to be UTF-8 in their entirety.

This made me realize that there was actually a path forward within the current implementation, I pushed it in the latest commits alongside a test to verify that it works (and that it indeed did fail before the changes)

My thought right now (having largely not reviewed the approach you actually took) is to create a private trait that any type implementing io::Write or fmt::Write implements. All internal APIs can then be made generic over that trait. Right off the top of my head I think this might run into a currently unsolvable problem, though: a type can implement both io::Write and fmt::Write and specialization is not stable.

I am not entirely sure how would this look, could you elaborate on it? I agree that this sounds like something that would quickly hit specialization though

@jhpratt
Copy link
Member

jhpratt commented Jan 3, 2022

I am not entirely sure how would this look, could you elaborate on it?

Basically it would require all internal methods like format_number to be generic over a trait that shims the two writers into a desired API. I'm quite certain it's not currently possibly, though.

I'll take a look at the code soon™

@jhpratt
Copy link
Member

jhpratt commented Mar 10, 2022

Finally coming back around to this. After some more thought, I don't think that this is the best way forward. Duplicating methods has some overhead from the user's perspective, as you need to know which to call — and they're split for no apparent reason to someone new to or unfamiliar with Rust. In addition, the compatibility is ultimately a hack. While possible to avoid, doing so would lead to significant duplication as noted earlier in this thread.

For these reasons (along with minor nits), I'm going to close this. I sincerely thank you for your work on this; it's not trivial to do. Your work is appreciated even though this PR is not ultimately being merged. Know that this code may still be used, in whole or in part, in the future.

Looking forward, I think specialization will lead to the ideal solution here. Being able to accept either io::Write or fmt::Write is clearly preferable, and having the ability to prefer io::Write when a parameter implements both would be great. This would be wholly transparent to the user, and would avoid some of the concerns introduced here. To my knowledge min_specialization is sound and may be stabilized later this year. Full specialization is not needed here.

@jhpratt jhpratt closed this Mar 10, 2022
@jhpratt jhpratt removed the C-keep-open Category: should not be closed due to inactivity label Mar 10, 2022
@Mathspy Mathspy deleted the fmt-write branch March 10, 2022 11:33
@Mathspy
Copy link
Author

Mathspy commented Mar 10, 2022

Sounds good! And thank you for your time ❤️ Looking forward to specialization stabilization as well

@Mathspy Mathspy restored the fmt-write branch November 8, 2022 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-formatting Area: formatting C-feature-request Category: a new feature (not already implemented)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

format_into() requires io::Write; what about fmt::Write in Debug/Display impls?
2 participants