Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PHF in EnumString #220

Merged
merged 8 commits into from Apr 30, 2022
Merged

Conversation

Ten0
Copy link
Contributor

@Ten0 Ten0 commented Apr 8, 2022

Resolves #218

Via an additional use_phf attribute at the enum metadata level.

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 8, 2022

I'm using this fork on an enum with ~250 variants (all the same length) and observe extremely significant performance improvements.

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 8, 2022

Tests fail due to #219 (unrelated to this PR)

@Peternator7
Copy link
Owner

Hello @Ten0, thanks for the PR. I think the general idea makes perfect sense. It should be a nice performance win for some cases. Let me clarify a few things so I understand all the ramifications:

  • PHF won't work for case-insensitive comparisons, correct? That's fine, I just want to make sure it's documented correctly.
  • We probably need PHF as an optional dependency + feature in the cargo.toml
  • The MSRV for PHF is higher than strum's so let's make sure it's documented that it's only available in new-ish versions.

@Peternator7
Copy link
Owner

And I'll work on getting the doc tests fixed.

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 11, 2022

We probably need PHF as an optional dependency + feature in the cargo.toml

I opted for not having that and letting the user bring the version of PHF they would want in scope, but that would also be an option. Although less flexible, that would probably be more stable. If doing this, that raises another question: if the phf feature is enabled, do we want to always enable the phf feature in the derive?

PHF won't work for case-insensitive comparisons, correct? That's fine, I just want to make sure it's documented correctly.

That is correct, I have neglected this part. It actually probably wouldn't be too hard to implement: we could lowercase both the arms and the string as it comes in I guess.
In that case though, since that would require allocating, the performance may be lower depending on the number of variants in the enum -> probably we want to keep the attribute flag in that case.

The MSRV for PHF is higher than strum's so let's make sure it's documented that it's only available in new-ish versions.

That's part of the reason why I didn't include it as a feature, but yes if we do we should definitely do that :)

I'll move forward on this feature+ opt dep today or tomorrow.

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 11, 2022

Comments applied :)

@Peternator7
Copy link
Owner

Thanks for the updates!

I triggered a new build. I think there are a few things failing in the 1.32 build.

  • We'll need to disable the phf feature in the test matrix. AppVeyor. If you confirm things work locally, I'm happy to figure this out since it's not very interesting.
  • I don't think nested "or" patterns were added until recently-ish which affects the phf + case-insensitive code. :(

I see that adding code to make phf with compatible complicated quite a few code paths. I wonder if it's best to in the initial version of this feature say phf is not compatible with case insensitive comparisons.
It's clearly workable for this case; however, I get periodic requests from people who want other ways of checking equality, and the compatibility matrix could get complicated quickly. What are your thoughts on that?

@Peternator7
Copy link
Owner

Follow up thought on the phf + equality testing. I think the right compromise between performance and flexibility may be to do something like this:

pub fn from_str(s: &str) -> MyEnum {
    static PHF = map ! { .. };

    if let Some(value) = PHF.get(s) {
        return value;
    }

    match s {
         s if s.ascii_eq_caseinsensitive("my_variant") => MyEnum::MyVariant,
         _ => Err(NotFound),
     }
}

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 17, 2022

I think the right compromise between performance and flexibility may be to do something like this

That is OK for my use-case at the moment as I don't need case insensitive parsing. In that case perhaps we should just use the feature flag to determine whether to use phf? This would make the interface significantly simpler.

@Peternator7
Copy link
Owner

That is OK for my use-case at the moment as I don't need case insensitive parsing. In that case perhaps we should just use the feature flag to determine whether to use phf? This would make the interface significantly simpler.

I think that makes sense. We'll just enable it automatically for anyone who enables the feature flag. In the future, there's a reasonable path for case-insensitive parsing where we add both the lowercase and uppercase versions of the text to the map as an optimization and then more complicated cases still fallback to the match statement.

@Ten0
Copy link
Contributor Author

Ten0 commented Apr 21, 2022

In that case perhaps we should just use the feature flag to determine whether to use phf?

Turns out the requirement on Clone makes that bothersome: because of the way features work in Rust, enabling it in one project may break another if they derived EnumString but not Clone.

So I think this is generally ready.

We'll need to disable the phf feature in the test matrix. AppVeyor. If you confirm things work locally, I'm happy to figure this out since it's not very interesting.

That would be the only thing left to do I think - I've prepared the feature in the tests crate, so it looks like we only need to do the (almost) equivalent of https://github.com/diesel-rs/diesel/blob/50c2c85032857a65e4a8d68dc851fa7ab91d65b2/.github/workflows/ci.yml#L234 here.

You should have push access, as I have checked the "Allow edits by maintainers" box. :)

@Peternator7 Peternator7 merged commit 832dd86 into Peternator7:master Apr 30, 2022
@Peternator7
Copy link
Owner

Thanks for the last round of changes. I went ahead and did the merge, and I'll just open a second PR to fix the build. I'm also going to benchmark the lowercase logic a little bit and potentially tweak/update it. Thanks for your help on this feature!!

@Peternator7
Copy link
Owner

Published to crates.io as version 0.24.1

@Peternator7
Copy link
Owner

@Ten0, I had to temporarily yank 0.24.1 from crates :(. It was released along with #217 which caused more breaking changes than I was happy with. I'm sorry for the inconvenience, and I'm working on getting it re-released, but I wanted to get the incompatible version down before additional people took a dependency on it. thank you for understanding.

@Ten0
Copy link
Contributor Author

Ten0 commented Jun 16, 2022

@Ten0, I had to temporarily yank 0.24.1 from crates :(. It was released along with #217 which caused more breaking changes than I was happy with. I'm sorry for the inconvenience, and I'm working on getting it re-released, but I wanted to get the incompatible version down before additional people took a dependency on it. thank you for understanding.

Thanks for the warning, we reverted to the fork on our system

@Peternator7
Copy link
Owner

Okay 0.24.2 is published which contains phf support and reverted the other change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EnumString would be faster by using phf
2 participants