New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat emoji presentation sequences as fullwidth #35
Treat emoji presentation sequences as fullwidth #35
Conversation
In terms of UTS 51 conformance, with this PR, this crate will give the correct widths for:
However, it may overestimate (though never underestimate) the rendered widths of: |
3cc64ff
to
5525e7d
Compare
I've replaced the binary search with a better datastructure, and also added a section in the rustdoc documenting the full width rules. |
7ca0bd6
to
75be2e9
Compare
a3d39f4
to
4aa5fb8
Compare
The not-yet-released Unicode 16 adds 8 new non-emoji standardized variation sequences that affect width: https://unicode.org/alloc/Pipeline.html#variation_sequences, https://www.unicode.org/L2/L2023/23212r-quotes-svs-proposal.pdf. In time, we'll need to support those as well. |
4aa5fb8
to
9bb0575
Compare
Faster and smaller!
Ensure rows don't cross cache lines, makes a small difference in the benchmarks
9bb0575
to
5e8bf9b
Compare
This PR is bloated, I'm splitting it up. |
UAX11 says:
Lookup is done with a 2-level trie.