Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf8 case folding and/or comparisons? #135

Open
liquidaty opened this issue Jun 30, 2022 · 4 comments
Open

utf8 case folding and/or comparisons? #135

liquidaty opened this issue Jun 30, 2022 · 4 comments

Comments

@liquidaty
Copy link

Hi,

Thank you for your work on this. Would it be possible to use this (or any other SIMD-based) approach for fast case folding and/or case-insensitive comparison?

@clausecker
Copy link
Collaborator

Case-folding and case-insensitive comparisons are complex and locale dependent. Will be very tricky to vectorise, even if normalised input can be assumed.

@WojciechMula
Copy link
Collaborator

At https://github.com/SnellerInc/sneller we have Unicode support and even for upper/lower-case we ended up with huge lookup tables. Here we might consider using gathers, and check the outcome. Although I'm quite sure performance will be mainly bounded by cache misses penalties.

@lin72h
Copy link

lin72h commented Apr 11, 2024

@WojciechMula I didn't know you are one of the sneller developer, very impressive project BTW!

@WojciechMula
Copy link
Collaborator

@WojciechMula I didn't know you are one of the sneller developer, very impressive project BTW!

Yeah, a lot of great stuff is there. :) Getting back to the main topic, there was an attempt to express lookup table as a huge vectorized if-ladder and it was very slow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants