Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lookup tables instead of linear search with memchr #625

Closed
wants to merge 2 commits into from
Closed

Use lookup tables instead of linear search with memchr #625

wants to merge 2 commits into from

Conversation

lopopolo
Copy link
Contributor

Several places use memchr with a byteset. This commit refactors these
code paths to construct a lookup table from u8 -> bool, where an
index is set to true if the byte is present in the given slice.

This change removes linear scans that occur in loops, which changes
these functions runtime complexity from O(m * n) to O(m + n).

@gwenn
Copy link
Collaborator

gwenn commented Apr 23, 2022

I guess break_chars_byteset can be evaluated once / statically.
Something like https://github.com/sqlite/sqlite/blob/master/src/tokenize.c#L61-L80 but for DEFAULT_BREAK_CHARS and DOUBLE_QUOTES_SPECIAL_CHARS.

@lopopolo
Copy link
Contributor Author

lopopolo commented Apr 23, 2022

@gwenn I think so too but I wasn't sure if that change would be accepted since these functions are public APIs that take the byteset as a slice.

I'm not sure how you'd like to make the API breaks.

Do you want to merge this as is or maybe push a PR to my fork?

Several places use `memchr` with a byteset. This commit refactors these
code paths to construct a lookup table from `u8` -> `bool`, where an
index is set to `true` if the byte is present in the given slice.

This change removes linear scans that occur in loops, which changes
these functions runtime complexity from `O(m * n)` to `O(m + n)`.
@gwenn
Copy link
Collaborator

gwenn commented Jan 29, 2023

See #676

@lopopolo lopopolo closed this Jan 29, 2023
@lopopolo lopopolo deleted the lopopolo/memchr-to-lookup-table branch January 29, 2023 18:47
@lopopolo
Copy link
Contributor Author

Thanks @gwenn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants