Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: speed up Position::line_col for large inputs using SIMD #707

Merged
merged 1 commit into from Sep 12, 2022

Conversation

tomtau
Copy link
Contributor

@tomtau tomtau commented Sep 10, 2022

note that this may have extra overhead for small inputs and requires two extra dependencies, hence the faster line_col was put under the optional fast-line-col feature flag.

closes #560

note that this may have extra overhead for small inputs and
requires two extra dependencies, hence the faster `line_col`
was put under the optional `fast-line-col` feature flag.

closes pest-parser#560
@tomtau tomtau requested review from CAD97 and a team September 10, 2022 08:14
@tomtau tomtau requested a review from a team as a code owner September 10, 2022 08:14
@tomtau
Copy link
Contributor Author

tomtau commented Sep 10, 2022

@yodalee @huacnlee @matthew-dean @Kilerd I tried to implement the suggestion by @CAD97 : #560 (comment) -- on aarch64, line_col is approximately 50x faster on that large sample input:

default:

line col                time:   [942.44 µs 944.90 µs 947.59 µs]
                        change: [+3537.9% +3547.9% +3558.2%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild                     

with fast-col-line:

line col                time:   [25.940 µs 26.077 µs 26.251 µs]                      
                        change: [-97.266% -97.255% -97.243%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe

Feel free to open a PR to incrementally track line:col in cursors (e.g. Pair) though if this is not enough.

Copy link
Contributor Author

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will merge this for the point release

@tomtau tomtau merged commit bfbdc4b into pest-parser:master Sep 12, 2022
huacnlee added a commit to huacnlee/pest that referenced this pull request Dec 27, 2022
tomtau pushed a commit that referenced this pull request Dec 29, 2022
…terator. (#754)

* Improve line, col calculate performance by use move cursor on Pairs Iterator.

ref: #707, #560

* Add benchmark for pair.line_col vs position.line_cole

* Fix flat_pairs and pairs.next_back to use position.line_col

* Merge line_col method to use `position::line_col`.

* Fix `pair.line_col` for supports skiped characters, and add test for rev iter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

line_col method N times slow in large file.
1 participant