Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pairs#move_cursor is slow down iter #784

Closed
huacnlee opened this issue Jan 31, 2023 · 5 comments · Fixed by #785
Closed

Pairs#move_cursor is slow down iter #784

huacnlee opened this issue Jan 31, 2023 · 5 comments · Fixed by #785
Labels

Comments

@huacnlee
Copy link
Member

huacnlee commented Jan 31, 2023

Hi @tomtau,

I am very sorry about that. 😥

I have found Pairs#iter performance issue when we merged #754.

Because when we iter pairs, each iter next will called Position#line_col once to move cursor. That will slow down the iter .

Especially that, in sometimes we just need get a little pair's line,col, not all of pairs. So to do this is not value of money.

By my test:

fn bench_pairs_iter(c: &mut Criterion) {
    let data = include_str!("data.json");

    fn iter_all_pairs(pairs: pest::iterators::Pairs<autocorrect::Rule>) {
        for pair in pairs {
            iter_all_pairs(pair.into_inner());
        }
    }

    c.bench_function("nested iter", |b| {
        let pairs = autocorrect::JsonParser::parse(autocorrect::Rule::item, &data).unwrap();

        b.iter(move || iter_all_pairs(pairs.clone()));
    });

    c.bench_function("flatten iter", |b| {
        let pairs = autocorrect::JsonParser::parse(autocorrect::Rule::item, &data).unwrap();

        b.iter(move || {
            for _pair in pairs.clone().flatten().into_iter() {
                // do nothing
            }
        });
    });
}
nested iter                             time:   [258.27 µs 260.05 µs 262.64 µs]
nested iter (fast-line-col)             time:   [14.943 µs 14.963 µs 14.993 µs]
flatten iter                            time:   [2.0367 µs 2.1104 µs 2.2144 µs]

In a large files (or have a lot of pairs, For instance: enabled WHITESPACE, each spaces will a pair item), iterate all pairs may take a long time than before.

So we need decide to how to do, remove it or find some solution to fix that.

@tomtau
Copy link
Contributor

tomtau commented Jan 31, 2023

How about feature-guarding it? It'll be disabled by default -- if someone needs it, they can enable that feature?

@huacnlee
Copy link
Member Author

huacnlee commented Feb 1, 2023

Or change API like peekable or flatten, to make a special iterator?

pairs.locatable().next()

This will make a LocatablePairs, then prepare line, col only work in it.

@tomtau
Copy link
Contributor

tomtau commented Feb 1, 2023

@huacnlee yes, that's perhaps simpler in the sense that the combination of possible feature flags doesn't increase more. Would you like to open a PR with this fix?

@huacnlee
Copy link
Member Author

huacnlee commented Feb 1, 2023

I have trying to find other way to solve it.

huacnlee added a commit to huacnlee/pest that referenced this issue Feb 1, 2023
…e performance issues.

Resolve pest-parser#784
Ref: https://github.com/rust-lang/rust/blob/1.67.0/src/tools/rust-analyzer/crates/ide-db/src/line_index.rs

Benchmark result:

```
pair.line_col           time:   [11.032 µs 11.653 µs 12.461 µs]
position.line_col       time:   [219.32 µs 224.17 µs 229.99 µs]
pairs nested iter             time:   [2.0168 ms 2.0381 ms 2.0725 ms]
pairs flatten iter            time:   [4.5973 µs 4.6132 µs 4.6307 µs]
```
@huacnlee
Copy link
Member Author

huacnlee commented Feb 1, 2023

@tomtau To see my new PR, I find a new way to solve it.

#785

huacnlee added a commit to huacnlee/pest that referenced this issue Feb 1, 2023
…ate and `Pair::line_col` performance.

Resolve pest-parser#784
Ref: https://github.com/rust-lang/rust/blob/1.67.0/src/tools/rust-analyzer/crates/ide-db/src/line_index.rs

Benchmark result:

```
pair.line_col                 time:   [11.032 µs 11.653 µs 12.461 µs]
position.line_col             time:   [219.32 µs 224.17 µs 229.99 µs]
pairs nested iter             time:   [2.0168 ms 2.0381 ms 2.0725 ms]
pairs flatten iter            time:   [4.5973 µs 4.6132 µs 4.6307 µs]
```
huacnlee added a commit to huacnlee/pest that referenced this issue Feb 1, 2023
…ate and `Pair::line_col` performance.

Resolve pest-parser#784
Ref: https://github.com/rust-lang/rust/blob/1.67.0/src/tools/rust-analyzer/crates/ide-db/src/line_index.rs

Benchmark result:

```
pair.line_col                 time:   [11.032 µs 11.653 µs 12.461 µs]
position.line_col             time:   [219.32 µs 224.17 µs 229.99 µs]
pairs nested iter             time:   [2.0168 ms 2.0381 ms 2.0725 ms]
pairs flatten iter            time:   [4.5973 µs 4.6132 µs 4.6307 µs]
```
huacnlee added a commit to huacnlee/pest that referenced this issue Feb 1, 2023
…ate and `Pair::line_col` performance.

Resolve pest-parser#784
Ref: https://github.com/rust-lang/rust/blob/1.67.0/src/tools/rust-analyzer/crates/ide-db/src/line_index.rs

Benchmark result:

```
pair.line_col                 time:   [11.032 µs 11.653 µs 12.461 µs]
position.line_col             time:   [219.32 µs 224.17 µs 229.99 µs]
pairs nested iter             time:   [2.0168 ms 2.0381 ms 2.0725 ms]
pairs flatten iter            time:   [4.5973 µs 4.6132 µs 4.6307 µs]
```
tomtau pushed a commit that referenced this issue Feb 5, 2023
…ext` and `Pair::line_col` performance. (#785)

* Add benchmark test for nested and flatten iterate all pairs.

For issue #784

* Add `LineIndex` instead of `Pairs::move_cursor` to improve Pairs iterate and `Pair::line_col` performance.

Resolve #784
Ref: https://github.com/rust-lang/rust/blob/1.67.0/src/tools/rust-analyzer/crates/ide-db/src/line_index.rs

Benchmark result:

```
pair.line_col                 time:   [11.032 µs 11.653 µs 12.461 µs]
position.line_col             time:   [219.32 µs 224.17 µs 229.99 µs]
pairs nested iter             time:   [2.0168 ms 2.0381 ms 2.0725 ms]
pairs flatten iter            time:   [4.5973 µs 4.6132 µs 4.6307 µs]
```

* Fix `Pair#into_inner` to ref `line_index`.

* Fix LineIndex for support utf8 char. And update benchmark result.

* Add benchmark result for 10K times iter for line_col.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants