Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterators 'blocking' each other #61

Open
jeswr opened this issue Apr 6, 2022 · 2 comments
Open

Iterators 'blocking' each other #61

jeswr opened this issue Apr 6, 2022 · 2 comments

Comments

@jeswr
Copy link
Collaborator

jeswr commented Apr 6, 2022

I suspect that this may be caused by #35 so don't worry for now.


This is a potential issue that I saw when perf testing the current PRs, but also saw in main.

If I run (on the main branch)

import { ArrayIterator, range } from '../dist/asynciterator.js'

for (let i = 0; i < 5; i++) {
  const arr = new range(0, 2000000); // I've tested with both `range` and `ArrayIterator`
  
  const iterator = arr
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item);
  
  const now = performance.now();
  iterator.on('data', () => {}).on('end', () => {
    const end = performance.now();
    console.log('elapsed', end - now);
  });
}

then the result is

elapsed 14108.122984007001
elapsed 14112.1332180053
elapsed 14112.153782993555
elapsed 14112.245641991496
elapsed 14112.30153799057

(as opposed to ~3000ms for one)

I suspect the issue is that the end event is called asynchronously so has to wait for everything else to complete before we get back to it.

@jacoscaz
Copy link
Collaborator

jacoscaz commented Apr 6, 2022

I'm not entirely sure about my assumptions here but this should generate a significant amount of CPU contention between iterators as we're still single-threaded (-ish, I know there's some magic happening in V8) and there's no I/O anywhere.

On my machine, the time per item remains constant and the total time scales linearly with the overall number of items, regardless of the number of (quasi-)parallel chains. I get between 8.8 and 9.3 seconds for 10_000_000 items regardless of whether that's due to 1 chain going through 10_000_000 items or 5 chains going through 2_000_000 items each.

Curiously, distributing 10_000_000 items across 100 chains going through 100_000 items each results in roughly 1.25x faster performance, finishing in ~ 6 seconds. I don't think it's something to do with the end event.

@jeswr
Copy link
Collaborator Author

jeswr commented Jun 1, 2022

Just ran some tests that seem to support what @jacoscaz is saying here

Experiment

import { ArrayIterator, range, AsyncIterator } from 'asynciterator'

for (let i = 0; i < 5; i++) {
  const arr = range(0, 2000000); // I've tested with both `range` and `ArrayIterator`
  
  const iterator = arr
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item)
    .map((item) => item);
  
  console.time(`End event emitted for iterator ${i}`)
  console.time(`First data event emitted for iterator ${i}`)
  console.time(`Middle data event emitted for iterator ${i}`)
  console.time(`Last data event emitted for iterator ${i}`)
  iterator.on('data', (x) => {
    if (x === 0) {
      console.timeEnd(`First data event emitted for iterator ${i}`)
    }
    if (x === 1000000) {
      console.timeEnd(`Middle data event emitted for iterator ${i}`)
    }
    if (x === 2000000 - 1) {
      console.timeEnd(`Last data event emitted for iterator ${i}`)
    }
  }).on('end', () => {
    console.timeEnd(`End event emitted for iterator ${i}`)
  });
}

Results

First data event emitted for iterator 0: 8.585ms
First data event emitted for iterator 1: 9.589ms
First data event emitted for iterator 2: 9.729ms
First data event emitted for iterator 3: 9.942ms
First data event emitted for iterator 4: 10.167ms
Middle data event emitted for iterator 0: 7.531s
Middle data event emitted for iterator 1: 7.531s
Middle data event emitted for iterator 2: 7.530s
Middle data event emitted for iterator 3: 7.531s
Middle data event emitted for iterator 4: 7.531s
Last data event emitted for iterator 0: 14.690s
Last data event emitted for iterator 1: 14.690s
Last data event emitted for iterator 2: 14.690s
Last data event emitted for iterator 3: 14.690s
Last data event emitted for iterator 4: 14.690s
End event emitted for iterator 0: 14.691s
End event emitted for iterator 1: 14.691s
End event emitted for iterator 2: 14.691s
End event emitted for iterator 3: 14.691s
End event emitted for iterator 4: 14.691s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants