Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OverlapWindowPlugin causes strax.chunk.CannotSplit #490

Open
JoranAngevaare opened this issue Jul 17, 2021 · 1 comment · Fixed by XENONnT/straxen#1209
Open

OverlapWindowPlugin causes strax.chunk.CannotSplit #490

JoranAngevaare opened this issue Jul 17, 2021 · 1 comment · Fixed by XENONnT/straxen#1209
Labels
bug Something isn't working

Comments

@JoranAngevaare
Copy link
Member

JoranAngevaare commented Jul 17, 2021

Describe the bug
OverlapWindowPlugin causes un-splittable chunks. See e.g. '020318'

To Reproduce
Insert the MWE of how to reproduce the error

straxer     020318   --target event_info_double     --context xenonnt_v3     --package cutax     --notlazy     --timeout 1200     --context_kwargs '{"use_rucio": 1}'

Expected behavior
Since the OverlapWindowPlugin keeps track of what data is seen, it should never split results in such a way that the next time it splits this causes errors. This could potentially only happen if the overlap is too small and we are creating different splits because there is more data in the next iteration.

Additionally, we should implement a try: except block to provide a proper traceback since this is uninformative

Versions

Working on dali023.rcc.local with the following versions and installation paths:
python	v3.8.0	(default, Nov  6 2019, 21:49:08) [GCC 7.3.0]
strax	v0.16.1	/home/angevaare/software/dev_strax/strax/strax
straxen	v0.19.3	/home/angevaare/software/dev_strax/straxen/straxen
cutax	v0.1.1	/home/angevaare/software/dev_strax/cutax/cutax

Traceback

Exception in thread build:merged_s2s:
Traceback (most recent call last):
  File "/home/angevaare/software/Miniconda3/envs/strax_dev/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/angevaare/software/Miniconda3/envs/strax_dev/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/angevaare/software/dev_strax/strax/strax/mailbox.py", line 268, in _send_from
    self.kill_from_exception(e)
  File "/home/angevaare/software/dev_strax/strax/strax/mailbox.py", line 187, in kill_from_exception
    raise e
  File "/home/angevaare/software/dev_strax/strax/strax/mailbox.py", line 255, in _send_from
    x = next(iterable)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 616, in iter
    yield from super().iter(iters, executor=executor)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 427, in iter
    yield self.do_compute(chunk_i=chunk_i, **inputs_merged)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 636, in do_compute
    _, result = result.split(t=self.sent_until,
  File "/home/angevaare/software/dev_strax/strax/strax/chunk.py", line 170, in split
    data1, data2, t = split_array(
  File "/home/angevaare/software/dev_strax/strax/strax/chunk.py", line 376, in split_array
    raise CannotSplit()
strax.chunk.CannotSplit
Got 830 items. Now 369.5 sec / 20.5% into the run. Using 32588.0 MB RAM. ETA 1344.09 sec.
Got 839 items. Now 401.7 sec / 22.3% into the run. Using 31875.5 MB RAM. ETA 1285.11 sec.
Got 784 items. Now 433.4 sec / 24.1% into the run. Using 31786.8 MB RAM. ETA 1212.04 sec.
2021-07-17 05:58:32,198 - MainThread - ThreadedMailboxProcessor - CRITICAL - Target Mailbox (event_info) killed, exception <class 'strax.mailbox.MailboxKilled'>, message (<class 'strax.chunk.CannotSplit'>, CannotSplit(), <traceback object at 0x7f4079c1e8c0>)
Traceback (most recent call last):
  File "/home/angevaare/software/Miniconda3/envs/strax_dev/bin/straxer", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/home/angevaare/software/dev_strax/straxen/bin/straxer", line 297, in <module>
    sys.exit(main(args))
  File "/home/angevaare/software/dev_strax/straxen/bin/straxer", line 236, in main
    for i, d in enumerate(get_results()):
  File "/home/angevaare/software/dev_strax/straxen/bin/straxer", line 233, in get_results
    yield from st.get_iter(**kwargs)
  File "/home/angevaare/software/dev_strax/strax/strax/context.py", line 1099, in get_iter
    generator.throw(e)
  File "/home/angevaare/software/dev_strax/strax/strax/context.py", line 1072, in get_iter
    for n_chunks, result in enumerate(strax.continuity_check(generator), 1):
  File "/home/angevaare/software/dev_strax/strax/strax/chunk.py", line 303, in continuity_check
    for chunk in chunk_iter:
  File "/home/angevaare/software/dev_strax/strax/strax/processor.py", line 286, in iter
    raise exc.with_traceback(traceback)
  File "/home/angevaare/software/dev_strax/strax/strax/mailbox.py", line 255, in _send_from
    x = next(iterable)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 616, in iter
    yield from super().iter(iters, executor=executor)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 427, in iter
    yield self.do_compute(chunk_i=chunk_i, **inputs_merged)
  File "/home/angevaare/software/dev_strax/strax/strax/plugin.py", line 636, in do_compute
    _, result = result.split(t=self.sent_until,
  File "/home/angevaare/software/dev_strax/strax/strax/chunk.py", line 170, in split
    data1, data2, t = split_array(
  File "/home/angevaare/software/dev_strax/strax/strax/chunk.py", line 376, in split_array
    raise CannotSplit()
strax.chunk.CannotSplit
Processing job ended
@JoranAngevaare JoranAngevaare added the bug Something isn't working label Jul 17, 2021
@JoranAngevaare JoranAngevaare changed the title OverlapWindowPlugin causes OverlapWindowPlugin causes strax.chunk.CannotSplit Jul 17, 2021
@JelleAalbers
Copy link
Member

If you get this despite the generous fudge factor in MergedS2's window size, MergedS2s would seem to have an unexpectedly large causal horizon, i.e. time during which data and decisions at one time can affect earlier and later results.

If so, I'm not sure you can fix the situation in strax. Even if you prevent a crash by some clever chunk juggling or #518, you are still in trouble. OverlapWindowPlugins with horizons exceeding their windows cause results to depend on where chunk breaks are, i.e. on whether you process online or from disk, and on how many original chunks you merged into each file.

I'm obviously far out of the loop, but could this relate to XENONnT/straxen#548? The new merging algorithm seems to consider all gaps in a chunk at once, skipping around from smaller gaps to larger gaps. If so, then the decision to merge two peaklets could depend on data everywhere in the chunk, in an extreme case where there are no gaps > max_duration between any two peaklets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants