Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data is not made when keep_columns in get_df is used #750

Open
RoBGlaBe opened this issue Aug 14, 2023 · 0 comments
Open

data is not made when keep_columns in get_df is used #750

RoBGlaBe opened this issue Aug 14, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@RoBGlaBe
Copy link

Describe the bug
When I load data using the keep_columns argument in context.get_df, I get an error (at the end of the post) if the data wasn't made yet. Without the keep_columns argument, or if the data exists already, I can load the data without an issue.
Also see this slack post.

To Reproduce
I use the following runs: ['052059', '052058', '052055', '052054', '052053']
The plugin I use is this one: DeadTimeTPConly

When I do the following, when the data of this plugin wasn't yet made, I get an error (at the end of the post).

import cutax
st = cutax.contexts.xenonnt_offline()
st.register(cutax.DeadtimeTPCOnly)

st.get_df(
        some_runs,
        "deadtime_tpc",
        keep_columns='time endtime lifetime_loss'.split()
        )

However, when the data exists already (was made and cached previously) , or when I don't use the keep_columns argument, the data loads without an issue.

Expected behavior
Using the keep_columns argument, the data should be made and cached in the same way, as without the argument. Then the data should be loaded only providing the arguments columns that were specified in that argument. The second part works already fine, if the data exists.

Versions

cutax.print_versions()

Host midway2-0112.rcc.local
 module version                                                                               path  git
 python  3.9.17                          /opt/XENONnT/anaconda/envs/XENONnT_development/bin/python None
  strax   1.5.2   /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax None
straxen   2.1.2 /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/straxen None
  cutax  1.15.3                                 /dali/lgrandi/xenonnt/software/cutax/v1.15.3/cutax None

Error

Target Mailbox (deadtime_tpc) killed, exception <class 'strax.mailbox.MailboxKilled'>, message (<class 'TypeError'>, TypeError('invalid type promotion with structured datatype(s).'), <traceback object at 0x7fb9167bfe00>)
Exception in thread build:deadtime_tpc:
Traceback (most recent call last):
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 917, in run
Exception in thread save_0:deadtime_tpc:
Traceback (most recent call last):
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 664, in save_from
    source.throw(e)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 447, in _read
    self.kill_from_exception(e)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 213, in kill_from_exception
    raise e
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 444, in _read
    yield res
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 633, in save_from
    chunk = strax.Chunk.concatenate([chunk, next_chunk])
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/chunk.py", line 292, in concatenate
    data=np.concatenate([c.data for c in chunks]),
  File "<__array_function__ internals>", line 180, in concatenate
TypeError: invalid type promotion with structured datatype(s).
    self._target(*self._args, **self._kwargs)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 297, in _send_from
    self.close()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 361, in close
    self.send(StopIteration)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 313, in send
    raise MailboxKilled(self.killed_because)
strax.mailbox.MailboxKilled: (<class 'TypeError'>, TypeError('invalid type promotion with structured datatype(s).'), <traceback object at 0x7fb9167bfe00>)
Target Mailbox (deadtime_tpc) killed, exception <class 'strax.mailbox.MailboxKilled'>, message (<class 'TypeError'>, TypeError('invalid type promotion with structured datatype(s).'), <traceback object at 0x7fb945f15880>)
Exception in thread build:deadtime_tpc:
Traceback (most recent call last):
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 297, in _send_from
    self.close()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 361, in close
    self.send(StopIteration)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 313, in send
    raise MailboxKilled(self.killed_because)
strax.mailbox.MailboxKilled: (<class 'TypeError'>, TypeError('invalid type promotion with structured datatype(s).'), <traceback object at 0x7fb945f15880>)
Exception in thread save_0:deadtime_tpc:
Traceback (most recent call last):
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 664, in save_from
    source.throw(e)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 447, in _read
    self.kill_from_exception(e)
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 213, in kill_from_exception
    raise e
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py", line 444, in _read
    yield res
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 633, in save_from
    chunk = strax.Chunk.concatenate([chunk, next_chunk])
  File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/chunk.py", line 292, in concatenate
    data=np.concatenate([c.data for c in chunks]),
  File "<__array_function__ internals>", line 180, in concatenate
TypeError: invalid type promotion with structured datatype(s).

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 cal_lt.lifetime(st, runs_available[:5])

File ~/analysis_xenonnt/lifetime_calculation_for_ACs/cal_lt.py:18, in lifetime(context, runs)
      6 """
      7 Calculates the corrected lifetime for a given set of runs.
      8 corrected lifetime = (run length) - (lifetime loss due to the busy and high energy veto deatimes).
   (...)
     14 returns: corrected lifetime [ns]
     15 """
     16 context.register(cutax.DeadtimeTPCOnly)
---> 18 data = context.get_df(
     19     runs,
     20     "deadtime_tpc",
     21     keep_columns='time endtime lifetime_loss'.split()
     22     )
     24 lifetime_loss = data['lifetime_loss'].sum()
     25 lifetime_uncorrected = (data['endtime'] - data['time']).sum()

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1564, in Context.get_df(self, run_id, targets, save, max_workers, **kwargs)
   1556 def get_df(self, run_id: ty.Union[str, tuple, list],
   1557            targets, save=tuple(), max_workers=None,
   1558            **kwargs) -> pd.DataFrame:
   1559     """
   1560     Compute target for run_id and return as pandas DataFrame
   1561     
   1562     {get_docs}
   1563     """
-> 1564     df = self.get_array(
   1565         run_id, targets,
   1566         save=save, max_workers=max_workers, **kwargs)
   1567     try:
   1568         return pd.DataFrame.from_records(df)

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1432, in Context.get_array(self, run_id, targets, save, max_workers, **kwargs)
   1429     raise RuntimeError('Cannot allow_multiple with get_array/get_df')
   1431 if len(run_ids) > 1:
-> 1432     results = strax.multi_run(
   1433         self.get_array, run_ids, targets=targets,
   1434         log=self.log,
   1435         save=save, max_workers=max_workers, **kwargs)
   1436 else:
   1437     source = self.get_iter(
   1438         run_ids[0],
   1439         targets,
   1440         save=save,
   1441         max_workers=max_workers,
   1442         **kwargs)

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/utils.py:541, in multi_run(exec_function, run_ids, max_workers, throw_away_result, multi_run_progress_bar, ignore_errors, log, *args, **kwargs)
    539         failures.append(_run_id)
    540         continue
--> 541     raise f.exception()
    543 if throw_away_result:
    544     continue

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/concurrent/futures/thread.py:58, in _WorkItem.run(self)
     55     return
     57 try:
---> 58     result = self.fn(*self.args, **self.kwargs)
     59 except BaseException as exc:
     60     self.future.set_exception(exc)

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1443, in Context.get_array(self, run_id, targets, save, max_workers, **kwargs)
   1436 else:
   1437     source = self.get_iter(
   1438         run_ids[0],
   1439         targets,
   1440         save=save,
   1441         max_workers=max_workers,
   1442         **kwargs)
-> 1443     results = [x.data for x in source]
   1445 results = np.concatenate(results)
   1446 return results

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1443, in <listcomp>(.0)
   1436 else:
   1437     source = self.get_iter(
   1438         run_ids[0],
   1439         targets,
   1440         save=save,
   1441         max_workers=max_workers,
   1442         **kwargs)
-> 1443     results = [x.data for x in source]
   1445 results = np.concatenate(results)
   1446 return results

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1324, in Context.get_iter(self, run_id, targets, save, max_workers, time_range, seconds_range, time_within, time_selection, selection_str, keep_columns, drop_columns, allow_multiple, progress_bar, _chunk_number, **kwargs)
   1319     generator.throw(OutsideException(
   1320         "Terminating due to an exception originating from outside "
   1321         "strax's get_iter (which we cannot retrieve)."))
   1323 except Exception as e:
-> 1324     generator.throw(e)
   1325     raise ValueError(f'Failed to process chunk {n_chunks}!')
   1327 if not seen_a_chunk:

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/context.py:1296, in Context.get_iter(self, run_id, targets, save, max_workers, time_range, seconds_range, time_within, time_selection, selection_str, keep_columns, drop_columns, allow_multiple, progress_bar, _chunk_number, **kwargs)
   1294 pbar.last_print_t = time.time()
   1295 pbar.mbs = []
-> 1296 for n_chunks, result in enumerate(strax.continuity_check(generator), 1):
   1297     seen_a_chunk = True
   1298     if not isinstance(result, strax.Chunk):

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/chunk.py:303, in continuity_check(chunk_iter)
    300 last_runid = None
    302 last_subrun = {'run_id': None}
--> 303 for chunk in chunk_iter:
    304     if chunk.run_id != last_runid:
    305         last_end = None

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/processor.py:302, in ThreadedMailboxProcessor.iter(self)
    296 if exc is not None:
    297     # Reraise exception. This is outside the except block
    298     # to avoid the 'during handling of this exception, another
    299     # exception occurred' stuff from confusing the traceback
    300     # which is printed for the user
    301     self.log.debug("Reraising exception")
--> 302     raise exc.with_traceback(traceback)
    304 # Check the savers for any exception that occurred during saving
    305 # These are thrown back to the mailbox, but if that has already closed
    306 # it doesn't trigger a crash...
    307 # TODO: add savers inlined by parallelsourceplugin
    308 # TODO: need to look at plugins too if we ever implement true
    309 # multi-target mode
    310 for k, saver_list in self.components.savers.items():

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/mailbox.py:444, in Mailbox._read(self, subscriber_i)
    441     res = msg
    443 try:
--> 444     yield res
    445 except Exception as e:
    446     # TODO: Should I also handle timeout errors like this?
    447     self.kill_from_exception(e)

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py:633, in Saver.save_from(self, source, rechunk, executor)
    628         if _is_super_run:
    629             # If we are creating a superrun, we load data from subruns
    630             # and the loaded subrun chunk becomes a superun chunk:
    631             next_chunk = strax.transform_chunk_to_superrun_chunk(run_id, 
    632                                                                  next_chunk)  
--> 633         chunk = strax.Chunk.concatenate([chunk, next_chunk])
    634 else:
    635     chunk = next(source)

File /opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/chunk.py:292, in Chunk.concatenate(cls, chunks)
    279         raise ValueError(
    280             "Attempt to concatenate overlapping or "
    281             f"out-of-order chunks: {chunks} ")
    282     prev_end = c.end
    284 return cls(
    285     start=chunks[0].start,
    286     end=chunks[-1].end,
    287     dtype=chunks[0].dtype,
    288     data_type=data_type,
    289     data_kind=chunks[0].data_kind,
    290     run_id=run_id,
    291     subruns=subruns,
--> 292     data=np.concatenate([c.data for c in chunks]),
    293     target_size_mb=max([c.target_size_mb for c in chunks]))

File <__array_function__ internals>:180, in concatenate(*args, **kwargs)

TypeError: invalid type promotion with structured datatype(s).
@RoBGlaBe RoBGlaBe added the bug Something isn't working label Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant