Bad magic unexpectedly passes #147

ceball · 2020-06-11T11:25:28Z

If I create a new notebook nb.ipynb with two cells:

In[1]:

%notmagic
x = 0
raise TypeError

In[2]:

x = 1

I would expect python -m pytest -v --nbval-lax nb.ipynb to fail, but it passes.

$ python -m pytest -v --nbval-lax nb.ipynb 
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.12.0 -- /home/sefkw/mc3/envs/celltestsui/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.7.6', 'Platform': 'Linux-5.4.0-7634-generic-x86_64-with-debian-bullseye-sid', 'Packages': {'pytest': '5.4.1', 'py': '1.8.1', 'pluggy': '0.12.0'}, 'Plugins': {'nbval': '0.9.5', 'xdist': '1.32.0', 'html': '2.1.1', 'metadata': '1.8.0', 'cov': '2.8.1', 'forked': '1.1.2'}}
rootdir: /home/sefkw/code/external/nbval
plugins: nbval-0.9.5, xdist-1.32.0, html-2.1.1, metadata-1.8.0, cov-2.8.1, forked-1.1.2
collected 2 items                                                                                                                                                         

nb::ipynb::Cell 0 PASSED                                                                                                                                            [ 50%]
nb::ipynb::Cell 1 PASSED                                                                                                                                            [100%]

============================================================================ warnings summary =============================================================================
nbval/plugin.py:115
  /home/sefkw/code/external/nbval/nbval/plugin.py:115: PytestDeprecationWarning: direct construction of IPyNbFile has been deprecated, please use IPyNbFile.from_parent
    return IPyNbFile(path, parent)

nbval/plugin.py:312
nbval/plugin.py:312
  /home/sefkw/code/external/nbval/nbval/plugin.py:312: PytestDeprecationWarning: direct construction of IPyNbCell has been deprecated, please use IPyNbCell.from_parent
    cell, options)

nb.ipynb::Cell 0
  /home/sefkw/mc3/envs/celltestsui/lib/python3.7/site-packages/jupyter_client/manager.py:63: DeprecationWarning: KernelManager._kernel_spec_manager_changed is deprecated in traitlets 4.1: use @observe and @unobserve instead.
    def _kernel_spec_manager_changed(self):

-- Docs: https://docs.pytest.org/en/latest/warnings.html
====================================================================== 2 passed, 4 warnings in 0.55s ======================================================================

In jupyter lab, "run all" stops execution at the first cell and reports UsageError: Line magic function '%notmagic' not found

I'm replacing an existing "notebook checking tool" with nbval; that tool (based on nbconvert) also correctly reports the same failure.

The text was updated successfully, but these errors were encountered:

ceball · 2020-06-11T11:34:20Z

Replacing %notmagic with e.g. %dirs, I get the expected result from nbval:

$ python -m pytest -v --nbval-lax nb.ipynb 
=============================== test session starts ================================
platform linux -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.12.0 -- /home/sefkw/mc3/envs/celltestsui/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.7.6', 'Platform': 'Linux-5.4.0-7634-generic-x86_64-with-debian-bullseye-sid', 'Packages': {'pytest': '5.4.1', 'py': '1.8.1', 'pluggy': '0.12.0'}, 'Plugins': {'nbval': '0.9.5', 'xdist': '1.32.0', 'html': '2.1.1', 'metadata': '1.8.0', 'cov': '2.8.1', 'forked': '1.1.2'}}
rootdir: /home/sefkw/code/external/nbval
plugins: nbval-0.9.5, xdist-1.32.0, html-2.1.1, metadata-1.8.0, cov-2.8.1, forked-1.1.2
collected 2 items                                                                  

nb::ipynb::Cell 0 FAILED                                                     [ 50%]
nb::ipynb::Cell 1 PASSED                                                     [100%]

===================================== FAILURES =====================================
_________________________________ nb.ipynb::Cell 0 _________________________________
Notebook cell execution failed
Cell 0: Cell execution caused an exception

Input:
%dirs
x = 0
raise TypeError

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-e269ed07c00d> in <module>
      1 get_ipython().run_line_magic('dirs', '')
      2 x = 0
----> 3 raise TypeError

TypeError: 

================================= warnings summary =================================
nbval/plugin.py:115
  /home/sefkw/code/external/nbval/nbval/plugin.py:115: PytestDeprecationWarning: direct construction of IPyNbFile has been deprecated, please use IPyNbFile.from_parent
    return IPyNbFile(path, parent)

nbval/plugin.py:312
nbval/plugin.py:312
  /home/sefkw/code/external/nbval/nbval/plugin.py:312: PytestDeprecationWarning: direct construction of IPyNbCell has been deprecated, please use IPyNbCell.from_parent
    cell, options)

nb.ipynb::Cell 0
  /home/sefkw/mc3/envs/celltestsui/lib/python3.7/site-packages/jupyter_client/manager.py:63: DeprecationWarning: KernelManager._kernel_spec_manager_changed is deprecated in traitlets 4.1: use @observe and @unobserve instead.
    def _kernel_spec_manager_changed(self):

-- Docs: https://docs.pytest.org/en/latest/warnings.html
============================= short test summary info ==============================
FAILED nb.ipynb::Cell 0
===================== 1 failed, 1 passed, 4 warnings in 0.59s ======================

ceball · 2020-06-12T10:55:57Z

For the "bad magic", you get a msg['content']['status'] of 'error' (while getting stream='shell' messages):

nbval/nbval/kernel.py

Lines 149 to 164 in 3597e0b

    
               def await_reply(self, msg_id, timeout=None): 
        
                   """ 
        
                   Continuously poll the kernel 'shell' stream for messages until: 
        
                    - It receives an 'execute_reply' status for the given message id 
        
                    - The timeout is reached awaiting a message, in which case 
        
                      a `Queue.Empty` exception will be raised. 
        
                   """ 
        
                   while True: 
        
                       msg = self.get_message(stream='shell', timeout=timeout) 
        
                       # Is this the message we are waiting for? 
        
                       if msg['parent_header'].get('msg_id') == msg_id: 
        
                           if msg['content']['status'] == 'aborted': 
        
                               # This should not occur! 
        
                               raise RuntimeError('Kernel aborted execution request') 
        
                           return

(You can see that message in kernelspy in the screenshot above.)

Here, 'error' is treated no differently from any other status (except 'aborted').

Meanwhile, in runtest(), while getting outputs only from the iopub channel, the only related message is a 'stream' message.

nbval/nbval/plugin.py

Lines 586 to 718 in 3597e0b

    
           # Now get the outputs from the iopub channel 
        
           while True: 
        
               # The iopub channel broadcasts a range of messages. We keep reading 
        
               # them until we find the message containing the side-effects of our 
        
               # code execution. 
        
               try: 
        
                   # Get a message from the kernel iopub channel 
        
                   msg = self.parent.get_kernel_message(timeout=self.output_timeout) 
        
               except Empty: 
        
                   # This is not working: ! The code will not be checked 
        
                   # if the time is out (when the cell stops to be executed?) 
        
                   # Halt kernel here! 
        
                   kernel.stop() 
        
                   if timed_out_this_run: 
        
                       self.raise_cell_error( 
        
                           "Timeout of %g seconds exceeded while executing cell." 
        
                           " Failed to interrupt kernel in %d seconds, so " 
        
                           "failing without traceback." % 
        
                               (timeout, self.output_timeout), 
        
                       ) 
        
                   else: 
        
                       self.parent.timed_out = True 
        
                       self.raise_cell_error( 
        
                           "Timeout of %d seconds exceeded waiting for output." % 
        
                               self.output_timeout, 
        
                       ) 
        
               # now we must handle the message by checking the type and reply 
        
               # info and we store the output of the cell in a notebook node object 
        
               msg_type = msg['msg_type'] 
        
               reply = msg['content'] 
        
               out = NotebookNode(output_type=msg_type) 
        
               # Is the iopub message related to this cell execution? 
        
               if msg['parent_header'].get('msg_id') != msg_id: 
        
                   continue 
        
               # When the kernel starts to execute code, it will enter the 'busy' 
        
               # state and when it finishes, it will enter the 'idle' state. 
        
               # The kernel will publish state 'starting' exactly 
        
               # once at process startup. 
        
               if msg_type == 'status': 
        
                   if reply['execution_state'] == 'idle': 
        
                       break 
        
                   else: 
        
                       continue 
        
               # execute_input: To let all frontends know what code is 
        
               # being executed at any given time, these messages contain a 
        
               # re-broadcast of the code portion of an execute_request, 
        
               # along with the execution_count. 
        
               elif msg_type == 'execute_input': 
        
                   continue 
        
               # com? execute reply? 
        
               elif msg_type.startswith('comm'): 
        
                   continue 
        
               elif msg_type == 'execute_reply': 
        
                   continue 
        
               # This message type is used to clear the output that is 
        
               # visible on the frontend 
        
               # elif msg_type == 'clear_output': 
        
               #     outs = [] 
        
               #     continue 
        
               # elif (msg_type == 'clear_output' 
        
               #       and msg_type['execution_state'] == 'idle'): 
        
               #     outs = [] 
        
               #     continue 
        
               # 'execute_result' is equivalent to a display_data message. 
        
               # The object being displayed is passed to the display 
        
               # hook, i.e. the *result* of the execution. 
        
               # The only difference is that 'execute_result' has an 
        
               # 'execution_count' number which does not seems useful 
        
               # (we will filter it in the sanitize function) 
        
               # 
        
               # When the reply is display_data or execute_result, 
        
               # the dictionary contains 
        
               # a 'data' sub-dictionary with the 'text' AND the 'image/png' 
        
               # picture (in hexadecimal). There is also a 'metadata' entry 
        
               # but currently is not of much use, sometimes there is information 
        
               # as height and width of the image (CHECK the documentation) 
        
               # Thus we iterate through the keys (mimes) 'data' sub-dictionary 
        
               # to obtain the 'text' and 'image/png' information 
        
               elif msg_type in ('display_data', 'execute_result'): 
        
                   out['metadata'] = reply['metadata'] 
        
                   out['data'] = reply['data'] 
        
                   outs.append(out) 
        
                   if msg_type == 'execute_result': 
        
                       out.execution_count = reply['execution_count'] 
        
               # if the message is a stream then we store the output 
        
               elif msg_type == 'stream': 
        
                   out.name = reply['name'] 
        
                   out.text = reply['text'] 
        
                   outs.append(out) 
        
               # if the message type is an error then an error has occurred during 
        
               # cell execution. Therefore raise a cell error and pass the 
        
               # traceback information. 
        
               elif msg_type == 'error': 
        
                   # Store error in output first 
        
                   out['ename'] = reply['ename'] 
        
                   out['evalue'] = reply['evalue'] 
        
                   out['traceback'] = reply['traceback'] 
        
                   outs.append(out) 
        
                   if not self.options['check_exception']: 
        
                       # Ensure we flush iopub before raising error 
        
                       try: 
        
                           self.parent.kernel.await_idle(msg_id, self.output_timeout) 
        
                       except Empty: 
        
                           self.stop() 
        
                           raise RuntimeError('Timed out waiting for idle kernel!') 
        
                       traceback = '\n' + '\n'.join(reply['traceback']) 
        
                       if out['ename'] == 'KeyboardInterrupt' and self.parent.timed_out: 
        
                           msg = "Timeout of %g seconds exceeded executing cell" % timeout 
        
                       else: 
        
                           msg = "Cell execution caused an exception" 
        
                       self.raise_cell_error(msg, traceback) 
        
               # any other message type is not expected 
        
               # should this raise an error? 
        
               else: 
        
                   print("unhandled iopub msg:", msg_type)

Here, only messages of type 'error' result in a cell error; 'stream' messages merely contribute to output.

So, no error is detected for bad magics. (Not sure if there could be other things that are missed too.)

This is in contrast to e.g. a regular python exception, which results in an iopub message of type 'error' and is hence detected here in runtest().

I'm not sure what the solution should be. E.g. I don't know if it's safe to be checking only iopub messages for errors, or if something else in the stack should be sending an iopub error message for bad magics (just like for a regular python execution failure).

ceball linked a pull request Jun 11, 2020 that will close this issue

WIP: Fix handling of bad magics #148

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad magic unexpectedly passes #147

Bad magic unexpectedly passes #147

ceball commented Jun 11, 2020

ceball commented Jun 11, 2020

ceball commented Jun 12, 2020

Bad magic unexpectedly passes #147

Bad magic unexpectedly passes #147

Comments

ceball commented Jun 11, 2020

ceball commented Jun 11, 2020

ceball commented Jun 12, 2020