Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytest 5.4.0 breaks tests #93

Open
Gallaecio opened this issue Mar 13, 2020 · 20 comments
Open

Pytest 5.4.0 breaks tests #93

Gallaecio opened this issue Mar 13, 2020 · 20 comments

Comments

@Gallaecio
Copy link

It seems to happen in the cleanup stage after all the tests of a TestCase, and marks the last test of that TestCase as a failure.

===================================================================== test session starts ======================================================================
platform linux -- Python 3.6.7, pytest-5.4.0, py-1.8.0, pluggy-0.13.0
rootdir: /home/adrian/proxectos/scrapy, inifile: pytest.ini
plugins: pylint-0.14.0, twisted-1.12, forked-1.0.1, xdist-1.26.0, cov-2.8.1
collected 4 items                                                                                                                                              

tests/test_command_fetch.py F...                                                                                                                         [100%]

=========================================================================== FAILURES ===========================================================================
____________________________________________________________________ FetchTest.test_headers ____________________________________________________________________

self = <Process pid=None status=0>

    def reapProcess(self):
        """
        Try to reap a process (without blocking) via waitpid.
    
        This is called when sigchild is caught or a Process object loses its
        "connection" (stdout is closed) This ought to result in reaping all
        zombie processes, since it will be called twice as often as it needs
        to be.
    
        (Unfortunately, this is a slightly experimental approach, since
        UNIX has no way to be really sure that your process is going to
        go away w/o blocking.  I don't want to block.)
        """
        try:
            try:
>               pid, status = os.waitpid(self.pid, os.WNOHANG)
E               TypeError: an integer is required (got type NoneType)

/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/process.py:299: TypeError

During handling of the above exception, another exception occurred:

self = <twisted.internet.epollreactor.EPollReactor object at 0x7fe23908a828>, selectable = <twisted.internet.posixbase._SIGCHLDWaker object at 0x7fe2365af358>
fd = 15, event = 1

    def _doReadOrWrite(self, selectable, fd, event):
        """
        fd is available for read or write, do the work and raise errors if
        necessary.
        """
        why = None
        inRead = False
        if event & self._POLL_DISCONNECTED and not (event & self._POLL_IN):
            # Handle disconnection.  But only if we finished processing all
            # the pending input.
            if fd in self._reads:
                # If we were reading from the descriptor then this is a
                # clean shutdown.  We know there are no read events pending
                # because we just checked above.  It also might be a
                # half-close (which is why we have to keep track of inRead).
                inRead = True
                why = CONNECTION_DONE
            else:
                # If we weren't reading, this is an error shutdown of some
                # sort.
                why = CONNECTION_LOST
        else:
            # Any non-disconnect event turns into a doRead or a doWrite.
            try:
                # First check to see if the descriptor is still valid.  This
                # gives fileno() a chance to raise an exception, too.
                # Ideally, disconnection would always be indicated by the
                # return value of doRead or doWrite (or an exception from
                # one of those methods), but calling fileno here helps make
                # buggy applications more transparent.
                if selectable.fileno() == -1:
                    # -1 is sort of a historical Python artifact.  Python
                    # files and sockets used to change their file descriptor
                    # to -1 when they closed.  For the time being, we'll
                    # continue to support this anyway in case applications
                    # replicated it, plus abstract.FileDescriptor.fileno
                    # returns -1.  Eventually it'd be good to deprecate this
                    # case.
                    why = _NO_FILEDESC
                else:
                    if event & self._POLL_IN:
                        # Handle a read event.
>                       why = selectable.doRead()

/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/posixbase.py:614: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/posixbase.py:227: in doRead
    process.reapAllProcesses()
/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/process.py:63: in reapAllProcesses
    process.reapProcess()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Process pid=None status=0>

    def reapProcess(self):
        """
        Try to reap a process (without blocking) via waitpid.
    
        This is called when sigchild is caught or a Process object loses its
        "connection" (stdout is closed) This ought to result in reaping all
        zombie processes, since it will be called twice as often as it needs
        to be.
    
        (Unfortunately, this is a slightly experimental approach, since
        UNIX has no way to be really sure that your process is going to
        go away w/o blocking.  I don't want to block.)
        """
        try:
            try:
                pid, status = os.waitpid(self.pid, os.WNOHANG)
            except OSError as e:
                if e.errno == errno.ECHILD:
                    # no child process
                    pid = None
                else:
                    raise
        except:
>           log.msg('Failed to reap %d:' % self.pid)
E           TypeError: %d format: a number is required, not NoneType

/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/process.py:307: TypeError
--------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------
Unhandled Error
Traceback (most recent call last):
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/posixbase.py", line 227, in doRead
    process.reapAllProcesses()
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/process.py", line 63, in reapAllProcesses
    process.reapProcess()
  File "/home/adrian/.local/share/virtualenvs/scrapy-R00A9LxV/lib/python3.6/site-packages/twisted/internet/process.py", line 307, in reapProcess
    log.msg('Failed to reap %d:' % self.pid)
builtins.TypeError: %d format: a number is required, not NoneType

=================================================================== short test summary info ====================================================================
FAILED tests/test_command_fetch.py::FetchTest::test_headers - TypeError: %d format: a number is required, not NoneType
================================================================= 1 failed, 3 passed in 4.51s ==================================================================
@altendky
Copy link
Member

Thanks for reporting. I've setup a nightly build and it passed this morning with pytest 5.4.0. But, that's master not v1.12. I suppose a nightly of the release version would be good too... Anyways, would you please share a pip freeze of your env? I haven't recreated the issue locally either. My local is Ubuntu 19.10 and (amongst others) I have Python 3.6.10, so it's at least close'ish to your setup. Thanks for the help.

@altendky
Copy link
Member

Alrighty, I see it over in the Scrapy build on Travis.

https://travis-ci.org/github/scrapy/scrapy/jobs/661676387?utm_medium=notification&utm_source=github_status

And of course I was quickly just running pytest-twisted tests. I'll let you know what I can figure out.

@altendky
Copy link
Member

I'm looking but have you done any debugging on this? So far it looks like scrapy code is managing the subprocessing by way of FetchTest inheriting from ProcessTest? It also looks to be an issue in the the first test within the class and be that the process is attempting to be reaped before it has even been forked?

@altendky

This comment has been minimized.

@altendky
Copy link
Member

got it running with pytest directly, my bad

@altendky
Copy link
Member

From what I can tell it happens in the first run test of FetchTests while yielding to ProcessTest.execute(). For me anyways the first run is FetchTests.test_headers() but if I comment that out then the first is FetchTests.test_output() which then fails.

https://github.com/scrapy/scrapy/blob/ccc4d88779cf2827431ef9e73f976f755c04fe0e/tests/test_command_fetch.py

https://github.com/scrapy/scrapy/blob/ccc4d88779cf2827431ef9e73f976f755c04fe0e/scrapy/utils/testproc.py#L13-L21

    def execute(self, args, check_code=True, settings=None):
        env = os.environ.copy()
        if settings is not None:
            env['SCRAPY_SETTINGS_MODULE'] = settings
        cmd = self.prefix + [self.command] + list(args)
        pp = TestProcessProtocol()
        pp.deferred.addBoth(self._process_finished, cmd, check_code)
        reactor.spawnProcess(pp, cmd[0], cmd, env=env, path=self.cwd)
        return pp.deferred

Git bisect points at pytest-dev/pytest@04f27d4.

git bisect run bash -c 'cd ../scrapy; venv/bin/pytest tests/test_command_fetch.py'

@blueyed, maybe you have some quick insight to share? If not, we can keep digging.

I will note that there's a pretty significant rework over in #91 changing how pytest-twisted uses the pytest plugin hooks (significant relative to the fairly small pytest-twisted, that is). Sadly that didn't seem to change anything.

So I think we can call log.msg('Failed to reap %d:' % self.pid) a bug in Twisted since it should log for any not-explicitly-evil self.pid including None. That fixes one layer of distraction anyways. It also gives us the error for all four tests in FetchTest.

FAILED tests/test_command_fetch.py::FetchTest::test_headers - TypeError: an integer is required (got type NoneType)
FAILED tests/test_command_fetch.py::FetchTest::test_output - TypeError: an integer is required (got type NoneType)
FAILED tests/test_command_fetch.py::FetchTest::test_redirect_default - TypeError: an integer is required (got type NoneType)
FAILED tests/test_command_fetch.py::FetchTest::test_redirect_disabled - TypeError: an integer is required (got type NoneType)

And if you further avoid calling os.waitPID() on None then all the FetchTests pass. Though, I'm less confident that is totally proper and even if it is it may still indicate a further issue. I'll at least start some discussion over there.

And yes... I see that this has nothing to do with pytest or pytest-twisted. I expect the potential further indicated issue may lead back to one of them (or scrapy itself perhaps).

@Gallaecio
Copy link
Author

Running those tests with trial also works, which is why I discarded twisted and Scrapy itself being at fault.

@altendky
Copy link
Member

So my present analysis is that... pytest changed something (haven't gotten to analyzing that) that now exposes that scrapy calls a deferred's .callback() in TestProcessProtocol.processEnded() while the associated subprocess is being reaped. This callback ends up letting Twisted start reaping all processes while in an invalid state. The process' .pid is None but the process has not been removed from the reapProcessHandlers list. This triggers an error calling os.waitpid() since it expects an integer. This triggers an error because the logging assumes the pid will be an integer.

I still plan to try to figure out from the Twisted side if this is in error in how the pid gets set to None and the process gets removed from the list or if there is a documentation improvement to make that will explain how you can't .callback() in ProcessProtocol.processEnded(). I am not aware at the moment of any reason that we can't just change Twisted to address this. But, I'm certainly no expert at this part of Twisted.

scrapy/scrapy#4426 is an exploratory PR in case Twisted decides that the answer is to not allow .callback() in ProcessProtocol.processEnded().

@blueyed
Copy link

blueyed commented Mar 15, 2020

@altendky
It is the same / similar problem as with pytest-dev/pytest-django#824 (comment): pytest is "raising out of" the unittest.

@altendky
Copy link
Member

Avoid log failure when reaping of None PID is requested

(not a solution, just a thing to be fixed)

@blueyed, thanks. I'll read up on that when I get back to the core issue. Got another Twisted PR to fit in etc...

@altendky
Copy link
Member

t.i.process._BaseProcess.reapProcess() sets self.pid to None before unregistering

So... I'm not sure this should be called a fix but it does seem to alleviate the symptom. @Gallaecio, could you give this a try at some point? The thing I don't like is not being sure why the scrapy test end up back in the reactor where it tries to reap all the processes while still in the middle of reaping the one being tested.

@Gallaecio
Copy link
Author

I will try to give it a try this week.

@altendky
Copy link
Member

Alrighty, so at least there's a test in pytest-twisted itself that shows the same issue... twisted/twisted#1240 'fixes' it. Though I think i want to test more specifically about the 're-entering' into the reactor.

Fun, with the asyncio reactor you get RuntimeError: This event loop is already running. :] Which is really a _much_ more descriptive error...

@altendky
Copy link
Member

Alrighty, so since asyncio already complains let's maybe start there. I think the reactor.iterate(0) call is probably the place I'll dig when I find some time. I _should_ be familiar with trial, but I'm not. Let alone the interaction between pytest-twisted and trial.

https://github.com/pytest-dev/pytest-twisted/pull/94/checks?check_run_id=546309854

test_scrapy____something.py::Test::test FAILED                           [100%]
test_scrapy____something.py::Test::test ERROR                            [100%]

==================================== ERRORS ====================================
________________________ ERROR at teardown of Test.test ________________________

self = <test_scrapy____something.Test testMethod=test>
result = <TestCaseFunction test>

    def _classCleanUp(self, result):
        try:
>           util._Janitor(self, result).postClassCleanup()

/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/trial/_asynctest.py:226: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/trial/util.py:113: in postClassCleanup
    calls = self._cleanPending()
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/internet/utils.py:238: in warningSuppressingWrapper
    return runWithWarningsSuppressed(suppressedWarnings, f, *a, **kw)
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/internet/utils.py:221: in runWithWarningsSuppressed
    reraise(exc_info[1], exc_info[2])
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/python/compat.py:464: in reraise
    raise exception.with_traceback(traceback)
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/internet/utils.py:217: in runWithWarningsSuppressed
    result = f(*a, **kw)
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/trial/util.py:138: in _cleanPending
    reactor.iterate(0)
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/trial/_asynctest.py:242: in _
    return self._reactorMethods[name](*a, **kw)
/home/runner/work/pytest-twisted/pytest-twisted/.tox/py38-asyncioreactor/lib/python3.8/site-packages/twisted/internet/asyncioreactor.py:262: in iterate
    self._asyncioEventloop.run_forever()
/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py:560: in run_forever
    self._check_running()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_UnixSelectorEventLoop running=False closed=False debug=False>

    def _check_running(self):
        if self.is_running():
>           raise RuntimeError('This event loop is already running')
E           RuntimeError: This event loop is already running

/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py:552: RuntimeError

@blueyed
Copy link

blueyed commented Mar 30, 2020

@altendky
I've not followed up on details really, but make sure to not get fooled by pytest 5.4 breaking unittest teardown - while this might be interesting to investigate followup failures it certainly can be a time sink also, and is an issue with pytest after all.

@altendky
Copy link
Member

@blueyed, thanks for poking me on this. It sounds like you are confident this is just a pytest bug. Would you expect the (indirectly) linked blueyed/pytest#107 to fix it? It doesn't seem to. If you think I shouldn't bother, I'll stop and wait and see what happens with pytest. If you think I might dig up some useful info... I might find some more time for this.

Thanks for looking out for me, just like everyone else I've certainly got other open issues to be working on.

@blueyed
Copy link

blueyed commented Mar 31, 2020

blueyed/pytest#107 was just an attempt to refactor it to make it work, nothing finished.
There might be other issues here, but pytest currently is raising out of unittests using a KeyboardInterrupt-exception, and therefore e.g. Django will not clean its databases afterwards etc.

@altendky
Copy link
Member

The two related Twisted issues have been completed and are available in trunk. This may be sufficient to alleviate the symptoms in the scrapy tests. At present I don't plan to dig into this a lot more right now. Not because I think these Twisted changes count as a fix but rather because of the encouragement to let pytest fix it. I've got a few other open source activities I'm trying to get work done on too.

@altendky
Copy link
Member

@Gallaecio, per the test I added in #94 it seems that pytest==5.4.2 may have fixed this. Checking in scrapy/scrapy#4588.

@altendky
Copy link
Member

It looks like pytest-dev/pytest#7156 may be the fix (if i did my backwards bisect correctly).

@Gallaecio, does it seem fixed to you also? (PR passed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants