Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python3: behave incorrectly mocks stdout / stderr (Missing buffer property) #533

Closed
asottile opened this issue Feb 18, 2017 · 18 comments
Closed
Labels

Comments

@asottile
Copy link

From an interactive prompt

$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.buffer.write(b'foo\n')
foo
4

Demo Setup

$ find features/ -type f | xargs tail -n 999
==> features/steps/example_steps.py <==
import sys

from behave import given


@given('given')
def step_impl(context):
    sys.stdout.buffer.write(b'foo\n')
    sys.stderr.buffer.write(b'foo\n')

==> features/example.feature <==
Feature: Test

    Scenario: Run
       Given given

Demo output

$ behave
Feature: Test # features/example.feature:1

  Scenario: Run  # features/example.feature:3
    Given given  # features/steps/example_steps.py:6 0.000s
      Traceback (most recent call last):
        File "/tmp/foo/venv/lib/python3.5/site-packages/behave/model.py", line 1456, in run
          match.run(runner.context)
        File "/tmp/foo/venv/lib/python3.5/site-packages/behave/model.py", line 1903, in run
          self.func(context, *args, **kwargs)
        File "features/steps/example_steps.py", line 8, in step_impl
          sys.stdout.buffer.write(b'foo\n')
      AttributeError: '_io.StringIO' object has no attribute 'buffer'



Failing scenarios:
  features/example.feature:3  Run

0 features passed, 1 failed, 0 skipped
0 scenarios passed, 1 failed, 0 skipped
0 steps passed, 1 failed, 0 skipped, 0 undefined
Took 0m0.000s

Version information

$ behave --version
behave 1.2.5
$ python --version
Python 3.5.2
$ uname -a
Linux asottile-MacBookPro 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Expected behaviour

$ behave
Feature: Test # features/example.feature:1

  Scenario: Run  # features/example.feature:3
    Given given  # features/steps/example_steps.py:6 0.000s

1 feature passed, 0 failed, 0 skipped
1 scenario passed, 0 failed, 0 skipped
1 step passed, 0 failed, 0 skipped, 0 undefined
Took 0m0.000s
@jenisys
Copy link
Member

jenisys commented Feb 19, 2017

@asottile
Can you explain why you are using the low-level interface (buffer) instead of writing with unicode text to the output stream (sys.stdout) ?

@asottile
Copy link
Author

I'm not sure it's relevant why this is happening (it's really orthogonal given behave's mock is incorrect) but sure I can explain.

With a missing LANG environment variable (or LANG=C) the stdout stream attempts to encode to US-ASCII encoding which is not appropriate (a western bias).

$ LANG= python3 -c 'print("\u2603")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u2603' in position 0: ordinal not in range(128)

The bytestream does not suffer from this fault:

$ LANG= python3 -c 'import sys; sys.stdout.buffer.write("\u2603\n".encode("UTF-8"))'
☃

@asottile
Copy link
Author

In another project I maintain, the only interface to plugins is via subprocess. The framework itself makes no attempt to decode / transform the output of the plugins (to do so would create a nightmare of encoding bugs) and instead forwards the output directly to the stdout bytestream.

@jenisys
Copy link
Member

jenisys commented Feb 20, 2017

Sorry, I disagree. sys.stdout is meant for text and not for bytes transport. The approach that you suggest leads exactly to trouble that the encoding of sys.stdout is violated by you. Therefore, I think that you are shooting yourself in the foot and are complaining about this fact.

@asottile
Copy link
Author

asottile commented Feb 20, 2017

Whether you agree or not stdout and stderr are actually TextIOWrappers in python 3 so representing them as StringIO is incorrect.

>>> import io
>>> import sys
>>> isinstance(sys.stdout, io.TextIOWrapper)
True
>>> isinstance(sys.stderr, io.TextIOWrapper)
True

@bukzor
Copy link

bukzor commented Feb 21, 2017

@jenisys: your opinion is incorrect. Consider the venerable program"tar”. A common usage is to stream its (decidedly binary) output over SSH, from its stdout. Should we not be able to write this program in Python? (No, we should, and we can.) Should behave prevent it from functioning correctly under test?

@jenisys
Copy link
Member

jenisys commented Feb 22, 2017

@bukzor
Sorry, I disagree. We are talking about stdout that is captured in behave and not the ssh console-stream. In addition, if the reason for this to skip the encoding problems, such usage will cause other decoding problems when an exception is raised. Who should ensure then that encoding matches the stream-encoding of stdout ?!?

@asottile
Copy link
Author

@jenisys I think you misunderstood @bukzor, let's see if I can make a simpler example:

Consider I'm writing pycat, a portable version of cat written in the high-level python3 language.

Here's a simple test I might write that would ensure it works correctly:

# test 1: text file
echo 'foo' > f
pycat f
# assert output is 'foo\n'

But you'll notice that cat also supports arbitrarily encoded files, including binary files:

$ (head -c 15 /dev/urandom && echo) > f.bin
$ cat f.bin 
�/�e���Z��]ʉT�

If I were to write this in python, I would write it as follows:

import argparse
import sys

def main(argv=None):
    parser = argparse.ArgumentParser()
    parser.add_argument('filenames', nargs='*')
    args = parser.parse_args(argv)
    # TODO: zero filenames would read from stdin if compliant, but this is a silly example
    for filename in args.filenames:
        with open(filename, 'rb') as f:
            # py2 + py3 compatibility, in python2, sys.stdout is a byte stream, in python3 use the buffer property
            getattr(sys.stdout, 'buffer', sys.stdout).write(f.read())

if __name__ == '__main__':
    exit(main())

My implementation works!

$ python2.7 pycat.py f.txt
hello world
$ python3.6 pycat.py f.txt
hello world
$ python2.7 pycat.py f.bin
�/�e���Z��]ʉT�
$ python3.6 pycat.py f.bin
�/�e���Z��]ʉT�

And I might test it as follows (if behave would allow it! except there's no attribute buffer on the stdio monkeypatch)

# Write 'hello world\n' to a file f
# Call pycat.main(('f',))
# Expect output to be 'hello world\n'

@jenisys
Copy link
Member

jenisys commented Feb 26, 2017

@asottile
Maybe, your implementation works. I am not completely convinced because now you add the following: throw an exception and show/print the traceback. Or read back the stream output with the current stream encoding.

The problem is the following. The stdout/stderr streams have an encoding. Your clearly violate the encoding (at least in your example) by using the low-level API (and the randomly generated data) and do not ensure that the encoding is preserved (in the binary data). Therefore, anybody who tries to read back the binary data from the steam output and tries to display it (therefore needs to decode them) is fucked.

Just because you can do something, is not a reason to really do it and encourage others to do it. Provide a compelling reason why this is needed (and a solution that complies with the encoding of the output streams), then I might consider it as real problem (and accept the pull-request). "Works for me" is something different as "works in all cases".

NOTE: Behave uses textual streams (meaning unicode) for stdout/stderr. Therefore, the streams have an encoding, for example "UTF-8". What you do with your other streams is not relevant here.

@asottile
Copy link
Author

You'll notice that I change nothing about the interface. These are still text streams, but they add an additional interface which conforms to the interface of sys.stdout and sys.stderr. They still enforce writing of text and they still retrieve text. The interface is entirely backwards compatible!

>>> from behave.capture import CaptureIO
>>> io = CaptureIO()
>>> io.write(b'foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: write() argument must be str, not bytes
>>> io.write('bar')
3
>>> io.getvalue()
'bar'
>>> type(io.getvalue()) is bytes
False

pytest (a much more popular testing framework) considered this to be a bug and accepted an almost identical pull request.

Like you said "works for me" is your current broken case (you only use write(), whereas others use buffer but cannot use your framework). Whereas adding the buffer property "works in all cases".

What I do with my other streams is absolutely relevant here. behave is a testing framework! It should keep its opinions to itself and enable me to write code however I want, however you seem to be interjecting your opinions and forcing me to fork or deprecate your framework.

If you believe how I do my streams is not relevant, please demonstrate how I can test pycat above with the current master of behave and I'll do that. I'd be happy to take your suggestions in this regard.

@jenisys
Copy link
Member

jenisys commented Feb 26, 2017

@asottile
You did not answer my primary point:
How do you ensure the stream's encoding is not violated when you write binary data to its buffer ?

Behave must display the captured contents of this stream when a failure occurs. When you inject your BAD-BINARY-DATA into it, behave will not deal with this problem.

@asottile
Copy link
Author

I was planning to add a second request if this was accepted to be able to read the binary data. As is it should rightfully crash

@asottile
Copy link
Author

Note that this is already the case in python 2 -- imo not a regression.

@stale
Copy link

stale bot commented Feb 13, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Comment to post when closing a stale issue. Set to false to disable

@stale stale bot added the wontfix label Feb 13, 2018
@stale stale bot closed this as completed Feb 27, 2018
@jenisys jenisys reopened this Mar 4, 2018
@stale stale bot removed the wontfix label Mar 4, 2018
@solarkennedy
Copy link

I can confirm this is still an issue, I'm running pip install git+git://github.com/asottile/behave@issue_533 and using @asottile 's fork as a workaround (thanks!).

@asottile
Copy link
Author

asottile commented Aug 1, 2018

@solarkennedy Not sure if it's helpful for your situation but pytest now has direct support for capfdbinary and capsysbinary fixtures for capturing -- potentially behave could be avoided entirely

@solarkennedy
Copy link

Aw, but I like behave! :)

@stale
Copy link

stale bot commented Jan 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 30, 2019
@stale stale bot closed this as completed Feb 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants