Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 5.1 dumps strings differently on different platforms #275

Closed
undera opened this issue Mar 17, 2019 · 10 comments
Closed

Version 5.1 dumps strings differently on different platforms #275

undera opened this issue Mar 17, 2019 · 10 comments
Labels

Comments

@undera
Copy link

undera commented Mar 17, 2019

After 5.1 has been released, our CI builds started to fail. The reason is different outputs on Linux, Mac and Windows, depending on Python version.
The problem displays as different way to dump string containing tabs, on different platforms. I'd accept any of dump variants from 5.1 if it would be consistent across platforms. But having it different creates a mess with automated tests.

I have made simple unit test to reproduce the problem (Blazemeter/taurus#1076). The code to reproduce is:

class MyTestCase(unittest.TestCase):
    def test_case(self):
        data = {"str": "\tpart1\tpart2\t"}
        res = yaml.safe_dump(data,
                             default_flow_style=False, explicit_start=True, canonical=False,
                             allow_unicode=True, encoding='utf-8', width=float("inf"))
        res = res.decode('utf8')
        self.assertEqual('---\nstr: "\\tpart1\\tpart2\\t\"\n', res)

This test works fine on Linux with any Python version, Windows with Python 3.
It fails on MacOS with Python 2.7 and on Windows with Python 2.7.

I double-checked that version of PyYAML is 5.1 on all builds.

Relevant CI links:
Linux and Mac: https://travis-ci.org/Blazemeter/taurus/builds/507459846
Windows: https://ci.appveyor.com/project/undera/taurus/builds/23134929

@perlpunk
Copy link
Member

Thanks, I have a Mac available right now and can reproduce it.
The reason seems to be that has_ucs4 is False (has_ucs4 = sys.maxunicode > 0xffff).
The logic deciding when to quote a string seems to be wrong there.

@perlpunk
Copy link
Member

perlpunk commented Mar 17, 2019

I created #276
Seems the logic was wrong, I just replaced an or with and.
Was introduced in #63

@undera
Copy link
Author

undera commented Mar 18, 2019

Thanks for reacting so quick.
Do you have estimation when this fix will be released?

@perlpunk
Copy link
Member

I hope in less than a couple of days. I can't release myself.

@perlpunk
Copy link
Member

Since it's not really clear what the error is from the original post:
Strings with tabs (or other special characters) are dumped as plain scalars.

>>> import yaml
>>> string = "\tpart1\tpart2"
>>> print(yaml.dump(string, allow_unicode=True))

        part1   part2
...

which is unreadable (for humans as well as PyYAML)

@wvidana
Copy link

wvidana commented Sep 3, 2019

Not sure if related, but I'm getting AttributeError: 'str' object has no attribute 'decode' in https://github.com/garnaat/kappa/blob/46709b6b790fead13294c2c18ffa5d63ea5133c7/kappa/context.py#L109

But only when I'm using Python 3.7 with pyyaml 5.1. Looks like the return value is already decoded now. This doesn't happen on Python 2.7 and PyYaml 3.12

@perlpunk
Copy link
Member

perlpunk commented Sep 3, 2019

@wvidana this is unrelated to the reported issue.
You might create a new issue.
Important information would be what happens with Python 2.7 + pyyaml 5.1 and Python 3.7 + pyyaml 3.13.
From your information this could also be an issue about python 2 vs. 3

@perlpunk
Copy link
Member

We just released 5.2b1 https://pypi.org/project/PyYAML/5.2b1/ which should fix this.

@perlpunk
Copy link
Member

perlpunk commented Dec 2, 2019

We released 5.2: https://pypi.org/project/PyYAML/5.2/

@perlpunk
Copy link
Member

perlpunk commented Dec 2, 2019

Closing. Please reopen if necessary. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants