Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yaml serializer fails since pyyaml 4.1 #366

Closed
jbagot opened this issue Jun 27, 2018 · 10 comments
Closed

Yaml serializer fails since pyyaml 4.1 #366

jbagot opened this issue Jun 27, 2018 · 10 comments

Comments

@jbagot
Copy link

jbagot commented Jun 27, 2018

Yesterday pyyaml was updated to version 4.1
And now I'm receiving this error:
yaml.representer.RepresenterError: ('cannot represent an object', URL('<url>')
The problem is in the vcr/serializers/yamlserializer.py file.
When the method serialize method is called.

I found this pull request that was pushed in the version 4.1 of pyyaml:
yaml/pyyaml#74
where they change the behaviour dump, Dumper, CDumper and the same with the loaders. Before the update this methods were "danger" and now are "safe" then we can change the line in the serializer to the "danger" version because is exactly the same that we use before this update.
I'll try to fix it later.

@jbagot jbagot changed the title Yaml representer fails since pyyaml 4.1 Yaml serializer fails since pyyaml 4.1 Jun 27, 2018
@stefangordon
Copy link

stefangordon commented Jun 27, 2018

I'm seeing even scarier things with pyyaml 4.1 -- vcrpy is actually overwriting cassettes during test runs (with record mode=once) on pyyaml 4.1 - haven't figured out why yet....

Update:
I confirmed if I recorded the tests with pyyaml 4.1 installed the problem went away, but if they were recorded with 3.12 then run on a CI machine with 4.1 it would "not see" the cassette and try to make real HTTP calls and overwrite the cassette file. Downgrading CI machine to 3.12 resolved, and upgrading dev box to 4.1 to make new cassettes also resolved.

Somehow new pyyaml wasn't reading the cassettes, so vcr thought they were empty, and its record mode kicked in.

@lmazuel
Copy link
Contributor

lmazuel commented Jun 27, 2018

Got hit by this as well, and the only difference between working Travis and broken Travis is indeed PyYAML 4.1 / 3.12. Thanks for the workaround @stefangordon, I will try to pin my dependency to 3.12 for now, and find time to re-record everything using 4.1 (:cry: :cry: :cry:)

Edit: pinning 3.12 works as expected, I got my Travis back

@lmazuel
Copy link
Contributor

lmazuel commented Jun 27, 2018

@stefangordon @jbagot found out the problem.

After some investigation, it impacts only files recorded with Python 2.7. Because starting PyYAML 4.x, the methods “load/safe_load” becomes respectively “danger_load/load” (i.e. by default creating Python object from YAML parsing is disabled).
In Py 2.7, YAML of VCRPy contains this:

body: {string: !!python/unicode 'stuff'.....

Which will not be parsed anymore by default (since it creates a Python object and is vulnerable to attack).

This means that with PyYAML 3.x series, if a malicious person introduce bad python code in the recordings, each time the recordings are replayed the code is executed (?!?!??). And you see here, this could simple as:

favorite_activity: !!python/object/apply:os.system ['rm *']

I'm not sure if it means that VCRPy is not compatible anymore with Py2.7 if PyYAML is installed, or if this means that PyYAML should be pinned to 3.12 on Py2.7 and >=4.0 on Py3. Or if you just document (as PyYAML did) that playing recording is vulnerable to attack if PyYAML is < 4.0.

radek-sprta added a commit to radek-sprta/mariner that referenced this issue Jun 28, 2018
There is issue with vcr.py when using PyYAML 4.1, so we pin the
dependency until it's resolved. See issue at:
kevin1024/vcrpy#366
@mark0978
Copy link

The correct place to pin the version is in setup.py

@jbagot
Copy link
Author

jbagot commented Jun 29, 2018

I never used python2 in the project where I have this problem. I started this issue because with the pyyaml 4.1 doesn't work in my project (python3.6).
Now is using the safe dumper instead of the danger and the pyyaml return an error when it has to represent a URL object.

@lmazuel
Copy link
Contributor

lmazuel commented Jun 29, 2018

@jbagot In my scenario (which is clearly different than yours), a simple replace like s$!!python/unicode $$ was enough to make my old recordings work with all PyYAML versions. But I guess it also depends on the pre-processors used and options.

I think the idea is that for each !!python/xxxx in the YAML file, this won't work out-of-the-box. VCRPy should probably propose a safe/danger mode telling about the issue and give the hot potato to the VCRPy user to decide. If you control the recordings, "danger" mode is great and allow you to use Python objects, but if you don't (like accepting PR on an OpenSource project with recordings you didn't do), safe mode should be enforced.

@sevein
Copy link

sevein commented Jun 29, 2018

It seems that PyYAML 4.1 has been removed/unpublished. PyYAML 3.12 is again the latest stable release (which may bring some vcrpy users to the previous state) but I guess that will change again as soon as PyYAML 4.2 is released which seems like it's already in progress with a 4.2b1 release already published in PyPI: https://pypi.org/project/PyYAML/#history.

@lamenezes
Copy link
Collaborator

I think we have to wait and see this new release of PyYAML to work on a fix to this issue. From the commentaries here using pyyaml < 4 works fine, so pinning the version is the solution for now.

@jbagot
Copy link
Author

jbagot commented Jul 8, 2018

I know that is not a proper solution, but if in the next update of pyyaml fails we can use JSON meanwhile.

@lamenezes
Copy link
Collaborator

I believe this was fixed by PyYAML. If the problem persists feel free to open another issue. Thanks everyone for the discussion :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants