Use of yaml.load instead of yaml.safe_load #1529

tonybaloney · 2018-07-03T08:46:56Z

Hi,

I noticed that the yaml module is using load instead of safe_load. It's generally recommended to use safe_load as otherwise, attackers can run arbitrary code through configuration files.
See this OpenStack file
https://security.openstack.org/guidelines/dg_avoid-dangerous-input-parsing-libraries.html

Happy to raise a PR and swap those with yaml.safe_load in

mkdocs/mkdocs/utils/__init__.py

Line 78 in 34ef3ca

return yaml.load(source, Loader)

The text was updated successfully, but these errors were encountered:

waylan · 2018-07-03T14:00:52Z

Your reference states:

You always want to use the safe functions to load input.

I have to disagree with that blanket statement. Yes, many (most?) times that is true, But sometimes you might actually want/need the so-called "unsafe" option (otherwise, why would the library offer it).

In any event, there are at least two questions we need to answer:

Is there a legitimate need for not using safe_load with MkDocs?

For some users, yes, See this comment to Non ascii headers do not slugify nicely. #212 as an example. Although, as mentioned elsewhere in that discussion, we may not actually want to allow that. And it is likely possible to get the same functionality using a plugin rather than pointing to a Python object directly from the YAML config file.

So, in the end, no there is no "need." In fact, we don't officially support doing that. But for some advanced users it is "nice to have."
As MkDocs is a static site generator, what danger, if any, really exists?

Again this was discussed in Non ascii headers do not slugify nicely. #212 and we don't see it as a high concern. As a reminder, the "server" that we include with MkDocs is a dev server and not intended for use in a server open to the world. All our users ever upload to a server is a bunch of flat files. What exactly would the attack vector be here?

Worst case, an attacker adds some code calling option into the config file and that code adds some evil JavaScript code to the generated site. But if the attacker can do that, then they can just add the JavaScript code into the source files for the site directly. No need to use the config file. At most, using the config file adds a level of indirection if the attacker is trying to hide their activities, but things are not actually any more or less safe than if we use safe_load.

Yes, I realize that the attacker could simply use this as a method to get arbitrary code to run on the the developer's machine (when she runs mkdocs build). But even if we did use safe_load, we still support plugins. So any plugin referenced in the config will call code which could be from an unknown source. How is that any different that not using safe_load?

tonybaloney · 2018-07-03T21:59:02Z

ok, I thought it worth raising the issue. I see your point with the plugins

I was thinking along the lines of shell access being gained through the configuration file, but as you state you need shell access anyway to compile it.

I was thinking of a scenario like this:

MkDocs being run on a cron/hooks on a server, linked to a CI/CD process. Developers change pages and MkDocs builds the content via source-control. This is a pretty common deployment scenario
Someone in the team had weak credentials (again really common), attacker pushes code in mkdocs.yaml to create a shell account and pops server.

!!python/object/apply:os.system\nargs: ['useradd bad_user ...']

That scenario is completely the fault of the administrator for doing this and for the dev for having bad creds, but they might not be aware what harm can be done simply by deserialising a yaml file.

I'll close the issue anyway.

waylan · 2019-02-28T14:05:14Z

With the upcoming release of PyYAML 5.1, we should update to no longer call the soon-to-be deprecated yaml.load(). See yaml/pyyaml#257 (and https://msg.pyyaml.org/load) for details and the options available. I expect we will want to use FullLoader. But I need to confirm that we don't loose anything we need from UnsafeLoader.

waylan · 2019-03-01T14:12:45Z

I have reviewed the changes to PyYAML and MkDocs use of the library and there are not any changes that we need to make. PyYAML 5.1 raises a deprecation warning only when a loader isn't explicitly selected. However, we use our own custom loader so we always explicitly select a loader. The yaml.load function itself is not deprecated; in fact the loader argument of that function is how you select your Loader of choice.

The only possible change to the code we could make is to have our custom loader subclass yaml.FullLoader rather than yaml.Loader. But that seems unnecessary as the two are aliases of each other in 5.1. And yaml.FullLoader didn't exist in earlier versions. Therefore, such a change would require us to upgrade our minimum dependency requirement to version 5.1.

For the reasons stated in an earlier comment, the benefits of not calling the so-called "safe" version of PyYAML outweigh the risks for MkDocs. Therefore, there is no need or benefit to requiring version 5.1 at this time.

tonybaloney closed this as completed Jul 3, 2018

waylan mentioned this issue Sep 4, 2018

Security Issue on Module PyYAML #1620

Closed

facelessuser mentioned this issue Oct 27, 2018

update pyyaml to fix CVE-2017-18342 #1666

Closed

This was referenced Dec 6, 2018

Default yaml loader is unsafe #1696

Closed

use SafeLoader instead of the default Pyyaml loader #1697

Closed

waylan mentioned this issue Jan 24, 2019

GitHub reports security vulnerability in pyyaml, recommends >=4.2b1 #1730

Closed

waylan reopened this Feb 28, 2019

waylan added this to the 1.1 milestone Feb 28, 2019

waylan added Needs design decision Cleanup labels Feb 28, 2019

waylan closed this as completed Mar 1, 2019

waylan removed Cleanup Needs design decision labels Mar 1, 2019

waylan mentioned this issue May 16, 2021

PyYAML@5.1 is highly vulnerable #2406

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of yaml.load instead of yaml.safe_load #1529

Use of yaml.load instead of yaml.safe_load #1529

tonybaloney commented Jul 3, 2018

waylan commented Jul 3, 2018

tonybaloney commented Jul 3, 2018

waylan commented Feb 28, 2019 •

edited

waylan commented Mar 1, 2019

Use of yaml.load instead of yaml.safe_load #1529

Use of yaml.load instead of yaml.safe_load #1529

Comments

tonybaloney commented Jul 3, 2018

waylan commented Jul 3, 2018

tonybaloney commented Jul 3, 2018

waylan commented Feb 28, 2019 • edited

waylan commented Mar 1, 2019

waylan commented Feb 28, 2019 •

edited