New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding rule and test case of python PyYaml FULL_LOAD #2352
Conversation
Hello Team, The pyyaml is vulnerable to code execution as seen here - yaml/pyyaml#386 I did not find this rule in Semgrep hence added it
Hey @shivankar-madaan! 👋 Thanks for the PR. 😄 This PR actually prompted me to investigate which exploit paths are still relevant today. The link you provided looks like the vulnerability was patched, so I looked around for the cases which are still exploitable. I found this blog which very helpfully spells out many of the still-valid exploit cases. Here's me trying them out:
|
It looks like things are much safer now, and the exploitable cases include:
I'll have to update the rule accordingly. |
Thanks @minusworld for sharing this technique and investigation. Out of curiosity, so in case somebody was using an old version of PyYAML and using the vulnerable full_load, Semgrep does not take that into consideration right? |
🤔 It's kinda tricky, because we want to make sure we're up-to-date with the latest implementation of PyYAML so that we don't report false positives. But also, a quick search on GitHub shows that many projects are not up-to-date. Looks like the safe default changes were added in version 5.4, and the search shows many projects not up to 5.4 yet. |
Let me talk with some of my team to identify a good path forward. Some options include:
|
Hmm, maybe it was actually deprecated on version 5.1. I'm curious to do some more testing on versions. |
complicated scenario haha.... let me know if I can help in the PR. More than happy to contribute back |
Here's my analysis of which APIs are vulnerable in which versions. I'm thinking of writing this up in a short blog post, actually. It will include details on how the tests were done alongside the matrix above. @shivankar-madaan, would you like to be mentioned? Would you like to be a reviewer? 😄 As for this rule, we think we will update this rule to only alert on vulnerable cases after PyYAML v5.4. Further analysis of the usage of PyYAML as a dependency showed that most people are on v5.4 or later. In addition, v5.4 was released 1.5 years ago, which is probably enough time for many projects to upgrade. Lastly, the API differences between unsafe and safe versions are pretty minimal. Therefore, this rule would detect |
Hello @minusworld First of all really nice analysis and a good decision and big thanks for sharing this. For sure I would love to mentioned in the blog and hopefully can help reviewing.Let me know how I can be involved in the process. Also I'll get back with the updated PR for the following variants you mentioned. |
Hello @minusworld Could you guide me on how we can Releasing a companion rule which checks for out-of-date PyYAML versions in requirements.txt Is this possible, I would really like to have it in this way. If not, I will go by the above discussion as decided. |
There are two things I can think of:
|
@shivankar-madaan If you email me at my r2c email, I can send you a draft of the blog for review :D . It's in the git history |
@minusworld just emailed you :) |
Hello Team,
The pyyaml FULL_LOAD is vulnerable to code execution as seen here - yaml/pyyaml#386
I did not find this rule in Semgrep hence added it