Description
Currently we are susceptible to a (slight) vulnerability with nested scripts:
Examples (using scrub_fragment
):
Input: <script><script src='malicious.js'></script>
Sanitizer: strip
Output (text): <script src='malicious.js'>
Output (unescaped_text): <script src='malicious.js'>
(Sanitizer prune
is immune to this)
Input: <<s>script src='malicious.js'>
Sanitizer: strip
, prune
Output (text): <script src='malicious.js'>
Output (raw): <script src='malicious.js'>
Last example using strip
or prune
:
Input: <<s>script>alert('a')<<s>/script>
Output(raw): <script>alert('a')</script>
Why is this a problem?
I'm happy to discuss this, but I do believe that we should try to strip recursively as even though this outpus are only dangerous if unescaped, the whole purpose of scrubbing is to try to obtain the safest string back, and while <script src='malicious.js'> is not in itself unsafe, it is certainly less safe than it should be after going through a scrubber.
I've attached a PR
#128
with a potential solution using recursive scrubbing (information regarding this PR implementation is available on it)
Activity
flavorjones commentedon Oct 22, 2017
Thank you for reporting this. I'm looking at it now (apologies for the delay).
flavorjones commentedon Oct 22, 2017
I'm turning these into executable tests, and although I can easily reproduce the first example (which is a security problem), I cannot reproduce what you're seeing with the second or third.
Here's my code. Can you tell me what you're doing differently?
handle nested script tags
handle nested script tags
flavorjones commentedon Oct 22, 2017
My proposed fix for example 1 is in #132, here it is inline:
What do you think of that? And can you please help me understand if there's a security vulnerablity in either of the second or third example you've given?
myxoh commentedon Oct 23, 2017
Sorry for the delay on replying. The issue I am reporting is when calling the .text method with encode_special_chars: false
While I am aware this is not meant to be 100% safe, I would expect that after I strip a string the result won't contain a script (even though it might contain partial unsafe texts).
Without fixing this behaviour for the cases in which encode_special_chars is false, the library ends up having only one layer of security (encoding characters) rendering the scrub effectively useless in this cases. This is particularly troubling as loofah_activerecord uses the
text
method, meaning that if I want to allow 'safe' tags (like breaklines) I am left with the nested-script vulnerability.aaronchi commentedon Nov 7, 2017
We are seeing this issue as well. The inability of Loofah to handle proper escaping of nested tags is a problem that didn't exist with the deprecated Rails sanitizer and is not addressed by this proposed solution.
To reiterate what OP is pointing out in his second and third examples:
Input:
<<test>script>alert('hi')<<test>/script>
Ouput in Rails deprecated sanitizer:
alert('hi')
Output using Loofah strip:
<script>alert('hi')</script>
with encode_special_chars false:
<script>alert('hi')</script>
flavorjones commentedon Nov 12, 2017
@myxoh I've written code above twice to try to understand what you're seeing. Can you please write code back to help me understand?
@aaronchi Can you please write code as well to help me understand?
flavorjones commentedon Nov 12, 2017
@kaspth or @rafaelfranca can either of you help me understand?
flavorjones commentedon Nov 12, 2017
Is this related to rails/rails-html-sanitizer#48 ?
flavorjones commentedon Nov 12, 2017
Or rails/rails#28060 ?
flavorjones commentedon Nov 12, 2017
Wait, are you both saying that when you ask Loofah to not-escape entities, and you get them back unescaped, that you think it's a bug? I'm really struggling here to understand. Hopefully someone can help me, in particular with working code.
12 remaining items