Skip to content

Commit

Permalink
Prevent a HTML sanitization vulnerability
Browse files Browse the repository at this point in the history
CVE-2021-42576

A vulnerability was discovered by https://github.com/TomAnthony https://www.tomanthony.co.uk/ which allowed the contents of a `style` tag to be leaked unsanitized by bluemonday into the HTML output. Further it was demonstrated that if the form elements `select` and `option` were allowed by the policy that this could result in a successful XSS.

You would only be vulnerable to if if you allowed `style`, `select` and `option` in your HTML sanitization policy:

```go
p := bluemonday.NewPolicy()
p.AllowElements("style","select")
html := p.Sanitize(`<select><option><style><script>alert(1)</script>`)
fmt.Println(html)
```

bluemonday very strongly recommends not allowing the `style` element in a policy. It is fundamentally unsafe as we do not have a CSS sanitizer and the content is passed through unmodified.

bluemonday has been updated to explicitly suppress `style` and `script` elements by default even if you do allow them by policy as these are considered unsafe. If you have a use-case for using bluemonday whilst trusting the input then you can assert this via `p.AllowUnsafe(true)` which will let `style` and `script` through if the policy also allows them.

Note: the policies shipped with bluemonday are not vulnerable to this.
  • Loading branch information
buro9 committed Oct 18, 2021
1 parent 13d1799 commit c788a2a
Show file tree
Hide file tree
Showing 6 changed files with 116 additions and 9 deletions.
3 changes: 2 additions & 1 deletion CREDITS.md
Expand Up @@ -4,4 +4,5 @@
1. Andrew Krasichkov @buglloc https://github.com/buglloc
1. Mike Samuel mikesamuel@gmail.com
1. Dmitri Shuralyov shurcooL@gmail.com
1. https://github.com/opennota
1. opennota https://github.com/opennota https://gitlab.com/opennota
1. Tom Anthony https://www.tomanthony.co.uk/
6 changes: 4 additions & 2 deletions README.md
Expand Up @@ -180,7 +180,7 @@ p.AllowElementsMatching(regex.MustCompile(`^my-element-`))

Or add elements as a virtue of adding an attribute:
```go
// Not the recommended pattern, see the recommendation on using .Matching() below
// Note the recommended pattern, see the recommendation on using .Matching() below
p.AllowAttrs("nowrap").OnElements("td", "th")
```

Expand Down Expand Up @@ -222,7 +222,7 @@ p.AllowElements("fieldset", "select", "option")

Although it's possible to handle inline CSS using `AllowAttrs` with a `Matching` rule, writing a single monolithic regular expression to safely process all inline CSS which you wish to allow is not a trivial task. Instead of attempting to do so, you can allow the `style` attribute on whichever element(s) you desire and use style policies to control and sanitize inline styles.

It is suggested that you use `Matching` (with a suitable regular expression)
It is strongly recommended that you use `Matching` (with a suitable regular expression)
`MatchingEnum`, or `MatchingHandler` to ensure each style matches your needs,
but default handlers are supplied for most widely used styles.

Expand Down Expand Up @@ -379,6 +379,8 @@ Both examples exhibit the same issue, they declare attributes but do not then sp

We are not yet including any tools to help allow and sanitize CSS. Which means that unless you wish to do the heavy lifting in a single regular expression (inadvisable), **you should not allow the "style" attribute anywhere**.

In the same theme, both `<script>` and `<style>` are considered harmful. These elements (and their content) will not be rendered by default, and require you to explicitly set `p.AllowUnsafe(true)`. You should be aware that allowing these elements defeats the purpose of using a HTML sanitizer as you would be explicitly allowing either JavaScript (and any plainly written XSS) and CSS (which can modify a DOM to insert JS), and additionally but limitations in this library mean it is not aware of whether HTML is validly structured and that can allow these elements to bypass some of the safety mechanisms built into the [WhatWG HTML parser standard](https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inselect).

It is not the job of bluemonday to fix your bad HTML, it is merely the job of bluemonday to prevent malicious HTML getting through. If you have mismatched HTML elements, or non-conforming nesting of elements, those will remain. But if you have well-structured HTML bluemonday will not break it.

## TODO
Expand Down
30 changes: 30 additions & 0 deletions policy.go
Expand Up @@ -134,6 +134,19 @@ type Policy struct {
setOfElementsMatchingAllowedWithoutAttrs []*regexp.Regexp

setOfElementsToSkipContent map[string]struct{}

// Permits fundamentally unsafe elements.
//
// If false (default) then elements such as `style` and `script` will not be
// permitted even if declared in a policy. These elements when combined with
// untrusted input cannot be safely handled by bluemonday at this point in
// time.
//
// If true then `style` and `script` would be permitted by bluemonday if a
// policy declares them. However this is not recommended under any circumstance
// and can lead to XSS being rendered thus defeating the purpose of using a
// HTML sanitizer.
allowUnsafe bool
}

type attrPolicy struct {
Expand Down Expand Up @@ -714,6 +727,23 @@ func (p *Policy) AllowElementsContent(names ...string) *Policy {
return p
}

// AllowUnsafe permits fundamentally unsafe elements.
//
// If false (default) then elements such as `style` and `script` will not be
// permitted even if declared in a policy. These elements when combined with
// untrusted input cannot be safely handled by bluemonday at this point in
// time.
//
// If true then `style` and `script` would be permitted by bluemonday if a
// policy declares them. However this is not recommended under any circumstance
// and can lead to XSS being rendered thus defeating the purpose of using a
// HTML sanitizer.
func (p *Policy) AllowUnsafe(allowUnsafe bool) *Policy {
p.init()
p.allowUnsafe = allowUnsafe
return p
}

// addDefaultElementsWithoutAttrs adds the HTML elements that we know are valid
// without any attributes to an internal map.
// i.e. we know that <table> is valid, but <bdo> isn't valid as the "dir" attr
Expand Down
2 changes: 1 addition & 1 deletion policy_test.go
Expand Up @@ -35,7 +35,7 @@ import (
)

func TestAllowElementsContent(t *testing.T) {
policy := NewPolicy().AllowElementsContent("iframe", "script")
policy := NewPolicy().AllowElementsContent("iframe", "script").AllowUnsafe(true)

tests := []test{
{
Expand Down
49 changes: 45 additions & 4 deletions sanitize.go
Expand Up @@ -293,6 +293,17 @@ func (p *Policy) sanitize(r io.Reader, w io.Writer) error {

mostRecentlyStartedToken = normaliseElementName(token.Data)

switch normaliseElementName(token.Data) {
case `script`:
if !p.allowUnsafe {
continue
}
case `style`:
if !p.allowUnsafe {
continue
}
}

aps, ok := p.elsAndAttrs[token.Data]
if !ok {
aa, matched := p.matchRegex(token.Data)
Expand Down Expand Up @@ -341,6 +352,17 @@ func (p *Policy) sanitize(r io.Reader, w io.Writer) error {
mostRecentlyStartedToken = ""
}

switch normaliseElementName(token.Data) {
case `script`:
if !p.allowUnsafe {
continue
}
case `style`:
if !p.allowUnsafe {
continue
}
}

if skipClosingTag && closingTagToSkipStack[len(closingTagToSkipStack)-1] == token.Data {
closingTagToSkipStack = closingTagToSkipStack[:len(closingTagToSkipStack)-1]
if len(closingTagToSkipStack) == 0 {
Expand Down Expand Up @@ -386,6 +408,17 @@ func (p *Policy) sanitize(r io.Reader, w io.Writer) error {

case html.SelfClosingTagToken:

switch normaliseElementName(token.Data) {
case `script`:
if !p.allowUnsafe {
continue
}
case `style`:
if !p.allowUnsafe {
continue
}
}

aps, ok := p.elsAndAttrs[token.Data]
if !ok {
aa, matched := p.matchRegex(token.Data)
Expand Down Expand Up @@ -425,14 +458,22 @@ func (p *Policy) sanitize(r io.Reader, w io.Writer) error {
case `script`:
// not encouraged, but if a policy allows JavaScript we
// should not HTML escape it as that would break the output
if _, err := buff.WriteString(token.Data); err != nil {
return err
//
// requires p.AllowUnsafe()
if p.allowUnsafe {
if _, err := buff.WriteString(token.Data); err != nil {
return err
}
}
case "style":
// not encouraged, but if a policy allows CSS styles we
// should not HTML escape it as that would break the output
if _, err := buff.WriteString(token.Data); err != nil {
return err
//
// requires p.AllowUnsafe()
if p.allowUnsafe {
if _, err := buff.WriteString(token.Data); err != nil {
return err
}
}
default:
// HTML escape the text
Expand Down
35 changes: 34 additions & 1 deletion sanitize_test.go
Expand Up @@ -1721,7 +1721,7 @@ AAAASUVORK5CYII=" alt="">`
func TestIssue55ScriptTags(t *testing.T) {
p1 := NewPolicy()
p2 := UGCPolicy()
p3 := UGCPolicy().AllowElements("script")
p3 := UGCPolicy().AllowElements("script").AllowUnsafe(true)

in := `<SCRIPT>document.write('<h1><header/h1>')</SCRIPT>`
expected := ``
Expand Down Expand Up @@ -3660,3 +3660,36 @@ func TestHrefSanitization(t *testing.T) {
}
wg.Wait()
}

func TestInsertionModeSanitization(t *testing.T) {
tests := []test{
{
in: `<select><option><style><script>alert(1)</script>`,
expected: `<select><option>`,
},
}

p := UGCPolicy()
p.AllowElements("select", "option", "style")

// These tests are run concurrently to enable the race detector to pick up
// potential issues
wg := sync.WaitGroup{}
wg.Add(len(tests))
for ii, tt := range tests {
go func(ii int, tt test) {
out := p.Sanitize(tt.in)
if out != tt.expected {
t.Errorf(
"test %d failed;\ninput : %s\noutput : %s\nexpected: %s",
ii,
tt.in,
out,
tt.expected,
)
}
wg.Done()
}(ii, tt)
}
wg.Wait()
}

0 comments on commit c788a2a

Please sign in to comment.