Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use safelist(s), allowlist(s) where applicable #164

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Expand Up @@ -17,6 +17,13 @@
* CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [#165] (Thanks, @asok!)


### Deprecations / Name Changes

* Deprecate `Loofah::Helpers::ActionView.white_list_sanitizer`, please use `Loofah::Helpers::ActionView.safe_list_sanitizer` instead.
* Deprecate `Loofah::Helpers::ActionView::WhiteListSanitizer`, please use `Loofah::Helpers::ActionView::SafeListSanitizer` instead.
* Deprecate `Loofah::HTML5::WhiteList`, please use `Loofah::HTML5::SafeList` instead.


## 2.2.3 / 2018-10-30

### Security
Expand Down
2 changes: 1 addition & 1 deletion Manifest.txt
Expand Up @@ -17,7 +17,7 @@ lib/loofah/html/document.rb
lib/loofah/html/document_fragment.rb
lib/loofah/html5/libxml2_workarounds.rb
lib/loofah/html5/scrub.rb
lib/loofah/html5/whitelist.rb
lib/loofah/html5/safelist.rb
lib/loofah/instance_methods.rb
lib/loofah/metahelpers.rb
lib/loofah/scrubber.rb
Expand Down
6 changes: 3 additions & 3 deletions README.md
Expand Up @@ -19,7 +19,7 @@ documents and fragments. It's built on top of Nokogiri and libxml2, so
it's fast and has a nice API.

Loofah excels at HTML sanitization (XSS prevention). It includes some
nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
nice HTML sanitizers, which are based on HTML5lib's safelist, so it
most likely won't make your codes less secure. (These statements have
not been evaluated by Netexperts.)

Expand All @@ -29,7 +29,7 @@ ActiveRecord extensions for sanitization are available in the

## Features

* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's whitelists).
* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's safelists).
* Common HTML sanitizing tasks are built-in:
* _Strip_ unsafe tags, leaving behind only the inner text.
* _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
Expand Down Expand Up @@ -221,7 +221,7 @@ Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
=== Built-In HTML Scrubbers

Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
whitelist algorithm:
safelist algorithm:

``` ruby
doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
Expand Down
6 changes: 3 additions & 3 deletions Rakefile
Expand Up @@ -70,9 +70,9 @@ task :doc_upload_to_rubyforge => :docs do
end
end

desc "generate whitelists from W3C specifications"
task :generate_whitelists do
load "tasks/generate-whitelists"
desc "generate safelists from W3C specifications"
task :generate_safelists do
load "tasks/generate-safelists"
end

Concourse.new("loofah", fly_target: "ci") do |c|
Expand Down
2 changes: 1 addition & 1 deletion lib/loofah.rb
Expand Up @@ -5,7 +5,7 @@
require 'loofah/metahelpers'
require 'loofah/elements'

require 'loofah/html5/whitelist'
require 'loofah/html5/safelist'
require 'loofah/html5/libxml2_workarounds'
require 'loofah/html5/scrub'

Expand Down
16 changes: 13 additions & 3 deletions lib/loofah/helpers.rb
Expand Up @@ -46,8 +46,13 @@ def full_sanitizer
@full_sanitizer ||= ::Loofah::Helpers::ActionView::FullSanitizer.new
end

def safe_list_sanitizer
@safe_list_sanitizer ||= ::Loofah::Helpers::ActionView::SafeListSanitizer.new
end

def white_list_sanitizer
@white_list_sanitizer ||= ::Loofah::Helpers::ActionView::WhiteListSanitizer.new
warn "warning: white_list_sanitizer is deprecated, please use safe_list_sanitizer instead."
safe_list_sanitizer
end
end

Expand All @@ -73,13 +78,13 @@ def sanitize html, *args
#
# To use by default, call this in an application initializer:
#
# ActionView::Helpers::SanitizeHelper.white_list_sanitizer = ::Loofah::Helpers::ActionView::WhiteListSanitizer.new
# ActionView::Helpers::SanitizeHelper.safe_list_sanitizer = ::Loofah::Helpers::ActionView::SafeListSanitizer.new
#
# Or, to generally opt-in to Loofah's view sanitizers:
#
# Loofah::Helpers::ActionView.set_as_default_sanitizer
#
class WhiteListSanitizer
class SafeListSanitizer
def sanitize html, *args
Loofah::Helpers.sanitize html
end
Expand All @@ -88,6 +93,11 @@ def sanitize_css style_string, *args
Loofah::Helpers.sanitize_css style_string
end
end

WhiteListSanitizer = SafeListSanitizer
if Object.respond_to?(:deprecate_constant)
deprecate_constant :WhiteListSanitizer
end
end
end
end
11 changes: 8 additions & 3 deletions lib/loofah/html5/whitelist.rb → lib/loofah/html5/safelist.rb
Expand Up @@ -3,7 +3,7 @@
module Loofah
module HTML5 # :nodoc:
#
# HTML whitelist lifted from HTML5lib sanitizer code:
# HTML safelist lifted from HTML5lib sanitizer code:
#
# http://code.google.com/p/html5lib/
#
Expand Down Expand Up @@ -44,7 +44,7 @@ module HTML5 # :nodoc:
# DEALINGS IN THE SOFTWARE.
#
# </html5_license>
module WhiteList
module SafeList

ACCEPTABLE_ELEMENTS = Set.new([
"a",
Expand Down Expand Up @@ -790,6 +790,11 @@ module WhiteList
ALLOWED_ELEMENTS_WITH_LIBXML2 = ALLOWED_ELEMENTS + TAGS_SAFE_WITH_LIBXML2
end

::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::HTML5::WhiteList
WhiteList = SafeList
if Object.respond_to?(:deprecate_constant)
deprecate_constant :WhiteList
end

::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::HTML5::SafeList
end
end
26 changes: 13 additions & 13 deletions lib/loofah/html5/scrub.rb
Expand Up @@ -12,7 +12,7 @@ module Scrub
class << self

def allowed_element? element_name
::Loofah::HTML5::WhiteList::ALLOWED_ELEMENTS_WITH_LIBXML2.include? element_name
::Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.include? element_name
end

# alternative implementation of the html5lib attribute scrubbing algorithm
Expand All @@ -28,31 +28,31 @@ def scrub_attributes node
next
end

unless WhiteList::ALLOWED_ATTRIBUTES.include?(attr_name)
unless SafeList::ALLOWED_ATTRIBUTES.include?(attr_name)
attr_node.remove
next
end

if WhiteList::ATTR_VAL_IS_URI.include?(attr_name)
if SafeList::ATTR_VAL_IS_URI.include?(attr_name)
# this block lifted nearly verbatim from HTML5 sanitization
val_unescaped = CGI.unescapeHTML(attr_node.value).gsub(CONTROL_CHARACTERS,'').downcase
if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ && ! WhiteList::ALLOWED_PROTOCOLS.include?(val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[0])
if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ && ! SafeList::ALLOWED_PROTOCOLS.include?(val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0])
attr_node.remove
next
elsif val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[0] == 'data'
elsif val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0] == 'data'
# permit only allowed data mediatypes
mediatype = val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[1]
mediatype = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[1]
mediatype, _ = mediatype.split(';')[0..1] if mediatype
if mediatype && !WhiteList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype)
if mediatype && !SafeList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype)
attr_node.remove
next
end
end
end
if WhiteList::SVG_ATTR_VAL_ALLOWS_REF.include?(attr_name)
if SafeList::SVG_ATTR_VAL_ALLOWS_REF.include?(attr_name)
attr_node.value = attr_node.value.gsub(/url\s*\(\s*[^#\s][^)]+?\)/m, ' ') if attr_node.value
end
if WhiteList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == 'xlink:href' && attr_node.value =~ /^\s*[^#\s].*/m
if SafeList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == 'xlink:href' && attr_node.value =~ /^\s*[^#\s].*/m
attr_node.remove
next
end
Expand All @@ -79,14 +79,14 @@ def scrub_css style
style_tree.each do |node|
next unless node[:node] == :property
next if node[:children].any? do |child|
[:url, :bad_url].include?(child[:node]) || (child[:node] == :function && !WhiteList::ALLOWED_CSS_FUNCTIONS.include?(child[:name].downcase))
[:url, :bad_url].include?(child[:node]) || (child[:node] == :function && !SafeList::ALLOWED_CSS_FUNCTIONS.include?(child[:name].downcase))
end
name = node[:name].downcase
if WhiteList::ALLOWED_CSS_PROPERTIES.include?(name) || WhiteList::ALLOWED_SVG_PROPERTIES.include?(name)
if SafeList::ALLOWED_CSS_PROPERTIES.include?(name) || SafeList::ALLOWED_SVG_PROPERTIES.include?(name)
sanitized_tree << node << CRASS_SEMICOLON
elsif WhiteList::SHORTHAND_CSS_PROPERTIES.include?(name.split('-').first)
elsif SafeList::SHORTHAND_CSS_PROPERTIES.include?(name.split('-').first)
value = node[:value].split.map do |keyword|
if WhiteList::ALLOWED_CSS_KEYWORDS.include?(keyword) || keyword =~ CSS_KEYWORDISH
if SafeList::ALLOWED_CSS_KEYWORDS.include?(keyword) || keyword =~ CSS_KEYWORDISH
keyword
end
end.compact
Expand Down
2 changes: 1 addition & 1 deletion lib/loofah/scrubbers.rb
@@ -1,7 +1,7 @@
module Loofah
#
# Loofah provides some built-in scrubbers for sanitizing with
# HTML5lib's whitelist and for accomplishing some common
# HTML5lib's safelist and for accomplishing some common
# transformation tasks.
#
#
Expand Down
4 changes: 2 additions & 2 deletions loofah.gemspec
Expand Up @@ -9,10 +9,10 @@ Gem::Specification.new do |s|
s.require_paths = ["lib".freeze]
s.authors = ["Mike Dalessio".freeze, "Bryan Helmkamp".freeze]
s.date = "2018-02-12"
s.description = "Loofah is a general library for manipulating and transforming HTML/XML\ndocuments and fragments. It's built on top of Nokogiri and libxml2, so\nit's fast and has a nice API.\n\nLoofah excels at HTML sanitization (XSS prevention). It includes some\nnice HTML sanitizers, which are based on HTML5lib's whitelist, so it\nmost likely won't make your codes less secure. (These statements have\nnot been evaluated by Netexperts.)\n\nActiveRecord extensions for sanitization are available in the\n[`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).".freeze
s.description = "Loofah is a general library for manipulating and transforming HTML/XML\ndocuments and fragments. It's built on top of Nokogiri and libxml2, so\nit's fast and has a nice API.\n\nLoofah excels at HTML sanitization (XSS prevention). It includes some\nnice HTML sanitizers, which are based on HTML5lib's safelist, so it\nmost likely won't make your codes less secure. (These statements have\nnot been evaluated by Netexperts.)\n\nActiveRecord extensions for sanitization are available in the\n[`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).".freeze
s.email = ["mike.dalessio@gmail.com".freeze, "bryan@brynary.com".freeze]
s.extra_rdoc_files = ["CHANGELOG.md".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "CHANGELOG.md".freeze, "README.md".freeze]
s.files = [".gemtest".freeze, "CHANGELOG.md".freeze, "Gemfile".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "Rakefile".freeze, "benchmark/benchmark.rb".freeze, "benchmark/fragment.html".freeze, "benchmark/helper.rb".freeze, "benchmark/www.slashdot.com.html".freeze, "lib/loofah.rb".freeze, "lib/loofah/elements.rb".freeze, "lib/loofah/helpers.rb".freeze, "lib/loofah/html/document.rb".freeze, "lib/loofah/html/document_fragment.rb".freeze, "lib/loofah/html5/scrub.rb".freeze, "lib/loofah/html5/whitelist.rb".freeze, "lib/loofah/instance_methods.rb".freeze, "lib/loofah/metahelpers.rb".freeze, "lib/loofah/scrubber.rb".freeze, "lib/loofah/scrubbers.rb".freeze, "lib/loofah/xml/document.rb".freeze, "lib/loofah/xml/document_fragment.rb".freeze, "test/assets/testdata_sanitizer_tests1.dat".freeze, "test/helper.rb".freeze, "test/html5/test_sanitizer.rb".freeze, "test/integration/test_ad_hoc.rb".freeze, "test/integration/test_helpers.rb".freeze, "test/integration/test_html.rb".freeze, "test/integration/test_scrubbers.rb".freeze, "test/integration/test_xml.rb".freeze, "test/unit/test_api.rb".freeze, "test/unit/test_encoding.rb".freeze, "test/unit/test_helpers.rb".freeze, "test/unit/test_scrubber.rb".freeze, "test/unit/test_scrubbers.rb".freeze]
s.files = [".gemtest".freeze, "CHANGELOG.md".freeze, "Gemfile".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "Rakefile".freeze, "benchmark/benchmark.rb".freeze, "benchmark/fragment.html".freeze, "benchmark/helper.rb".freeze, "benchmark/www.slashdot.com.html".freeze, "lib/loofah.rb".freeze, "lib/loofah/elements.rb".freeze, "lib/loofah/helpers.rb".freeze, "lib/loofah/html/document.rb".freeze, "lib/loofah/html/document_fragment.rb".freeze, "lib/loofah/html5/scrub.rb".freeze, "lib/loofah/html5/safelist.rb".freeze, "lib/loofah/instance_methods.rb".freeze, "lib/loofah/metahelpers.rb".freeze, "lib/loofah/scrubber.rb".freeze, "lib/loofah/scrubbers.rb".freeze, "lib/loofah/xml/document.rb".freeze, "lib/loofah/xml/document_fragment.rb".freeze, "test/assets/testdata_sanitizer_tests1.dat".freeze, "test/helper.rb".freeze, "test/html5/test_sanitizer.rb".freeze, "test/integration/test_ad_hoc.rb".freeze, "test/integration/test_helpers.rb".freeze, "test/integration/test_html.rb".freeze, "test/integration/test_scrubbers.rb".freeze, "test/integration/test_xml.rb".freeze, "test/unit/test_api.rb".freeze, "test/unit/test_encoding.rb".freeze, "test/unit/test_helpers.rb".freeze, "test/unit/test_scrubber.rb".freeze, "test/unit/test_scrubbers.rb".freeze]
s.homepage = "https://github.com/flavorjones/loofah".freeze
s.licenses = ["MIT".freeze]
s.rdoc_options = ["--main".freeze, "README.md".freeze]
Expand Down
14 changes: 7 additions & 7 deletions tasks/generate-allowlists → tasks/generate-safelists
Expand Up @@ -28,12 +28,12 @@ dompurify_metadata.each { |k, v| puts "#{k}: #{v.keys}" }
require "loofah"

pairs = {
"html:tags" => [Loofah::HTML5::WhiteList::ACCEPTABLE_ELEMENTS, dompurify_metadata["tags"]["html"]],
"mathml:tags" => [Loofah::HTML5::WhiteList::MATHML_ELEMENTS, dompurify_metadata["tags"]["mathMl"]],
"svg:tags" => [Loofah::HTML5::WhiteList::SVG_ELEMENTS, dompurify_metadata["tags"]["svg"]],
"html:attrs" => [Loofah::HTML5::WhiteList::ACCEPTABLE_ATTRIBUTES, dompurify_metadata["attrs"]["html"]],
"mathml:attrs" => [Loofah::HTML5::WhiteList::MATHML_ATTRIBUTES, dompurify_metadata["attrs"]["mathMl"]],
"svg:attrs" => [Loofah::HTML5::WhiteList::SVG_ATTRIBUTES, dompurify_metadata["attrs"]["svg"]],
"html:tags" => [Loofah::HTML5::SafeList::ACCEPTABLE_ELEMENTS, dompurify_metadata["tags"]["html"]],
"mathml:tags" => [Loofah::HTML5::SafeList::MATHML_ELEMENTS, dompurify_metadata["tags"]["mathMl"]],
"svg:tags" => [Loofah::HTML5::SafeList::SVG_ELEMENTS, dompurify_metadata["tags"]["svg"]],
"html:attrs" => [Loofah::HTML5::SafeList::ACCEPTABLE_ATTRIBUTES, dompurify_metadata["attrs"]["html"]],
"mathml:attrs" => [Loofah::HTML5::SafeList::MATHML_ATTRIBUTES, dompurify_metadata["attrs"]["mathMl"]],
"svg:attrs" => [Loofah::HTML5::SafeList::SVG_ATTRIBUTES, dompurify_metadata["attrs"]["svg"]],
}

pairs.each do |name, v|
Expand All @@ -53,4 +53,4 @@ pairs.each do |name, v|
puts
end

# TODO actually generate whitelists
# TODO actually generate safelists