Skip to content

Ignoring Symbols #221

Answered by flavorjones
aljadepalaran asked this question in Q&A
Oct 27, 2021 · 1 comments · 3 replies
Discussion options

You must be logged in to vote

Without knowing what your input string is, I'm going to make an assumption that your input string contains a bare &. libxml2 and libgumbo (which underly Nokogiri, which underlies Sanitize) will correct a bare & to the HTML entity:

#! /usr/bin/env ruby

require "nokogiri"

frag = Nokogiri::HTML5.fragment("<div>this & that</div>")
frag.to_html # => "<div>this &amp; that</div>"

libxml2 does not allow configuration of how the core HTML entities like &lt;, &gt;, and &amp; are serialized, which is the right thing to do if you want to emit properly-formed HTML.

Can you tell us a bit more about your use case? Are you trying to sanitize plaintext, or is it truly HTML markup content? If it's plaint…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@aljadepalaran
Comment options

@flavorjones
Comment options

@aljadepalaran
Comment options

Answer selected by aljadepalaran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants