Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot add to AllowedCssClasses after constructor, NullReferenceException #206

Closed
RudeySH opened this issue Jan 27, 2020 · 17 comments
Closed

Comments

@RudeySH
Copy link

RudeySH commented Jan 27, 2020

This throws a NullReferenceException:

var sanitizer = new HtmlSanitizer();
sanitizer.AllowedCssClasses.Add("foo"); // NullReferenceException, the set is null

This is what the constructor of HtmlSanitizer currently does:

AllowedCssClasses = allowedCssClasses != null ? new HashSet<string>(allowedCssClasses) : null;

Also, AllowedCssClasses doesn't have a public setter. Therefore, if you don't pass any allowedCssClasses into the constructor, the set will always be null.

I want to make a PR to fix this, but I'm not sure what the best way is to do it while retaining it's current behavior. When AllowedCssClasses is null, all classes are allowed. When it is empty, no classes are allowed. We can't simply make it fall back to new HashSet<string>() in the constructor, that would be a breaking change.

Perhaps we should make it's setter public?

@mganss
Copy link
Owner

mganss commented Jan 27, 2020

Perhaps we could initialize the HashSet to contain a single element "*" which means all if null was passed in the constructor.

@RudeySH
Copy link
Author

RudeySH commented Jan 27, 2020

Or add boolean property AllowAllCssClasses and set it to true if null was passed in the constructor, otherwise false by default.

Or add a boolean property that turns the whitelist into a blacklist, and set that to true if null was passed in the constructor. An empty blacklist = allow all.

If possible, make the constructor overload that allows passing in a null collection deprecated in favor of a new method.

@RudeySH
Copy link
Author

RudeySH commented Jan 27, 2020

Furthermore, I would like to point out that actually, an option to disallow all classes was never really needed. If someone wanted to do that, wouldn't they simply not add "class" to the AllowedAttributes? And thus ideally, an empty collection for AllowedCssClasses could've meant "all classes allowed", and we wouldn't have had this issue now.

But it's too late for that now, unless you want to make the entire AllowedCssClasses HashSet deprecated and replace it with something that doesn't abuse null. 😄

@tiesont
Copy link

tiesont commented Jan 27, 2020

Suppose, while we're at it, it makes sense to suggest that the HashSet constructors use the override that includes a StringComparer (docs)? At the moment, the hashset is case-sensitive - is that intentional?

@mganss
Copy link
Owner

mganss commented Jan 28, 2020

@tiesont Absolutely.

The current state also means that if you don't pass null in the constructor you can never go back to allowing all classes. So if we adopt the AllowAllCssClasses approach it wouldn't be a breaking change.

@RudeySH
Copy link
Author

RudeySH commented Jan 28, 2020

It's technically still a breaking change. If someone out there is doing checks like sanitizer.AllowedCssClasses == null or != null, they'll run into issues. I'll leave it up to you to decide which approach is best.

@mganss
Copy link
Owner

mganss commented Jan 29, 2020

You're right. Then how about letting the getter return null if AllowAllCssClasses is true?

@mganss mganss mentioned this issue Jan 29, 2020
@RudeySH
Copy link
Author

RudeySH commented Jan 30, 2020

That eliminates any breaking changes, but results in confusing behavior. Devs would still run into NullReferenceExceptions until they fully understand that the value of allowedCssClasses passed into the constructor affects the default value of the new boolean, and how the boolean determines whether or not AllowedCssClasses becomes null / not null.

The way I see it, there are two better options that avoid any breaking changes:

  1. Make the setter of AllowedCssClasses public. Devs would still run into NullReferenceExceptions, but at least they can construct a new HashSet and assign it.

  2. Deprecate AllowedCssClasses and it's constructor overloads in favor of a new property that behaves better. Technically not a breaking change but eventually forces devs to switch, they have to be informed how the null behavior was changed.
    The new property could be named AllowedClasses, and it should never be null. At the very least, it should be a class whitelist that allows all classes when empty. If you want to disallow all classes, simpy ensure that "class" does not get added to AllowedAttributes.
    You could take this new property one step further and make it configurable whether or not it's a whitelist or blacklist. If doing so, an empty whitelist would disallow all classes, an empty blacklist would allow all classes.

@mganss
Copy link
Owner

mganss commented Jan 30, 2020

I like option 2 better, except for the blacklist part which I think is too confusing. Since the next release will be a major version update anyway, how about not deprecating AllowedCssClasses but removing it right away in favor of AllowedClasses?

@RudeySH
Copy link
Author

RudeySH commented Jan 30, 2020

I wonder, if you're considering removing AllowedCssClasses, why are we worrying about breaking changes at all? Then you might as well keep AllowedCssClasses and just fix the null behavior by assigning it a new HashSet<string>() in the constructor, and treat an empty set as "allow all".

@mganss
Copy link
Owner

mganss commented Jan 30, 2020

Then some people (who relied on the current behavior) might not recognize that there was a breaking change (code still compiles and seemingly works, no exceptions thrown etc) when in fact the behavior was changed to the exact opposite.

@RudeySH
Copy link
Author

RudeySH commented Jan 30, 2020

I see, that makes sense. Just keep in mind that it could be tricky to remove the constructor overload that accepted allowedCssClasses and replace it with an overload that accepts the new set. If the signature of the new overload is the same as the old overload, the people who rely on the current behavior will still have their code compile with no warnings.

@mganss
Copy link
Owner

mganss commented Jan 30, 2020

What if we removed the parameter from the constructor? This would introduce a runtime breaking change for everyone and a compile time breaking change for people who actually pass an argument (luckily it's the last one).

@RudeySH
Copy link
Author

RudeySH commented Jan 30, 2020

That works, but wouldn't you want to add an overload that allows people to specify the allowed classes in the constructor, like you already can with attributes / elements ?

@mganss
Copy link
Owner

mganss commented Jan 30, 2020

I don't see a way to do this without running into the problem you described above where code will compile (and/or run without recompilation) but will behave differently.

@tiesont
Copy link

tiesont commented Feb 6, 2020

Option 2 wouldn't really help the constructor problem unless someone was explicitly specifying parameters (e.g. allowedCssClasses: new HashSet<string>()) since (I'm assuming) AllowedClasses would still be a HashSet<string> (unless I missed something).

@mganss
Copy link
Owner

mganss commented Feb 6, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants