Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Allow to preserve order of attributes #566

Closed
askoretskiy opened this issue Nov 24, 2020 · 6 comments
Closed

feature: Allow to preserve order of attributes #566

askoretskiy opened this issue Nov 24, 2020 · 6 comments
Milestone

Comments

@askoretskiy
Copy link
Contributor

askoretskiy commented Nov 24, 2020

At the moment bleach.clean sorts attributes.

There are several use cases when it is more desired to preserve the order.

Imagine following HTML:

<a href="https:/ya.ru" rel="noopener noreferrer" target="_blank" style="font-size: 24px;">Yandex</a>

The result of bleach.clean would be different than the input:

>>> import bleach
>>> text = """<a href="https:/ya.ru" rel="noopener noreferrer" target="_blank" style="font-size: 24px;">Yandex</a>"""
>>> bleach.clean(text, attributes=["href", "rel", "target", "style"], styles=["font-size"])
'<a href="https:/ya.ru" rel="noopener noreferrer" style="font-size: 24px;" target="_blank">Yandex</a>'
>>> text
'<a href="https:/ya.ru" rel="noopener noreferrer" target="_blank" style="font-size: 24px;">Yandex</a>'

Sometimes it is important to preserve the order of the attributes. For example, to check if the clean version of the HTML is the same as the original.

Technically, attributes are sorted by function bleach.utils.alphabetize_attributes.

The reason for this function was to keep stable order of the attributes. Internally attributes are stored as dict that prior to Python3.6 were storing elements in "random" order. Starting from Python3.6 all the dicts preserve the order of elements, so they are stable by itself.

FYI: Python 3.5 is no longer supported since 13 Sep 2020, so it should be fine to have this feature working for Python3.6+ only.

I suggest to add a parameter sort_attributes=True to BleachSanitizerFilter and if it is set to False -- then do not sort attributes on fly. I would keep True as default value so that behaviour stays the same as now.

@askoretskiy
Copy link
Contributor Author

I could prepare MR if needed

@g-k
Copy link
Collaborator

g-k commented Jan 5, 2021

Hi, thanks for the PR and digging into this!

Bleach doesn't try to prettify output, so it makes sense to have it default to not sorting attrs. I'd rather not add more flags though, so we can change the behavior in the next major release.

@g-k g-k added this to the v4.0.0 milestone Jan 5, 2021
@g-k g-k mentioned this issue Jan 22, 2021
@jvanasco
Copy link
Contributor

Before this change rolls out, please release a warning or some sort of message well in advance, as this is likely to break a lot of test suites and probably some production uses.

IMHO, it would be more proper to keep the current behavior and let uses opt-in. This could even be done with a module or environment variable.

I would have LOVED this feature 8 years ago. I am eager to adopt it when it is hits, however this is likely to break most of our test suites.

@g-k
Copy link
Collaborator

g-k commented Mar 25, 2021

I'm not planning to add this to bleach 3.x. I want to avoid adding more flags to .clean and retain support for the Python versions 3.x currently supports.

I landed a modified version of Artem's PR in the 4.0 branch that makes preserving attr order the default behavior and drops support for old Python versions.

Bleach sorting attributes is surprising behavior (especially considering that prettifying isn't a goal of the library), so this will be nice to release in 4.0.

@askoretskiy
Copy link
Contributor Author

I agree, it is a nice feature of 4.x release to break backward compatibility and also remove older Python support.

@willkg
Copy link
Member

willkg commented Feb 18, 2022

This landed in the next-major-release branch and I cherry-picked it into PR #644.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants