Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left angle bracket '<' with any characters (tag or not) and no closing right angle bracket prevents remaining text from being returned #544

Closed
steffanboodhoo opened this issue Aug 6, 2020 · 2 comments · Fixed by #667
Labels
Milestone

Comments

@steffanboodhoo
Copy link

Given some text that contains a left angle bracket and no closing right angle bracket, the bleach.clean method returns nothing after the occurrence of the left angle bracket.

Consider the following example strings
example_str_1 = 'random prefix text <anything any amount of suffix text'
example_str_2 = '<e any amount of text here is gone'
example_str_3 = '<it works when there is a closing>'

Expected output
example_str_1 random prefix text &lt;anything any amount of suffix text
example_str_2 &lt;e any amount of text here is gone
example_str_3 &lt;it works when there is a closing right bracket&gt;

Actual output
example_str_1 random prefix text
example_str_2 empty string
example_str_3 &lt;it works when there is a closing right bracket&gt;

Python version 3.6.8
bleach 3.1.5

@g-k g-k added the clean label Sep 16, 2020
@remote007
Copy link

Hey @g-k @steffanboodhoo s this still open to MR ?

@willkg
Copy link
Member

willkg commented Feb 10, 2022

I verified that this is still an issue.

@remote007 If you want to try fixing this, that'd be great! Let me know if you have any questions.

@willkg willkg added this to the v5.0.1 milestone Jun 1, 2022
willkg added a commit that referenced this issue Jun 2, 2022
…667)

The html5lib tokenizer kicks up a parse error token when there's a <
that isn't the start of a tag. This adds some handling for that case and
treats the < plus whatever is after it as characters data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants