Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<a> tags in tables not matched correctly #201

Open
matloob opened this issue Dec 1, 2023 · 1 comment
Open

<a> tags in tables not matched correctly #201

matloob opened this issue Dec 1, 2023 · 1 comment

Comments

@matloob
Copy link

matloob commented Dec 1, 2023

Here's a program I ran: https://go.dev/play/p/iT5Rcoe1XRz copied below


import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	text := `
<a name="foo">
<table>
<tr>
<td><a href="page">text</a></td>
</tr>
</table>
</a>
`

	p := bluemonday.UGCPolicy()
	s := string(p.SanitizeBytes([]byte(text)))
	fmt.Println(s)
}

The output is



<table>
<tr>
<td><a href="page" rel="nofollow">text</td>
</tr>
</table>
</a>

But I would expect the output to be



<table>
<tr>
<td><a href="page" rel="nofollow">text</a></td>
</tr>
</table>

That is, the closing a tag in the td should be matched to the starting a tag in the td, but instead, the ending a tag in the text is matched to it.

@buro9
Copy link
Member

buro9 commented Dec 2, 2023

this doesn't feel like it is valid nor should work.

the a element is not allowed to be a descendent of another a element, and that is what you've done by wrapping the table in the outer <a name="foo"> (which btw is also deprecated/obsolete and you should use id instead of name).

by nesting the anchors you've broken the little state machine that keeps track of the tags opened and closed, and the last open anchor is preserved, as is the last close anchor.

to me, whilst yes it breaks the state machine, this isn't valid input... and bluemonday is GIGO (garbage in, garbage out) and so I wouldn't look to fix this.

that said, if the error can be reproduced on valid input then it would be a problem that I should fix.

I used https://validator.w3.org/nu/#textarea to validate the input as I did feel that nesting anchors shouldn't work (in addition to wrapping a table with an anchor)... as the only used of a name or id anchor is to navigate to the top of it, why not just close that immediately?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants