-
-
Notifications
You must be signed in to change notification settings - Fork 41
Attribute selectors vs \n in values #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Compared to actual browser implementations, this appears to be a bug in SoupSieve: https://codepen.io/facelessuser/pen/MWvBoJm. The reason this fails is simply due to the pattern. Our matching pattern uses This was simply a case (dealing with new lines) we did not specifically test. I should probably take a look at all the attribute-related patterns and compare them against browser behavior when including newlines. |
Looks like it is a pretty simple fix. We just needed to enable |
Wow. I don't know what to say. To handle such a peculiar request so fast and gracious—it is absolutely awesome 😍 |
It's actually not so peculiar. Yes, it is a bit odd to use newlines in attributes, but Soup Sieve's goal is to match real-world CSS selector behavior, as much as is practical and possible in the scraping environment. Ideally, we'd like to limit surprises and have things operate as close as possible to what people experience using selectors in real browsers. Real-world browsers handle such cases, so we should too 🙂 . Before I wrote Soup Sieve, BeautifulSoup's selector behavior was quite limited and very quirky. Now, you can copy in most selectors and they should work pretty much as expected meaning you don't have to think so hard about what this selector implementation supports and what it doesn't or what it does differently. I plan on cutting a release later today, so you should be able to pick the fix up soon. |
Hi! Thanks for the powerful library.
I use it via BeautifulSoup, and I find out this behavior:
I expected this to print both spans, but the actual output is
It seems that
*=
considers only the first line of multi-line attribute:prints this:
Is there some bug, or some conscious limitation, or
\n
in attribute values is against the standard?Thanks!
The text was updated successfully, but these errors were encountered: