New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debug _find_link_internal #256
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The travis build fails (test failure), so there seems to be an issue somewhere, this obviously needs fixing before merging.
Also, it would be nice to have a test demonstrating the failure to illustrate the issue that is fixed by the commit.
mechanicalsoup/stateful_browser.py
Outdated
'url_regex because url_regex is already ' | ||
'present in keyword arguments') | ||
else: | ||
kwargs['url_regex'] = link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep the if ... and ...
as is and turn else:
into else if link: ...
for readability.
Hi, and thanks for the pull request! So if I understand this correctly, the case you want to fix is this: br.find_link(None, url_regex="/something") However, according to the docstring of
It looks like the intention is to not specify a I think your patch makes sense on its own, but I'm not 100% sure what the intended behavior of
|
By the way, the test failure is a separate issue. I just updated all my pip packages locally and I'm now seeing the same issue without this patch. (Looks like it's a list ordering issue...) |
Not exactly. In my code I had a line like this : browser.download_link(url_regex='/admin/https/csr', file='/vagrant/local.csr') which would call self._find_link_internal(None, url_regex='/admin/https/csr') in this case, the code would set up the url_regex keyword to None (the value of link), thus downloading a "random" link from the page instead of the one I wanted. I think that the two following calls should have the same result, which seems coherent to the documentation in my opinion. self.follow_link(url_regex='/admin/https/csr')
self._find_link_internal(None, url_regex='/admin/https/csr')
self.follow_link('/admin/https/csr')
self._find_link_internal('/admin/https/csr') And this is what is fixed with my patch. |
The `test_link_arg_regex` test was accidentally passing. By adding another link to the test html, we can see that it was selecting the first link instead of the link that matched the supplied regex. This new version of the test fails, but it will be fixed by MechanicalSoup#256. Also added a case for no arguments and reduced code duplication by changing it to a parameterized test.
I agree with the change (but to clarify, it would grab the first link, not a random one). This should have been caught by Please go ahead and make any requested changes, and then I think this is ready to be merged. It would be worth checking that the code passes the improved tests in #261, but I wouldn't worry about that yourself. |
The `test_link_arg_regex` test was accidentally passing. By adding another link to the test html, we can see that it was selecting the first link instead of the link that matched the supplied regex. This new version of the test fails, but it will be fixed by MechanicalSoup#256. Also added a case for no arguments and reduced code duplication by changing it to a parameterized test.
@tiboroche #261 has been resolved, so please rebase your branch against master when updating and remove the Thanks! |
I made the requested changes. |
Thanks! Can you rebase on top of master and squash all commits into one? Ideally, the commit message can contain a mention of the commit introducing the If you don't have time, I can do it while merging, no problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the PR!
CSS selectors in bs4 now return elements in page order, whereas they did not previously. This requires us to re-order some of our expected test output, and to perform an order-independent comparison if tested with a bs4 version before 4.7.0. Tested and passing with bs4 4.6.0 and 4.7.1. Closes MechanicalSoup#257.
The `message` parameter to `pytest.raises` was deprecated in pytest version 4.1. > PytestDeprecationWarning: The 'message' parameter is deprecated. > (did you mean to use `match='some regex'` to check the exception message?) Yes! We did mean to use `match`, and now we do! Thanks pytest.
FAQ: Update on alternatives mechanize is back on track, RoboBrowser seems clearly abandoned.
The tag types were being checked in two ways: 1. tag.get("type", "").lower() == X 2. tag.get("type") is not None and tag.get("type").lower() == X Since these should be identical, change all to the first option, which is simpler and faster. This is a follow-up to MechanicalSoup#246.
The `test_link_arg_regex` test was accidentally passing. By adding another link to the test html, we can see that it was selecting the first link instead of the link that matched the supplied regex. This new version of the test fails, but it will be fixed by MechanicalSoup#256. Also added a case for no arguments and reduced code duplication by changing it to a parameterized test.
Closes MechanicalSoup#248. MechanicalSoup was incorrectly ignoring `disabled` attributes in form elements.
Also, point to the bug in the new GitHub repository, the one we pointed to is obsolete.
If you called the function with url_regex='/something' and not the link argument, it would set the url_regex to None, and thus the result would be impredictable. This is tested by the test added by MechanicalSoup#261
0a3d593
to
d748184
Compare
I messed up the commits squashing, so I opened a new PR #272 with a single "clean" commit. Therefore I close this one. |
If you called the function with url_regex='/something' and not the
link argument, it would set the url_regex to None, and thus the result
would be impredictable.