-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output control + forms numbering system #11
Comments
Hi @abubelinha, thanks for the feedback. "When called from Python, this function returns a list of the forms." actually refers to the I will fix that in #12 (by making them behave the same and updating the docs). Also note that you don't need to use twill by writing Python scripts, you can write Twill scripts instead which are even more readable and easier to write, and suffice in many cases. I'll try to emphasize this point a bit more in the docs, and give some more examples. You also correctly observed that the form numbers printed by I will fix that via #13. Does this cover all your question? Again, unfortunately, the documentation is very old and does not always match the behavior of the latest version. If you find more bugs, let me know. I will fix these issues and create a new release then. |
Thanks for the prompt answers. I still have an important question though. I can't see the main page after login. I don't know a word about cookies, but maybe I am supposed to somehow collect and propagate them?
Maybe I need to fill in the original |
Hard to say without knowing your webseite. An important point is that Twill is not a full-fledged web browser, like Selenium or other tools. It can only automate and test simple websites that do not rely on JavaScript. However, cookies, hidden fields and simple redirects should work. I have also tested a website that uses CSRF tokens without doing anything special to take care of that (messing with cookies or tokens). The CSRF token should change every time you request a form, that is ok. |
Thanks @Cito I still think the CRSF token has to be passed into the login form. def __login(URL, USER, PWD):
session = requests.Session()
session.post(URL)
data = {"email": USER, "password": PWD, "csrfToken": session.cookies['CSRFtoken'], "login": "login"} That works. But once logged in, internal forms are hard for me to handle with requests, so I am looking for other Python alternatives like twill.
I am not being able to do what you said there. For example, returned_cookies = show_cookies()
print(type(returned_cookies), returned_cookies) output:
How can get those cookies into a Python dictionary, or at least a string which I can parse and split? Thanks a lot for your help with this! |
Just noticed that the value is only returned for forms, history and links, but strangely not for cookies. Will add this to #12 so that this get fixed as well. With the current version, you can get the cookies as a dict like this: from twill.commands import *
go("https://ipt.gbif-uat.org/login.do")
cookies = {cookie.name: cookie.value
for cookie in browser._session.cookies} |
The Python requests module does not send hidden fields automatically, it is not really a stateful browser like the one in Twill (although requests also supports redirection and can keep cookies if you use a session - which is what Twill does under the hood). So the twill browser should be able to handle any "normal" CSRF mechanism out of the box. Note however that your login page has two forms. You need to fill the values for the second form (displayed as #3 in the current version because of the mentioned bug), which has the email and password fields and the hidden CSRF token field, like this: fv(2, 'email', 'test@example.org')
fv(2, 'password', '123456')
submit() This should work without further ado. I think the reason why it does not work is that your site sends a strange cookie path (two quotes instead of an actual path). I guess it only works in some browsers because they silently "correct" the path. So it should work if you correct the path like this: go("https://ipt.gbif-uat.org/login.do")
for cookie in browser._session.cookies:
if cookie.name == 'CSRFtoken':
cookie.path = '/'
break
fv(2, 'email', 'test@example.org')
fv(2, 'password', '123456')
fv(2, 'login', 'login')
submit()
# this finds the error message for the wrong password
find("combination does not exist") # should be ok
# to confirm that "find" works, this should raise an error
find("some garbage") The "cookie patching" would not be necessary if the website had sent a proper cookie path, so if you fix this on the server, it should work without that. |
Thanks a lot for your detailed answers.
If I use Chrome developer tool Network tab and then select Regarding fixing it on the server, that site is a test installation of this Java webapp (running on Apache Tomcat/7.0.76 -it tells you that after login-). Thank you so much again! |
I think you can do both. You can also pass field or form names instead of numbers (if they are named).
That's optional, if you leave it out it uses the form of the last fieldvalue (fv) command.
I guess it's the ipt web app or its configuration. The cookie path is set in its CsrfLoginInterceptor class, and something probably is not done right there. It also catches and ignores all Exceptions when setting the path, which does not look clean to me. ipt issue #1652 could be related to this.
Yes, I guess because Twill (or rather the RequestsCookieJar which is used under the hood) is more nitpicky about the path. Requests issue #6245 also looks related to this. You can tag me, but currently I do not have the time to look deeper into these issues. The crucial issue here is that cookies can have a domain and a path attribute which specify for which domains and URL paths they shall be valid and sent to the server. If the client (the browser or Twill) thinks the path does not match, it does not send the cookie. The behavior if the server sends an invalid path (as ipt is doing) is undefined. Chrome seems to send the cookie anyway in this case, but the RequestsCookieJar does not. Maybe the RequestsCookieJar should be more sloppy as well. |
Thanks a lot @Cito |
I am beginning with this nice tool so probably these are pretty basic questions.
forms = browser.showforms()
, this produces an output, buttype(forms)
returnsNoneType
showforms()
output is returning me something like this:But there is only one form in this page, so I don't understand why it is numbered "2".
Maybe there is a mistake and twill is beginning numbers in 2 instead of 1.
After looking at numbers, I understand I should fill in this
Form #3
like this:But that raises an error:
If I use
fv("2", ...)
instead, then it works correctly, although the target form number is 3:I would say there is some kind of confussion here.
submit("2")
I would expect that should submit the form and log me in, so the browser enters the site.
But nothing happens. I keep on seeing the login page.
Thanks in advance for any help you can provide.
The text was updated successfully, but these errors were encountered: