A suspended issue when i click to more next page on headless mode #201
Comments
I can reproduce this error, but only I have this script freezes on page 24. |
I logged events in frame_manager.py like mishaberezi in a similar issue in pUppeteer. logpage 23 ===== before click count 1 lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '4249AAE0943906B495E9D1DDB1AE1D52', 'name': 'init', 'timestamp': 40460.853683} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'init', 'timestamp': 40461.103231} execution context destrouyed 70 frame detached execution context destrouyed 69 frame detached execution context destrouyed 68 execution context cleared frame navigated {'id': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'url': 'https://blog.csdn.net/xlgen157387/article/list/24?', 'securityOrigin': 'https://blog.csdn.net', 'mimeType': 'text/html'} execution context created {'id': 71, 'origin': 'https://blog.csdn.net', 'name': '', 'auxData': {'isDefault': True, 'frameId': '47F975CF196F8FBBB9423C5D462B7D79'}} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstPaint', 'timestamp': 40461.271383} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstContentfulPaint', 'timestamp': 40461.271383} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstTextPaint', 'timestamp': 40461.271383} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstImagePaint', 'timestamp': 40461.271383} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstMeaningfulPaintCandidate', 'timestamp': 40461.271383} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'DOMContentLoaded', 'timestamp': 40461.448674} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstMeaningfulPaintCandidate', 'timestamp': 40461.37856} frame attached 9A96ED59951C6DC44C5CB6E69B220AFD 47F975CF196F8FBBB9423C5D462B7D79 False lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': '93FA7757E124633729295113644641C6', 'name': 'DOMContentLoaded', 'timestamp': 40461.501684} lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': 'init', 'timestamp': 40461.502118} frame navigated {'id': '9A96ED59951C6DC44C5CB6E69B220AFD', 'parentId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': '', 'url': 'about:blank', 'securityOrigin': '://', 'mimeType': 'text/html'} execution context created {'id': 72, 'origin': 'https://blog.csdn.net', 'name': '', 'auxData': {'isDefault': True, 'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD'}} lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': 'load', 'timestamp': 40461.50941} frame stopped loading lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': 'DOMContentLoaded', 'timestamp': 40461.512303} frame attached 52E46163F744BC16C379A8F3B0E9C76E 47F975CF196F8FBBB9423C5D462B7D79 False lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': 'D177D80179268253AF8685D377AC5757', 'name': 'DOMContentLoaded', 'timestamp': 40461.88449} lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '355A57893FF868ADA3D83CD2225D62E7', 'name': 'init', 'timestamp': 40461.884764} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'load', 'timestamp': 40461.888697} lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'init', 'timestamp': 40461.894251} frame navigated {'id': '52E46163F744BC16C379A8F3B0E9C76E', 'parentId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'BAIDU_DUP_fp_iframe', 'url': 'https://pos.baidu.com/wh/o.htm?ltr=', 'securityOrigin': 'https://pos.baidu.com', 'mimeType': 'text/html'} execution context created {'id': 73, 'origin': 'https://pos.baidu.com', 'name': '', 'auxData': {'isDefault': True, 'frameId': '52E46163F744BC16C379A8F3B0E9C76E'}} lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'load', 'timestamp': 40461.915709} frame stopped loading frame stopped loading lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'DOMContentLoaded', 'timestamp': 40461.916774} after click count 1 lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': 'networkAlmostIdle', 'timestamp': 40461.512309} lifecycleevevent {'frameId': '9A96ED59951C6DC44C5CB6E69B220AFD', 'loaderId': 'B0D77E7EDDD9833CF17E70790D6EFE39', 'name': 'networkIdle', 'timestamp': 40461.512309} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'firstMeaningfulPaint', 'timestamp': 40461.37856} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': '3E997CE5D24DF006A1FF1FDC7394F390', 'name': 'networkAlmostIdle', 'timestamp': 40461.818695} lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'networkAlmostIdle', 'timestamp': 40461.91678} lifecycleevevent {'frameId': '52E46163F744BC16C379A8F3B0E9C76E', 'loaderId': '47E7F26FD1BD1D3C12F60AAC3988DB54', 'name': 'networkIdle', 'timestamp': 40461.916793} page 24 ===== before click count 1 lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': 'B513C68D3230A62C47CC3ECF2C3DD0D1', 'name': 'init', 'timestamp': 40462.838782} lifecycleevevent {'frameId': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': 'FF9136930CB61A71978C5AD85CBE9F00', 'name': 'init', 'timestamp': 40463.086377} execution context destrouyed 73 frame detached execution context destrouyed 72 frame detached execution context destrouyed 71 execution context cleared frame navigated {'id': '47F975CF196F8FBBB9423C5D462B7D79', 'loaderId': 'FF9136930CB61A71978C5AD85CBE9F00', 'url': 'https://blog.csdn.net/xlgen157387/article/list/25?', 'securityOrigin': 'https://blog.csdn.net', 'mimeType': 'text/html'} execution context created {'id': 74, 'origin': 'https://blog.csdn.net', 'name': '', 'auxData': {'isDefault': True, 'frameId': '47F975CF196F8FBBB9423C5D462B7D79'}} Traceback (most recent call last): File "freeze.py", line 38, in loop.run_until_complete(task) File "/home/nokados/anaconda3/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete return future.result() File "freeze.py", line 21, in main await page.waitForNavigation() # stuck (suspended) here when click to File "/home/nokados/.local/lib/python3.7/site-packages/pyppeteer/page.py", line 938, in waitForNavigation raise error pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms In addition, if I start with any other page, the freezing will rise again after 24 iterations, after creating the context with id=74. For example, if I start with page=3, then I will be stuck during loading page 27. |
Thank you very much for your feedback! |
Hi @nokados . |
I explored this issue deeper and I have 2 investigations. 1. The similar freeze may be caused by an exception raised in the _recv_loop coroutine in connection.py. This example will cause the same symptoms: async def _recv_loop(self) -> None:
async with self._ws as connection:
self._connected = True
self.connection = connection
while self._connected:
try:
resp = await self.connection.recv()
raise Exception('dummy exception') # <--- this is what we added
if resp:
await self._on_message(resp)
except (websockets.ConnectionClosed, ConnectionResetError):
logger.info('connection closed')
break
await asyncio.sleep(0)
if self._connected:
self._loop.create_task(self.dispose()) Anyway, it is not your case, because no exceptions are thrown here. However, maybe there is another coroutine that has the same bug... 2. The last executed line of code I could track is not in pyppeteer. It is in websockets library in protocol.py. There is the next code: async def recv(self) -> Data:
# skip some code ...
pop_message_waiter: asyncio.Future[None] = self.loop.create_future()
self._pop_message_waiter = pop_message_waiter
try:
await asyncio.wait(
[pop_message_waiter, self.transfer_data_task],
loop=self.loop,
return_when=asyncio.FIRST_COMPLETED,
)
finally:
self._pop_message_waiter = None pop_message_waiter is a future that is created here and do nothing. |
Besides, I get this error message to chrome console on every page:
It may be the reason for some error, but it does not justify the hang of the entire script. We should still receive an exception and be able to keep progress, but there is no such possibility. |
Hi, Just adding my two cents... you could try adding element.appendChild() or similar to tackle this... refer to this answer on google devs site. https://developers.google.com/web/updates/2016/08/removing-document-write |
1. Env version
OS: Ubuntu 16.04
Python version: 3.6.2
pyppeteer version: 0.0.25
Chrome version: 575458(default) (I also try other version like 579032 609904)
2. What happened?
I want to visit this page csdn: https://blog.csdn.net/xlgen157387 by pypeeteer on headless mode.
Then click next page util last page number.
We can see this list page has total 28 pages when we open this url.
I run my code for this purpose, but it stuck on page 24.
Actually I catch
Navigation Timeout Exceeded
Error on page 24, then retry to click next page button.It suspended ont this code
await page.screenshot({"path": "img/exp{}.png".format(i), "fullPage": True})
.Important: When headless=False, no this issue!
3. Origin code
You can recover this error by this code.
I suspect the connect lost. If you can click to last page not other issue, can you share your env version for me?Thks!!!
simplify version
detail version
The text was updated successfully, but these errors were encountered: