Pyppeteer 会话崩溃或超时

问题描述

为了抓取binance.com,我使用了这个库 pyppeteer 来呈现网页并获得干净的 html 代码而不是 javascript 代码

我的问题是:会话第一次在远程 Ubuntu 20.04 服务器上正常工作,但是当我再次运行代码时,我得到 pyppeteer.errors.PageError: Page crashed!pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 100000 ms exceeded. 此外,当我从主 Windows 系统在 PyCharm 中运行该代码时,该代码工作正常,但问题恰恰发生在 ubuntu 上。

我认为问题与无人认领的 pyppeteer 会话有关,但我不确定。

这是我的代码

from requests_html import HTMLSession
from bs4 import BeautifulSoup
import time
from datetime import datetime
from sql import *


if __name__ == "__main__":
    while True:
        session = HTMLSession()
        r = session.get('https://www.binance.com/ru/Trade/ETH_BTC')
        r.html.render(sleep = 1,keep_page=True,scrolldown=1,timeout=1000)
        soup = BeautifulSoup(r.html.html,"lxml")

        price = soup.find("div",class_ = lambda value: value and value.startswith("showPrice"))


        Now = datetime.Now()
        dt_string = Now.strftime("%d/%m/%Y %H:%M:%s")
        sql(dt_string,price.text)
        print(dt_string + " ETH/BTC: " +  price.text)

        r.close()
        session.close()

这是崩溃错误日志:

Traceback (most recent call last):
  File "binance.py",line 13,in <module>
    r.html.render(sleep = 1,timeout=1000)
  File "/usr/local/lib/python3.8/dist-packages/requests_html.py",line 598,in render
    content,result,page = self.session.loop.run_until_complete(self._async_render(url=self.url,script=script,sleep=sleep,wait=wait,content=self.html,reload=reload,scrolldown=scrolldown,timeout=timeout,keep_page=keep_page))
  File "/usr/lib/python3.8/asyncio/base_events.py",line 616,in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.8/dist-packages/requests_html.py",line 512,in _async_render
    await page.goto(url,options={'timeout': int(timeout * 1000)})
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/page.py",line 885,in goto
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 1000000 ms exceeded.
[E:pyppeteer.connection] connection unexpectedly closed
Task exception was never retrieved
future: <Task finished name='Task-105' coro=<Connection._async_send() done,defined at /usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py:69> exception=InvalidStateError('invalid state')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 827,in transfer_data
    message = await self.read_message()
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 895,in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 971,in read_data_frame
    frame = await self.read_frame(max_size)
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 1047,in read_frame
    frame = await Frame.read(
  File "/usr/local/lib/python3.8/dist-packages/websockets/framing.py",line 105,in read
    data = await reader(2)
  File "/usr/lib/python3.8/asyncio/streams.py",line 721,in readexactly
    raise exceptions.IncompleteReadError(incomplete,n)
asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of 2 expected bytes

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 73,in _async_send
    await self.connection.send(msg)
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 555,in send
    await self.ensure_open()
  File "/usr/local/lib/python3.8/dist-packages/websockets/protocol.py",line 803,in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: code = 1006 (connection closed abnormally [internal]),no reason

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 79,in _async_send
    await self.dispose()
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 170,in dispose
    await self._on_close()
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 151,in _on_close
    cb.set_exception(_rewriteError(
asyncio.exceptions.InvalidStateError: invalid state
Task exception was never retrieved
future: <Task finished name='Task-2' coro=<Connection._recv_loop() done,defined at /usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py:53> exception=PageError('Page crashed!')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 61,in _recv_loop
    await self._on_message(resp)
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 143,in _on_message
    self._on_query(msg)
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 123,in _on_query
    session._on_message(params.get('message'))
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/connection.py",line 276,in _on_message
    self.emit(obj.get('method'),obj.get('params'))
  File "/usr/local/lib/python3.8/dist-packages/pyee/_base.py",line 108,in emit
    handled = self._call_handlers(event,args,kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pyee/_base.py",line 91,in _call_handlers
    self._emit_run(f,kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pyee/_compat.py",line 49,in _emit_run
    coro = f(*args,**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/page.py",line 205,in <lambda>
    lambda event: self._onTargetCrashed())
  File "/usr/local/lib/python3.8/dist-packages/pyppeteer/page.py",line 228,in _onTargetCrashed
    self.emit('error',PageError('Page crashed!'))
  File "/usr/local/lib/python3.8/dist-packages/pyee/_base.py",line 111,in emit
    self._emit_handle_potential_error(event,args[0] if args else None)
  File "/usr/local/lib/python3.8/dist-packages/pyee/_base.py",line 83,in _emit_handle_potential_error
    raise error
pyppeteer.errors.PageError: Page crashed!

解决方法

如果您 KILL 一个进程,例如使用 killall -9 python3,它在第二次、第三次等情况下再次运行良好。任务解决了!

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...