为什么 len(self) 在 aiohttp 中抛出 RecursionError?又名蔚蓝事件中心库的奇怪行为

问题描述

这很奇怪。

我们有一个 aiohttp 抛出的 RecursionError:

File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py",line 169,in on_eof
    callback()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py",line 943,in _response_eof
    self._connection.release()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py",line 171,in release
    self._key,self._protocol,should_close=self._protocol.should_close
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_proto.py",line 53,in should_close
    or len(self) > 0

但我们无法识别任何会在我们的主代码中递归的代码

文件

azure-eventhub-checkpointstoreblob-aio = "*"
azure-eventhub-checkpointstoreblob = "*"

锁定版本是:

azure-eventhub = "==5.5.0"
aiohttp = "==3.7.4.post0"
azure-eventhub-checkpointstoreblob = "==1.1.4"
azure-eventhub-checkpointstoreblob-aio = "==1.1.4"

azure-eventhub 接收时创建的协程中出现上述错误。 从事件中心库中,我们收到了上述日志级别警告错误的包装消息:

EventProcessor instance '...' of eventhub 'usage-ingress' consumer group '$Default'. An error occurred while load-balancing and claiming ownership. The exception is RecursionError('maximum recursion depth exceeded'). retrying after 11.889686314484639 seconds

这种方法的递归错误??我已经在堆栈跟踪中查找了代码位,我很困惑。

有关基础架构的详细信息:

检查点是在带有分层文件系统的 Azure Storage Datalake Gen2 上完成的。 Libary 为存储请求内置的请求日志记录显示 200 的所有权和检查点请求以及消息

"No body was attached to the request"

在没有任何代码更改或重新部署的情况下,我们的服务开始变慢(接收器每小时吞吐量为 1-5 个事件,有 10 万个事件进入 Eventhub)。

我们已经怀疑 eventhub 服务受到限制,但指标上没有受到限制的请求。 在数据湖上也没有节流。除了 eventhub 库在文件系统上创建的检查点文件夹上的访问时间非常长。访问层很热。

我完全不知道为什么会发生这种情况。我希望有人有想法。 干杯

编辑: 与此相关:

"level": "ERR","message": "Exception in eof callback","threadid": 140306312525568,"processid": 10,"channel": "aiohttp.internal","exception": "RecursionError","stacktrace": "  File \"/usr/local/lib/python3.8/site-packages/aiohttp/streams.py\",in on_eof\n    callback()\n  File \"/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py\",in _response_eof\n    self._connection.release()\n  File \"/usr/local/lib/python3.8/site-packages/aiohttp/connector.py\",line 167,in release\n    self._notify_release()\n"

编辑#2: 回应亚当:

Jun 24,2021 @ 01:16:21.628 Request URL: 'https://<storage>.blob.core.windows.net/<resGroup>?restype=REDACTED&comp=REDACTED&prefix=REDACTED&marker=REDACTED&include=REDACTED'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629 Request method: 'GET'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'Authorization': 'REDACTED' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629 No body was attached to the request INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629 Request headers:    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'x-ms-version': 'REDACTED'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'Accept': 'application/xml' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'x-ms-date': 'REDACTED' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'x-ms-client-request-id': '05acf018-d479-11eb-af7d-a2f51e4cb2c7'    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:21.629     'User-Agent': 'azsdk-python-storage-blob/12.7.1 Python/3.8.10 (Linux-5.4.0-1047-azure-x86_64-with-glibc2.28)'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.924 Response status: 200    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.924     'transfer-encoding': 'chunked'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.924 Exception in eof callback   ERR   File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py",in should_close
    or len(self) > 0
    RecursionError  aiohttp.internal
    Jun 24,2021 @ 01:16:24.924 Response headers:   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'x-ms-version': 'REDACTED'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'Date': 'Wed,23 Jun 2021 23:16:24 GMT' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'Content-Type': 'application/xml'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'x-ms-request-id': '57d01bb6-e01e-0019-7185-68c5ca000000'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24,2021 @ 01:16:24.925     'x-ms-client-request-id': 'f789fce2-d478-11eb-80e0-a2f51e4cb2c7'

似乎是在请求阶段遇到某种 eof 时导致的错误

解决方法

所以我的问题是通过增加事件中心的分区计数和手动删除该事件中心的所有权 blob 来解决的。

仍然很奇怪,但我无法清楚地重现这个错误。