由于验证错误,Snowplow 管道无法提交请求

问题描述

我在我的丰富步骤中收到错误“由于验证错误而无法提交请求:INVALID_ARGUMENT:Pubsub 发布请求限制为 10MB,拒绝消息以避免超过 byte64 请求编码的限制”。

我使用 Simo Ahava 的教程在 Google 云平台上安装 SNowplow | Simo Ahava 的博客

beam-enrich worker 的数据流步骤出现错误错误停止所有进程,数据不插入 BQ。

错误日志

{
“insertId”: “7514256621418980731:34459:0:179556”,“jsonPayload”: {
“line”: “active_work_manager.cc:1564”,“message”: “132593 Could not commit request due to validation error: INVALID_ARGUMENT: Pubsub publish requests are limited to 10MB,rejecting message over 7168K (size 7245K) to avoid exceeding limit with byte64 request encoding.”,“thread”: “194”
},“resource”: {
“type”: “dataflow_step”,“labels”: {
“job_name”: “beam-enrich”,“project_id”: “XXXXXXXXXX”,“region”: “europe-central2”,“job_id”: “2021-03-30_14_43_03-13952642482494084906”,“step_id”: “”
}
},“timestamp”: “2021-03-31T08:38:58.534286Z”,“severity”: “ERROR”,“labels”: {
“compute.googleapis.com/resource_name”: “beam-enrich-03301443-wkq8-harness-w1zs”,“dataflow.googleapis.com/log_type”: “system”,“dataflow.googleapis.com/job_id”: “2021-03-30_14_43_03-13952642482494084906”,“dataflow.googleapis.com/region”: “europe-central2”,“compute.googleapis.com/resource_type”: “instance”,“compute.googleapis.com/resource_id”: “7514256621418980731”,“dataflow.googleapis.com/job_name”: “beam-enrich”
},“logName”: “projects/XXXXXXXXXX/logs/dataflow.googleapis.com%2Fshuffler”,“receiveTimestamp”: “2021-03-31T08:39:21.828671489Z”
}

解决方法

Pub/Sub 具有以下硬性资源限制:https://cloud.google.com/pubsub/quotas#resource_limits

您的管道是否发布到 Pub/Sub?您可能需要通过拆分或截断消息来减小消息的大小。