为什么我的任务在 Google 的 App Engine 中失败？

问题描述

我的两个 12 小时任务中每周大约有 3-4 次作为从 API 端点到 SNowflake DB 的 ETL 失败，我不知道确切原因。

Cron Task Mananger 说它最后一次运行是在今天早上 6 点 29 分，但在检索日志时只有一行写着：

This request caused a new process to be started for your application,and thus caused your application code to be loaded for the first time. This request may thus take longer and use more cpu than a typical request for your application.

我不确定我是否需要热身、分配特定的工作人员等，因为单行错误的日志对我来说信息量太大。我正在使用一个相当大的实例类，我希望可以处理大部分工作负载。

成功运行的日志如下所示：

https://github.com/markamcgown/GF/blob/main/downloaded-logs-success2.csv

失败：

https://github.com/markamcgown/GF/blob/main/downloaded-logs-20210104-074656.csv

app.yaml：

service: vetdata-loader
runtime: python38

instance_class: F4_1G

handlers:

- url: /task/loader
  script: auto

已更新，这是我最近的 app.yaml，现在失败的次数减少了，但有时仍然如此：

service: vetdata-loader
runtime: python38

instance_class: B4_1G

handlers:

- url: /task/loader
  script: auto

basic_scaling:
  max_instances: 11
  idle_timeout: 30m

解决方法

我认为您没有使用正确的实例类。如果您查看一下 here about the timeouts and the task call，自动缩放的调用时间限制为 10 分钟，基本和手动缩放最多为 24 小时。

如果我拿你的 instance_class，FXXX 类型是 suitable for automatic scaling。请改用 B4_1G 实例类并检查您是否仍然存在这些问题。你不应该。

app.yaml cron-task google-app-engine google-cloud-platform