为什么Airflow在日志中不显示python错误?

问题描述

我正在尝试通过http请求恢复数据,然后插入postgres。 我已经尝试过使用开发数据来运行它,但是它可以正常工作,但是当我切换到生产环境时,它会在15分钟后掉入错误,但是日志中没有显示任何错误

测试数据为5MB 开发数据为90MB

在我本地的python作品中,我用jupyter进行了尝试。

请求代码为:

URL = ti.xcom_pull(key='url',task_ids='settings')
data_row = requests.get(url = URL)
data_text = data_row.text
data_clean = data_text.replace( "'",'"' )

my_data = {
        'entity':'LOCATION','data': data_clean
        }

dataframe = pd.DataFrame(my_data,index=[0])

气流记录为:

*** Reading remote log from gs://.../load_Location/2020-08-12T06:49:53.933780+00:00/3.log.
[2020-08-12 07:40:02,165] {taskinstance.py:656} INFO - Dependencies all met for <TaskInstance: ODS_batch_load.load_Location 2020-08-12T06:49:53.933780+00:00 [queued]>
[2020-08-12 07:40:02,226] {taskinstance.py:656} INFO - Dependencies all met for <TaskInstance: ODS_batch_load.load_Location 2020-08-12T06:49:53.933780+00:00 [queued]>
[2020-08-12 07:40:02,227] {taskinstance.py:867} INFO -
--------------------------------------------------------------------------------
[2020-08-12 07:40:02,227] {taskinstance.py:868} INFO - Starting attempt 3 of 3
[2020-08-12 07:40:02,228] {taskinstance.py:869} INFO -
--------------------------------------------------------------------------------
[2020-08-12 07:40:02,244] {taskinstance.py:888} INFO - Executing <Task(LoadsqlOperator): load_Location> on 2020-08-12T06:49:53.933780+00:00
[2020-08-12 07:40:02,245] {base_task_runner.py:131} INFO - Running on host: airflow-worker-85b4c6c6-RSScd
[2020-08-12 07:40:02,245] {base_task_runner.py:132} INFO - Running: ['airflow','run','ODS_batch_load','load_Location','2020-08-12T06:49:53.933780+00:00','--job_id','4091','--pool','default_pool','--raw','-sd','DAGS_FOLDER/ODS_Batch_Load.py','--cfg_path','/tmp/tmp8432h19m']

enter image description here

并以错误显示而未显示。 我的理论之一是因为数据的大小

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)