气流Google ComposerTypeError:无法腌制_thread.RLock对象

问题描述

我正在使用气流(Google作曲家),但在下面遇到了一些例外情况

TypeError:无法腌制_thread.RLock对象

Ooops.

                          ____/ (  (    )   )  \___
                         /( (  (  )   _    ))  )   )\
                       ((     (   )(    )  )   (   )  )
                     ((/  ( _(   )   (   _) ) (  () )  )
                    ( (  ( (_)   ((    (   )  .((_ ) .  )_
                   ( (  )    (      (  )    )   ) . ) (   )
                  (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
                  ( (  (   ) (  )   (  ))     ) _)(   )  )  )
                 ( (  ( \ ) (    (_  ( ) ( )  )   ) )  )) ( )
                  (  (   (  (   (_ ( ) ( _    )  ) (  )  )   )
                 ( (  ( (  (  )     (_  )  ) )  _)   ) _( ( )
                  ((  (   )(    (     _    )   _) _(_ (  (_ )
                   (_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
                   ((__)        \\||lll|l||///          \_))
                            (   /(/ (  )  ) )\   )
                          (    ( ( ( | | ) ) )\   )
                           (   /(| / ( )) ) ) )) )
                         (     ( ((((_(|)_)))))     )
                          (      ||\(|(|)|/||     )
                        (        |(||(||)||||        )
                          (     //|/l|||)|\\ \     )
                        (/ / //  /|//||||\\  \ \  \ _)
-------------------------------------------------------------------------------
Node: d93e048dc08a
-------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/python3.6/lib/python3.6/site-packages/flask/app.py",line 2447,in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/python3.6/lib/python3.6/site-packages/flask/app.py",line 1952,in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/python3.6/lib/python3.6/site-packages/flask/app.py",line 1821,in handle_user_exception
    reraise(exc_type,exc_value,tb)
  File "/opt/python3.6/lib/python3.6/site-packages/flask/_compat.py",line 39,in reraise
    raise value
  File "/opt/python3.6/lib/python3.6/site-packages/flask/app.py",line 1950,in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/python3.6/lib/python3.6/site-packages/flask/app.py",line 1936,in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/python3.6/lib/python3.6/site-packages/flask_admin/base.py",line 69,in inner
    return self._run_view(f,*args,**kwargs)
  File "/opt/python3.6/lib/python3.6/site-packages/flask_admin/base.py",line 368,in _run_view
    return fn(self,**kwargs)
  File "/opt/python3.6/lib/python3.6/site-packages/flask_login/utils.py",line 258,in decorated_view
    return func(*args,**kwargs)
  File "/usr/local/lib/airflow/airflow/www/utils.py",line 290,in wrapper
    return f(*args,line 337,**kwargs)
  File "/usr/local/lib/airflow/airflow/www/views.py",line 1335,in clear
    include_upstream=upstream)
  File "/usr/local/lib/airflow/airflow/models/dag.py",line 1243,in sub_dag
    for t in regex_match + also_include}
  File "/usr/local/lib/airflow/airflow/models/dag.py",in <dictcomp>
    for t in regex_match + also_include}
  File "/opt/python3.6/lib/python3.6/copy.py",line 161,in deepcopy
    y = copier(memo)
  File "/usr/local/lib/airflow/airflow/models/baSEOperator.py",line 678,in __deepcopy__
    setattr(result,k,copy.deepcopy(v,memo))
  File "/opt/python3.6/lib/python3.6/copy.py",line 180,in deepcopy
    y = _reconstruct(x,memo,*rv)
  File "/opt/python3.6/lib/python3.6/copy.py",line 280,in _reconstruct
    state = deepcopy(state,memo)
  File "/opt/python3.6/lib/python3.6/copy.py",line 150,in deepcopy
    y = copier(x,line 240,in _deepcopy_dict
    y[deepcopy(key,memo)] = deepcopy(value,line 215,in _deepcopy_list
    append(deepcopy(a,line 169,in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.RLock objects

我尝试了什么?

  1. 从气流UI界面清除任务,不起作用
  2. 从诸如使用命令回填之类的命令中,不起作用
  3. 重新启动Airflow Web服务,不起作用
  4. 更改了DAG retry_delay = timedelta(seconds = 5)

有人可以提供上述帮助吗?非常感谢

我发现StackOverflow上也有类似的问题,但是这些问题并没有真正解决

Airflow can't pickle _thread._local objects

Airflow 1.9.0 ExternalTaskSensor retry_delay=30 yields TypeError: can't pickle _thread.RLock objects

解决方法

我猜想类似的问题已在Apache Jira tracker上报道,调查那里的讨论线程,我可以指出一些可能有助于克服此问题的问题:

  • 我建议您仔细阅读特定的DAG,并检查 专用DAG运算符的默认参数的正确类型, 尽管retry_delay已被检查, 值得回顾其余参数,link是 已经在问题中提到;

  • 要进一步调试,请验证您的DAG Operator是否仅消耗 根据发布的评论here可拾取(可序列化)对象。

  • 我认为我们仍然从用户那里收到一些问题 通过Airflow WEB UI清除Airflow DAG任务,只需检查一下 thread。为了 要缓解此问题,您可以在其中删除失败的任务 气流command-line tool(例如here)或作为最后的手段 从Airflow元数据数据库中删除task_id记录。

    连接到Composer的一名工人:

    kubectl -it exec $(kubectl get po -l run=airflow-worker -o jsonpath='{.items[0].metadata.name}' \
        -n $(kubectl get ns| grep composer*| awk '{print $1}')) -n $(kubectl get ns| grep composer*| awk '{print $1}') \
        -c airflow-worker -- mysql -u root -h airflow-sqlproxy-service.default
    

    使用mysql客户端:

    mysql> show databases;
    +-----------------------------------------+
    | Database                                |
    +-----------------------------------------+
    | information_schema                      |
    | composer-1-11-3-airflow-1-10-6-* |
    | mysql                                   |
    | performance_schema                      |
    | sys                                     |
    +-----------------------------------------+
    5 rows in set (0.01 sec)   
    

    启动与composer-1-11-3-airflow-1-10-6-*模式的连接:

    mysql> use composer-1-11-3-airflow-1-10-6-*;

    删除失败的task_id

    delete from task_instance where task_id='<task_id>' AND execution_date='<execution_date>'