AttributeError: 模块 'apache_beam' 没有属性 'options'

问题描述

我在运行 Apache Beam 管道时收到以下错误。完整的错误代码是:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-870f9c2f41e5> in <module>
     39                  file_path_prefix=os.path.join(OUTPUT_DIR,'ptp-dataset.csv'))))
     40 
---> 41 preprocess()

<ipython-input-12-870f9c2f41e5> in preprocess()
     22       'requirements_file': 'requirements.txt'
     23     }
---> 24     opts = beam.options.pipeline_options.PipelineOptions(flags=[],**options)
     25     RUNNER = 'DataflowRunner' # 'DirectRunner'
     26 

AttributeError: module 'apache_beam' has no attribute 'options'

产生错误代码是当我尝试调用 PipelineOptions 类时。

 opts = beam.pipeline.PipelineOptions(flags=[],**options)
 RUNNER = 'DataflowRunner' # 'DirectRunner'

解决方法

要解决此问题,请运行 pip install 最新版本的 apache-beam:

pip install apache-beam[gcp]

重新启动内核,然后使用 options.pipeline_options.PipelineOptions 导入类。在本例中,将其更改为:

opts = beam.options.pipeline_options.PipelineOptions(flags=[],**options)
RUNNER = 'DataflowRunner'