问题描述
我在运行 Apache Beam 管道时收到以下错误。完整的错误代码是:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-870f9c2f41e5> in <module>
39 file_path_prefix=os.path.join(OUTPUT_DIR,'ptp-dataset.csv'))))
40
---> 41 preprocess()
<ipython-input-12-870f9c2f41e5> in preprocess()
22 'requirements_file': 'requirements.txt'
23 }
---> 24 opts = beam.options.pipeline_options.PipelineOptions(flags=[],**options)
25 RUNNER = 'DataflowRunner' # 'DirectRunner'
26
AttributeError: module 'apache_beam' has no attribute 'options'
产生错误的代码是当我尝试调用 PipelineOptions
类时。
opts = beam.pipeline.PipelineOptions(flags=[],**options)
RUNNER = 'DataflowRunner' # 'DirectRunner'
解决方法
要解决此问题,请运行 pip install
最新版本的 apache-beam:
pip install apache-beam[gcp]
重新启动内核,然后使用 options.pipeline_options.PipelineOptions
导入类。在本例中,将其更改为:
opts = beam.options.pipeline_options.PipelineOptions(flags=[],**options)
RUNNER = 'DataflowRunner'