ValueError:root_directory 必须是绝对路径:从 Synapse 工作区访问 ADLS 中的目录时出错

问题描述

在 Apache Spark 中尝试使用以下 PySpark 代码访问 ADLS 目录时出现错误

ValueError: root_directory must be an absolute path. Got abfss://root@adlspretbiukadlsdev.dfs.core.windows.net/RAW/LANDING/ instead.
Traceback (most recent call last):

  File "/home/trusted-service-user/cluster-env/env/lib/python3.6/site-packages/great_expectations/core/usage_statistics/usage_statistics.py",line 262,in usage_statistics_wrapped_method
    result = func(*args,**kwargs)

尝试访问目录时出现上述错误代码如下:

data_context_config = DataContextConfig(
                    datasources={"my_spark_datasource": my_spark_datasource_config},store_backend_defaults=FilesystemStoreBackendDefaults(root_directory='abfss://root@adlspretbiukadlsdev.dfs.core.windows.net/RAW/LANDING/'),)

context = BaseDataContext(project_config=data_context_config)

当我将代码更改为

data_context_config = DataContextConfig(
                    datasources={"my_spark_datasource": my_spark_datasource_config},store_backend_defaults=FilesystemStoreBackendDefaults(root_directory='/abfss://root@adlspretbiukadlsdev.dfs.core.windows.net/RAW/LANDING/'),)

我收到以下错误消息:

PermissionError: [Errno 13] Permission denied: '/abfss:'
Traceback (most recent call last):

当我输入以下代码

data_context_config = DataContextConfig(
                    datasources={"my_spark_datasource": my_spark_datasource_config},store_backend_defaults=FilesystemStoreBackendDefaults(root_directory='/'),)

context = BaseDataContext(project_config=data_context_config)

我收到错误消息:

PermissionError: [Errno 13] Permission denied: '/expectations'
Traceback (most recent call last):

但是,我没有名为“/expectations”的目录

顺便说一句,我正在尝试执行 Great_Expectations。

解决方法

Great_Expectations 的开发者通知我这个错误将在 Great_Expectations 的新版本中修复