问题描述
我无法找到一种使用Polybase在Azure SQL数据仓库(Synapse SQL Pool)中创建外部表的方法,其中某些字段包含嵌入式逗号。
对于具有以下4列的csv文件:
myresourcename,myresourcelocation,"""resourceVersion"": ""windows"",""deployedBy"": ""john"",""project_name"": ""test_project""","{ ""ResourceType"": ""Network"",""programName"": ""v1""}"
尝试了以下“创建外部表”语句。
CREATE EXTERNAL FILE FORMAT my_format
WITH (
FORMAT_TYPE = DELIMITEDTEXT,FORMAT_OPTIONS(
FIELD_TERMINATOR=',',STRING_DELIMITER='"',First_Row = 2
)
);
CREATE EXTERNAL TABLE my_external_table
(
resourceName VARCHAR,resourceLocation VARCHAR,resourceTags VARCHAR,resourceDetails VARCHAR
)
WITH (
LOCATION = 'my/location/',DATA_SOURCE = my_source,FILE_FORMAT = my_format
)
但是查询此表会出现以下错误:
Failed to execute query. Error: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too many columns in the line.
任何帮助将不胜感激。
解决方法
当前在polybase中不支持此功能,需要相应地修改输入数据才能使其正常工作。