问题描述
在我的场景中,每个父家庭下都有一个“姓名”标签。父族重复,每个族内都有多个“值”标签。我的期望是解析每个 [Name,Value] 对并按行显示它们。
示例 XML 和预期输出如下所示:
<ParentArray>
<ParentFieldArray>
<Name>ABCD</Name>
<Value>
<string>111</string>
<string>222</string>
<string>333</string>
</Value>
</ParentFieldArray>
<ParentFieldArray>
<Name>EFGH</Name>
<Value>
<string>444</string>
<string>555</string>
</Value>
</ParentFieldArray>
</ParentArray>
Name Value
ABCD 111
ABCD 222
ABCD 333
EFGH 444
EFGH 555
此处重复“ParentFieldArray”系列,其中“Value”标签的数量也因系列而异。
尝试查询:
select Name,Value from <table_name> -- "xmlinfo" field in this table includes the above XML content
LAteraL VIEW POSEXPLODE(XPATH(xmlinfo,'ParentArray/ParentFieldArray/Name/text()')) NM as Name_pos,Name
LAteraL VIEW POSEXPLODE(XPATH(xmlinfo,'ParentArray/ParentFieldArray/Value/string/text()')) VL as Value_pos,Value;
我尝试使用 LAteraL VIEW POSEXPLODE(XPATH(..))
概念进行查询,但似乎不起作用。基本上我无法根据他们的位置为每个值映射正确的名称。这会导致交叉连接。
解决方法
获取名称并将其传递给第二个 XPATH 以仅过滤包含该名称的 ParentFieldArray。
演示:
with your_data as (
select '<ParentArray>
<ParentFieldArray>
<Name>ABCD</Name>
<Value>
<string>111</string>
<string>222</string>
<string>333</string>
</Value>
</ParentFieldArray>
<ParentFieldArray>
<Name>EFGH</Name>
<Value>
<string>444</string>
<string>555</string>
</Value>
</ParentFieldArray>
</ParentArray>' as xmlinfo
)
select name,value
from your_data d
lateral view outer explode(XPATH(xmlinfo,'ParentArray/ParentFieldArray/Name/text()')) pf as Name
lateral view outer explode(XPATH(xmlinfo,concat('ParentArray/ParentFieldArray[Name="',pf.Name,'"]/Value/string/text()'))) vl as value
结果:
name value
ABCD 111
ABCD 222
ABCD 333
EFGH 444
EFGH 555