问题描述
我的问题与此类似(Athena/Presto - UNNEST MAP to columns)。但就我而言,我知道我之前需要哪些列。
我的用例是这个
{
"reqId" : "1234","clientId" : "client","response" : [
{
"name" : "Susan","projects" : [
{
"name" : "project1","completed" : true
},{
"name" : "project2","completed" : false
}
]
},{
"name" : "Adams","completed" : false
}
]
}
]
}
name | project | Completed |
----------+-------------+------------+
Susan | project1 | true |
Susan | project2 | false |
Adams | project1 | true |
Adams | project2 | false |
WITH dataset AS (
SELECT 'Susan' as name,transform(filter(CAST(json_extract('{
"projects": [{"name":"project1","completed":false},{"name":"project3",{"name":"project2","completed":true}]}','$.projects') AS ARRAY<MAP<VARCHAR,VARCHAR>>),p -> (p['name'] != 'project1')),p -> ROW(map_values(p))) AS projects
)
SELECT * from dataset
CROSS JOIN UNnesT(projects)
这是我得到的输出
name projects _col2
1 Susan [{field0=[project3,false]},{field0=[project2,true]}] {field0=[project3,false]}
2 Susan [{field0=[project3,true]}] {field0=[project2,true]}
我基本上想将地图的键值对取消嵌套为单独的列。如何在presto / Athena中做到这一点?
解决方法
您的JSON示例似乎无效,它在,
和"name" : "Susan"
之后遗漏了"name" : "Adams"
。除此之外,您可以通过此查询实现预期的输出,您需要两次UNNEST,还需要进行一些强制转换:
with dataset as
(
select json_parse('{"reqId" : "1234","clientId" : "client","response" : [{"name" : "Susan","projects" : [{"name" : "project1","completed" : true},{"name" : "project2","completed" : false}]},{"name" : "Adams","completed" : false}]}]}') as json_col
),unnest_response as
(
select *
from dataset
cross join UNNEST(cast(json_extract(json_col,'$.response') as array<JSON>)) as t (response)
)
select
json_extract_scalar(response,'$.name') name,json_extract_scalar(project,'$.name') project_name,'$.completed') project_completed
from unnest_response
cross join UNNEST(cast(json_extract(response,'$.projects') as array<JSON>)) as t (project);