问题描述
我有一列包含以下格式的数据:
array(row(action varchar,actor varchar,special_notes varchar,timestamp bigint))
数组保证有 1 个或多个元素。不保证数组长度相同。
我们称之为“my_array_row_column”。例如,此列的一行如下所示:
[{action=cast_role,actor=Morgan.Freeman,special_notes=null,timestamp=1611616961958},{action=note_create,actor=employee@example.com,timestamp=1611617308492},{action=dismissed,actor=newhire@example.com,special_notes=NA,timestamp=1611617308512}]
我尝试使用 CROSS JOIN UNnesT(my_array_row_column)
但它最终只返回数组中的第一个 row()
。这是我尝试过的最新查询:
SELECT unnested
FROM athena.movies
CROSS JOIN UNnesT(my_array_row_column) AS t(unnested)
令我沮丧的是,它只会返回
unnested
---------------------------------------------------------
{action=cast_role,actor=MorganFreeman,timestamp=1611616961958}
当我希望它在结果中将所有三行(Morgan.Freeman、employee、newhire)作为单独的行返回时,如下所示:
unnested
---------------------------------------------------------
{action=cast_role,timestamp=1611616961958}
---------------------------------------------------------
{action=note_create,timestamp=1611617308492}
---------------------------------------------------------
{action=dismissed,timestamp=1611617308512}
对我如何实现这一目标有任何想法吗?
解决方法
根据文档 here,这是正确的模式:
WITH dataset AS (
SELECT ARRAY[
CAST(ROW('Bob',38) AS ROW(name VARCHAR,age INTEGER)),CAST(ROW('Alice',35) AS ROW(name VARCHAR,CAST(ROW('Jane',27) AS ROW(name VARCHAR,age INTEGER))
] AS users
)
SELECT * FROM dataset
哪个返回
+-----------------------------------------------------------------+
| users |
+-----------------------------------------------------------------+
| [{NAME=Bob,AGE=38},{NAME=Alice,AGE=35},{NAME=Jane,AGE=27}] |
+-----------------------------------------------------------------+
要取消嵌套,这将是:
WITH dataset AS (
SELECT ARRAY[
CAST(ROW('Bob',age INTEGER))
] AS users
)
SELECT unnested
FROM dataset,UNNEST(users) t(unnested)
哪个应该返回
+---------------------+
| unnested |
+---------------------+
| {NAME=Bob,AGE=38} |
| {NAME=Alice,AGE=35}|
| {NAME=Jane,AGE=27} |
+---------------------+
就你而言
WITH dataset AS (
SELECT my_array_row_column AS movieDetail
from athena.movies
)
SELECT unnested
FROM dataset,UNNEST(movieDetail) AS t(unnested)
或者类似的东西。 我认为您的交叉连接是不必要的,因为您没有在与 athena.movies 表的键相同的粒度上引入任何字段,因此没有什么可以乘以。