问题描述
我需要在HIVE中创建数据表,该数据表将包含具有2条或更多记录且具有100天或更长时间的记录的ID,我该如何在HIVE中做到这一点?
解决方法
您可以使用窗口函数来计算天数等于或大于100的行数:
select t.*
from (select t.*,sum(case when days >= 100 then 1 else 0 end) over (partition by id) as cnt_100pl
from t
) t
where cnt_100pl >= 2;
,
您可以使用戈登在其答复中建议的窗口功能。您还可以使用以下相关的子查询来执行此操作。 (假设表名称为my_table)
Select t1.*
from my_table t1
where 2 <= (Select count(1) from my_table t2 where t2.id = t1.id and t2.days >= 100);
因此完整的查询应该是
Create table my_target_table
As
Select t1.*
from my_table t1
where 2 <= (Select count(1) from my_table t2 where t2.id = t1.id and t2.days >= 100);