在 Hive 中带有案例的 Collect_set

问题描述

有什么办法可以重写下面的case语句,而不是写4次Collect_set[0],我可以使用单个Collect_set得到相同的结果。

    select id,collect_set(name)[0] as name,sum(salary),cASE WHEN month(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],'dd-MM-yyyy')))) 
    IN (01,02,03) THEN 
    CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],'dd-MM-yyyy'))))-1,'-'),substr(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],'dd-MM-yyyy')))),3,4))
     ELSE CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],SUBSTR(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],'dd-MM-yyyy'))))+1,4)) 
     END as fy from testing_1.collect_set_test group by id;

我写在查询下面。

select collect_set(CASE WHEN month(to_date(from_unixtime(unix_timestamp(date1),'dd-MM-yyyy'))) 
IN (01,03) THEN CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(date1),'dd-MM-yyyy')))-1,substr(year(to_date(from_unixtime(unix_timestamp(date1),'dd-MM-yyyy'))),4)) 
ELSE
 CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(date1),SUBSTR(year(to_date(from_unixtime(unix_timestamp(date1),'dd-MM-yyyy')))+1,4))) [0]
 END as fy from testing_1.collect_set_test group by id;

但它给出低于错误

    Failed: ParseException line 1:446 missing KW_END at ')' near ']' in selection target
    line 1:452 cannot recognize input near 'END' 'as' 'fy' in selection target

有人可以指导我如何重写相同的内容

解决方法

将所有带有分组和日期转换的聚合移入子查询,在上层子查询中计算fy:

select id,name,salary,cASE WHEN month(date1) 
               IN (01,02,03) THEN CONCAT(CONCAT(year(date1))-1,'-'),substr(year(date1),3,4))
         ELSE CONCAT(CONCAT(year(date1),SUBSTR(year(date1)+1,4)) 
     END as fy 
     from 
          (select to_date(from_unixtime(unix_timestamp(collect_set(date1)[0],'dd-MM-yyyy'))) as date1,collect_set(name)[0]  as name,sum(salary) as salary,id 
            from testing_1.collect_set_test group by id) s
 ;