问题描述
我想获得唯一客户数。我有 postgresql 查询的参考。您能否将此查询转换为 Hivesql
SELECT
COUNT(user_id) Total_profiles,COUNT(distinct user_id) FITLER (WHERE age BETWEEN 18 AND 12) as age_less_than_20
FROM
customer_profiles
WHERE
profile_date BETWEEN '2020-01-01' AND '2020-12-31'
解决方法
用例表达式:
SELECT
COUNT(user_id) Total_profiles,COUNT(distinct case when age BETWEEN 18 AND 12 then user_id else null end) as age_less_than_20
FROM
customer_profiles
WHERE
profile_date BETWEEN '2020-01-01' AND '2020-12-31'
另一种计算distinct的方法是size(collect_set()):
SELECT
COUNT(user_id) Total_profiles,size(collect_set(case when age BETWEEN 18 AND 12 then user_id else null end)) as age_less_than_20
FROM
customer_profiles
WHERE
profile_date BETWEEN '2020-01-01' AND '2020-12-31'