如何提高clickhouse的查询速度

问题描述

在clickhouse中,我想做一个查询操作。查询包含 group by QJTD1,但 QJTD1 是通过查询字典获得的。声明如下:

`SELECT
IF(
    sale_mode = 'owner',dictGetString(
        'dict.dict_sku','dept_id_1',toUInt64OrZero(sku_id)
    ),dictGetString(
        'dict.dict_shop',toUInt64OrZero(shop_id)
    )
) AS QJTD1,brand_cd,coalesce(
    uniq(sd_deal_ord_user_num),0
) AS sd_deal_ord_user_num,0 AS item_uv,dt
FROM app.test_all
WHERE dt >= '2020-11-01'
AND dt <= '2020-11-30'
and IF(
    sale_mode = 'owner','bu_id',toUInt64OrZero(shop_id)
    )
)= '1727' GROUP BY
QJTD1,dt
ORDER BY item_pv desc limit 0,100`

、QJTD1数据倾斜严重,导致查询速度慢。我尝试优化索引以提高查询速度。索引如下:sku_id,shop_id....但是没有效果。如何提高查询效率?

解决方法

CH 总是计算 IF (then & else) 的两个分支。

您可以使用两阶段组

select IF( sale_mode ='owner',... as QJTD1
from (  
  select owner,sku_id,dept_id_1,....
  ...
  group by owner,dept_id_1
  )
group by QJTD1

或者定义字典<injective>true

https://clickhouse.tech/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure/

Flag that shows whether the id -> attribute image is injective.
If true,ClickHouse can automatically place after the GROUP BY 
clause the requests to dictionaries with injection. Usually it 
significantly reduces the amount of such requests.

Default value: false.

如果它们是内射的。

我会测试 Union all 然后只计算一次 IF 分支。