这是一个
sqlFiddle,显示了我正在尝试做的事情.
这是@ lad2025 sqlFiddle更好地展示它
这是@ lad2025 sqlFiddle更好地展示它
在小提琴中,我展示了一个执行我想要做的事情的查询.但它很慢.
我有一个交叉表请求几乎完全相同,非常快,但几乎没有错误. (它会融合一些线条)
SELECT end_user_id,tms,coalesce(max(IN_VEHICLE),0) as IN_VEHICLE,coalesce(max(ON_BICYCLE),0) as ON_BICYCLE,coalesce(max(ON_FOOT),0) as ON_FOOT,coalesce(max(RUNNING),0) as RUNNING,coalesce(max(STILL),0) as STILL,coalesce(max(TILTING),0) as TILTING,coalesce(max(UNKNowN),0) as UNKNowN,coalesce(max(WALKING),0) as WALKING FROM crosstab ( 'SELECT end_user_id,type,max(confidence) FROM activities group by 1,2,3 ','SELECT disTINCT type FROM activities order by type' )as newtable ( end_user_id text,tms timestamp,IN_VEHICLE float,ON_BICYCLE float,ON_FOOT float,RUNNING float,STILL float,TILTING float,UNKNowN float,WALKING float ) GROUP BY end_user_id,tms ORDER BY end_user_id,tms
我不知道为什么postgres要求我使用GROUP BY end_user_id,最后是tms …它应该是唯一的.
另外我不知道为什么,但如果我不在交叉表查询中分组,我每个end_user_id只会有一行:(
如何更正该交叉表请求?
编辑:
@ lad2025响应是一个比我更好的例子,更优雅,我肯定更快.不过,我想知道如何使用交叉表进行操作.
解决方法
你可以避免在
Fiddle中使用交叉表/多重左连接,并使用简单的条件聚合:
SELECT end_user_id,COALESCE(MAX(CASE WHEN type = 'IN_VEHICLE' THEN confidence END),0) AS IN_VEHICLE,COALESCE(MAX(CASE WHEN type = 'ON_BICYCLE' THEN confidence END),0) AS ON_BICYCLE,COALESCE(MAX(CASE WHEN type = 'ON_FOOT' THEN confidence END),0) AS ON_FOOT,COALESCE(MAX(CASE WHEN type = 'RUNNING' THEN confidence END),0) AS RUNNING,COALESCE(MAX(CASE WHEN type = 'STILL' THEN confidence END),0) AS STILL,COALESCE(MAX(CASE WHEN type = 'TILTING' THEN confidence END),0) AS TILTING,COALESCE(MAX(CASE WHEN type = 'UNKNowN' THEN confidence END),0) AS UNKNowN,COALESCE(MAX(CASE WHEN type = 'WALKING' THEN confidence END),0) AS WALKING FROM activities GROUP BY end_user_id,tms;
输出:
╔═══════════════════╦════════════════════════════╦═════════════╦═════════════╦══════════╦══════════╦════════╦══════════╦══════════╦═════════╗ ║ end_user_id ║ tms ║ in_vehicle ║ on_bicycle ║ on_foot ║ running ║ still ║ tilting ║ unkNown ║ walking ║ ╠═══════════════════╬════════════════════════════╬═════════════╬═════════════╬══════════╬══════════╬════════╬══════════╬══════════╬═════════╣ ║ 64e8394876a5b7f1 ║ October,28 2015 08:24:20 ║ 21 ║ 8 ║ 2 ║ 0 ║ 2 ║ 0 ║ 68 ║ 2 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:24:41 ║ 15 ║ 0 ║ 3 ║ 0 ║ 72 ║ 0 ║ 10 ║ 3 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:25:17 ║ 5 ║ 0 ║ 5 ║ 0 ║ 77 ║ 100 ║ 13 ║ 5 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:25:32 ║ 0 ║ 0 ║ 0 ║ 0 ║ 100 ║ 0 ║ 0 ║ 0 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:25:36 ║ 0 ║ 0 ║ 0 ║ 0 ║ 92 ║ 0 ║ 8 ║ 0 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:27:24 ║ 48 ║ 48 ║ 0 ║ 0 ║ 0 ║ 0 ║ 5 ║ 0 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:27:54 ║ 0 ║ 0 ║ 0 ║ 0 ║ 0 ║ 100 ║ 0 ║ 0 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:28:11 ║ 62 ║ 8 ║ 3 ║ 0 ║ 15 ║ 0 ║ 13 ║ 3 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:28:53 ║ 35 ║ 0 ║ 6 ║ 0 ║ 37 ║ 0 ║ 23 ║ 6 ║ ║ 64e8394876a5b7f1 ║ October,28 2015 08:29:16 ║ 54 ║ 2 ║ 0 ║ 0 ║ 10 ║ 0 ║ 35 ║ 0 ║ ║ e86b0b91546194cc ║ October,28 2015 08:24:41 ║ 13 ║ 13 ║ 69 ║ 3 ║ 0 ║ 100 ║ 5 ║ 67 ║ ║ e86b0b91546194cc ║ October,28 2015 08:33:33 ║ 0 ║ 0 ║ 100 ║ 0 ║ 0 ║ 0 ║ 0 ║ 100 ║ ║ e86b0b91546194cc ║ October,28 2015 08:33:38 ║ 0 ║ 0 ║ 100 ║ 0 ║ 0 ║ 0 ║ 0 ║ 100 ║ ║ e86b0b91546194cc ║ October,28 2015 08:34:06 ║ 19 ║ 6 ║ 31 ║ 2 ║ 29 ║ 0 ║ 16 ║ 29 ║ ║ e86b0b91546194cc ║ October,28 2015 08:34:34 ║ 3 ║ 0 ║ 0 ║ 0 ║ 95 ║ 0 ║ 3 ║ 0 ║ ╚═══════════════════╩════════════════════════════╩═════════════╩═════════════╩══════════╩══════════╩════════╩══════════╩══════════╩═════════╝
COALESCE也是多余的(如果只允许正/零值):
SELECT end_user_id,MAX(CASE WHEN type = 'IN_VEHICLE' THEN confidence ELSE 0 END) AS IN_VEHICLE,MAX(CASE WHEN type = 'ON_BICYCLE' THEN confidence ELSE 0 END) AS ON_BICYCLE,MAX(CASE WHEN type = 'ON_FOOT' THEN confidence ELSE 0 END) AS ON_FOOT,MAX(CASE WHEN type = 'RUNNING' THEN confidence ELSE 0 END) AS RUNNING,MAX(CASE WHEN type = 'STILL' THEN confidence ELSE 0 END) AS STILL,MAX(CASE WHEN type = 'TILTING' THEN confidence ELSE 0 END) AS TILTING,MAX(CASE WHEN type = 'UNKNowN' THEN confidence ELSE 0 END) AS UNKNowN,MAX(CASE WHEN type = 'WALKING' THEN confidence ELSE 0 END) AS WALKING FROM activities GROUP BY end_user_id,tms;
您还可以考虑为类型列创建查找表,如activities_type(type_id,type_name),而不是直接存储在表字符串中(‘IN_VEHICLE’,’ON_BICYCLE’,…).
附录
我不是Postgresql Expert但经过一些游戏之后:
SELECT LEFT(end_user_id,strpos(end_user_id,'_')-1) AS end_user_id,RIGHT(end_user_id,LENGTH(end_user_id) - strpos(end_user_id,'_'))::timestamp AS tms,COALESCE(IN_VEHICLE,COALESCE(ON_BICYCLE,COALESCE(ON_FOOT,0) AS ON_FOOT,COALESCE(RUNNING,0) AS RUNNING,COALESCE(STILL,0) AS STILL,COALESCE(TILTING,0) AS TILTING,COALESCE("UNKNowN",0) AS "UNKNowN",COALESCE(WALKING,0) AS WALKING FROM crosstab( 'SELECT (end_user_id || ''_'' || tms) AS row_id,confidence FROM activities ORDER BY row_id,confidence','SELECT disTINCT type FROM activities order by type' ) AS newtable ( end_user_id text,IN_VEHICLE int,ON_BICYCLE int,ON_FOOT int,RUNNING int,STILL int,TILTING int,"UNKNowN" int,WALKING int) ORDER BY end_user_id,tms;
为什么连接和拆分end_user_id tms?
因为交叉表(文本,文本)需要:
row_id <=> end_user_id + tms category <=> type value <=> confidence
请注意,此版本中没有GROUP BY.
附录2 – 最终版本
基于tablefunc module doc F.37.1.4.交叉表(文本,文本):
这样做要好得多,因为它可以处理row_id,extra_col1,extra_col2,category,value).所以现在:
row_id <=> id extra_col1 <=> end_user_id extra_col2 <=> tms ...
最后查询:
SELECT end_user_id,coalesce(max("UNKNowN"),0) as "UNKNowN",0) as WALKING FROM crosstab( 'SELECT id,end_user_id,confidence FROM activities','SELECT disTINCT type FROM activities order by type' ) AS newtable ( id INT,end_user_id text,WALKING int ) GROUP BY end_user_id,tms ORDER BY end_user_id,tms;
What would be the point of the activities_type table ?
数据库规范化你可以使用:
SELECT disTINCT type FROM activities order by type vs SELECT type_name FROM activities_types ORDER BY type_name;
此版本使用id作为row_id,因此仍需要GROUP BY来压缩多行.
总结一下:条件聚合是最具可读性的解决方案.