如何在SQL / Snowflake中创建列和行相同的交叉表/系数表?

问题描述

我有一张桌子

      col1 | col2 | col3 | col4 | col5
 id1 |  1     0      0      1      0
 id2 |  1     1      0      0      0
 id3 |  0     1      0      1      0
 id4 |  0     0      1      0      1
 id5 |  1     0      1      0      0
 id6 |  0     0      0      1      0
  .
  .
  .
 idN

我将如何创建查询获取类似的表

      col1 | col2 | col3 | col4 | col5
col1 |  3     1      1      1      0
col2 |  1     2      0      1      0
col3 |  1     1      2      0      1
col4 |  1     1      1      2      0
col5 |  0     0      1      0      1

其中结果中的每个条目是一列中的某个值1与值为1的另一列发生的次数

我可以通过执行以下操作获得对角线值:

SELECT 
sum(col1),sum(col2),sum(col3),sum(col4),sum(col5)
FROM (
SELECT 
col1,col2,col3,col4,col5,col1 + col2 + col3 + col4 + col5 ) AS total
FROM (
SELECT 
      ROW_NUMBER()OVER(PARTITION BY id ORDER BY date) row_num,*
FROM (
SELECT disTINCT(id),date,col1,col5
FROM db.schema.table)
)
WHERE row_num = 1 AND total <= 1
ORDER BY total DESC);

我认为我必须做某种枢纽或各种结合,但我似乎无法弄清楚。

解决方法

您可以通过合并5个select和25个case语句来解决它-每个select中有5个case语句。我必须承认这是一个非常丑陋的解决方案,并且只有在列数恒定的情况下才可以使用,但是绝对可以。

,

由于您不知道ios来解开冷杉,进行操纵并将它们向后旋转的确切列数。这应该起作用:

-- identify table columns
with table_columns_list as (
select column_name,ordinal_position
from information_schema.columns
where table_schema like 'schema' and table_name like 'table' 
   ),-- unpivot the table and add row id 
flat_table as (
select * from ( select *,row_number() as row_id from my_table)
unpivot(value for column_name in (select column_name from table_columns_list)
),-- calculate all matrix values
full_flat_table as ( 
select a.row_id as row_id,a.column_name as a_column_name,b.column_name as 
b_column_name,min(a.value,b.value) as value
from flat_table as a inner join  flat_table as b on a.row_id=b.row_id
)

select * 
from full_flat_table
pivot(sum(value) for a_column_name in (select column_name from 
table_columns_list))
as p
order by b_column_name;