从表中为每列检索 3 个样本值

问题描述

我正在尝试进行查询，该查询将仅从每列中检索随机（或仅唯一）样本值，作为用户概览表中的值是什么格式。

表格：

项目	客户	地理
p1	c2	欧洲
p2	c4	美国
p3	c6	日本
p2	c9	美国
p4	c1	亚洲
...	...	...

预期结果（随机值）：

项目	客户	地理
p4	c9	亚洲
p2	c1	美国
p3	c6	日本

检索最高值不正确，因为它通常返回相同的值或相关值（例如 3 个项目但具有相同的地理位置）。

select * from table limit 3

如果地理位置不同，表格可能有 100 多列
列是不同的

解决方法

你可以试试：

select distinct * from table order by customers desc,geography asc limit 3;

或者你可以试试这个（这个有点长）：

select distinct * from table where project='p1' limit 1
union distinct 

select distinct * from table where project='p2' and customer not in (select distinct customer from table where project='p1') and geography not in (select distinct geography from table where project='p1') limit 1
union distinct 

select distinct * from table where project='p3' and customer not in (select distinct customer from table where project not in ('p1','p2')) and geography not in (select distinct geography from table where project not in ('p1','p2')) limit 1;

如果我理解正确，您希望从每一列中看到三个唯一值，并且它们应该相互独立。试试这个

with project_t as (
  select row_number() over (order by project) rn,project
  from (select distinct project from tab limit 3) t
),customer_t as (
  select row_number() over (order by customer) rn,customer
  from (select distinct customer from tab limit 3) t
),geography_t as (
  select row_number() over (order by geography) rn,geography
  from (select distinct geography from tab limit 3) t
)
select p.project,c.customer,g.geography
from project_t p
join customer_t c on p.rn = c.rn
join geography_t g on c.rn = g.rn

我只在 mysql 上进行了测试，但是，窗口函数和 CTE 也应该在 Netezza 中可用。

您尝试做的不是很 SQL，因为它将每列中的值彼此分离。

如果任何列实际上没有三个值，您也会遇到问题。我建议使用 var dir = "l" var pt = "pt" var size = "a4" var id = "body" var pdf = new jsPDF(dir,pt,size); name = "test.pdf" var options = { pagesplit: true }; pdf.addHTML(document.querySelector(id),options,function() { pdf.addPage() pdf.save(name) }); 和聚合：

union all

只要在至少一列中至少有三个不同的值，这将返回 3 行。

如果您需要最常用的值，只需将 select max(project),max(customer),max(geography) from ((select project,null as customer,null as geography,row_number() over (order by random) as seqnum from t group by project ) union all (select null,project,customer,row_number() over (order by random) as seqnum from t group by customer ) union all (select null as project,geography,row_number() over (order by random) as seqnum from t group by geography ) ) pcg where seqnum <= 3 group by seqnum; 子句中的 random() 替换为 count(*) desc。

在您的表上运行“生成统计数据”（出于性能原因，您应该每周左右执行一次）后，您可以找到两个值：HIVAL 和 LOWAL 它们可以与列中不同值的数量和 NULL 的数量一起在目录中找到

因此：如果树是一个硬数字，我无能为力，但如果在我的商店中，我们在必要时对目录运行“快速数据分析”查询，并从那里继续使用更具体的 SQL，只需专注于更新统计信息....

如果有人感兴趣，我会分享那个 SQL 查询 ????

netezza sql sql

从表中为每列检索 3 个样本值

问题描述

解决方法

相关问答