从表中为每列检索 3 个样本值

问题描述

我正在尝试进行查询,该查询将仅从每列中检索随机(或仅唯一)样本值,作为用户概览表中的值是什么格式。

表格:

项目 客户 地理
p1 c2 欧洲
p2 c4 美国
p3 c6 日本
p2 c9 美国
p4 c1 亚洲
... ... ...

预期结果(随机值):

项目 客户 地理
p4 c9 亚洲
p2 c1 美国
p3 c6 日本

检索最高值不正确,因为它通常返回相同的值或相关值(例如 3 个项目但具有相同的地理位置)。

select * from table limit 3
  • 如果地理位置不同,表格可能有 100 多列
  • 列是不同的

解决方法

你可以试试:

select distinct * from table order by customers desc,geography asc limit 3;

或者你可以试试这个(这个有点长):

select distinct * from table where project='p1' limit 1
union distinct 

select distinct * from table where project='p2' and customer not in (select distinct customer from table where project='p1') and geography not in (select distinct geography from table where project='p1') limit 1
union distinct 

select distinct * from table where project='p3' and customer not in (select distinct customer from table where project not in ('p1','p2')) and geography not in (select distinct geography from table where project not in ('p1','p2')) limit 1;
,

如果我理解正确,您希望从每一列中看到三个唯一值,并且它们应该相互独立。试试这个

with project_t as (
  select row_number() over (order by project) rn,project
  from (select distinct project from tab limit 3) t
),customer_t as (
  select row_number() over (order by customer) rn,customer
  from (select distinct customer from tab limit 3) t
),geography_t as (
  select row_number() over (order by geography) rn,geography
  from (select distinct geography from tab limit 3) t
)
select p.project,c.customer,g.geography
from project_t p
join customer_t c on p.rn = c.rn
join geography_t g on c.rn = g.rn

我只在 mysql 上进行了测试,但是,窗口函数和 CTE 也应该在 Netezza 中可用。

,

您尝试做的不是很 SQL,因为它将每列中的值彼此分离。

如果任何列实际上没有三个值,您也会遇到问题。我建议使用 var dir = "l" var pt = "pt" var size = "a4" var id = "body" var pdf = new jsPDF(dir,pt,size); name = "test.pdf" var options = { pagesplit: true }; pdf.addHTML(document.querySelector(id),options,function() { pdf.addPage() pdf.save(name) }); 和聚合:

union all

只要在至少一列中至少有三个不同的值,这将返回 3 行。

如果您需要最常用的值,只需将 select max(project),max(customer),max(geography) from ((select project,null as customer,null as geography,row_number() over (order by random) as seqnum from t group by project ) union all (select null,project,customer,row_number() over (order by random) as seqnum from t group by customer ) union all (select null as project,geography,row_number() over (order by random) as seqnum from t group by geography ) ) pcg where seqnum <= 3 group by seqnum; 子句中的 random() 替换为 count(*) desc

,

在您的表上运行“生成统计数据”(出于性能原因,您应该每周左右执行一次)后,您可以找到两个值:HIVAL 和 LOWAL 它们可以与列中不同值的数量和 NULL 的数量一起在目录中找到

因此:如果树是一个硬数字,我无能为力,但如果在我的商店中,我们在必要时对目录运行“快速数据分析”查询,并从那里继续使用更具体的 SQL,只需专注于更新统计信息....

如果有人感兴趣,我会分享那个 SQL 查询 ????

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...