问题描述
我正在尝试进行查询,该查询将仅从每列中检索随机(或仅唯一)样本值,作为用户概览表中的值是什么格式。
表格:
项目 | 客户 | 地理 |
---|---|---|
p1 | c2 | 欧洲 |
p2 | c4 | 美国 |
p3 | c6 | 日本 |
p2 | c9 | 美国 |
p4 | c1 | 亚洲 |
... | ... | ... |
预期结果(随机值):
项目 | 客户 | 地理 |
---|---|---|
p4 | c9 | 亚洲 |
p2 | c1 | 美国 |
p3 | c6 | 日本 |
检索最高值不正确,因为它通常返回相同的值或相关值(例如 3 个项目但具有相同的地理位置)。
select * from table limit 3
- 如果地理位置不同,表格可能有 100 多列
- 列是不同的
解决方法
你可以试试:
select distinct * from table order by customers desc,geography asc limit 3;
或者你可以试试这个(这个有点长):
select distinct * from table where project='p1' limit 1
union distinct
select distinct * from table where project='p2' and customer not in (select distinct customer from table where project='p1') and geography not in (select distinct geography from table where project='p1') limit 1
union distinct
select distinct * from table where project='p3' and customer not in (select distinct customer from table where project not in ('p1','p2')) and geography not in (select distinct geography from table where project not in ('p1','p2')) limit 1;
,
如果我理解正确,您希望从每一列中看到三个唯一值,并且它们应该相互独立。试试这个
with project_t as (
select row_number() over (order by project) rn,project
from (select distinct project from tab limit 3) t
),customer_t as (
select row_number() over (order by customer) rn,customer
from (select distinct customer from tab limit 3) t
),geography_t as (
select row_number() over (order by geography) rn,geography
from (select distinct geography from tab limit 3) t
)
select p.project,c.customer,g.geography
from project_t p
join customer_t c on p.rn = c.rn
join geography_t g on c.rn = g.rn
我只在 mysql 上进行了测试,但是,窗口函数和 CTE 也应该在 Netezza 中可用。
,您尝试做的不是很 SQL,因为它将每列中的值彼此分离。
如果任何列实际上没有三个值,您也会遇到问题。我建议使用 var dir = "l"
var pt = "pt"
var size = "a4"
var id = "body"
var pdf = new jsPDF(dir,pt,size);
name = "test.pdf"
var options = {
pagesplit: true
};
pdf.addHTML(document.querySelector(id),options,function() {
pdf.addPage()
pdf.save(name)
});
和聚合:
union all
只要在至少一列中至少有三个不同的值,这将返回 3 行。
如果您需要最常用的值,只需将 select max(project),max(customer),max(geography)
from ((select project,null as customer,null as geography,row_number() over (order by random) as seqnum
from t
group by project
) union all
(select null,project,customer,row_number() over (order by random) as seqnum
from t
group by customer
) union all
(select null as project,geography,row_number() over (order by random) as seqnum
from t
group by geography
)
) pcg
where seqnum <= 3
group by seqnum;
子句中的 random()
替换为 count(*) desc
。
在您的表上运行“生成统计数据”(出于性能原因,您应该每周左右执行一次)后,您可以找到两个值:HIVAL 和 LOWAL 它们可以与列中不同值的数量和 NULL 的数量一起在目录中找到
因此:如果树是一个硬数字,我无能为力,但如果在我的商店中,我们在必要时对目录运行“快速数据分析”查询,并从那里继续使用更具体的 SQL,只需专注于更新统计信息....
如果有人感兴趣,我会分享那个 SQL 查询 ????