使用窗口函数从列中检索值,具体取决于另一个

问题描述

在如下所示的数据框中:

df1[c(1,4)] <- Map(function(x,y) paste0("New",y,"_",df1$si,seq_along(x)),df1[c(1,4)],names(df1)[c(1,4)])
df1$ps <- paste0("Newps_",df1$si)

我想添加另一列,以提供用户在最短日期使用的产品。所以需要像下面这样:

df1 <- structure(list(ce = c("cedummy1","cedummy2","cedummy"),si = c("SIRR","SI234","SI67K"),ps = c("psdummy","psdummy","psdummy"),se = c("sedummy1","sedummy2","sedummy")),class = "data.frame",row.names = c(NA,-3L))

我知道我需要使用window函数,并从下面的窗口开始。但这将为我提供每个用户最早的约会。我需要为用户找到最早使用的产品:

id    date      product
1   2010-02-01     c
1   2010-02-02     v
1   2010-02-03     d
1   2010-02-04     g
2   2010-02-03     h
2   2010-02-04     w
2   2010-02-05     t
2   2010-02-06     d
3   2010-02-04     x
3   2010-02-05     f
3   2010-02-06     x

解决方法

将FIRST_VALUE用作窗口函数

CREATE TABLE table1 (
  `id` INTEGER,`date` Date,`product` VARCHAR(1)
);

INSERT INTO table1
  (`id`,`date`,`product`)
VALUES
  ('1','2010-02-01','c'),('1','2010-02-02','v'),'2010-02-03','d'),'2010-02-04','g'),('2','h'),'w'),'2010-02-05','t'),'2010-02-06',('3','x'),'f'),'x');
SELECT `id`,`product`,FIRST_VALUE(`product`) OVER(PARTITION BY `id` ORDER BY `date` ROWS UNBOUNDED PRECEDING) minproduct
FROM table1
id | date       | product | minproduct
-: | :--------- | :------ | :---------
 1 | 2010-02-01 | c       | c         
 1 | 2010-02-02 | v       | c         
 1 | 2010-02-03 | d       | c         
 1 | 2010-02-04 | g       | c         
 2 | 2010-02-03 | h       | h         
 2 | 2010-02-04 | w       | h         
 2 | 2010-02-05 | t       | h         
 2 | 2010-02-06 | d       | h         
 3 | 2010-02-04 | x       | x         
 3 | 2010-02-05 | f       | x         
 3 | 2010-02-06 | x       | x         

db 提琴here