问题描述
我有一个这样的数据框
ID <- c("1D01","1D02","1D03","1D04","1D05","1D06","1D07","1D08","1D09")
A <- c("2020-05-29 00:00:13","2020-06-09 00:00:13","2020-06-06 00:00:13","2020-06-03 00:00:13",NA)
B <- c("2020-06-01 00:00:13","2020-06-08 00:00:13","2020-06-19 00:00:13","2020-06-21 00:00:13","2020-06-07 00:00:13",NA)
C <- c("2020-06-03 00:00:13","2020-06-01 00:00:13","2020-06-11 00:00:13",NA,"2020-06-07 00:00:13")
D <- c("2020-06-04 00:00:13","2020-06-05 00:00:13","2020-06-04 00:00:13","2020-06-03 00:00:13")
df <- data.frame(ID,A,B,C,D)
df$A <- as.POSIXct(df$A)
df$B <- as.POSIXct(df$B)
df$C <- as.POSIXct(df$C)
df$D <- as.POSIXct(df$D)
我正在根据其他列日期的以下条件以及日期的升序创建一个名为path
的列
- 查看4列(A,B,C,D)中日期时间的顺序,然后根据日期时间的升序将这些列连接起来。例如: A_B_C_D ,如果A具有最小的日期时间而D具有最长的日期时间。
- 如果2列或更多列具有相同的日期时间,请串联而不带下划线。例如:如果B和C的日期时间相同,则 A_BC_D
- 如果列具有NA,则在连接时排除该列。例如: A_B_D ,如果C具有NA
我想要的输出是
ID A B C D path
1 1D01 2020-05-29 00:00:13 2020-06-01 00:00:13 2020-06-03 00:00:13 2020-06-04 00:00:13 A_B_C_D
2 1D02 2020-06-09 00:00:13 2020-06-08 00:00:13 2020-06-07 00:00:13 2020-06-05 00:00:13 D_C_B_A
3 1D03 2020-06-06 00:00:13 2020-06-19 00:00:13 2020-06-01 00:00:13 2020-06-08 00:00:13 C_A_D_B
4 1D04 2020-06-03 00:00:13 2020-06-21 00:00:13 2020-06-11 00:00:13 2020-06-01 00:00:13 D_A_C_B
5 1D05 2020-06-03 00:00:13 2020-06-03 00:00:13 2020-06-03 00:00:13 2020-06-04 00:00:13 ABC_D
6 1D06 2020-06-03 00:00:13 2020-06-03 00:00:13 2020-06-03 00:00:13 2020-06-03 00:00:13 ABCD
7 1D07 2020-06-03 00:00:13 2020-06-07 00:00:13 2020-06-03 00:00:13 2020-06-01 00:00:13 D_AC_B
8 1D08 2020-06-03 00:00:13 2020-06-07 00:00:13 <NA> <NA> A_B
9 1D09 <NA> <NA> 2020-06-07 00:00:13 2020-06-03 00:00:13 D_C
我正在尝试通过这种方式进行操作,但显然无法正常工作
library(dplyr)
df %>%
mutate(path = case_when(
A >= B >= C >= D ~ "(A_B_C_D)",TRUE ~ "(ABD_C)"))
如何获得所需的输出?有人可以指出我正确的方向吗?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)