R studio:具有超过 1 个感兴趣变量的时间序列的动态时间包装

问题描述

这个问题与这篇文章有关: How to apply dtw algorithm on multiple time series in R?

原始帖子的数据框仅包含 1 个感兴趣的变量:speed.kph.ED

#data: 8 observations,3 cars 
file.ID2 <- c("Cars_03","Cars_03","Cars_04","Cars_05","Cars_05")
speed.kph.ED <- c(129.3802848,129.4022304,129.424176,129.4461216,129.4680672,129.47904,129.5009856,129.5229312,127.8770112,127.8221472,127.7672832,127.7124192,127.6575552,127.6026912,127.5478272,127.4929632,134.1095616,134.1205344,134.1315072,134.1534528,134.1644256,134.1753984,134.1863712,134.197344)

df <- data.frame(file.ID2,speed.kph.ED)
df

根据已接受答案的建议,以下是使用 dtw 计算 3 辆车(3 个时间序列)之间距离的程序:

library(dtw)
library(purrr)
library(dplyr)

# Split your data frame into a list by file.ID2
ds <- split(df,df$file.ID2)
ds

# Use expand.grid to make all combinations of your names,file.ID2 and your values
Names <- expand.grid(unique(df$file.ID2),unique(df$file.ID2))
Values <- expand.grid(ds,ds)

# purrr:map_dbl iterates through all row-combinations of Values and returns a vector of doubles
Dist <- map_dbl(1:nrow(Values),~dtw(x = Values[.x,]$Var1[[1]]$speed.kph.ED,y = Values[.x,]$Var2[[1]]$speed.kph.ED)$distance)

# Bind answer to Names
library(dplyr)
ans <- Names %>% 
  mutate(distance = Dist)

ans

我想知道在计算 3 辆车(3 个时间序列)之间的距离时,如果我还有另外两个变量要考虑,该怎么办?

例如,假设我还有另外 2 个变量 score.kph.EDrating.kph.ED

score.kph.ED <- c(1:24)
rating.kph.ED <- c(25:48)


df <- data.frame(file.ID2,speed.kph.ED,score.kph.ED,rating.kph.ED)
df

现在,3辆车之间的距离不仅基于speed.kph.ED计算,还基于score.kph.EDrating.kph.ED

如何修改现有代码以实现此目标?

非常感谢您的帮助!

解决方法

你可以这样做:

library(purrr)

df <- data.frame(file.ID2,speed.kph.ED,score.kph.ED,rating.kph.ED)
ds <- split(df,df$file.ID2)
Names <- expand.grid(unique(df$file.ID2),unique(df$file.ID2))
Values <- expand.grid(ds,ds)

cols <- names(df)[-1]
result <- map_dfc(cols,function(col) map_dbl(1:nrow(Values),~dtw(x = Values[.x,]$Var1[[1]][[col]],y = Values[.x,]$Var2[[1]][[col]])$distance))

names(result) <- paste0('dist.',cols)
cbind(Names,result)


#     Var1    Var2 dist.speed.kph.ED dist.score.kph.ED dist.rating.kph.ED
#1 Cars_03 Cars_03           0.00000                 0                  0
#2 Cars_04 Cars_03          25.66538                71                 71
#3 Cars_05 Cars_03          69.72117               191                191
#4 Cars_03 Cars_04          25.66538                71                 71
#5 Cars_04 Cars_04           0.00000                 0                  0
#6 Cars_05 Cars_04          96.00103                71                 71
#7 Cars_03 Cars_05          69.72117               191                191
#8 Cars_04 Cars_05          96.00103                71                 71
#9 Cars_05 Cars_05           0.00000                 0                  0
,

您正在尝试做的称为多元 DTW, 并且您可以通过使用 proxy 包来简化事情。 检查this other answer, 但你基本上可以做你想做的事(使用你的例子中的变量):

proxy::dist(lapply(ds,function(x) { x[,-1L] }),method = "dtw")

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...