问题描述
大师
StoreNumber Online Pressure MON TUE WED ... SUN ...
1 0.2 50 0 0 0 ... 0
2 0.8 20 0 0 0 ... 0
3 1.2 10 0 0 0 ... 0
...
小时
BranchNumber Day ... Time
1 MON 7.50
1 TUE 6.00
1 WED 8.50
3 MON 2.00
3 TUE 1.00
3 WED 2.50
...
NB由于一周中有7天,因此分支机构数目很多,大约有10,500,因此一个分支机构数目可能在一周的每一天出现7次,但有些分支机构可能只有6或5个,等等。
我的想法是我想从“小时”中的“时间”数据中填充“主”表中的“星期一”,“星期二”,“星期三” ...“星期日”等星期几表。匹配项应检查StorenNumber是否与BranchNumber匹配...然后在星期几,如果小时表中有“ Mon”,则应将其与主表中的“ Mon”列匹配。
所以输出看起来像这样
StoreNumber Online Pressure MON TUE WED ... Hours
1 0.2 50 7.50 6.00 8.50 ... 53
2 0.8 20 0 0 0 ... 30
3 1.2 10 2.00 1.00 2.50 ... 20
...
例如,小时表中不存在BranchNumber 2,因此它应该跳过每一列并将输入保留为0
我已经在下面尝试过基于R的代码,但是正努力获得输出...运行代码后,我得到了零错误,但主表保持不变,并且一周中的所有天都保持为0
merged <- Master%>% select(-
c("MON","TUE","WED","THU","FRI","SAT","SUN")) %>%
left_join(
Hours %>% pivot_wider(names_from = Day,values_from =
Time),by = c("StoreNumber" = "BranchNumber"))
merged <- merged %>% replace(is.na(.),0)
Dput(主)
Pressure = c(0,0.23,0.19,0.56,0.42,0.37,0.92,0.04,0.15,0.12,0.06,0.02,0.08,0.46,0.25,0.17,0.1,0.48,0.13,0.44,0.81,0.5,0.33,0.27,0.21,0.31,0.29,0.38,0.58,1.15,0.4,0.35,0.6,1.4,0.77,0.85,0.69,0.65,0.54,0.67,0.63,1.35,0.02),Online = c(1.63,2.6,1.9,5.27,1.23,1.87,4.56,3.71,2.4,9.62,1.5,1.96,2.5,10.37,2.62,7.44,2.71,1.48,16.94,3.92,2.9,4.44,2.69,6.04,1.44,0.52,1.81,9.25,3.12,5.56,2.33,4.67,2.54,3.73,1.63,3.4,1.08,6.27,2.23,3.13,2.02,3.31,2.96,3.19,6.21,8.6,0.9,0.94,1.58,3.38,4.04,6.46,3.17,4.79,7.92,6.58,5.88,5.06,6.42,4.4,2.08,2.81,3.23,1.6,3.08,5.77,1.65,2.56,3.81,4.08,3.65,3.77,2.75,4.37,4.92,2.12,4.02,1.29,6.12,4.62,1.98,9.77,7.63,5.37,0.79,1.17,4.23,4.31,6.38,4.77,8.08,2.13,1.75,0.87,2.19,1.38,3.5,1.85,18.52,6.69,1.06,1.71,7.48,11.04,6.1,10.31,4.46,2.27,3.83,4.98,5.63,1.21,9.35,1.79,6.15,6.52,1.37,4.25,0.75,3.56,1.12,4.83,5.04,2.38,5.4,13.38,3.52,5.19,3.62,4.5,10.19,4.73,9.4,11.21,2.29,35.94,7.58,1.54,2.98,4.35,5.85,5,8.75,10.13,6,7.69,1.92,4.29,3.46,7.65,8.35,1,6.96,2.17,14.81,10.65,5.79,6.6,4.94,1.27,4.1,8.65,3.67,5.46,1.69,2.06,5.1,2.87,6.54,0.73,1.33,3.98,9.73,12,3.29,6.06,2.48,5.58,4.85,4.15,2.35,14.35,2.77,19.6,3.35,17.13,2.65,4.58,1.02,7.12,5.25,3.04,7.9,10.67,7.42,2.44,7.1,5.62,4.71,12.6,8.98,19.65,15.56,12.58,7.33,11.06,9.33,12.79,15.29,7.23,14.17,12.06,15.35,10.71,14.69,6.94,5.73,7.17,15.77,7.02,12.75,11.56,8.12,9.02,10.79,10.08,8.21,9.81,8.92,10.73,10.15,8.62,13.23,14.02,14.13,6.25,7.37,21.4,6.63,17.29,8.4,10.77,56.38,11.98,8.9,13.48,14.23,8.17,11.62,12.38,6.88,15.65,13.02,7.71,18.33,5.69,7.96,5.35,21.17,9.17,13.96,8.69,8.5,12.94,19.94,7.54,13.9,11.4,11.31,8.83,11.19,9.94,7.5,8,8.02,11.15,15.13,7.06,10.04,27.02,13.6,15.5,13.15,7.75,9.19,11.48,8.81,10.23,12.1,5.96,8.13,12.48,9.44,9.54,19.9,4.63,23.63,24.31,20.85,15.04,11.27,40.12,8.77,7.79,8.27,5.67,7.6,8.87,17.77,5.9,6.85,12.54,21.08,15.08,11.67,32.79,5.21,21.35,18.06,13.17,16.52,17.04,10.4,21.19,8.71,14.5,8.29,19.87,3.87,6.73,6.75,22.87,16.79,6.33,10.25,9.87,14.33,13.58,11.88,9.12,12.21,10.92,9.65,13.75,9.46,12.33,10.02,98.77,12.9,10.1,10.56,6.17,2.92,1.42,3.25,3.94,0.96,4.88,7.87,11.35,2.83,6.67,12.46,5.13,6.48,3.44,6.31,6.81,4.19,11.79,3.88,3.6,10.35,16.15,12.31,15.87,7.15,16.58,5.15,4.52,16.31,5.94,22.02,14.63,10.58,3.02,23.46,0),row.names = c(NA,-1432L),class = "data.frame")
通话量(小时)
Days = c(8.5,9,4,9.5,10,13,3,8.25,15,7,11,12.5,10.5,11.5,14,9.75,16,13.5,10.75,11.75,11.25,8.5)),-8658L),class = "data.frame")
任何帮助/指导将不胜感激
非常感谢
解决方法
也许此tidyverse
解决方案接近您想要的解决方案。但是我不确定Hours
变量在哪里。您可以将数据重塑为宽范围,然后合并。这里的代码,因为在两个数据框中都存在天数,您可以像这样直接合并:
library(tidyverse)
#Code
#First reshape df2
merged <- df1 %>% select(StoreNumber,Online,Pressure) %>%
left_join(
df2 %>% pivot_wider(names_from = Day,values_from=Time),by = c('StoreNumber'='BranchNumber'))
#Replace zeroes
merged <- merged %>% replace(is.na(.),0)
输出:
StoreNumber Online Pressure MON TUE WED
1 1 0.2 50 7.5 6 8.5
2 2 0.8 20 0.0 0 0.0
3 3 1.2 10 2.0 1 2.5
使用了一些数据:
#Data 1
df1 <- structure(list(StoreNumber = 1:3,Online = c(0.2,0.8,1.2),Pressure = c(50L,20L,10L),MON = c(0L,0L,0L),TUE = c(0L,WED = c(0L,SUN = c(0L,0L)),class = "data.frame",row.names = c(NA,-3L))
#Data 2
df2 <- structure(list(BranchNumber = c(1L,1L,3L,3L),Day = c("MON","TUE","WED","MON","WED"),Time = c(7.5,6,8.5,2,1,2.5)),-6L))