使用唯一标识符,将行值与R中的列名匹配

问题描述

大师

StoreNumber Online Pressure MON TUE WED ... SUN ...
   1        0.2     50     0   0   0  ...  0
   2        0.8     20     0   0   0  ...  0
   3        1.2     10     0   0   0  ...  0
   ...

小时

BranchNumber Day ... Time
    1     MON     7.50
    1     TUE     6.00
    1     WED     8.50
    3     MON     2.00
    3     TUE     1.00
    3     WED     2.50
    ... 

NB由于一周中有7天,因此分支机构数目很多,大约有10,500,因此一个分支机构数目可能在一周的每一天出现7次,但有些分支机构可能只有6或5个,等等。

我的想法是我想从“小时”中的“时间”数据中填充“主”表中的“星期一”,“星期二”,“星期三” ...“星期日”等星期几表。匹配项应检查StorenNumber是否与BranchNumber匹配...然后在星期几,如果小时表中有“ Mon”,则应将其与主表中的“ Mon”列匹配。

所以输出看起来像这样

StoreNumber Online Pressure MON TUE WED ... Hours
   1        0.2     50   7.50 6.00 8.50  ...  53
   2        0.8     20     0   0   0  ...  30
   3        1.2     10   2.00 1.00 2.50  ...  20
   ...

例如,小时表中不存在BranchNumber 2,因此它应该跳过每一列并将输入保留为0

我已经在下面尝试过基于R的代码,但是正努力获得输出...运行代码后,我得到了零错误,但主表保持不变,并且一周中的所有天都保持为0

    merged <- Master%>% select(- 
    c("MON","TUE","WED","THU","FRI","SAT","SUN")) %>%
    left_join(
    Hours %>% pivot_wider(names_from = Day,values_from = 
    Time),by = c("StoreNumber" = "BranchNumber"))

    merged <- merged %>% replace(is.na(.),0)

Dput(主)

Pressure = c(0,0.23,0.19,0.56,0.42,0.37,0.92,0.04,0.15,0.12,0.06,0.02,0.08,0.46,0.25,0.17,0.1,0.48,0.13,0.44,0.81,0.5,0.33,0.27,0.21,0.31,0.29,0.38,0.58,1.15,0.4,0.35,0.6,1.4,0.77,0.85,0.69,0.65,0.54,0.67,0.63,1.35,0.02),Online = c(1.63,2.6,1.9,5.27,1.23,1.87,4.56,3.71,2.4,9.62,1.5,1.96,2.5,10.37,2.62,7.44,2.71,1.48,16.94,3.92,2.9,4.44,2.69,6.04,1.44,0.52,1.81,9.25,3.12,5.56,2.33,4.67,2.54,3.73,1.63,3.4,1.08,6.27,2.23,3.13,2.02,3.31,2.96,3.19,6.21,8.6,0.9,0.94,1.58,3.38,4.04,6.46,3.17,4.79,7.92,6.58,5.88,5.06,6.42,4.4,2.08,2.81,3.23,1.6,3.08,5.77,1.65,2.56,3.81,4.08,3.65,3.77,2.75,4.37,4.92,2.12,4.02,1.29,6.12,4.62,1.98,9.77,7.63,5.37,0.79,1.17,4.23,4.31,6.38,4.77,8.08,2.13,1.75,0.87,2.19,1.38,3.5,1.85,18.52,6.69,1.06,1.71,7.48,11.04,6.1,10.31,4.46,2.27,3.83,4.98,5.63,1.21,9.35,1.79,6.15,6.52,1.37,4.25,0.75,3.56,1.12,4.83,5.04,2.38,5.4,13.38,3.52,5.19,3.62,4.5,10.19,4.73,9.4,11.21,2.29,35.94,7.58,1.54,2.98,4.35,5.85,5,8.75,10.13,6,7.69,1.92,4.29,3.46,7.65,8.35,1,6.96,2.17,14.81,10.65,5.79,6.6,4.94,1.27,4.1,8.65,3.67,5.46,1.69,2.06,5.1,2.87,6.54,0.73,1.33,3.98,9.73,12,3.29,6.06,2.48,5.58,4.85,4.15,2.35,14.35,2.77,19.6,3.35,17.13,2.65,4.58,1.02,7.12,5.25,3.04,7.9,10.67,7.42,2.44,7.1,5.62,4.71,12.6,8.98,19.65,15.56,12.58,7.33,11.06,9.33,12.79,15.29,7.23,14.17,12.06,15.35,10.71,14.69,6.94,5.73,7.17,15.77,7.02,12.75,11.56,8.12,9.02,10.79,10.08,8.21,9.81,8.92,10.73,10.15,8.62,13.23,14.02,14.13,6.25,7.37,21.4,6.63,17.29,8.4,10.77,56.38,11.98,8.9,13.48,14.23,8.17,11.62,12.38,6.88,15.65,13.02,7.71,18.33,5.69,7.96,5.35,21.17,9.17,13.96,8.69,8.5,12.94,19.94,7.54,13.9,11.4,11.31,8.83,11.19,9.94,7.5,8,8.02,11.15,15.13,7.06,10.04,27.02,13.6,15.5,13.15,7.75,9.19,11.48,8.81,10.23,12.1,5.96,8.13,12.48,9.44,9.54,19.9,4.63,23.63,24.31,20.85,15.04,11.27,40.12,8.77,7.79,8.27,5.67,7.6,8.87,17.77,5.9,6.85,12.54,21.08,15.08,11.67,32.79,5.21,21.35,18.06,13.17,16.52,17.04,10.4,21.19,8.71,14.5,8.29,19.87,3.87,6.73,6.75,22.87,16.79,6.33,10.25,9.87,14.33,13.58,11.88,9.12,12.21,10.92,9.65,13.75,9.46,12.33,10.02,98.77,12.9,10.1,10.56,6.17,2.92,1.42,3.25,3.94,0.96,4.88,7.87,11.35,2.83,6.67,12.46,5.13,6.48,3.44,6.31,6.81,4.19,11.79,3.88,3.6,10.35,16.15,12.31,15.87,7.15,16.58,5.15,4.52,16.31,5.94,22.02,14.63,10.58,3.02,23.46,0),row.names = c(NA,-1432L),class = "data.frame")

通话量(小时)

Days = c(8.5,9,4,9.5,10,13,3,8.25,15,7,11,12.5,10.5,11.5,14,9.75,16,13.5,10.75,11.75,11.25,8.5)),-8658L),class = "data.frame")

任何帮助/指导将不胜感激

非常感谢

解决方法

也许此tidyverse解决方案接近您想要的解决方案。但是我不确定Hours变量在哪里。您可以将数据重塑为宽范围,然后合并。这里的代码,因为在两个数据框中都存在天数,您可以像这样直接合并:

library(tidyverse)
#Code
#First reshape df2
merged <- df1 %>% select(StoreNumber,Online,Pressure) %>%
  left_join(
    df2 %>% pivot_wider(names_from = Day,values_from=Time),by = c('StoreNumber'='BranchNumber'))
#Replace zeroes
merged <- merged %>% replace(is.na(.),0)

输出:

  StoreNumber Online Pressure MON TUE WED
1           1    0.2       50 7.5   6 8.5
2           2    0.8       20 0.0   0 0.0
3           3    1.2       10 2.0   1 2.5

使用了一些数据:

#Data 1
df1 <- structure(list(StoreNumber = 1:3,Online = c(0.2,0.8,1.2),Pressure = c(50L,20L,10L),MON = c(0L,0L,0L),TUE = c(0L,WED = c(0L,SUN = c(0L,0L)),class = "data.frame",row.names = c(NA,-3L))

#Data 2
df2 <- structure(list(BranchNumber = c(1L,1L,3L,3L),Day = c("MON","TUE","WED","MON","WED"),Time = c(7.5,6,8.5,2,1,2.5)),-6L))