问题描述
我有一个数据框,其中包含来自多个渠道的每日渠道收入。数据框如下所示:
orders_dataframe:
Order |Channel | Revenue |
1 |TV | 120 |
2 |Email | 30 |
3 |Retail | 300 |
4 |Shop1 | 50 |
5 |Shop2 | 90 |
6 |Email | 20 |
7 |Retail | 250 |
我想做的是将来自零售的收入分成几部分,并按照预定义的比例(例如60%/ 40%的分成)在Shop1和Shop2之间进行分配。例如,我希望收入来自“零售”的所有行都归因于Shop1 60%和Shop2 40%。可以通过将所有零售收入行替换为两个新行来体现这一点,正如我要在最终表中获得的最终表中的订单3和订单7所示:
orders_dataframe:
Order |Channel | Revenue |
1 |TV | 120 |
2 |Email | 30 |
3 |Shop1 | 180 |
3 |Shop2 | 120 |
4 |Shop1 | 50 |
5 |Shop2 | 90 |
6 |Email | 20 |
7 |Shop1 | 150 |
7 |Shop2 | 100 |
理想情况下,由于我要对各种数据集执行此操作,因此我想从数据帧(split_dataframe)中获取百分比,而不是手动分配数字60%和40%。我想使用如下数据集中的数据:
split_dataframe:
Channel |Percent |
Shop1 |60% |
Shop2 |40% |
这是两个数据帧的可复制示例:
orders_dataframe <- data.frame(Order = c(1,2,3,4,5,6,7),Channel = c("TV","Email","Retail","Shop1","Shop2","Retail"),Revenue = c(120,30,300,50,90,20,250))
split_dataframe <- data.frame(Channel = c("Shop1","Shop2"),Percent = c(0.6,0.4))
非常感谢您!
解决方法
使用dplyr
,
split_dataframe %>%
mutate(Index="Retail") %>%
merge(.,orders_dataframe,by.x="Index",by.y="Channel") %>%
mutate(Revenue=Revenue*Percent) %>%
select(Order,Channel,Revenue) %>%
bind_rows(orders_dataframe %>% filter(Channel !="Retail"),.)%>%
arrange(.,Order)
给予
Order Channel Revenue
1 1 TV 120
2 2 Email 30
3 3 Shop1 180
4 3 Shop2 120
5 4 Shop1 50
6 5 Shop2 90
7 6 Email 20
8 7 Shop1 150
9 7 Shop2 100
,
这是一种data.table
的方法...请参见代码中的注释以获取解释
library( data.table )
#make them data.tables
setDT( orders_dataframe ); setDT( split_dataframe )
#split to retail en non-retail orders
orders_retail <- orders_dataframe[ Channel == "Retail",]
orders_no_retail <- orders_dataframe[ !Channel == "Retail",]
#divide the retail orders over the two shops (multiple steps)
#create a new colum by shop
shop_cols <- split_dataframe$Channel
orders_retail[,(shop_cols) := Revenue ]
#melt to long format
orders_retail.melt <- melt( orders_retail,id.vars = "Order",measure.vars = (shop_cols),variable.name = "Channel",value.name = "Revenue")
#and update the molten data with the percentages in the split_dataframe
orders_retail.melt[ split_dataframe,Revenue := Revenue * i.Percent,on = .( Channel )]
#merge everything back together and order on Order id
ans <- rbind( orders_no_retail,orders_retail.melt )
setorder( ans,Order )
# Order Channel Revenue
# 1: 1 TV 120
# 2: 2 Email 30
# 3: 3 Shop1 180
# 4: 3 Shop2 120
# 5: 4 Shop1 50
# 6: 5 Shop2 90
# 7: 6 Email 20
# 8: 7 Shop1 150
# 9: 7 Shop2 100
,
您可以在基数R中执行此操作。
{
"extends": "./tsconfig.base.json","compilerOptions": {
"outDir": "./out-tsc/app","types": ["node"]
},"files": ["src/main.ts","src/polyfills.ts"],"include": ["src/**/*.d.ts"]
}