将总计添加到数据框

问题描述

我想将总计添加到我的数据框中,但是遇到了困难,因为数据非常混乱(和以往一样!)-一些列是文本,一些日期,一些数字。我无法发布实际数据,因为它很敏感,但我将显示一个具有相同结构的代表性示例(下图-所需的列为黄色。我一直在尝试使用dplyr和管道进行此操作,但是由于文字和数字的混合。...

数据:

date <- c("17/08/2020","17/08/2020","18/08/2020","18/08/2020")

type <- c("type A","type B","type A","type B")

location <- c("USA","USA","India","India")

value <- c("10","10","frak","15","open","open")

df <- data.frame(date,type,location,value)

基本上,我需要按日期,类型和位置进行汇总。 enter image description here

解决方法

不确定这就是你要的吗。

df %>%
  group_by(date,type = "total_type",location) %>%
  summarise("value" = sum(as.numeric(value),na.rm = F)) %>%
  mutate(value = as.character(value)) %>%
  bind_rows(df)

# A tibble: 12 x 4
# Groups:   date,type [6]
   date       type       location value
   <chr>      <chr>      <chr>    <chr>
 1 17/08/2020 total_type India    NA   
 2 17/08/2020 total_type USA      20   
 3 18/08/2020 total_type India    NA   
 4 18/08/2020 total_type USA      30   
 5 17/08/2020 type A     USA      10   
 6 17/08/2020 type B     USA      10   
 7 17/08/2020 type A     India    frak 
 8 17/08/2020 type B     India    frak 
 9 18/08/2020 type A     USA      15   
10 18/08/2020 type B     USA      15   
11 18/08/2020 type A     India    open 
12 18/08/2020 type B     India    open 

按除value以外的所有列进行分组可复制原始表,并且在图像中,汇总行的类型为total_type。另一方面,图像中所有已汇总的行都具有USA位置,这也没有意义,因此我照原样进行设置。

,

我建议使用下一种方法,该方法也类似于@Humpelstielzchen提出的方法,该方法与您在图片中显示的方法很接近:

library(dplyr)

df %>% bind_rows(df %>% group_by(date,location) %>%
                   mutate(value=as.numeric(value)) %>% 
                   summarise(value=sum(value,na.rm=F)) %>%
                   mutate(type='total type',value=as.character(value)))

输出:

         date       type location value
1  17/08/2020     type A      USA    10
2  17/08/2020     type B      USA    10
3  17/08/2020     type A    India  frak
4  17/08/2020     type B    India  frak
5  18/08/2020     type A      USA    15
6  18/08/2020     type B      USA    15
7  18/08/2020     type A    India  open
8  18/08/2020     type B    India  open
9  17/08/2020 total type    India  <NA>
10 17/08/2020 total type      USA    20
11 18/08/2020 total type    India  <NA>
12 18/08/2020 total type      USA    30

更新:这里的方法可能会因为OP版本的软件包而发出:

library(dplyr)
#Data
date <- c("17/08/2020","17/08/2020","18/08/2020","18/08/2020")

type <- c("type A","type B","type A","type B")

location <- c("USA","USA","India","India")

value <- c("10","10","frak","15","open","open")

df <- data.frame(date,type,location,value,stringsAsFactors = F)
#Mutate for summary
df1 <- df %>% group_by(date,location) %>%
  mutate(value=as.numeric(value)) %>% 
  summarise(value=sum(value,na.rm=F)) %>%
  mutate(type='total type') %>% ungroup()
df1$value <- as.character(df1$value)
#Bind
df2 <- rbind(df,df1)

输出:

         date       type location value
1  17/08/2020     type A      USA    10
2  17/08/2020     type B      USA    10
3  17/08/2020     type A    India  frak
4  17/08/2020     type B    India  frak
5  18/08/2020     type A      USA    15
6  18/08/2020     type B      USA    15
7  18/08/2020     type A    India  open
8  18/08/2020     type B    India  open
9  17/08/2020 total type    India  <NA>
10 17/08/2020 total type      USA    20
11 18/08/2020 total type    India  <NA>
12 18/08/2020 total type      USA    30

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...