问题描述
我正在尝试使用 dplyr
包 R
重现 collapse
代码的以下输出。
library(tidyverse)
starwars %>%
select(name,mass,species) %>%
group_by(species) %>%
mutate(mass_norm = mean(mass,na.rm = TRUE))
# A tibble: 87 x 4
# Groups: species [38]
name mass species mass_norm
<chr> <dbl> <chr> <dbl>
1 Luke Skywalker 77 Human 82.8
2 C-3PO 75 Droid 69.8
3 R2-D2 32 Droid 69.8
4 Darth Vader 136 Human 82.8
5 Leia Organa 49 Human 82.8
6 Owen lars 120 Human 82.8
7 Beru Whitesun lars 75 Human 82.8
8 R5-D4 32 Droid 69.8
9 Biggs Darklighter 84 Human 82.8
10 Obi-Wan Kenobi 77 Human 82.8
# … with 77 more rows
collapse
代码
library(collapse)
starwars %>%
fselect(name,species) %>%
fgroup_by(species) %>%
ftransform(mass_norm = fmean(mass,na.rm = TRUE))
# A tibble: 87 x 4
name mass species mass_norm
* <chr> <dbl> <chr> <dbl>
1 Luke Skywalker 77 Human 97.3
2 C-3PO 75 Droid 97.3
3 R2-D2 32 Droid 97.3
4 Darth Vader 136 Human 97.3
5 Leia Organa 49 Human 97.3
6 Owen lars 120 Human 97.3
7 Beru Whitesun lars 75 Human 97.3
8 R5-D4 32 Droid 97.3
9 Biggs Darklighter 84 Human 97.3
10 Obi-Wan Kenobi 77 Human 97.3
# … with 77 more rows
Grouped by: species [38 | 2 (5.5)]
想知道为什么我用 collapse
代码得到错误答案。任何提示。
解决方法
fmean
默认使用 na.rm = TRUE
。此外,还有一个选项可以指定 fmean
内的分组,即 g
。默认情况下,TRA
为 NULL
并返回汇总输出,但我们可以将其更改为 replace_fill
以返回完整长度
library(collapse)
ftransform(slt(starwars,name,mass,species),mass_norm = fmean(mass,species,TRA = 'replace_fill'))
-输出
# A tibble: 87 x 4
# name mass species mass_norm
# * <chr> <dbl> <chr> <dbl>
# 1 Luke Skywalker 77 Human 82.8
# 2 C-3PO 75 Droid 69.8
# 3 R2-D2 32 Droid 69.8
# 4 Darth Vader 136 Human 82.8
# 5 Leia Organa 49 Human 82.8
# 6 Owen Lars 120 Human 82.8
# 7 Beru Whitesun lars 75 Human 82.8
# 8 R5-D4 32 Droid 69.8
# 9 Biggs Darklighter 84 Human 82.8
#10 Obi-Wan Kenobi 77 Human 82.8
# … with 77 more rows
如果我们要使用链式,使用GRP
来指定数据上的g
或分组变量(.
)
library(dplyr)
starwars %>%
fselect(name,species) %>%
fgroup_by(species) %>%
ftransform(mass_norm = fmean(mass,GRP(.),TRA = 'replace'))