如何添加对值进行排名的列？

问题描述

我前段时间问了一个类似的问题，但后来意识到我的问题实际上更复杂。很抱歉再次询问。

df <- data.frame(
  comp_name = c("A","A","B","C","D","D"),country = c("US","US","France","France"),year = c("2018","2018","2019","2019"),type = c("profit","revenue","profit","revenue"),value = c(10,20,30,40,50,140,150,120,130,100,110,80,90)
)

df：

   comp_name country year    type value
1          A      US 2018  profit    10
2          A      US 2018 revenue    20
3          B      US 2018  profit    30
4          B      US 2018 revenue    40
5          A      US 2019  profit    20
6          A      US 2019 revenue    30
7          B      US 2019  profit    40
8          B      US 2019 revenue    50
9          C  France 2018  profit   140
10         C  France 2018 revenue   150
11         D  France 2018  profit   120
12         D  France 2018 revenue   130
13         C  France 2019  profit   100
14         C  France 2019 revenue   110
15         D  France 2019  profit    80
16         D  France 2019 revenue    90

我想像这样添加一个排名列：

   comp_name country year    type value rank
1          A      US 2018  profit    10     
2          A      US 2018 revenue    20     
3          B      US 2018  profit    30     
4          B      US 2018 revenue    40     
5          A      US 2019  profit    20    2
6          A      US 2019 revenue    30     
7          B      US 2019  profit    40    1
8          B      US 2019 revenue    50     
9          C  France 2018  profit   140     
10         C  France 2018 revenue   150     
11         D  France 2018  profit   120     
12         D  France 2018 revenue   130     
13         C  France 2019  profit   100    1
14         C  France 2019 revenue   110     
15         D  France 2019  profit    80    2
16         D  France 2019 revenue    90

我只想考虑 2019 年的利润，并根据每个国家/地区的利润对公司进行排名。

当我之前问这个问题时，@KarthikS 提供了以下解决方案：

library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))

但是，我现在添加了更多变量（年份和类型），我也想考虑这些。

如果问题不清楚，请告诉我。我是 R 的新手，任何帮助将不胜感激。谢谢！

解决方法

计算所有年份、所有类型、所有年份的排名，然后删除不需要的值。（或保留它们。）

library(dplyr)
df %>%
  group_by(country,year,type) %>%
  mutate(rank = rank(desc(value))) %>%
  ungroup() %>%
  mutate(rank = if_else(year == 2019 & type == "profit",rank,NA_real_))
# # A tibble: 16 x 6
#    comp_name country year  type    value  rank
#    <chr>     <chr>   <chr> <chr>   <dbl> <dbl>
#  1 A         US      2018  profit     10    NA
#  2 A         US      2018  revenue    20    NA
#  3 B         US      2018  profit     30    NA
#  4 B         US      2018  revenue    40    NA
#  5 A         US      2019  profit     20     2
#  6 A         US      2019  revenue    30    NA
#  7 B         US      2019  profit     40     1
#  8 B         US      2019  revenue    50    NA
#  9 C         France  2018  profit    140    NA
# 10 C         France  2018  revenue   150    NA
# 11 D         France  2018  profit    120    NA
# 12 D         France  2018  revenue   130    NA
# 13 C         France  2019  profit    100     1
# 14 C         France  2019  revenue   110    NA
# 15 D         France  2019  profit     80     2
# 16 D         France  2019  revenue    90    NA

dataframe r r ranking

如何添加对值进行排名的列？

问题描述

解决方法

相关问答