如何添加对值进行排名的列?

问题描述

我前段时间问了一个类似的问题,但后来意识到我的问题实际上更复杂。很抱歉再次询问。

df <- data.frame(
  comp_name = c("A","A","B","C","D","D"),country = c("US","US","France","France"),year = c("2018","2018","2019","2019"),type = c("profit","revenue","profit","revenue"),value = c(10,20,30,40,50,140,150,120,130,100,110,80,90)
)

df:

   comp_name country year    type value
1          A      US 2018  profit    10
2          A      US 2018 revenue    20
3          B      US 2018  profit    30
4          B      US 2018 revenue    40
5          A      US 2019  profit    20
6          A      US 2019 revenue    30
7          B      US 2019  profit    40
8          B      US 2019 revenue    50
9          C  France 2018  profit   140
10         C  France 2018 revenue   150
11         D  France 2018  profit   120
12         D  France 2018 revenue   130
13         C  France 2019  profit   100
14         C  France 2019 revenue   110
15         D  France 2019  profit    80
16         D  France 2019 revenue    90

我想像这样添加一个排名列:

   comp_name country year    type value rank
1          A      US 2018  profit    10     
2          A      US 2018 revenue    20     
3          B      US 2018  profit    30     
4          B      US 2018 revenue    40     
5          A      US 2019  profit    20    2
6          A      US 2019 revenue    30     
7          B      US 2019  profit    40    1
8          B      US 2019 revenue    50     
9          C  France 2018  profit   140     
10         C  France 2018 revenue   150     
11         D  France 2018  profit   120     
12         D  France 2018 revenue   130     
13         C  France 2019  profit   100    1
14         C  France 2019 revenue   110     
15         D  France 2019  profit    80    2
16         D  France 2019 revenue    90     

我只想考虑 2019 年的利润,并根据每个国家/地区的利润对公司进行排名。

当我之前问这个问题时,@KarthikS 提供了以下解决方案:

library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))

但是,我现在添加了更多变量(年份和类型),我也想考虑这些。

如果问题不清楚,请告诉我。我是 R 的新手,任何帮助将不胜感激。 谢谢!

解决方法

计算所有年份、所有类型、所有年份的排名,然后删除不需要的值。 (或保留它们。)

library(dplyr)
df %>%
  group_by(country,year,type) %>%
  mutate(rank = rank(desc(value))) %>%
  ungroup() %>%
  mutate(rank = if_else(year == 2019 & type == "profit",rank,NA_real_))
# # A tibble: 16 x 6
#    comp_name country year  type    value  rank
#    <chr>     <chr>   <chr> <chr>   <dbl> <dbl>
#  1 A         US      2018  profit     10    NA
#  2 A         US      2018  revenue    20    NA
#  3 B         US      2018  profit     30    NA
#  4 B         US      2018  revenue    40    NA
#  5 A         US      2019  profit     20     2
#  6 A         US      2019  revenue    30    NA
#  7 B         US      2019  profit     40     1
#  8 B         US      2019  revenue    50    NA
#  9 C         France  2018  profit    140    NA
# 10 C         France  2018  revenue   150    NA
# 11 D         France  2018  profit    120    NA
# 12 D         France  2018  revenue   130    NA
# 13 C         France  2019  profit    100     1
# 14 C         France  2019  revenue   110    NA
# 15 D         France  2019  profit     80     2
# 16 D         France  2019  revenue    90    NA

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...