如何在R中不同大小的列中找到共同元素? 数据

问题描述

我有一个名为 animals 的数据框,其中包含不同大小的列,这些列之间有一些常见和不常见的元素,如下所示:

Dog     Cat      Lion     Dog
Cat     Lion     Dog      Shark
Lion    Dog      Shark    Cat
Shark   Shark    Cat      Lion
        Whale    Seal     Moose
        Seal              Whale
                          Deer

我想要做的是识别每一列中的所有公共元素,排除不常见的元素并将公共元素组合成一列,如下所示:

Dog
Cat
Lion
Shark

到目前为止,我已经尝试使用 duplicated(animals) 识别重复元素,然后使用 animals[duplicated(animals)] 提取重复元素,但这没有给出任何结果。有人有更好的方法吗?

解决方法

我们可以使用intersect

Reduce(intersect,animals)
#[1] "Dog"   "Cat"   "Lion"  "Shark"

或者可以使用tidyverse

library(dplyr)
library(tidyr)
pivot_longer(animals,cols = everything(),values_drop_na = TRUE) %>% 
     group_by(value) %>% 
     filter(n_distinct(name) == ncol(animals)) %>% 
     ungroup %>% 
     distinct(value)
# A tibble: 4 x 1
#  value
#  <chr>
#1 Dog  
#2 Cat  
#3 Lion 
#4 Shark

数据

animals <- structure(list(v1 = c("Dog","Cat","Lion","Shark",NA,NA),v2 = c("Cat","Dog","Whale","Seal",NA
),v3 = c("Lion",v4 = c("Dog","Moose","Deer")),class = "data.frame",row.names = c(NA,-7L))
,

使用 stack + table + rowSums 的另一个基本 R 选项

> names(which(rowSums(table(na.omit(stack(animals)))) == ncol(animals)))
[1] "Cat"   "Dog"   "Lion"  "Shark"

下面我们将代码分解成步骤

> stack(animals)
   values ind
1     Dog  v1
2     Cat  v1
3    Lion  v1
4   Shark  v1
5    <NA>  v1
6    <NA>  v1
7    <NA>  v1
8     Cat  v2
9    Lion  v2
10    Dog  v2
11  Shark  v2
12  Whale  v2
13   Seal  v2
14   <NA>  v2
15   Lion  v3
16    Dog  v3
17  Shark  v3
18    Cat  v3
19   Seal  v3
20   <NA>  v3
21   <NA>  v3
22    Dog  v4
23  Shark  v4
24    Cat  v4
25   Lion  v4
26  Moose  v4
27  Whale  v4
28   Deer  v4

> na.omit(stack(animals))
   values ind
1     Dog  v1
2     Cat  v1
3    Lion  v1
4   Shark  v1
8     Cat  v2
9    Lion  v2
10    Dog  v2
11  Shark  v2
12  Whale  v2
13   Seal  v2
15   Lion  v3
16    Dog  v3
17  Shark  v3
18    Cat  v3
19   Seal  v3
22    Dog  v4
23  Shark  v4
24    Cat  v4
25   Lion  v4
26  Moose  v4
27  Whale  v4
28   Deer  v4

> table(na.omit(stack(animals)))
       ind
values  v1 v2 v3 v4
  Cat    1  1  1  1
  Deer   0  0  0  1
  Dog    1  1  1  1
  Lion   1  1  1  1
  Moose  0  0  0  1
  Seal   0  1  1  0
  Shark  1  1  1  1
  Whale  0  1  0  1

> rowSums(table(na.omit(stack(animals))))
  Cat  Deer   Dog  Lion Moose  Seal Shark Whale
    4     1     4     4     1     2     4     2

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...