如何创建R中连续出现的元素的同时出现矩阵?

问题描述

我是R语言的新秀,我想基于一行中同时出现的元素创建一个共现矩阵。

理想结果的基本示例

说您有这张桌子:

df <- data.frame(ID = c(1,2,3),V1 = c("England","England","China"),V2 = c("Greece","Greece")
)

也就是说:

   ID      V1      V2
1:  1 England  Greece
2:  2 England England
3:  3   China  Greece

我的同现矩阵应如下所示:

Country   China    England  Greece
China           0       0      1 #China & Greece co-occur in row 3
England         0       1      1 #England & England co-occur in row 2,and England and Greece in row 1
Greece          1       1      0 

但是,如果我遵循此example,我会得到:

library(tidyverse)
df %>%
      pivot_longer(-ID,names_to = "Category",values_to = "Country") %>%
      xtabs(~ID + Country,data = .,sparse = FALSE) %>% 
      crossprod(.,.) 
    
    df_diag <- df %>% 
      pivot_longer(-ID,values_to = "Country") %>%
      mutate(Country2 = Country) %>%
      xtabs(~Country + Country2,sparse = FALSE) %>% 
      diag()
    
    diag(df1) <- df_diag 
    
    df1

Country   China England Greece
  China       1       0      1
  England     0       3      1
  Greece      1       1      2

如果我跟随其他one,我将得到:

library(reshape2)
library(data.table)

melt(setDT(df),id.vars = "ID",measure = patterns("^V"))[nchar(value) > 0 & complete.cases(value)] -> foo

# Get distinct value (country) in each ID group (each row)
unique(foo,by = c("ID","value")) -> foo2

# https://stackoverflow.com/questions/13281303/creating-co-occurrence-matrix
# Seeing this question,you want to create a matrix with crossprod().

crossprod(table(foo2[,c(1,3)])) -> mymat

# Finally,you need to change diagonal values. If a value is equal to one,# change it to zero. Otherwise,keep the original value.

diag(mymat) <- ifelse(diag(mymat) <= 1,mymat)

mymat

value     China England Greece
  China       0       0      1
  England     0       0      1
  Greece      1       1      1

我想我缺少有关xtabs的工作方式或者也许还有跨脚架的东西?

解决方法

您需要的是:

在基数R中,您可以这样做:

a <- table(lapply(df[-1],factor,levels = sort(unique(unlist(df[-1])))))
a[lower.tri(a)] <- t(a)[lower.tri(a)]
 a
         V2
V1        China England Greece
  China       0       0      1
  England     0       1      1
  Greece      1       1      0

其他库:

library(network)
as.matrix(network(df[-1],directed = FALSE))
        China England Greece
China       0       0      1
England     0       1      1
Greece      1       1      0

library(igraph)
as.matrix(as_adj(graph_from_data_frame(df[-1],FALSE)))
        England China Greece
England       1     0      1
China         0     0      1
Greece        1     1      0
,

您可以在底数R中使用outer

unique_vals <- sort(union(df$V1,df$V2))
co_mat <- function(x,y) +(any(df$V1 == x & df$V2 == y | 
                                df$V2 == x & df$V1 == y))
mat <- outer(unique_vals,unique_vals,Vectorize(co_mat))
dimnames(mat) <- list(unique_vals,unique_vals)
mat

#        China England Greece
#China       0       0      1
#England     0       1      1
#Greece      1       1      0