如果我只有边名称,如何创建网络?

问题描述

我正在尝试连接在同一过程中被引用的作者。我的节点是作者,边缘是进程,但我不知道如何创建边缘列表。

我现在拥有的('Doutrina' 表示作者,'Numero' 表示进程号):

image of data

我想要这样的东西(这里的“N”表示这种联系发生了多少次,即它们被一起引用了多少次):

image of desired output


示例数据:

library(dplyr)

df <- tribble(
  ~Doutrina,~Numero,"MILARE,2014","1009526-53.2015.8.26.0032","SEGUIN,2000","0054387-89.2011.8.26.0224","SILVA,2009",2015","0000351-14.2013.8.26.0326",2011","MAXIMILIANO,1961","0000431-26.2013.8.26.0698","0054391-29.2011.8.26.0224","0012360-28.2010.8.26.0224","0012360-28.2010.8.26.0224"
)

df
#> # A tibble: 12 x 2
#>    Doutrina          Numero                   
#>    <chr>             <chr>                    
#>  1 MILARE,2014      1009526-53.2015.8.26.0032
#>  2 SEGUIN,2000      0054387-89.2011.8.26.0224
#>  3 SILVA,2009       0054387-89.2011.8.26.0224
#>  4 MILARE,2015      0000351-14.2013.8.26.0326
#>  5 SILVA,2011       0000351-14.2013.8.26.0326
#>  6 MAXIMILIANO,1961 0000351-14.2013.8.26.0326
#>  7 SILVA,2009       0000431-26.2013.8.26.0698
#>  8 SEGUIN,2000      0000431-26.2013.8.26.0698
#>  9 SILVA,2009       0054391-29.2011.8.26.0224
#> 10 SEGUIN,2000      0054391-29.2011.8.26.0224
#> 11 MAXIMILIANO,2015 0012360-28.2010.8.26.0224
#> 12 MILARE,2015      0012360-28.2010.8.26.0224

解决方法

我修改了您的示例数据,因此结果会更有趣。

library(dplyr)

df <- tribble(
  ~Doutrina,~Numero,"MILARE,2014","1009526-53.2015.8.26.0032","SEGUIN,2000","0054387-89.2011.8.26.0224","SILVA,2009",2015","0000351-14.2013.8.26.0326",2011","MAXIMILIANO,1961","0000431-26.2013.8.26.0698","0054391-29.2011.8.26.0224","0012360-28.2010.8.26.0224","0012360-28.2010.8.26.0224"
)

df %>% 
  mutate(Doutrina = sub(",[0-9]{4}","",Doutrina)) %>%  # remove the year
  full_join(x = .,y = .,by = "Numero") %>%  # join data to itself by Numero
  select(Doutrina = Doutrina.x,Doutrina2 = Doutrina.y) %>%  # keep only name columns
  filter(Doutrina != Doutrina2) %>%  # remove self-reference rows
  filter(Doutrina < Doutrina2) %>%  # only keep rows for one diretion of edge/link
  group_by(Doutrina,Doutrina2) %>% 
  summarise(N = n(),.groups = "drop")
#> # A tibble: 4 x 3
#>   Doutrina    Doutrina2     N
#>   <chr>       <chr>     <int>
#> 1 MAXIMILIANO MILARE        2
#> 2 MAXIMILIANO SILVA         1
#> 3 MILARE      SILVA         1
#> 4 SEGUIN      SILVA         3

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...