关于如何在python中绘制自定义网络的问题

问题描述

我有一个包含以下信息的 Pandas 数据框:

Year   NodeName   NodeSize
1990   A          50
1990   B          10
1990   C          100
1995   A          90
1995   B          70
1995   C          60
2000   A          150
2000   B          90
2000   C          100
2005   A          55
2005   B          90
2005   C          130

我希望将节点放在列中,这样每一年都是一列,每一行都是一个节点名称,并且节点大小反映了所指示的数量

然后我在数据框中具有以下边缘,如下所示:

FromYear ToYear  FromNode    ToNode   EdgeWidth
1990     1995    A           B        60   
1990     1995    A           C        20   
1990     1995    B           A        10   
1990     1995    C           B        10   
1995     2000    A           B        60   
1995     2000    B           A        30   
1995     2000    C           A        10   
1995     2000    C           B        10   
1995     2000    B           C        70   
2000     2005    A           B        10
2000     2005    A           C        60
2000     2005    B           A        60
2000     2005    B           C        25
2000     2005    C           B        44
2000     2005    C           A        10

其中第二个数据帧表示边缘信息。比如第一行,是从1990列下的节点A到1995列下的节点B的箭头,边的宽度与边宽列中的数字成线性关系。

似乎有很多关于 networkx 的教程,希望得到指导。

这是我希望它看起来像的粗略草图。如果可能的话,每行节点也应该是不同的颜色。我希望它是某种信息图,而不是显示节点之间多年来流动的典型网络。

enter image description here

这是生成两个数据帧的代码

import pandas as pd

nodes = pd.DataFrame(
[(1990,'A',50),(1990,'B',10),'C',100),(1995,90),70),60),(2000,150),(2005,55),130)],columns=['Year','NodeName','NodeSize'])

edges = pd.DataFrame(
[(1990,1995,20),2000,30),2005,25),44),10)],columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])

解决方法

真的很简单。将 NodeName 转换为 y 坐标,将 Year 转换为 x 坐标,然后绘制一堆 CircleFancyArrow 块。

enter image description here

#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from matplotlib.patches import Circle,FancyArrow

nodes = pd.DataFrame(
    [(1990,'A',50),(1990,'B',10),'C',100),(1995,90),70),60),(2000,150),(2005,55),130)],columns=['Year','NodeName','NodeSize'])

edges = pd.DataFrame(
    [(1990,1995,20),2000,30),2005,25),44),10)],columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])

# compute node coordinates: year -> x,letter -> y;
# np.unique(z,return_inverse=True) maps the unique and alphanumerically 
# ordered elements in z to consecutive integers,# and returns the result as a second output argument
nodes['x'] = np.unique(nodes['Year'],return_inverse=True)[1]
nodes['y'] = np.unique(nodes['NodeName'],return_inverse=True)[1]

# A should be on top,C on bottom
nodes['y'] = np.max(nodes['y']) - nodes['y']

#     Year NodeName  NodeSize  x  y
# 0   1990        A        50  0  2
# 1   1990        B        10  0  1
# 2   1990        C       100  0  0
# 3   1995        A        90  1  2
# 4   1995        B        70  1  1
# 5   1995        C        60  1  0
# 6   2000        A       150  2  2
# 7   2000        B        90  2  1
# 8   2000        C       100  2  0
# 9   2005        A        55  3  2
# 10  2005        B        90  3  1
# 11  2005        C       130  3  0


# compute edge paths
edges = pd.merge(edges,nodes,how='inner',left_on=['FromYear','FromNode'],right_on=['Year','NodeName'])
edges = pd.merge(edges,left_on=['ToYear','ToNode'],'NodeName'],suffixes=['_start','_stop'])

#     FromYear  ToYear FromNode ToNode  EdgeWidth  Year_start NodeName_start  NodeSize_start  x_start  y_start  Year_stop NodeName_stop  NodeSize_stop  x_stop  y_stop
# 0       1990    1995        A      B         60        1990              A              50        0        2       1995             B             70       1       1
# 1       1990    1995        C      B         10        1990              C             100        0        0       1995             B             70       1       1
# 2       1990    1995        A      C         20        1990              A              50        0        2       1995             C             60       1       0
# 3       1990    1995        B      A         10        1990              B              10        0        1       1995             A             90       1       2
# 4       1995    2000        A      B         60        1995              A              90        1        2       2000             B             90       2       1
# 5       1995    2000        C      B         10        1995              C              60        1        0       2000             B             90       2       1
# 6       1995    2000        B      A         30        1995              B              70        1        1       2000             A            150       2       2
# 7       1995    2000        C      A         10        1995              C              60        1        0       2000             A            150       2       2
# 8       1995    2000        B      C         70        1995              B              70        1        1       2000             C            100       2       0
# 9       2000    2005        A      B         10        2000              A             150        2        2       2005             B             90       3       1
# 10      2000    2005        C      B         44        2000              C             100        2        0       2005             B             90       3       1
# 11      2000    2005        A      C         60        2000              A             150        2        2       2005             C            130       3       0
# 12      2000    2005        B      C         25        2000              B              90        2        1       2005             C            130       3       0
# 13      2000    2005        B      A         60        2000              B              90        2        1       2005             A             55       3       2
# 14      2000    2005        C      A         10        2000              C             100        2        0       2005             A             55       3       2

fig,ax = plt.subplots()

rescale_by = 1./600 # trial and error

# draw edges first
for _,edge in edges.iterrows():
    x,y = edge[['x_start','y_start']]
    dx,dy = edge[['x_stop','y_stop']].values - edge[['x_start','y_start']].values
    ax.add_patch(FancyArrow(x,y,dx,dy,width=rescale_by*edge['EdgeWidth'],length_includes_head=True,color='orange'))

# draw nodes second such that they are plotted on top of edges
for _,node in nodes.iterrows():
    ax.add_patch(Circle(node[['x','y']],rescale_by*node['NodeSize'],facecolor='w',edgecolor='k'))
    ax.text(node['x'],node['y'],node['NodeSize'],ha='center',va='center')

# annotate nodes
for _,node in nodes[['NodeName','y']].drop_duplicates().iterrows():
    ax.text(-0.5,node['NodeName'],fontsize=15,fontweight='bold',va='center')

for _,node in nodes[['Year','x']].drop_duplicates().iterrows():
    ax.text(node['x'],-0.5,node['Year'],va='center')

# adjust axis limits to include labels
ax.autoscale_view()
_,xmax = ax.get_xlim()
ax.set_xlim(-1,xmax)

# style axis
ax.set_aspect('equal')
ax.axis('off')

plt.show()