问题描述
我想为我的节点分配一个属性。目前我正在使用以下数据示例创建网络:
Attribute Source Target Weight Label
87.5 Heisenberg Pauli 66.3 1
12.5 Beckham Messi 38.1 0
12.5 Beckham Maradona 12 0
43.5 water melon 33.6 1
标签应该给出节点的颜色(1=黄色,0=蓝色)。
网络代码:
G = nx.from_pandas_edgelist(df,source='Source',target='Target',edge_attr='Weight')
collist = df.drop('Weight',axis=1).melt('Label').dropna() # I need this for the below lines of code because I want to draw nodes - their size - based on their degree
degrees=[]
for x in collist['value']:
deg=G.degree[x]
degrees.append(100*deg)
pos=nx.spring_layout(G)
nx.draw_networkx_labels(G,pos,font_size=10)
nx.draw_networkx_nodes(G,nodelist=collist['value'],node_size = degrees,node_color=collist['Label'])
nx.draw_networkx_edges(G,pos)
这段代码应该做的是:节点的大小应该等于它们的度数(这解释了我的代码中的度数和 collist
)。边缘的厚度应等于 Weight
。 Attribute
应该被分配(和更新),如这个链接:(Changing attributes of nodes)。目前,我的代码不包括上述链接中的分配,其添加和更新如下:
G = nx.Graph()
G.add_node(0,weight=8)
G.add_node(1,weight=5)
G.add_node(2,weight=3)
G.add_node(3,weight=2)
nx.add_path(G,[2,5])
nx.add_path(G,3])
labels = {
n: str(n) + '\nweight=' + str(G.nodes[n]['weight']) if 'weight' in G.nodes[n] else str(n)
for n in G.nodes
}
newWeights = \
[
sum( # summ for averaging
[G.nodes[neighbor]['weight'] for neighbor in G.neighbors(node)] # weight of every neighbor
+ [G.nodes[i]['weight']] # adds the node itsself to the average
) / (len(list(G.neighbors(node)))+1) # average over number of neighbours+1
if len(list(G.neighbors(node))) > 0 # if there are no neighbours
else G.nodes[i]['weight'] # weight stays the same if no neighbours
for i,node in enumerate(G.nodes) # do the above for every node
]
print(newWeights)
for i,node in enumerate(G.nodes):
G.nodes[i]['weight'] = newWeights[i] # writes new weights after it calculated them all.
请注意,我有 100 多个节点,因此无法手动完成。 我尝试在我的代码中包含属性,如下所示:
G = nx.from_pandas_edgelist(df_net,edge_attr=['Weight'])
nx.set_node_attributes(G,pd.Series(nodes.Attribute,index=nodes.node).to_dict(),'Attribute')
但是,我遇到了错误:
----> 1 network(df)
<ipython-input-72-f68985d20046> in network(dataset)
24 degrees=[]
25 for x in collist['value']:
---> 26 deg=G.degree[x]
27 degrees.append(100*deg)
28
~/opt/anaconda3/lib/python3.8/site-packages/networkx/classes/reportviews.py in __getitem__(self,n)
445 def __getitem__(self,n):
446 weight = self._weight
--> 447 nbrs = self._succ[n]
448 if weight is None:
449 return len(nbrs) + (n in nbrs)
KeyError: 87.5
我想要的预期输出是一个网络,其中节点位于 Source 列中,而它们的邻居位于 Target 列中。边具有基于权重的厚度。标签给出来源的颜色,而属性值应作为标签添加并更新,如此链接上的问题/答案:Changing attributes of nodes .
请参阅下面我尝试构建的网络类型的可视化示例。图中的属性值是指更新前的值(newWeights),这就解释了为什么有些节点有缺失值。 Attribute 只与 Source 相关,根据 Label 着色。边缘的厚度由重量给出。
解决方法
implementation ('com.github.bumptech.glide:okhttp3-integration:4.4.0'){
exclude group: 'glide-parent'
}
kapt 'com.github.bumptech.glide:compiler:4.10.0'
,
根据您给定的示例数据框和所需的输出图像,我创建了以下解决方案:
import pandas as pd
import networkx as nx
import matplotlib.pylab as pl
df = pd.DataFrame(
data=[[87.5,"Heisenberg","Pauli",66.3,1,],[12.5,"Beckham","Messi",38.1,"Maradona",12,[43.5,"water","melon",33.6,]],columns=["Attribute","Source","Target","Weight","Label"]
)
# 1 Creating the graph
G = nx.from_pandas_edgelist(df,source='Source',target='Target',edge_attr='Weight')
# 2 Adding the node attributes for the source nodes
nx.set_node_attributes(G,{node: df.Attribute[i] for i,node in enumerate(df.Source)},'Attribute')
nx.set_node_attributes(G,{node: df.Label[i] for i,'Label')
# (optional) checking the created data
print(G.nodes(data=True))
# [('Heisenberg',{'Attribute': 87.5,'Label': 1}),('Pauli',{}),('Beckham',{'Attribute': 12.5,'Label': 0}),('Messi',('Maradona',('water',{'Attribute': 43.5,('melon',{})]
print(G.edges(data=True))
# [('Heisenberg','Pauli',{'Weight': 66.3}),'Messi',{'Weight': 38.1}),'Maradona',{'Weight': 12.0}),'melon',{'Weight': 33.6})]
# 3 fine tuning the visualisation
degrees = [100 * G.degree[i] for i in G.nodes()]
# not sure what should be the color if no label is available
color_dict = {0: "blue",1: "yellow","default": "yellow"}
node_colors = []
labels = {}
for node in G:
label = node + "\n Attribute="
if "Attribute" in G.nodes[node]:
label += str(G.nodes[node]["Attribute"])
labels[node] = label
if "Label" in G.nodes[node]:
node_colors.append(color_dict[G.nodes[node]["Label"]])
else:
node_colors.append(color_dict["default"])
# you can use any other layout e.g spring_layout
pos = nx.circular_layout(G)
nx.draw_networkx(G,pos,node_color=node_colors,node_size=degrees,width=[edge_info[2]/10 for edge_info in G.edges(data="Weight")],labels=labels,)
# 4 Adjustments for node labels partially cut
axis = pl.gca()
# zoom out
# maybe smaller factors work as well,but 1.3 works fine for this minimal example
axis.set_xlim([1.3*x for x in axis.get_xlim()])
axis.set_ylim([1.3*y for y in axis.get_ylim()])
# turn off frame
pl.axis("off")
pl.show()
结果如下
说明
创建网络的主要步骤如下:
- 最简单的部分是使用
nx.from_pandas_edgelist
创建基本网络,其中已经添加了边权重。 - 之后,使用
nx.set_node_attributes
添加节点属性。 - 然后,图形就完全创建好了,接下来的所有代码都应该只对图形进行操作,例如
G.nodes()
。因此,为了调整可视化,仅循环G
。 - 最后,我微调了创建的 matplotlib 图以避免剪切标签。
edge_attr
参数可以接受一个列表
G = nx.from_pandas_edgelist(df,edge_attr=['Weight','Attribute'])
您的 melt
的目的是什么?如果要查看每个节点的标签,可以使用
df['Node'] = df['Source'].str.cat(df['Target'],sep=' ').str.split(' ')
df = df.explode('Node')
print(df)
Attribute Source Target Weight Label Node
0 87.5 Heisenberg Pauli 66.3 1 Heisenberg
0 87.5 Heisenberg Pauli 66.3 1 Pauli
1 12.5 Beckham Messi 38.1 1 Beckham
1 12.5 Beckham Messi 38.1 1 Messi
2 23.5 Beckham Maradona 12.0 0 Beckham
2 23.5 Beckham Maradona 12.0 0 Maradona
3 43.5 water melon 33.6 1 water
3 43.5 water melon 33.6 1 melon
但是有不同标签的重复节点,你需要选择保留哪个。