问题描述
我有以下无向和未加权图,我想衡量聚类算法的质量。对于这个测量,我想要问题的答案:
单个簇的顶点之间有多少条唯一边?
例如:集群 red
有 6 条边,集群 blue
有 4 条边,集群 green
有 4 条边。
import networkx as nx
G = nx.Graph(directed=False).to_undirected()
G.add_edges_from([
("peter","missy"),("peter","longfellow"),("missy","rhinehardt"),"vivian"),("brandon","zoe"),("longfellow","flash"),"ox"),"heather"),("rhinehardt","zostra"),("ox","jenny"),("vivian","Sarah"),("flash",("zoe","mathilda"),("heather","caitlyn"),("zostra",("Sarah",("caitlyn","jenny")
])
解决方法
绿色集群示例
# Original cluster
cluster = set(["caitlyn","jenny","zostra","ox","flash"])
# Searching for external vertices between two of cluster's vertices
# Can be more efficient if the inner loop starts from the current position
# of the outer loop
for u in cluster:
for v in cluster:
for between in list(nx.shortest_path(G,u,v)):
cluster.add(between)
# Create subgraph and count edges
subgraph = G.subgraph(list(cluster))
print(len(subgraph.edges()))
,
您可能还想考虑 networkx 提供的质量度量:Measuring Partitions。它包括覆盖率、模块化和性能。
如果您查看 code,您还会发现 intra_community_edges
和 inter_community_edges
的方法。