将有向图划分为 n 个块 编辑

问题描述

我有一个表示简单网络的有向图,示例如下:

DiGraph

我想将此网络划分为至少 5 个节点的 n 个块。每个块或分区都需要密集连接,不能遗漏任何节点。网络可能会采用不同的配置,因此需要通用算法。

我已经实现了一个简单的chunker算法,当网络在节点索引列表中没有度数高于2的节点时该算法有效:

import itertools

def chunks(items,cutpoints):
    return [items[i:j] for i,j in zip([0] + cutpoints,cutpoints + [len(items)])]

# this function will partition the line into segments,based on the number of breakpoints n-1
def generate_chunks(items,n,min_size):
    indices = range(1,len(items))
    for cutpoints in itertools.combinations(indices,n-1):
        chunk = chunks(items,list(cutpoints))
        if len(min(chunk,key=len)) < min_size:
            continue
        yield chunk

由于 chunker 函数在列表上的工作方式,当应用于全图时,不会生成块:

candidates = []
for candidate in generate_chunks(list(Graph.nodes()),3,5):
    suitable_candidate = True
    for segment in candidate:
        subgraph = Graph.subgraph(segment)
        if not nx.is_strongly_connected(subgraph):
            suitable_candidate = False
            break
        # we want subgraphs where no nodes are connecting to more than 2 nodes
        if max(subgraph.degree(),key = lambda x: x[1])[1] > 4:
            suitable_candidate = False
            break
    if suitable_candidate:
        candidates.append(candidate)

我无法修改它,以便它考虑到更复杂的图拓扑。我在 networkx 中找不到将图形划分为 n 个随机连接部分的函数。我也考虑过随机选取 n-1 个节点并计算它们之间的最短路径,但似乎是一种浪费的操作。

所表示的图是更复杂网络的简化版本。

感谢有人能指出我正确的方向。

解决方法

AFAIK,没有什么现成的东西完全符合要求。 有 Kernighan-Lin 算法将图分割成两个,所以如果期望的分割数是 2 的幂,这是一个合理的选择。但是,不能保证 w.r.t.最小块大小(除了 > 1)。

partition_1,partition_2 = nx.algorithms.community.kernighan_lin_bisection(g)
subgraph_1,subgraph_2 = g.subgraph(partition_1),g.subgraph(partition_2)
# and recurse on each subgraph until the desired number of partitions is achieved

由于您的示例看起来非常接近一棵树,因此在将其转换为树时可能会保留其大部分结构。然后你可以使用 Lukes 的算法,但是,它是由最大块大小而不是块总数来参数化的。

max_size = g.size() / n # will probably result in more than n chunks 
mst = nx.minimum_spanning_tree(g)
partitions = nx.algorithms.community.lukes_partitioning(mst,max_size)

当转换为树时,在这种情况下使用最小生成树可能并不理想。你真正想要的是一个最佳平衡的树,这样最大的分支是最小的。我认为 networkx 没有实现任何可用的启发式方法,但我不确定。话虽如此,MST 通常工作得很好。

最后是networkx-metis。但是,由于我从未使用过它,因此我无法评论它在您的情况下的用处。

编辑

这是一个分块算法的粗略实现,它的灵感来自于 Lukes 的算法,但通过最小大小进行参数化,因为这在您的情况下似乎是更相关的参数。

它适用于树状图或树状图,但对于密集连接的图会产生非常糟糕的结果。密集图的问题在于最小生成树算法通常会导致具有许多叶子的高度节点。据推测,使用平衡生成树而不是最小生成树会改善结果。

enter image description here

#!/usr/bin/env python
"""
Partition a graph into connected subgraphs of minimum size.
"""
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx


def nx_chunk(graph,chunk_size):
    """
    Chunk a graph into subgraphs with the specified minimum chunk size.

    Inspired by Lukes algorithm.
    """

    # convert to a tree;
    # a balanced spanning tree would be preferable to a minimum spanning tree,# but networkx does not seem to have an implementation of that
    tree = nx.minimum_spanning_tree(graph)

    # select a root that is maximally far away from all leaves
    leaves = [node for node,degree in tree.degree() if degree == 1]
    minimum_distance_to_leaf = {node : tree.size() for node in tree.nodes()}
    for leaf in leaves:
        distances = nx.single_source_shortest_path_length(tree,leaf)
        for node,distance in distances.items():
            if distance < minimum_distance_to_leaf[node]:
                minimum_distance_to_leaf[node] = distance
    root = max(minimum_distance_to_leaf,key=minimum_distance_to_leaf.get)

    # make the tree directed and compute the total descendants of each node
    tree = nx.dfs_tree(tree,root)
    total_descendants = get_total_descendants(tree)

    # prune chunks,starting from the leaves
    chunks = []
    max_descendants = np.max(list(total_descendants.values()))
    while (max_descendants + 1 > chunk_size) & (tree.size() >= 2 * chunk_size):
        for node in list(nx.topological_sort(tree))[::-1]: # i.e. from leaf to root
            if (total_descendants[node] + 1) >= chunk_size:
                chunk = list(nx.descendants(tree,node))
                chunk.append(node)
                chunks.append(chunk)

                # update relevant data structures
                tree.remove_nodes_from(chunk)
                total_descendants = get_total_descendants(tree)
                max_descendants = np.max(list(total_descendants.values()))

                break

    # handle remainder
    chunks.append(list(tree.nodes()))

    return chunks


def get_total_descendants(dag):
    return {node : len(nx.descendants(dag,node)) for node in dag.nodes()}


if __name__ == '__main__':

    fig,axes = plt.subplots(1,3,figsize=(12,4))

    examples = nx.balanced_tree(2,6),nx.random_powerlaw_tree(64),nx.karate_club_graph()

    for g,ax in zip(examples,axes):
        chunks = nx_chunk(g,5)
        node_to_color = dict()
        for ii,chunk in enumerate(chunks):
            for node in chunk:
                node_to_color[node] = ii
        nx.draw(g,node_color=[node_to_color[node] for node in g.nodes()],cmap='tab20',ax=ax)

    plt.show()

    for ii,chunk in enumerate(chunks):
        for node in chunk:
            node_to_color[node] = ii

    nx.draw(g,cmap='tab20')
    plt.show()