给定边,如何以向量化的方式找到由两个边组成的路线? 每个评论

问题描述

我有许多城镇及其邻居。我想得到一组至少有一条由恰好两个不同边组成的路线的所有成对的城镇。有矢量化的方法可以做到这一点吗?如果没有,为什么?例如:边缘[3,0],[0,4],[5,0]的入射节点为0,因此[3,[4,5],[3,5]是一对城镇,可以通过以下路径连接:3-0-44-0-5和{{ 1}}。它们由两个边缘组成。

输入示例:3-0-5

预期输出np.array([[3,[2,1],[1,3],2]]) (如果顺序不同,不必担心边缘方向相反或重复)

到目前为止,我已经做了什么:

array([4,2],2])

最后两个给出预期的输出,但是它们需要列表理解。可以对这种计算进行向量化吗?

from itertools import chain,combinations

def get_incidences(roads):
    roads = np.vstack([roads,roads[:,::-1]])
    roads_sorted = roads[np.argsort(roads[:,0])]
    marker_idx = np.flatnonzero(np.diff(roads_sorted[:,0]))+1
    source = roads_sorted[np.r_[marker_idx-1,-1],0]
    target = np.split(roads_sorted[:,marker_idx)
    return source,target

def get_combinations_chain(target):
    #I kNow this Could be improved with `np.fromiter`
    return np.array(list(chain(*[combinations(n,2) for n in target])))

def get_combinations_triu(target):
    def combs(t):
        x,y = np.triu_indices(len(t),1)
        return np.transpose(np.array([t[x],t[y]]))
    return np.concatenate([combs(n) for n in target])

roads = np.array([[3,2]])

>>> get_incidences(roads)
(array([0,1,2,3,4,5]),[array([4,array([4,2]),array([1,array([0,1]),2])])
>>> get_combinations_chain(get_incidences(roads)[1])
array([[4,2]])
>>> get_combinations_triu(get_incidences(roads)[1])
array([[4,2]])

更新我以一种可能的矢量化方法告终,但我需要重新组织输入数据(np.concatenate([combs(n) for n in target]) 输出):

get_incidences

它似乎比直接组合所有组合要快:

INPUT:
target: [array([4,2])]
stream: [4 3 5 4 2 1 3 5 0 2 0 1 0 2]
lengths: [3 2 3 2 2 2]
OUTPUT:
array([[3,2]])

但是,这在很大程度上取决于数据。 def get_incidences(roads): roads = np.vstack([roads,0]))+1 lengths = np.diff(marker_idx,prepend=0,append=len(roads_sorted)) stream = roads_sorted[:,1] target = np.split(stream,marker_idx) return target,stream,lengths def get_combinations_vectorized(data): target,lengths = data idx1 = np.concatenate(np.repeat(target,lengths)) idx2 = np.repeat(stream,np.repeat(lengths,lengths)) return np.array([idx1,idx2]).T[idx1 < idx2] def get_combinations_triu(data): target,lengths = data def combs(t): x,t[y]])) return np.concatenate([combs(n) for n in target]) def get_combinations_chain(data): target,lengths = data return np.array(list(chain(*[combinations(n,2) for n in target]))) def get_combinations_scott(data): target,lengths = data return np.array([x for i in target for x in combinations(i,2)]) def get_combinations_index(data): target,lengths = data index = np.fromiter(chain.from_iterable(chain.from_iterable(combinations(n,2) for n in target)),dtype=int,count=np.sum(lengths*(lengths-1))) return index.reshape(-1,2) roads = np.array([[64,53],[94,90],[24,60],[45,44],[83,17],[10,88],[14,6],[56,93],[98,[86,77],[12,85],[58,[19,80],[48,26],[11,51],[16,83],96],[35,54],[47,23],[81,57],[52,34],[88,11],[18,[41,45],7],68],[46,38],[32,[44,41],[26,39],[20,58],[8,[74,71],[34,35],[91,72],[28,[53,73],[66,[84,97],29],[43,63],[96,74],89],22]]) data = get_incidences(roads) %timeit get_combinations_vectorized(data) %timeit get_combinations_chain(data) %timeit get_combinations_triu(data) %timeit get_combinations_scott(data) %timeit get_combinations_index(data) 92 µs ± 18.3 µs per loop (mean ± std. dev. of 7 runs,10000 loops each) 123 µs ± 3.67 µs per loop (mean ± std. dev. of 7 runs,10000 loops each) 1.8 ms ± 9.44 µs per loop (mean ± std. dev. of 7 runs,1000 loops each) 126 µs ± 2.45 µs per loop (mean ± std. dev. of 7 runs,10000 loops each) 140 µs ± 4.48 µs per loop (mean ± std. dev. of 7 runs,10000 loops each)

的时间
roads = np.array(list(combinations(range(100),2)))

解决方法

您可以使用networkx库:

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

a = np.array([[3,0],[0,4],[5,[2,1],[1,3],2]])

G = nx.Graph()

G.add_edges_from(a)

#Creates this newtork
nx.draw_networkx(G)

enter image description here

# Create pairs of all nodes in network
c = combinations(G.nodes,2)

# Find all routes between each pair in the network
routes = [list(nx.all_simple_paths(G,i,j,cutoff=2))[0] for i,j in c]

# Select only routes with three nodes/two edges the show first and last node
paths_2_edges = [(i[0],i[-1]) for i in routes if len(i) == 3]
print(paths_2_edges)

输出:

[(3,4),(3,5),1),(0,2),(4,(5,1)]

每个评论

矢量化此语句:np.concatenate([combs(n) for n in target])

对于t = get_incidences(roads)[1]

s2 = get_combinations_triu(t)

输出s2:

array([[4,[4,5],[3,2],2]])

%timeit get_combinations_triu(t)

每个循环96.9 µs±3.44 µs(平均±标准偏差,共运行7次,每个10000个循环)


然后

s1 = np.array([x for i in t for x in combinations(i,2)])

输出s1:

array([[4,2]])

而且(s1 == s2).all()

True

Timeit:

%timeit np.array([x for i in t for x in list(combinations(i,2))])

每个循环14.7 µs±577 ns(平均±标准偏差,共运行7次,每个循环100000次)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...