问题描述
有没有办法从 pySpark 中的 GraphFrame 图中找到具有给定中心节点的诱导子图? ?我曾尝试从模体制作诱导子图,但没有成功。
我尝试使用 NetworkX 的 ego 图,它按预期工作,但对于大型图(1200 万条边),加载整个图需要很长时间。
这里是一个中心节点为'a'的例子
v = sqlc.createDataFrame([
("a","Alice",34),("b","Bob",36),("c","Charlie",30),("d","David",29),("e","Esther",32),("f","Fanny",("g","Gabby",60)
],["id","name","age"])
# Edge DataFrame
e = sqlc.createDataFrame([
("a","b","friend"),"c","f","d","a",("a","e","friend")
],["src","dst","relationship"])
# Create a GraphFrame
g = GraphFrame(v,e)
get_community(g,1)
def create_motif(length: int) -> str:
"""Create a motif string.
Args:
length (int):
"""
motif_path = "(start)-[edge0]->"
for i in range(1,length):
motif_path += "(n%s);(n%s)-[edge%s]->" % (i - 1,i - 1,i)
motif_path += "(end)"
return motif_path
def get_community(G,depth):
motif_path = create_motif(depth)
current_motif = G.find(motif_path)\
current_motif.select(f.col("start.*"),"*").show()
返回:
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a,Alice,34]|[a,e,friend]|[e,Esther,32]|
| a|Alice| 34|[a,b,friend]| [b,Bob,36]|
+---+-----+---+--------------+--------------+---------------+
应该返回
+---+-----+---+--------------+--------------+---------------+
| id| name|age| start| edge0| end|
+---+-----+---+--------------+--------------+---------------+
| a|Alice| 34|[a,36]|
| a|Alice| 34|[a,d,friend]| [d,David,29]|
| b| Bob| 36|[b,36]|[b,29]|
+---+-----+---+--------------+--------------+---------------+
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)