问题描述
从ROOT到token提取依赖关系的路径?空间。 我有它的代码提取整个路径
import spacy
sentence = "I saw the man with a telescop"
nlp = spacy.load('en')
doc = nlp(sentence)
for sent in doc.sents:
for token in sent:
print("{}\t{}\t{}\t{}".format(token.i,token.text,token.head,token.dep_))
解决方法
依赖树基本上是一个图,所以如果你想找到ROOT的(最短)路径,你需要使用一些基于图的库,比如networkx
。假设您想从标记“telescop”中提取到根的路径。那么你可以尝试做这样的事情:
import spacy
import networkx
sentence = "I saw the man with a telescop"
nlp = spacy.load('en_core_web_sm')
doc = nlp(sentence)
edges = []
for sent in doc.sents:
for token in sent:
print("{}\t{}\t{}\t{}".format(token.i,token.text,token.head,token.dep_))
if token.dep_ == "ROOT":
target = token.text
for child in token.children:
edges.append(("{0}".format(token.lower_),"{0}".format(child.lower_)))
graph = networkx.Graph(edges)
print(nx.shortest_path(graph,source="telescop",target=target))
结果:
0 I saw nsubj
1 saw saw ROOT
2 the man det
3 man saw dobj
4 with saw prep
5 a telescop det
6 telescop with pobj
['telescop','with','saw']