使用树状图中的距离获取聚类中的单词

问题描述

我正在尝试从一定距离的层次聚类树状图中获取单词。因此,在给定距离下,我应该有一个单词列表。 [我的树状图] [1] [1]:https://i.stack.imgur.com/hFLnQ.png

例如在距离0.4处,应将所有以下单词作为一个簇访问

#print(labels2)
p = len(labels2)
np.version.version
matrix = model.wv.vectors
#print(matrix)

from scipy.cluster.hierarchy import dendrogram,linkage
linked = linkage(matrix,'ward')

p = len(labels2)

plt.figure(figsize=(12,9))
plt.title('Hierarchical Clustering Dendrogram ',fontsize=20)
plt.xlabel('words',fontsize=16)
plt.ylabel('distance',fontsize=16)

# call dendrogram to get the returned dictionary 
# (plotting parameters can be ignored at this point)
R = dendrogram(
                linked,truncate_mode='lastp',# show only the last p merged clusters
                p=p,# show only the last p merged clusters
                no_plot=True,)

print(labels2)



## This  gives you a  label AND the count
temp = {R["leaves"][ii]:(labels2[ii],R["ivl"][ii]) for ii in range(len(R["leaves"]))}
def llf(xx):
    return "{} - {}".format(*temp[xx])

print("values passed to leaf_label_func\nleaves : ",R["leaves"])

dendrogram(
            linked,# show only the last p merged clusters
            p=p,# show only the last p merged clusters
            leaf_label_func=llf,leaf_rotation=60.,leaf_font_size=12.,show_contracted=True,# to get a distribution impression in truncated branches
            )
plt.show()```

so i should produce a list which looks like 
"flood,bridg" as one list
"leed,tadcast" as another
so on and so forth.
any help would be appreciated 

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)