问题描述
我正在尝试从一定距离的层次聚类树状图中获取单词。因此,在给定距离下,我应该有一个单词列表。 [我的树状图] [1] [1]:https://i.stack.imgur.com/hFLnQ.png
例如在距离0.4处,应将所有以下单词作为一个簇访问
#print(labels2)
p = len(labels2)
np.version.version
matrix = model.wv.vectors
#print(matrix)
from scipy.cluster.hierarchy import dendrogram,linkage
linked = linkage(matrix,'ward')
p = len(labels2)
plt.figure(figsize=(12,9))
plt.title('Hierarchical Clustering Dendrogram ',fontsize=20)
plt.xlabel('words',fontsize=16)
plt.ylabel('distance',fontsize=16)
# call dendrogram to get the returned dictionary
# (plotting parameters can be ignored at this point)
R = dendrogram(
linked,truncate_mode='lastp',# show only the last p merged clusters
p=p,# show only the last p merged clusters
no_plot=True,)
print(labels2)
## This gives you a label AND the count
temp = {R["leaves"][ii]:(labels2[ii],R["ivl"][ii]) for ii in range(len(R["leaves"]))}
def llf(xx):
return "{} - {}".format(*temp[xx])
print("values passed to leaf_label_func\nleaves : ",R["leaves"])
dendrogram(
linked,# show only the last p merged clusters
p=p,# show only the last p merged clusters
leaf_label_func=llf,leaf_rotation=60.,leaf_font_size=12.,show_contracted=True,# to get a distribution impression in truncated branches
)
plt.show()```
so i should produce a list which looks like
"flood,bridg" as one list
"leed,tadcast" as another
so on and so forth.
any help would be appreciated
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)