问题描述
我正在尝试绘制树状图以对数据进行聚类,但此错误阻止了我。 我的日期在这里“https://assets.datacamp.com/production/repositories/655/datasets/2a1f3ab7bcc76eef1b8e1eb29afbd54c4ebf86f2/eurovision-2016.csv”
我首先选择了要使用的列
target_col = df_euro["To country"]
feat = df_euro[["Jury A","Jury B","Jury C","Jury D","Jury E"]]
#Convert them into ndarrays
x = feat.to_numpy(dtype ='float32')
y = target_col.to_numpy()
# Calculate the linkage: mergings
mergings = linkage(x,method = 'complete')
# Plot the dendrogram
dendrogram(
mergings,labels = y,leaf_rotation = 90,leaf_font_size = 6
)
plt.show()
但是我收到了我无法理解的错误。我用谷歌搜索并检查两者的形状是否相同 (1066,5) 和 (1066,) 特征和 target_col 中都没有 NA
我知道问题出在标签上,但我找不到解决办法。找到任何帮助将不胜感激:)
编辑:这是整个追溯
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-113-7fffdc847e5e> in <module>
4 mergings = linkage(feat,method = 'complete')
5 # Plot the dendrogram
----> 6 dendrogram(
7 mergings,8 labels = target_col,C:\ProgramData\Anaconda3\lib\site-packages\scipy\cluster\hierarchy.py in dendrogram(Z,p,truncate_mode,color_threshold,get_leaves,orientation,labels,count_sort,@R_502_6422@tance_sort,show_leaf_counts,no_plot,no_labels,leaf_font_size,leaf_rotation,leaf_label_func,show_contracted,link_color_func,ax,above_threshold_color)
3275 "'bottom',or 'right'")
3276
-> 3277 if labels and Z.shape[0] + 1 != len(labels):
3278 raise ValueError("Dimensions of Z and labels must be consistent.")
3279
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1476
1477 def __nonzero__(self):
-> 1478 raise ValueError(
1479 f"The truth value of a {type(self).__name__} is ambiguous. "
1480 "Use a.empty,a.bool(),a.item(),a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty,a.any() or a.all().
解决方法
万一其他人正在搜索相同的问题,通过将标签转换为列表,它会起作用。
samples= df_euro.iloc[:,2:7].values[:42]
country_names= list(df_euro.iloc[:,1].values[:42])
mergings = linkage(samples,method='single')
# Plot the dendrogram
fig,ax = plt.subplots(figsize=(15,10))
fig = dendrogram(mergings,labels=country_names)
plt.show()
,
问题是 labels
中的 dendrogram
关键字参数必须有一个 __bool__
方法来返回它是否包含任何项目,就像在 list
中一样。因此,您需要做的唯一更改是在传递参数时转换为 list
:
dendrogram(
mergings,labels = list(y),leaf_rotation = 90,leaf_font_size = 6
)
所有其他行可以保持不变。