我有一个熊猫系列的布尔值,我想标记连续的True值组.怎么可能这样做?是否有可能以矢量化方式执行此操作?任何帮助将非常感谢!
数据:
A
0 False
1 True
2 True
3 True
4 False
5 False
6 True
7 False
8 False
9 True
10 True
期望:
A Label
0 False 0
1 True 1
2 True 1
3 True 1
4 False 0
5 False 0
6 True 2
7 False 0
8 False 0
9 True 3
10 True 3
解决方法:
import scipy.ndimage.measurements as mnts
labeled, clusters = mnts.label(df.A.values)
# labeled is what you want, cluster is the number of clusters.
df.Labels = labeled # puts it into df
测试为:
a = array([False, False, True, True, True, False, True, False, False,
True, False, True, True, True, True, True, True, True,
False, True], dtype=bool)
labeled, clusters = mnts.label(a)
>>> labeled
array([0, 0, 1, 1, 1, 0, 2, 0, 0, 3, 0, 4, 4, 4, 4, 4, 4, 4, 0, 5], dtype=int32)
>>> clusters
5