如何查找在另一列的不同行中具有多个值的列值的总长度

问题描述

是否可以找到同时具有Apple和StrawBerry的ID,然后找到总长度的方法?和只有苹果的ID和只有草莓的IDS?

df:

        ID           Fruit
0       ABC          Apple        <-ABC has Apple and StrawBerry
1       ABC          StrawBerry   <-ABC has Apple and StrawBerry
2       EFG          Apple        <-EFG has Apple only
3       XYZ          Apple        <-XYZ has Apple and StrawBerry
4       XYZ          StrawBerry   <-XYZ has Apple and StrawBerry 
5       CDF          StrawBerry   <-CDF has StrawBerry
6       AAA          Apple        <-AAA has Apple only

所需的输出

Length of IDs that has Apple and StrawBerry: 2
Length of IDs that has Apple only: 2
Length of IDs that has StrawBerry: 1

谢谢!

解决方法

如果在列Apple中所有值始终都是StrawberryFruit,则可以比较每组的集合,然后以{{1} 1}}的值:

ID

编辑:如果有很多值:

sum
,

您可以将str[j+1]str[10]用于DataFrame(熊猫1.1.0。):

pivot_table

输出:

value_counts

或者,您可以使用:

df.pivot_table(index='ID',columns='Fruit',aggfunc='size',fill_value=0)\
.value_counts()