问题描述
是否可以找到同时具有Apple和StrawBerry的ID,然后找到总长度的方法?和只有苹果的ID和只有草莓的IDS?
df:
ID Fruit
0 ABC Apple <-ABC has Apple and StrawBerry
1 ABC StrawBerry <-ABC has Apple and StrawBerry
2 EFG Apple <-EFG has Apple only
3 XYZ Apple <-XYZ has Apple and StrawBerry
4 XYZ StrawBerry <-XYZ has Apple and StrawBerry
5 CDF StrawBerry <-CDF has StrawBerry
6 AAA Apple <-AAA has Apple only
所需的输出:
Length of IDs that has Apple and StrawBerry: 2
Length of IDs that has Apple only: 2
Length of IDs that has StrawBerry: 1
谢谢!
解决方法
如果在列Apple
中所有值始终都是Strawberry
或Fruit
,则可以比较每组的集合,然后以{{1} 1}}的值:
ID
编辑:如果有很多值:
sum
,
您可以将str[j+1]
和str[10]
用于DataFrame(熊猫1.1.0。):
pivot_table
输出:
value_counts
或者,您可以使用:
df.pivot_table(index='ID',columns='Fruit',aggfunc='size',fill_value=0)\
.value_counts()