分组依据适用于多个列？

问题描述

我在下面有一个df：

Name | Factory | Restaurant | Store | Building
Brian    True    False        True     False
Mike     True    True         True     True
Brian    True    False        False    True
Sam      False   False        False    False
Sam      True    False        True     True
Mike     True    False        False    False

我下面的代码为我提供了每个名称如Factory的列为True的次数，如何添加其余或更多列以使每列的所有值都为True，例如Restaurant and Store and Building还有更多的列？谢谢！

df.groupby(['Name'])['Factory'].apply(sum).reset_index()

当前输出：

Name | Factory
Brian    2
Mike     2
Sam      1

预期输出：

Name | Factory | Restaurant | Store | Building
Brian    2          0           1        1
Mike     2          1           1        1
Sam      1          0           1        1

谢谢！

解决方法

一些建议：1）尽量不要使用apply，2）sum是非向量化的Python操作，请尽量避免使用Pandas的'sum'运算符。 / p>

只需尝试：

cols = ['Factory','Restaurant','Store','Building']

df.groupby('Name',as_index=False)[cols].sum()

输出：

    Name  Factory  Restaurant  Store  Building
0  Brian        2           0      1         1
1   Mike        2           1      1         1
2    Sam        1           0      1         1

拥有df.groupby(['Name'])['Factory']后，只会返回“工厂”列

假设Name不是您的索引，请尝试：

df.groupby('Name').sum().astype('int')

data-science pandas pandas python python-3.x

分组依据适用于多个列？

问题描述

解决方法

相关问答