问题描述
|innings | bowler |
|--------|---------------|
|1 | P Kumar |
|1 | P Kumar |
|1 | P Kumar |
|1 | P Kumar |
|1 | Z Khan |
|1 | Z Khan |
|1 | Z Khan |
|2 | AB dinda |
|2 | AB dinda |
|2 | I Sharma |
期望输出
|innings | bowler |
|--------|----------------------|
|1 | P Kumar,Z Khan |
|2 | AB dinda,I Sharma |
我应用的代码:
df.groupby(['innings']).bowler.sum().drop_duplicates(subset="bowler",keep='first',inplace=True)
但由于某种原因,它给了我一个错误 类型错误:drop_duplicates() 得到了一个意外的关键字参数“子集”
然后我尝试没有子集: drop_duplicates("bowler",inplace=True) 现在我收到这个错误 类型错误:drop_duplicates() 为参数“keep”获得了多个值
解决方法
首先对两列使用 DataFrame.drop_duplicates
,然后聚合 join
:
df = (df.drop_duplicates(subset=["bowler",'innings'])
.groupby('innings')
.bowler.agg(','.join)
.reset_index())
print (df)
innings bowler
0 1 P Kumar,Z Khan
1 2 AB Dinda,I Sharma