对于每一行,返回最小值的列名称-pandas

问题描述

假设我有一个具有以下值的数据框:

id    product1sold   product2sold   product3sold
1     2              3              3
2     0              0              5
3     3              2              1

如何在每个ID的列表中添加一个“ most_sold”和“ least_sold”列,其中包含所有销量最高和销量最低的商品? 看起来应该像这样。

id    product1   product2   product3    most_sold                least_sold
1        2          3          3        [product2,product3]      [product1]     
2        0          0          5        [product3]                [product1,product2]
3        3          2          1        [product1]                [product3]

解决方法

对产品列表使用具有最小和最大值测试的列表理解:

#select all columns without first
df1 = df.iloc[:,1:]
cols = df1.columns.to_numpy()

df['most_sold'] = [cols[x].tolist() for x in df1.eq(df1.max(axis=1),axis=0).to_numpy()]
df['least_sold'] = [cols[x].tolist() for x in df1.eq(df1.min(axis=1),axis=0).to_numpy()]
print (df)
   id  product1sold  product2sold  product3sold                     most_sold  \
0   1             2             3             3  [product2sold,product3sold]   
1   2             0             0             5                [product3sold]   
2   3             3             2             1                [product1sold]   

                     least_sold  
0                [product1sold]  
1  [product1sold,product2sold]  
2                [product3sold]  

如果性能不重要,可以使用DataFrame.apply

df1 = df.iloc[:,1:]

f = lambda x: x.index[x].tolist()
df['most_sold'] = df1.eq(df1.max(axis=1),axis=0).apply(f,axis=1)
df['least_sold'] = df1.eq(df1.min(axis=1),axis=1)
,

您可以执行以下操作。

minValueCol = yourDataFrame.idxmin(axis=1) maxValueCol = yourDataFrame.idxmax(axis=1)

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...