我可以创建一个引用索引列,该列在每次达到累积阈值时从0重置

问题描述

我正在尝试添加一个累加的总和列和一个新的索引列n_index。使用现有答案,我添加一个cumsum列,但是我没有引用索引列。

df = pd.DataFrame({'amount':[4,3,7,8,2,1,5,8]})

ls = []
n_index = []
cumsum = 0
last_reset = 0
threshold = 16

for i,row in df.iterrows():
    if cumsum + row.amount <= threshold:        
        cumsum = cumsum + row.amount
        n_index.append(i)
    else:        
        last_reset = cumsum        
        cumsum = row.amount
        n_index.append(0)
        
    ls.append(cumsum)    

df['cumsum'] = ls
df['n_index'] = n_index

结果是:

df
    

      amount    cumsum  n_index
    0   4   4   0
    1   3   7   1
    2   7   14  2
    3   8   8   0
    4   2   10  4
    5   1   11  5
    6   5   16  6
    7   3   3   0
    8   5   8   8
    9   8   16  9

我希望每次超过阈值时,数据帧n_index从零(0)开始,如下所示:

   amount   cumsum  n_index
0   4   4   0
1   3   7   1
2   7   14  2
3   8   8   0
4   2   10  1
5   1   11  2
6   5   16  3
7   3   3   0
8   5   8   1
9   8   16  2

请帮助,谢谢。

解决方法

希望,您获得了预期的结果,并消除了错误。

df = pd.DataFrame({'amount':[4,3,7,8,2,1,5,8]})

ls = []
n_index = []
cumsum = 0
last_reset = 0
threshold = 16

assign_indx=0
for i,row in df.iterrows():
    if cumsum + row.amount <= threshold:        
        cumsum = cumsum + row.amount
        n_index.append(assign_indx)
        assign_indx+=1
    else:        
        last_reset = cumsum        
        cumsum = row.amount
        n_index.append(0)
        assign_indx=1
    ls.append(cumsum)    

df['cumsum'] = ls
df['n_index'] = n_index

#Output:

    amount  cumsum  n_index
0   4   4   0
1   3   7   1
2   7   14  2
3   8   8   0
4   2   10  1
5   1   11  2
6   5   16  3
7   3   3   0
8   5   8   1
9   8   16  2