有条件地复制一列中的一个元素并应用于python

问题描述

在下面的数据框中，我想为其关联的primary_fruit创建一个具有code_num的新列“ refer”，如果它与priamry_fruit没有关联，则应留空。

dct = {'Store': ('A','A','B','B'),'code_num':(101,102,103,104,105,106,201,202,203),'fruits': ('apple','cherry','cherry,apple','banana','rambo','apple,cherry','toy')
}

df = pd.DataFrame(dct)


fruit_list= ["apple","banana","cherry"]
primary_fruit = 'banana'

print(df)

Store code_num     fruits
A     101          apple  
A     102          cherry 
A     103          cherry,apple 
A     104          banana 
A     105          cherry 
A     106          rambo  
B     201          apple,cherry
B     202          banana
B     203          toy

预期数据框：

Store code_num    fruits       reference
A     101          apple         104
A     102          cherry        104
A     103          cherry,apple  104
A     104          banana        104
A     105          cherry        104
A     106          rambo       
B     201          apple,cherry  202
B     202          banana        202
B     203          toy

在我当前的问题中，我不希望106和203中的值，因为它们不属于“ fruit_list”

我尝试了下面的代码，但它只是为primary_fruit（104和202）获取了参考号，其余都留为空白


unique_store_id = df.Store.unique()

for store_id in unique_store_id:
    s = (df.Store == store_id) & df['fruits'].isin(unique_all_parts)   
    primary_code = df[df['fruits']==first_primary]['code_num']
    df.loc[s,'reference'] = primary_code

感谢您的帮助：）

更新： @Scott Boston的建议在完整的数据集上运行良好。但是在切片/切块的情况下，它给出[KeyError：'None']，我将不得不使用此逻辑将其应用于每个将更改“ fruit_list”和“ primary_fuit”的商店的切片数据帧。（我必须在最初的问题中表示歉意。）概念：根据每个商店的主要水果，应在参考中提供代码编号

解决方法

尝试一下：

$new_html = str_replace(array('\"','\/','&quot;','\n'),array('"','/','\'',"\n"),$old_html);

function unicode_convert($match){
   return mb_convert_encoding(pack('H*',$match[1]),'UTF-8','UCS-2BE');                         }                          

$new_html = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/',"unicode_convert",$new_html);

输出：

dct = {'Store': ('A','A','B','B'),'code_num':(101,102,103,104,105,106,201,202,203),'fruits': ('apple','cherry','cherry,apple','banana','rambo','apple,cherry','toy')
}

df = pd.DataFrame(dct)


fruit_list= ["apple","banana","cherry"]
primary_fruit = 'banana'

m = df.set_index(['Store','code_num'])['fruits'].str.split(',').explode().isin(fruit_list).max(level=[0,1]).to_numpy()

df['primary_code'] = df.loc[df['fruits'] == primary_fruit,'code_num']

#Changed this line
df['reference'] = df.groupby('Store')['primary_code'].transform(lambda x: x.loc[x.first_valid_index()]).where(m,'')

df_out = df.drop('primary_code',axis=1)
print(df_out)

pandas pandas pandas-groupby python python-3.x series series