将函数应用于数据框时为什么会出现错误?

问题描述

def function(s):
    if (s['col1'] == 'something1')|(s['col1'] == 'smth2')|(s['col1'] == 'smth3'):
        return 'A'
    elif (s['col1'] == 'smth4')|(s['col1'] == 'smth5'):
        return 'B'
    elif (s['col1'] == 'smth6')|(s['col1'] == 'smth7'):
        return 'C'
    else:
        return 'D'

以上功能起作用。但是当我将其应用于数据框时:

df['new_col'] = df.apply(function,axis = 1)

我得到:

TypeError: ("'bool' object is not callable",'occurred at index 0') 

解决方法

对于我来说工作正常,这里是使用Series.isinnumpy.select的替代解决方案:

df = pd.DataFrame({
    'col1':['something1','jeff bridges','smth7','billy boy','smth5']})


print (df)

def function(s):
    if (s['col1'] == 'something1')|(s['col1'] == 'smth2')|(s['col1'] == 'smth3'):
        return 'A'
    elif (s['col1'] == 'smth4')|(s['col1'] == 'smth5'):
        return 'B'
    elif (s['col1'] == 'smth6')|(s['col1'] == 'smth7'):
        return 'C'
    else:
        return 'D'
    
df['new_col'] = df.apply(function,axis = 1)

m1 = df['col1'].isin(['something1','smth2','smth3'])
m2 = df['col1'].isin(['smth4','smth5'])
m3 = df['col1'].isin(['smth6','smth7'])

df['new_col1'] = np.select([m1,m2,m3],['A','B','C'],default='D')
print (df)
           col1 new_col new_col1
0    something1       A        A
1  jeff bridges       D        D
2         smth7       C        C
3     billy boy       D        D
4         smth5       B        B