我有不同列的数据框用户.我的目标是添加列[uses_name],当密码与每个用户的名字或姓氏相同时,该列应为True.
例如,十二行中的[user_name]包含milford.hubbard.然后在[uses_name]中将为True,因为[password]和[last_name]是相同的.
为此,我使用正则表达式创建两列[first_name]和[last_name].当创建[uses_name]时,我遇到了问题操作符.我在pandas doc中更多地了解布尔索引,但没有找到答案.
我的代码:
import pandas as pd
users = pd.read_csv('datasets/users.csv')
# Extracting first and last names into their own columns
users['first_name'] = users['user_name'].str.extract(r'(^\w+)', expand=False)
users['last_name'] = users['user_name'].str.extract(r'(\w+$)', expand=False)
# Flagging the users with passwords that matches their names
users['uses_name'] = users['password'].isin(users['first_name'] | users['last_name'])
# Counting and printing the number of users using names as passwords
print(users['uses_name'].count())
# Taking a look at the 12 first rows
print(users.head(12))
TypeError: unsupported operand type(s) for |: 'str' and 'bool'
用户数据框中的前12行,创建了first_name和last_name列:
id user_name password first_name last_name
0 1 vance.jennings joobheco vance jennings
1 2 consuelo.eaton 0869347314 consuelo eaton
2 3 mitchel.perkins fabypotter mitchel perkins
3 4 odessa.vaughan aharney88 odessa vaughan
2 3 mitchel.perkins fabypotter mitchel perkins
3 4 odessa.vaughan aharney88 odessa vaughan
4 5 araceli.wilder acecdn3000 araceli wilder
5 6 shawn.harrington 5278049 shawn harrington
6 7 evelyn.gay master evelyn gay
7 8 noreen.hale murphy noreen hale
8 9 gladys.ward lwsves2 gladys ward
9 10 brant.zimmerman 1190KAREN5572497 brant zimmerman
10 11 leanna.abbott aivlys24 leanna abbott
11 12 milford.hubbard hubbard milford hubbard
解决方法:
这有效:
users [‘uses_name’] =(users [‘password’] == users [‘first_name’])| (用户[ ‘密码’] ==用户[ ‘姓氏’])