比较python中的两列并返回第一个表中的匹配项

问题描述

帮助循环比较不同表中的两列并将匹配项返回到第一个表。

data1:
|name   | revenue |
|-------|---------|
|Alice  | 700     |
|Bob    | 1000    |
|Gerry  | 300     |
|Alex   | 600     |
|Kyle   | 800     |
data2:
|Name   | revenue |
|-------|---------|
|Bob    | 900     |
|Gerry  | 400     |
result data1:
|name   | revenue  |  name_result |
|-------|----------|--------------|
|Alice  | 700      |              |
|Bob    | 1000     |  Bob         |
|Gerry  | 300      |  Gerry       |
|Alex   | 600      |              |
|Kyle   | 800      |              |

我尝试使用此代码,但得到所有空值:

import pandas as pd
import numpy as np

def group_category(category):
    for name in data['name']: 
        if name in data2['Name']:
            return name
        else: name = ''
        return name 
data['name_result'] = data['name'].apply(group_category)

解决方法

使用:

def group_category(category):
    if category in df2['Name'].unique():
            return category
    else:
        return ''

#Finally:
#Since you are going to use this function on Series so used map() in place of apply()
df1['name_result']=df1['name'].map(group_category)

通过 isin()where()

df1['name_result']=df1['name'].where(df1['name'].isin(df2['Name']),'')
,

我找到了解决方案:

df1.loc[df1['name'].isin(df2['name_result'].unique()),'brand'] = 'Adidas Collection'