问题描述
import pandas as pd
# making data frame from csv file
data = pd.read_csv("C:/Users/gvsph/Downloads/employees.csv")
# sorting by first name
data.sort_values("First Name",inplace=True)
# dropping ALL duplicte values
data.drop_duplicates(subset="First Name",keep=False,inplace=True)
# displaying data
print(data)
解决方法
您可以使用不带参数的“drop_duplicates”从数据集中删除所有重复记录。
cfr pandas docs