删除python AzureML分类问题中的重复项时出现错误

问题描述

我在调用 drop.duplicate 函数时收到此错误

Traceback (most recent call last):
  File "train.py",line 159,in <module>
    orders_dfx = preprocess_orders(orders_df)
  File "train.py",line 20,in preprocess_orders
    ao = ao.drop_duplicates(subset=['order_id'],keep='last')
AttributeError: 'TabularDataset' object has no attribute 'drop_duplicates'

这是train.py代码的一部分

def preprocess_orders(ao):
  ao = ao.drop_duplicates(subset=['order_id'],keep='last')
  ao['order_id'] = ao['order_id'].astype('str')
  ao['class'] = ao['class'].astype('int')
  ao['age'] = ao['age'].astype('float').fillna(ao['age'].mean()).round(2)
  return ao

orders_df = Dataset.get_by_name(ws,name='class_cancelled_orders')
orders_df.to_pandas_dataframe()
# Doing processing
orders_dfx = preprocess_orders(orders_df)

我正在从 azureml studio 的数据集中获取数据。 job.py 文件用于运行实验:

# submit job
run = Experiment(ws,experiment_name).submit(src)
run.wait_for_completion(show_output=True)

解决方法

"vars":{"time":"23:11"} 方法返回一个 Pandas DataFrame,因此您需要将其分配回您的变量:

this.setState({vars: {keys: val}})