Azure ML推理架构-“列表索引超出范围”错误

问题描述

我已经在Azure ML Studio上部署了ML模型，并且我正在使用推理模式对其进行更新，以允许与Power BI兼容，如here所述。

通过REST api将数据发送到模型时（添加此推理模式之前），一切正常，我得到了返回的结果。但是，一旦按照上面链接的说明添加了架构并个性化了我的数据，通过REST api发送的相同数据只会返回错误“列表索引超出范围”。部署进行得很好，被指定为“正常”，没有错误消息。

任何帮助将不胜感激。谢谢。

编辑：

输入脚本：

 import numpy as np
 import pandas as pd
 import joblib
 from azureml.core.model import Model
    
 from inference_schema.schema_decorators import input_schema,output_schema
 from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
 from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
 from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
    
 def init():
     global model
     #Model name is the name of the model registered under the workspace
     model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
     model = joblib.load(model_path)
    
 #Provide 3 sample inputs for schema generation for 2 rows of data
 numpy_sample_input = NumpyParameterType(np.array([[2400.0,78.26086956521739,11100.0,3.612565445026178,3.0,0.0],[368.55,96.88311688311687,709681.1600000012,73.88059701492537,44.0,0.0]],dtype = 'float64'))
 pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0,368.55],'delayed_percent': [78.26086956521739,96.88311688311687],'total_value_delayed': [11100.0,709681.1600000012],'num_invoices_per30_dealing_days': [3.612565445026178,73.88059701492537],'delayed_streak': [3.0,44.0],'prompt_streak': [0.0,0.0]}))
 standard_sample_input = StandardPythonParameterType(0.0)
    
 # This is a nested input sample,any item wrapped by `ParameterType` will be described by schema
 sample_input = StandardPythonParameterType({'input1': numpy_sample_input,'input2': pandas_sample_input,'input3': standard_sample_input})
    
 sample_global_parameters = StandardPythonParameterType(1.0) #this is optional
 sample_output = StandardPythonParameterType([1.0,1.0])
    
 @input_schema('inputs',sample_input)
 @input_schema('global_parameters',sample_global_parameters) #this is optional
 @output_schema(sample_output)
    
 def run(inputs,global_parameters):
     try:
         data = inputs['input1']
         # data will be convert to target format
         assert isinstance(data,np.ndarray)
         result = model.predict(data)
         return result.tolist()
     except Exception as e:
         error = str(e)
         return error

预测脚本：

 import requests
 import json
 from ast import literal_eval
    
 # URL for the web service
 scoring_uri = ''
 ## If the service is authenticated,set the key or token
 #key = '<your key or token>'
    
 # Two sets of data to score,so we get two results back
 data = {"data": [[2400.0,0.0]]}
 # Convert to JSON string
 input_data = json.dumps(data)
    
 # Set the content type
 headers = {'Content-Type': 'application/json'}
 ## If authentication is enabled,set the authorization header
 #headers['Authorization'] = f'Bearer {key}'
    
 # Make the request and display the response
 resp = requests.post(scoring_uri,input_data,headers=headers)
 print(resp.text)
    
 result = literal_eval(resp.text)

解决方法

我不确定您是否已解决问题，但是我遇到了类似的问题，而且我无法让Power BI看到我的ML模型。最后，我使用以下模式专门为Power BI（pandas df类型）创建了一个服务：

unordered scan

Microsoft documentation 说：“为了生成符合自动 Web 服务消费的 swagger，评分脚本 run() 函数必须具有以下 API 形状：

类型为“StandardPythonParameterType”的第一个参数，名为输入和嵌套。

一个可选的“StandardPythonParameterType”类型的第二个参数，名为 GlobalParameters。

返回名为“StandardPythonParameterType”类型的字典结果和嵌套。”

我已经测试过了，它区分大小写所以它会是这样的：

import numpy as np
import pandas as pd
import joblib

from azureml.core.model import Model
from inference_schema.schema_decorators import input_schema,output_schema
from inference_schema.parameter_types.standard_py_parameter_type import 
    StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType

def init():
    global model
    # Model name is the name of the model registered under the workspace
    model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
    model = joblib.load(model_path)

# Provide 3 sample inputs for schema generation for 2 rows of data
numpy_sample_input = NumpyParameterType(np.array([[2400.0,78.26086956521739,11100.0,3.612565445026178,3.0,0.0],[368.55,96.88311688311687,709681.1600000012,73.88059701492537,44.0,0.0]],dtype = 'float64'))

pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0,368.55],'delayed_percent': [78.26086956521739,96.88311688311687],'total_value_delayed': 
[11100.0,709681.1600000012],'num_invoices_per30_dealing_days': [3.612565445026178,73.88059701492537],'delayed_streak': [3.0,44.0],'prompt_streak': [0.0,0.0]}))

standard_sample_input = StandardPythonParameterType(0.0)

# This is a nested input sample,any item wrapped by `ParameterType` will be described 
by schema
sample_input = StandardPythonParameterType({'input1': numpy_sample_input,'input2': pandas_sample_input,'input3': standard_sample_input})

sample_global_parameters = StandardPythonParameterType(1.0) #this is optional

numpy_sample_output = NumpyParameterType(np.array([1.0,2.0]))

# 'Results' is case sensitive
sample_output = StandardPythonParameterType({'Results': numpy_sample_output})

# 'Inputs' is case sensitive
@input_schema('Inputs',sample_input)
@input_schema('global_parameters',sample_global_parameters) #this is optional
@output_schema(sample_output)
def run(Inputs,global_parameters):
    try:
        data = inputs['input1']
        # data will be convert to target format
        assert isinstance(data,np.ndarray)
        result = model.predict(data)
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

azure azure azure-machine-learning-studio endpoint powerbi schema schema schema