问题描述
我已经在Azure ML Studio上部署了ML模型,并且我正在使用推理模式对其进行更新,以允许与Power BI兼容,如here所述。
通过REST api将数据发送到模型时(添加此推理模式之前),一切正常,我得到了返回的结果。但是,一旦按照上面链接的说明添加了架构并个性化了我的数据,通过REST api发送的相同数据只会返回错误“列表索引超出范围”。部署进行得很好,被指定为“正常”,没有错误消息。
任何帮助将不胜感激。谢谢。
编辑:
输入脚本:
import numpy as np
import pandas as pd
import joblib
from azureml.core.model import Model
from inference_schema.schema_decorators import input_schema,output_schema
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
def init():
global model
#Model name is the name of the model registered under the workspace
model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
model = joblib.load(model_path)
#Provide 3 sample inputs for schema generation for 2 rows of data
numpy_sample_input = NumpyParameterType(np.array([[2400.0,78.26086956521739,11100.0,3.612565445026178,3.0,0.0],[368.55,96.88311688311687,709681.1600000012,73.88059701492537,44.0,0.0]],dtype = 'float64'))
pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0,368.55],'delayed_percent': [78.26086956521739,96.88311688311687],'total_value_delayed': [11100.0,709681.1600000012],'num_invoices_per30_dealing_days': [3.612565445026178,73.88059701492537],'delayed_streak': [3.0,44.0],'prompt_streak': [0.0,0.0]}))
standard_sample_input = StandardPythonParameterType(0.0)
# This is a nested input sample,any item wrapped by `ParameterType` will be described by schema
sample_input = StandardPythonParameterType({'input1': numpy_sample_input,'input2': pandas_sample_input,'input3': standard_sample_input})
sample_global_parameters = StandardPythonParameterType(1.0) #this is optional
sample_output = StandardPythonParameterType([1.0,1.0])
@input_schema('inputs',sample_input)
@input_schema('global_parameters',sample_global_parameters) #this is optional
@output_schema(sample_output)
def run(inputs,global_parameters):
try:
data = inputs['input1']
# data will be convert to target format
assert isinstance(data,np.ndarray)
result = model.predict(data)
return result.tolist()
except Exception as e:
error = str(e)
return error
预测脚本:
import requests
import json
from ast import literal_eval
# URL for the web service
scoring_uri = ''
## If the service is authenticated,set the key or token
#key = '<your key or token>'
# Two sets of data to score,so we get two results back
data = {"data": [[2400.0,0.0]]}
# Convert to JSON string
input_data = json.dumps(data)
# Set the content type
headers = {'Content-Type': 'application/json'}
## If authentication is enabled,set the authorization header
#headers['Authorization'] = f'Bearer {key}'
# Make the request and display the response
resp = requests.post(scoring_uri,input_data,headers=headers)
print(resp.text)
result = literal_eval(resp.text)
解决方法
我不确定您是否已解决问题,但是我遇到了类似的问题,而且我无法让Power BI看到我的ML模型。最后,我使用以下模式专门为Power BI(pandas df类型)创建了一个服务:
unordered scan
,
Microsoft documentation 说:“为了生成符合自动 Web 服务消费的 swagger,评分脚本 run() 函数必须具有以下 API 形状:
类型为“StandardPythonParameterType”的第一个参数,名为 输入和嵌套。
一个可选的“StandardPythonParameterType”类型的第二个参数, 名为 GlobalParameters。
返回名为“StandardPythonParameterType”类型的字典 结果和嵌套。”
我已经测试过了,它区分大小写 所以它会是这样的:
import numpy as np
import pandas as pd
import joblib
from azureml.core.model import Model
from inference_schema.schema_decorators import input_schema,output_schema
from inference_schema.parameter_types.standard_py_parameter_type import
StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
def init():
global model
# Model name is the name of the model registered under the workspace
model_path = Model.get_model_path(model_name = 'databricksmodelpowerbi2')
model = joblib.load(model_path)
# Provide 3 sample inputs for schema generation for 2 rows of data
numpy_sample_input = NumpyParameterType(np.array([[2400.0,78.26086956521739,11100.0,3.612565445026178,3.0,0.0],[368.55,96.88311688311687,709681.1600000012,73.88059701492537,44.0,0.0]],dtype = 'float64'))
pandas_sample_input = PandasParameterType(pd.DataFrame({'value': [2400.0,368.55],'delayed_percent': [78.26086956521739,96.88311688311687],'total_value_delayed':
[11100.0,709681.1600000012],'num_invoices_per30_dealing_days': [3.612565445026178,73.88059701492537],'delayed_streak': [3.0,44.0],'prompt_streak': [0.0,0.0]}))
standard_sample_input = StandardPythonParameterType(0.0)
# This is a nested input sample,any item wrapped by `ParameterType` will be described
by schema
sample_input = StandardPythonParameterType({'input1': numpy_sample_input,'input2': pandas_sample_input,'input3': standard_sample_input})
sample_global_parameters = StandardPythonParameterType(1.0) #this is optional
numpy_sample_output = NumpyParameterType(np.array([1.0,2.0]))
# 'Results' is case sensitive
sample_output = StandardPythonParameterType({'Results': numpy_sample_output})
# 'Inputs' is case sensitive
@input_schema('Inputs',sample_input)
@input_schema('global_parameters',sample_global_parameters) #this is optional
@output_schema(sample_output)
def run(Inputs,global_parameters):
try:
data = inputs['input1']
# data will be convert to target format
assert isinstance(data,np.ndarray)
result = model.predict(data)
return result.tolist()
except Exception as e:
error = str(e)
return error
`