问题描述
我已经用以下投放签名训练了Keras模型(不是估算器):
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['examples'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: serving_default_examples:0
The given SavedModel SignatureDef contains the following output(s):
outputs['mu'] tensor_info:
dtype: DT_FLOAT
shape: (-1,1)
name: StatefulPartitionedCall_1:0
outputs['sigma'] tensor_info:
dtype: DT_FLOAT
shape: (-1,1)
name: StatefulPartitionedCall_1:1
Method name is: tensorflow/serving/predict
在将模型导出为model.fit
之前,使用带有梯度带的自定义训练循环而不是saved_model
方法来更新权重。由于无法先编译模型就无法使TFMA正常工作,因此我在指定一组自定义Keras指标的同时编译模型:
model.compile(metrics=custom_keras_metrics) # each custom metric inherits from keras.Metric
custom_training_loop(model)
model.save("path/to/saved_model",save_format="tf")
我想使用TFMA评估此模型,因此我首先按以下方式初始化评估共享模型:
eval_config = tfma.EvalConfig(
model_specs=[tfma.ModelSpec(label_key="my_label_key")],slicing_specs=[tfma.SlicingSpec()] # empty slice refers to the entire dataset
)
eval_shared_model = tfma.default_eval_shared_model("path/to/saved_model",eval_config=eval_config)
但是,当我尝试运行模型分析时:
eval_results = tfma.run_model_analysis(
eval_shared_model=eval_shared_model,data_location="path/to/test/tfrecords*",file_format="tfrecords"
)
我遇到以下错误:
ValueError Traceback (most recent call last)
<ipython-input-156-f9a9684a6797> in <module>
2 eval_shared_model=eval_shared_model,3 data_location="tfma/test_raw-*",----> 4 file_format="tfrecords"
5 )
~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in run_model_analysis(eval_shared_model,eval_config,data_location,file_format,output_path,extractors,evaluators,writers,pipeline_options,slice_spec,write_config,compute_confidence_intervals,min_slice_size,random_seed_for_testing,schema)
1204
1205 if len(eval_config.model_specs) <= 1:
-> 1206 return load_eval_result(output_path)
1207 else:
1208 results = []
~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in load_eval_result(output_path,model_name)
383 metrics_and_plots_serialization.load_and_deserialize_metrics(
384 path=os.path.join(output_path,constants.METRICS_KEY),--> 385 model_name=model_name))
386 plots_proto_list = (
387 metrics_and_plots_serialization.load_and_deserialize_plots(
~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_and_plots_serialization.py in load_and_deserialize_metrics(path,model_name)
180 raise ValueError('Fail to find metrics for model name: %s . '
181 'Available model names are [%s]' %
--> 182 (model_name,','.join(keys)))
183
184 result.append((
ValueError: Fail to find metrics for model name: None . Available model names are []
为什么TFMA会引发此异常,我应该在哪里开始调试此错误?我尝试手动指定模型名称(这不是必需的,因为我仅使用一种模型),但这似乎也无济于事。我尝试跟踪源代码,当TFMA尝试加载PTransform生成的评估结果时,似乎会发生这种情况。
我正在使用tensorflow==2.3.0
和tensorflow-model-analysis==0.22.1
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)