从TFMA开始转置错误，同时将评估程序与TF服务一起使用

问题描述

我正在使用TFX部署语义图像分割管道。我在评估程序组件的context.run(Evaluator)中遇到错误。

Evaluator组件看起来像

            value_threshold=tfma.GenericValueThreshold(
                lower_bound={'value': 0.001},upper_bound={'value': 0.99}),change_threshold=tfma.GenericChangeThreshold(
                absolute={'value': 0.0001},direction=tfma.MetricDirection.HIGHER_IS_BETTER),)

metrics_specs = tfma.MetricsSpec(
                metrics = [
                    tfma.MetricConfig(
                        class_name='AUC',threshold=accuracy_threshold)]
                )


eval_config = tfma.EvalConfig(
        model_specs=[
            tfma.ModelSpec(label_key='mask/raw_xf')
        ],metrics_specs=[metrics_specs],slicing_specs=[
            tfma.SlicingSpec()
        ]
)


evaluator = Evaluator(
    examples=transform.outputs['transformed_examples'],#examples=example_gen.outputs['examples'],model=trainer.outputs['model'],baseline_model=model_resolver.outputs['model'],eval_config=eval_config)

context.run(evaluator)

错误的堆栈跟踪如下所示。它的堆栈跟踪很长，但主要错误是与转置形状有关的错误。我一直怀疑这与tf服务问题有关。在此之前的步骤中，我在Trainer组件上完成了context.run（）并成功地训练了模型，并用推理签名保存了模型。


INFO:absl:Evaluating model.
INFO:absl:Using 1 process(es) for Beam pipeline execution.
WARNING:tensorflow:Large batch_size 1 failed with error  transpose expects a vector of size 2. But input(1) is a vector of size 4
     [[{{node StatefulPartitionedCall/functional_1/stem_conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer}}]] [Op:__inference_signature_wrapper_222737]

Function call stack:
signature_wrapper
. Attempting to run batch through serially.
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/model_util.py in process(self,elements)
    421     try:
--> 422       result = self._batch_reducible_process(elements)
    423       self._batch_size.update(batch_size)

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/extractors/predict_extractor_v2.py in _batch_reducible_process(self,batch_of_extracts)
    146       if isinstance(inputs,dict):
--> 147         outputs = signature(**{k: tf.constant(v) for k,v in inputs.items()})
    148       else:

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self,*args,**kwargs)
   1654     """
-> 1655     return self._call_impl(args,kwargs)
   1656 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_impl(self,args,kwargs,cancellation_manager)
   1672 
-> 1673       return self._call_with_flat_signature(args,cancellation_manager)
   1674 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_with_flat_signature(self,cancellation_manager)
   1721                                              type(arg).__name__,str(arg)))
-> 1722     return self._call_flat(args,self.captured_inputs,cancellation_manager)
   1723 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in _call_flat(self,captured_inputs,cancellation_manager)
    105     return super(_WrapperFunction,self)._call_flat(args,--> 106                                                     cancellation_manager)
    107 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self,cancellation_manager)
   1923       return self._build_call_outputs(self._inference_function.call(
-> 1924           ctx,cancellation_manager=cancellation_manager))
   1925     forward_backward = self._select_forward_and_backward_functions(

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self,ctx,cancellation_manager)
    549               attrs=attrs,--> 550               ctx=ctx)
    551         else:

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name,num_outputs,inputs,attrs,name)
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,device_name,op_name,---> 60                                         inputs,num_outputs)
     61   except core._NotOkStatusException as e:

InvalidArgumentError:  transpose expects a vector of size 2. But input(1) is a vector of size 4
     [[{{node StatefulPartitionedCall/functional_1/stem_conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer}}]] [Op:__inference_signature_wrapper_222737]

Function call stack:
signature_wrapper


During handling of the above exception,another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/model_util.py in process(self,elements)
    433         self._batch_size.update(1)
--> 434         result.extend(self._batch_reducible_process([element]))
    435       self._num_instances.inc(len(result))

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/extractors/predict_extractor_v2.py in _batch_reducible_process(self,another exception occurred:

RuntimeError                              Traceback (most recent call last)
 in 
     36     eval_config=eval_config)
     37 
---> 38 context.run(evaluator)

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run_if_ipython(*args,**kwargs)
     64       # __IPYTHON__ variable is set by IPython,see
     65       # https://ipython.org/ipython-doc/rel-0.10.2/html/interactive/reference.html#embedding-ipython.
---> 66       return fn(*args,**kwargs)
     67     else:
     68       absl.logging.warning(

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run(self,component,enable_cache,beam_pipeline_args)
    166         component,pipeline_info,driver_args,metadata_connection,167         beam_pipeline_args,additional_pipeline_args)
--> 168     execution_id = launcher.launch().execution_id
    169 
    170     return execution_result.ExecutionResult(

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tfx/orchestration/launcher/base_component_launcher.py in launch(self)
    203                          execution_decision.input_dict,204                          execution_decision.output_dict,--> 205                          execution_decision.exec_properties)
    206 
    207     absl.logging.info('Running publisher for %s',~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tfx/orchestration/launcher/in_process_component_launcher.py in _run_executor(self,execution_id,input_dict,output_dict,exec_properties)
     65         executor_context)  # type: ignore
     66 
---> 67     executor.Do(input_dict,exec_properties)

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tfx/components/evaluator/executor.py in Do(self,exec_properties)
    222              eval_config=eval_config,223              output_path=output_uri,--> 224              slice_spec=slice_spec))
    225     absl.logging.info(
    226         'Evaluation complete. Results written to {}.'.format(output_uri))

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/pipeline.py in __exit__(self,exc_type,exc_val,exc_tb)
    522 
    523     if not exc_type:
--> 524       self.run().wait_until_finish()
    525 
    526   def visit(self,visitor):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/pipeline.py in run(self,test_runner_api)
    508       finally:
    509         shutil.rmtree(tmpdir)
--> 510     return self.runner.run_pipeline(self,self._options)
    511 
    512   def __enter__(self):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_pipeline(self,pipeline,options)
    177 
    178     self._latest_run_result = self.run_via_runner_api(
--> 179         pipeline.to_runner_api(default_environment=self._default_environment))
    180     return self._latest_run_result
    181 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_via_runner_api(self,pipeline_proto)
    187     # TODO(pabloem,BEAM-7514): Create a watermark manager (that has access to
    188     #   the teststream (if any),and all the stages).
--> 189     return self.run_stages(stage_context,stages)
    190 
    191   @contextlib.contextmanager

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_stages(self,stage_context,stages)
    333           stage_results = self._run_stage(
    334               runner_execution_context,--> 335               bundle_context_manager,336           )
    337           monitoring_infos_by_stage[stage.name] = (

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_stage(self,runner_execution_context,bundle_context_manager)
    543                                                    data_output,544                                                    {},--> 545                                                    expected_timer_output)
    546 
    547     last_result = result

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self,expected_outputs,fired_timers,expected_output_timers)
   1049 
   1050     with UnboundedThreadPoolExecutor() as executor:
-> 1051       for result,split_result in executor.map(execute,part_inputs):
   1052 
   1053         split_result_list += split_result

/usr/lib/python3.7/concurrent/futures/_base.py in result_iterator()
    596                     # Careful not to keep a reference to the popped future
    597                     if timeout is None:
--> 598                         yield fs.pop().result()
    599                     else:
    600                         yield fs.pop().result(end_time - time.monotonic())

/usr/lib/python3.7/concurrent/futures/_base.py in result(self,timeout)
    433                 raise CancelledError()
    434             elif self._state == FINISHED:
--> 435                 return self.__get_result()
    436             else:
    437                 raise TimeoutError()

/usr/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/utils/thread_pool_executor.py in run(self)
     42       # If the future wasn't cancelled,then attempt to execute it.
     43       try:
---> 44         self._future.set_result(self._fn(*self._fn_args,**self._fn_kwargs))
     45       except BaseException as exc:
     46         # Even though Python 2 futures library has #set_exection(),~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in execute(part_map)
   1046           cache_token_generator=self._cache_token_generator)
   1047       return bundle_manager.process_bundle(
-> 1048           part_map,expected_output_timers)
   1049 
   1050     with UnboundedThreadPoolExecutor() as executor:

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self,expected_output_timers)
    945             process_bundle_descriptor_id=self._bundle_descriptor.id,946             cache_tokens=[next(self._cache_token_generator)]))
--> 947     result_future = self._worker_handler.control_conn.push(process_bundle_req)
    948 
    949     split_results = []  # type: List[beam_fn_api_pb2.ProcessBundleSplitResponse]

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/worker_handlers.py in push(self,request)
    347       self._uid_counter += 1
    348       request.instruction_id = 'control_%s' % self._uid_counter
--> 349     response = self.worker.do_instruction(request)
    350     return ControlFuture(request.instruction_id,response)
    351 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py in do_instruction(self,request)
    469       # E.g. if register is set,this will call self.register(request.register))
    470       return getattr(self,request_type)(
--> 471           getattr(request,request_type),request.instruction_id)
    472     else:
    473       raise NotImplementedError

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py in process_bundle(self,request,instruction_id)
    504         with self.maybe_profile(instruction_id):
    505           delayed_applications,requests_finalization = (
--> 506               bundle_processor.process_bundle(instruction_id))
    507           monitoring_infos = bundle_processor.monitoring_infos()
    508           monitoring_infos.extend(self.state_cache_metrics_fn())

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py in process_bundle(self,instruction_id)
    970           elif isinstance(element,beam_fn_api_pb2.Elements.Data):
    971             input_op_by_transform_id[element.transform_id].process_encoded(
--> 972                 element.data)
    973 
    974       # Finish all operations.

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py in process_encoded(self,encoded_windowed_values)
    216       decoded_value = self.windowed_coder_impl.decode_from_stream(
    217           input_stream,True)
--> 218       self.output(decoded_value)
    219 
    220   def monitoring_infos(self,transform_id,tag_to_pcollection_id):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/worker/operations.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()
 
~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()
 /git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/.vscode/extensions/ms-python.python-2020.8.103604/pythonFiles/lib/python/future/utils/__init__.py in raise_with_traceback(exc,traceback)
    444         if traceback == Ellipsis:
    445             _,_,traceback = sys.exc_info()
--> 446         raise exc.with_traceback(traceback)
    447 
    448 else:

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/apache_beam/runners/common.cpython-37m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/model_util.py in process(self,elements)
    432       for element in elements:
    433         self._batch_size.update(1)
--> 434         result.extend(self._batch_reducible_process([element]))
    435       self._num_instances.inc(len(result))
    436       return result

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow_model_analysis/extractors/predict_extractor_v2.py in _batch_reducible_process(self,batch_of_extracts)
    145 
    146       if isinstance(inputs,v in inputs.items()})
    148       else:
    149         outputs = signature(tf.constant(inputs,dtype=tf.string))

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self,**kwargs)
   1653       TypeError: If the arguments do not match the function's signature.
   1654     """
-> 1655     return self._call_impl(args,kwargs)
   1656 
   1657   def _call_impl(self,cancellation_manager=None):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_impl(self,cancellation_manager)
   1671             raise structured_err
   1672 
-> 1673       return self._call_with_flat_signature(args,cancellation_manager)
   1674 
   1675   def _call_with_flat_signature(self,cancellation_manager):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_with_flat_signature(self,cancellation_manager)
   1720                         "got {} ({})".format(self._flat_signature_summary(),i,1721                                              type(arg).__name__,cancellation_manager)
   1723 
   1724   def _call_with_structured_signature(self,cancellation_manager):

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in _call_flat(self,cancellation_manager)
    104           map(get_cross_replica_handle,captured_inputs))
    105     return super(_WrapperFunction,--> 106                                                     cancellation_manager)
    107 
    108 

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self,cancellation_manager)
   1922       # No tape is watching; skip to running the function.
   1923       return self._build_call_outputs(self._inference_function.call(
-> 1924           ctx,cancellation_manager=cancellation_manager))
   1925     forward_backward = self._select_forward_and_backward_functions(
   1926         args,~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self,cancellation_manager)
    548               inputs=args,549               attrs=attrs,--> 550               ctx=ctx)
    551         else:
    552           outputs = execute.execute_with_cancellation(

~/git/kubeflow-pipelines/venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name,name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

RuntimeError: tensorflow.python.framework.errors_impl.InvalidArgumentError:  transpose expects a vector of size 2. But input(1) is a vector of size 4
     [[{{node StatefulPartitionedCall/functional_1/stem_conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer}}]] [Op:__inference_signature_wrapper_222737]

Function call stack:
signature_wrapper [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/ExtractPredictions/Predict']

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

tensorflow-model-analysis tfx

从TFMA开始转置错误，同时将评估程序与TF服务一起使用

问题描述

解决方法

相关问答