问题描述
我们在docker中生产了2个带有tensorflow服务的模型。我们已经为其中一个模型生成了一个预热数据集。这是docker日志:
tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:117] Starting to read warmup data for model at /models/model2/5/assets.extra/tf_serving_warmup_requests with model-warmup-options
2020-08-12 02:27:51.998915: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:166] Finished reading warmup data for model at /models/model2/5/assets.extra/tf_serving_warmup_requests. Number of warmup records read: 1000. Elapsed time (microseconds): 1293140.
2020-08-12 02:27:52.003127: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: model2 version: 5}
2020-08-12 02:27:52.005692: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2020-08-12 02:27:52.005711: I tensorflow_serving/model_servers/server_core.cc:573] (Re-)adding model: model1
2020-08-12 02:27:52.005734: I tensorflow_serving/model_servers/server_core.cc:573] (Re-)adding model: model2
2020-08-12 02:27:52.008151: I tensorflow_serving/model_servers/server.cc:353] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2020-08-12 02:27:52.012942: I tensorflow_serving/model_servers/server.cc:373] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...
但是,当我查看prometheus日志(http:// localhost:8501 / monitoring / prometheus / metrics)时,会看到以下内容:
# TYPE :tensorflow:serving:model_warmup_latency histogram
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",status="Out of range: Read less bytes than requested",le="10"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="18"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="32.4"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="58.32"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="104.976"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="188.957"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="340.122"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="612.22"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1102"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1983.59"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="3570.47"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="6426.84"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="11568.3"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="20823"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="37481.3"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="67466.4"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="121440"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="218591"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="393464"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="708235"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1.27482e+06"} 0
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="2.29468e+06"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="4.13043e+06"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="7.43477e+06"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1.33826e+07"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="2.40887e+07"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="4.33596e+07"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="7.80473e+07"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1.40485e+08"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="2.52873e+08"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="4.55172e+08"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="8.19309e+08"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="1.47476e+09"} 1
:tensorflow:serving:model_warmup_latency_bucket{model_path="/models/model2/5",le="+Inf"} 1
:tensorflow:serving:model_warmup_latency_sum{model_path="/models/model2/5",status="Out of range: Read less bytes than requested"} 1.29314e+06
:tensorflow:serving:model_warmup_latency_count{model_path="/models/model2/5",status="Out of range: Read less bytes than requested"} 1
问题是:我是否可以得出结论,预热数据集已通过张量流服务成功使用?基本上,我不确定如何解释Prometheus日志。
谢谢!
TF版本2.2,TF服务版本2.2.0
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)