使用tf.lookup.StaticHashTable导出TF2.2.0模型以进行服务时出错

问题描述

我在tf.keras模型的输出层之后的一层Lambda中使用StaticHashTable。实际上，这很简单：我有一个文本分类模型，并且要添加一个简单的lambda层，该层使用model.output并将model_id转换为更通用的标签。我可以使用model.save（...作为H5格式..）保存此版本的模型，而不会出现任何问题，并且可以将其加载回并使用而不会出现任何问题。

问题是，当我尝试导出TF2.2.0模型以进行TF服务时，我找不到导出方法。这是我可以使用TF1.X或TF2.X + tf.compat.v1.disable_eager_execution()

进行的操作

tf.compat.v1.disable_eager_execution()
version = 1
name = 'tmp_model'
export_path = f'/opt/tf_serving/{name}/{version}'
builder = saved_model_builder.SavedModelBuilder(export_path)

model_signature = tf.compat.v1.saved_model.predict_signature_def(
    inputs={
        'input': model.input
    },outputs={
        'output': model.output
    }
)

with tf.compat.v1.keras.backend.get_session() as sess:
    builder.add_Meta_graph_and_variables(
        sess=sess,tags=[tf.compat.v1.saved_model.tag_constants.SERVING],signature_def_map={
            'predict': model_signature
        },# For initializing Hashtables
        main_op=tf.compat.v1.tables_initializer()
    )
    builder.save()

这将以TF1.X格式保存我的模型以进行投放，并且我可以毫无问题地使用它。事情是，我正在使用LSTM层，并且想在GPU上使用我的模型。根据文档，如果禁用急切模式，则无法将LSTM的GPU版本与TF2.2一起使用。而且，如果不经历上面提到的代码，就无法保存用于服务于rt2.2标准和StaticHashTables的模型。

这是我尝试导出在最后一层使用StaticHashTables的TF2.2模型的方式；并给出如下错误：

class MyModule(tf.Module):

    def __init__(self,model):
        super(MyModule,self).__init__()
        self.model = model
    
    @tf.function(input_signature=[tf.TensorSpec(shape=(None,16),dtype=tf.int32,name='input')])
    def predict(self,input):
        result = self.model(input)
        return {"output": result}

version = 1
name = 'tmp_model'
export_path = f'/opt/tf_serving/{name}/{version}'

module = MyModule(model)
tf.saved_model.save(module,export_path,signatures={"predict": module.predict.get_concrete_function()})

错误：

AssertionError: Tried to export a function which references untracked object Tensor("2907:0",shape=(),dtype=resource).
TensorFlow objects (e.g. tf.Variable) captured by functions must be tracked by assigning them to an attribute of a tracked object or assigned to an attribute of the main object directly.

是否有任何建议或者我在使用最终Lambda层中的StaticHashTables进行TensorFlow服务的导出TF2.2模型时丢失任何内容？

谢谢！

解决方法

我遇到了同样的问题，我找到了通过查找转换创建自定义层并将其添加到模型中的答案。有人将答案放在了stackoverflow上，但是我再也找不到了，所以我将为您介绍。原因是外部的变量和其他元素必须是可跟踪的，我没有找到其他使它们可跟踪的方法，而是创建了一个Custom层，因为它们是可跟踪的，并且在导出时不需要添加其他资产。 / p>

这是代码：

这是在模型之前进行转换的特定自定义层（包括将分词器作为从静态表的查找，然后进行填充：

class VocabLookup(tf.keras.layers.Layer):
    def __init__(self,word_index,**kwargs):
        self.word_index = word_index
        self.vocab = list(word_index.keys())
        self.indices = tf.convert_to_tensor(list(word_index.values()),dtype=tf.int64)
        vocab_initializer = tf.lookup.KeyValueTensorInitializer(self.vocab,self.indices)
        self.table = tf.lookup.StaticHashTable(vocab_initializer,default_value=1)
        super(VocabLookup,self).__init__(**kwargs)

    def build(self,input_shape):
        self.built = True

    def sentences_transform(self,tx):
        x = tf.strings.lower(tx)
        x = tf.strings.regex_replace(x,"[,.:;]"," ")
        x = tf.strings.regex_replace(x,"á","a")
        x = tf.strings.regex_replace(x,"é","e")
        x = tf.strings.regex_replace(x,"í","i")
        x = tf.strings.regex_replace(x,"ó","ú","u")
        x = tf.strings.regex_replace(x,"ü","Á","É","Í","Ó","o")
        x = tf.strings.regex_replace(x,"Ú","Ü","[?¿¡!@#$-_\?+¿{}*/]","")
        x = tf.strings.regex_replace(x," +"," ")
        x = tf.strings.strip(x)
        x = tf.strings.split(x)
        x = self.table.lookup(x)
        x_as_vector = tf.reshape(x,[-1])
        zero_padding = tf.zeros([191] - tf.shape(x_as_vector),dtype=x.dtype)
        x = tf.concat([x_as_vector,zero_padding],0)
        return x
        

    def call(self,inputs):
        x = tf.map_fn(lambda tx: self.sentences_transform(tx),elems = inputs,dtype=tf.int64)
        return x

    def get_config(self):
        return {'word_index': self.word_index}

在我的情况下，我创建了一个图层来接收来自令牌生成器的word_index作为输入。然后，您可以在模型中使用这样的图层：

with open(<tokenizer_path>) as f:
    data = json.load(f)
    tokenizer = tokenizer_from_json(data)

moderator = load_model(<final model path ('.h5')>)
word_index = tokenizer.word_index
text_bytes = tf.keras.Input(shape=(),name='image_bytes',dtype=tf.string)
x = VocabLookup(word_index)(text_bytes)
output = moderator(x)
model = tf.keras.models.Model(text_bytes,output)

如果进行摘要，则将显示以下内容：

model.summary()
Model: "functional_57"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
image_bytes (InputLayer)     [(None,)]                 0         
_________________________________________________________________
vocab_lookup_60 (VocabLookup (None,None)              0         
_________________________________________________________________
sequential_1 (Sequential)    (None,1)                 1354369   
=================================================================
Total params: 1,354,369
Trainable params: 1,369
Non-trainable params: 0

通过此步骤，您最终可以将其另存为TF2服务模型

save_path = <your_serving_model_path>
tf.saved_model.save(model,save_path)

keras tensorflow tensorflow-serving tensorflow2.0 tf.keras