如何从 Keras 模型中删除前 N 层？

问题描述

我想从预训练的 Keras 模型中删除前 N 层。例如，一个 EfficientNetB0，其前 3 层仅负责预处理：

import tensorflow as tf

efinet = tf.keras.applications.EfficientNetB0(weights=None,include_top=True)

print(efinet.layers[:3])
# [<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7fa9a870e4d0>,# <tensorflow.python.keras.layers.preprocessing.image_preprocessing.Rescaling at 0x7fa9a61343d0>,# <tensorflow.python.keras.layers.preprocessing.normalization.normalization at 0x7fa9a60d21d0>]

正如 M.Innat 所提到的，第一层是一个 Input Layer，它应该被保留或重新附加。我想删除这些层，但像这样的简单方法会引发错误：

cut_input_model = return tf.keras.Model(
    inputs=[efinet.layers[3].input],outputs=efinet.outputs
)

这将导致：

ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(...)

推荐的方法是什么？

解决方法

出现 Graph disconnected 错误的原因是您没有指定 Input 层。但这不是这里的主要问题。有时，使用 keras 和 Sequential API 从 Functional 模型中移除中间层并不简单。

对于顺序，它应该比较容易，而在功能模型中，您需要关心多输入块（例如 multiply、add 等）。例如：如果你想在一个序列模型中去除一些中间层，你可以很容易地适应this solution。但是对于函数模型（efficientnet），你不能因为多输入内部块，你会遇到这个错误：ValueError: A merged layer should be called on a list of inputs。所以这需要更多的工作 AFAIK，这里有一个 possible approach 来克服它。

在这里，我将针对您的情况展示一个简单的解决方法，但在某些情况下它可能不通用且也不安全。基于 this approach;使用 pop 方法。 Why it can be unsafe to use!。好的，让我们先加载模型。

func_model = tf.keras.applications.EfficientNetB0()

for i,l in enumerate(func_model.layers):
    print(l.name,l.output_shape)
    if i == 8: break

input_19 [(None,224,3)]
rescaling_13 (None,3)
normalization_13 (None,3)
stem_conv_pad (None,225,3)
stem_conv (None,112,32)
stem_bn (None,32)
stem_activation (None,32)
block1a_dwconv (None,32)
block1a_bn (None,32)

接下来，使用 .pop 方法：

func_model._layers.pop(1) # remove rescaling
func_model._layers.pop(1) # remove normalization

for i,l.output_shape)
    if i == 8: break

input_22 [(None,3)]
stem_conv_pad (None,32)
block1a_activation (None,32)
block1a_se_squeeze (None,32)

以下代码加载一个模型，移除最后一层并添加一个新层作为最后一层

old_model = keras.models.load_model("old_model.h5")
new_model= keras.models.Sequential(old_model.layers[:-1])
new_model.add(keras.layers.Dense(5,activation="sigmoid"))

您可以使用类似的方法切掉第一层，只选择最后一层。

old_model.layers[N:]

有比这更好的方法。创建模型的克隆，复制其权重，然后添加新层并训练您的网络

clone_of_old_model = keras.models.clone_model(old_model)
clone_of_old_model.set_weights(old_model.get_weights())

我说这种方法更好的原因是，如果你使用第一段代码，新模型的训练可能会影响旧模型。

现在您可以将这两个部分组合起来创建一个新模型。

请记住编译您的模型，因为您将冻结和解冻图层。

我一直在尝试用 keras tensorflow VGGFace 模型做同样的事情。经过大量试验，我发现这种方法有效。在这种情况下，除了最后一层之外，所有模型都被使用，最后一层被自定义嵌入层替换：

vgg_model = VGGFace(include_top=True,input_shape=(224,3)) # full VGG16 model
inputs = Input(shape=(224,3))
x = inputs
# Assemble all layers except for the last layer
for layer in vgg_model.layers[1:-2]:
  x = vgg_model.get_layer(layer.name)(x)
    
# Now add a new last layer that provides the 128 embeddings output
x = Dense(128,activation='softmax',use_bias=False,name='fc8x')(x)
# Create the custom model
custom_vgg_model = Model(inputs,x,name='custom_vggface')

与layers[x] 或pop() 不同，get_layer 获取实际层，允许将它们组装成新的输出层集。然后，您可以从中创建一个新模型。 'for' 语句以 1 而不是 0 开头，因为输入层已经由 'inputs' 定义。

此方法适用于顺序模型。不清楚它是否适用于更复杂的模型。

deep-learning efficientnet keras keras python tensorflow tensorflow tensorflow