将来自tf.keras.applications的预训练模型包装到tf.keras.layers.TimeDistributed层中,内存是否昂贵?

问题描述

我有一个带有24GB内存的Titan RTX ...我什至不能将mobilenetV2包装在tf.keras.layers.Timedistributed层中,然后再包装具有256个单元的LSTM,并且不会遇到OOM错误

我的数据具有以下形状(batch_size,time_steps,299、299、2),其中batch_size = 32和time_steps = 32。我无法使用此数据运行整个mobilenetV2。我尝试将time_steps减少到10,但仍然太大。

我发现我需要使用alpha = 0.5减小模型宽度,而只是使用前几层(从输入层到第一个块的末尾'block_1_project_BN')来实际训练我的模型。这相当于模型中的

如果内存不是问题,我想训练的模型:

def change_model_input_size(model,new_input_shape=(None,299,2)):
    # replace input shape of first layer
    model._layers[0]._batch_input_shape = new_input_shape
    # rebuild model architecture by exporting and importing via jason
    new_model = tf.keras.models.model_from_json(model.to_json())
    return new_model

def mobilenetv2_early_layer_model(num_classes,learning_rate_schedule):
    
    input = tf.keras.Input(shape=(None,2),name='my_input')

    x = tf.keras.applications.MobileNetV2(include_top=False,weights='imagenet',alpha=1.0,input_shape=(299,3),pooling='max')
        """ Freeze all batch normalisation layers as recommended for fine-tuning"""
        for layer in x.layers:
            if re.match('.*bn',layer.name) or re.match('.*BN',layer.name) or re.match('bn.*',layer.name):
                layer.trainable = False
    
    x = change_model_input_size(x) # replace 3 ch the input layer with a 2 ch input layer
    x.trainable = True
    
    #The code I currently run to shorten the model to fit in memory
    #mobilenet_input_layer = x._layers[0]
    #desired_layer = x.get_layer('block_1_project_BN')
    #shortened_mobilenet = tf.keras.Model(inputs = [mobilenet_input_layer.input],outputs=[desired_layer.output])
    #x = shortened_mobilenet
    
    visual_feature_extractor = tf.keras.layers.Timedistributed(x)(input)
    pooled_mobilenet_output = tf.keras.layers.Timedistributed(tf.keras.layers.GlobalMaxPooling2D())(visual_feature_extractor)
    
    lstm = tf.keras.layers.LSTM(256,dropout=0.2)(pooled_mobilenet_output )
    
    outputs = tf.keras.layers.Dense(num_classes,activation='softmax')(lstm)
    model = tf.keras.Model(inputs=[optical_flow_input],outputs=outputs,name='final_model')
    
    model.compile(optimizer=tf.optimizers.Adam(learning_rate_schedule),loss='categorical_crossentropy',metrics=[tf.keras.metrics.categorical_accuracy])

    model.save('C:/modified_mobilenetv2.h5')
    K.clear_session()
    model = tf.keras.models.load_model('C:/modified_mobilenetv2.h5')

    return model

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)