问题描述
我在 MobileNet 模型中添加了一个注意力层,如下所示。
mobile = tf.keras.applications.mobilenet.MobileNet(weights='imagenet')
x = mobile.layers[-6].input
if True:
x = Reshape([7*7,1024])(x)
att = MultiHeadsAttModel(l=7*7,d=1024,dv=64,dout=1024,nv = 16 )
x = att([x,x,x])
x = Reshape([7,7,1024])(x)
x = Batchnormalization()(x)
x = mobile.get_layer('global_average_pooling2d')(x)
x = mobile.get_layer('reshape_1')(x)
x = mobile.get_layer('dropout')(x)
x = mobile.get_layer('conv_preds')(x)
x = mobile.get_layer('reshape_2')(x)
output = Dense(units=50,activation='softmax')(x)
model = Model(inputs=mobile.input,outputs=output)
for layer in model.layers[:-23]:
layer.trainable = False
但是当我获得热图时,它给了我梯度为“无”。在这里,我应该将哪一层作为“last_conv_layer”?我需要改变注意力层的位置吗?添加它的最佳位置是什么?
with tf.GradientTape(persistent=True) as gtape:
last_conv_layer = model.get_layer('conv_preds')
iterate = tf.keras.models.Model([model.inputs],[model.output,last_conv_layer.output])
model_out,last_conv_layer = iterate(img_tensor)
class_out = model_out[:,np.argmax(model_out[0])]
grads = gtape.gradient(class_out,last_conv_layer)
print(grads)
输出:
WARNING:tensorflow:Calling GradientTape.gradient on a persistent tape inside its context is significantly less efficient than calling it outside the context (it causes the gradient ops to be recorded on the tape,leading to increased cpu and memory usage). Only call GradientTape.gradient inside the context if you actually want to trace the gradient in order to compute higher order derivatives.
None
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)