问题描述
感谢您对此问题的关注。
我想计算tensorflow.keras.Model的hessian矩阵
对于高阶导数,我尝试嵌套 GradientTape.# 示例图和输入
xs = tf.constant(tf.random.normal([100,24]))
ex_model = Sequential()
ex_model.add(Input(shape=(24)))
ex_model.add(Dense(10))
ex_model.add(Dense(1))
with tf.GradientTape(persistent=True) as tape:
tape.watch(xs)
ys = ex_model(xs)
g = tape.gradient(ys,xs)
h = tape.jacobian(g,xs)
print(g.shape)
print(h.shape)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-dbf443f1ddab> in <module>
5 h = tape.jacobian(g,xs)
6 print(g.shape)
----> 7 print(h.shape)
AttributeError: 'nonetype' object has no attribute 'shape'
而且,另一个试验...
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
tape2.watch(xs)
ys = ex_model(xs)
g = tape2.gradient(ys,xs)
h = tape1.jacobian(g,xs)
print(g.shape)
print(h.shape)
(100,24)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-c5bbb17404bc> in <module>
7
8 print(g.shape)
----> 9 print(h.shape)
AttributeError: 'nonetype' object has no attribute 'shape'
为什么我无法计算 g wrt x 的梯度?
解决方法
您已经计算了 ys
梯度 wrt xs
的第二阶,它为零,这在您计算梯度 wrt 常数时应该是这样,这就是为什么 tape1.jacobian(g,xs)
返回 {{1 }}
当梯度的二阶不保持不变时:
None
输出:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input,Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x**3
dy_dx = t1.gradient(y,x)
d2y_dx2 = t2.gradient(dy_dx,x)
print('dy_dx:',dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:',d2y_dx2) # 9 * 2 * x => 18.0
当梯度的二阶为时:
dy_dx: tf.Tensor(9.0,shape=(),dtype=float32)
d2y_dx2: tf.Tensor(18.0,dtype=float32)
输出:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input,Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x
dy_dx = t1.gradient(y,dy_dx)
print('d2y_dx2:',d2y_dx2)
然而,您可以计算梯度 wrt dy_dx: tf.Tensor(3.0,dtype=float32)
d2y_dx2: None
的二阶层参数,例如 Input gradient regularization