使用 tensorflow 梯度带计算 Hessian

问题描述

感谢您对此问题的关注。

我想计算tensorflow.keras.Model的hessian矩阵

对于高阶导数，我尝试嵌套 GradientTape.# 示例图和输入

xs = tf.constant(tf.random.normal([100,24]))

ex_model = Sequential()
ex_model.add(Input(shape=(24)))
ex_model.add(Dense(10))
ex_model.add(Dense(1))

with tf.GradientTape(persistent=True) as tape:
    tape.watch(xs)
    ys = ex_model(xs)
g = tape.gradient(ys,xs)
h = tape.jacobian(g,xs)
print(g.shape)
print(h.shape)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-20-dbf443f1ddab> in <module>
      5 h = tape.jacobian(g,xs)
      6 print(g.shape)
----> 7 print(h.shape)

AttributeError: 'nonetype' object has no attribute 'shape'

而且，另一个试验...

with tf.GradientTape() as tape1:
    with tf.GradientTape() as tape2:
        tape2.watch(xs)
        ys = ex_model(xs)
    g = tape2.gradient(ys,xs)
h = tape1.jacobian(g,xs)
    
print(g.shape)
print(h.shape)


(100,24)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-17-c5bbb17404bc> in <module>
      7 
      8 print(g.shape)
----> 9 print(h.shape)

AttributeError: 'nonetype' object has no attribute 'shape'

为什么我无法计算 g wrt x 的梯度？

解决方法

您已经计算了 ys 梯度 wrt xs 的第二阶，它为零，这在您计算梯度 wrt 常数时应该是这样，这就是为什么 tape1.jacobian(g,xs) 返回 {{1 }}

当梯度的二阶不保持不变时：

None

输出：

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input,Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x**3
  dy_dx = t1.gradient(y,x)
d2y_dx2 = t2.gradient(dy_dx,x)

print('dy_dx:',dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:',d2y_dx2) # 9 * 2 * x => 18.0

当梯度的二阶为时：

dy_dx: tf.Tensor(9.0,shape=(),dtype=float32)
d2y_dx2: tf.Tensor(18.0,dtype=float32)

输出：

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input,Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x
  dy_dx = t1.gradient(y,dy_dx)
print('d2y_dx2:',d2y_dx2)

然而，您可以计算梯度 wrt dy_dx: tf.Tensor(3.0,dtype=float32) d2y_dx2: None 的二阶层参数，例如 Input gradient regularization

autodiff tensorflow tensorflow tensorflow

使用 tensorflow 梯度带计算 Hessian

问题描述

解决方法

相关问答