tf.GradientTape 给出了错误的渐变

问题描述

我正在尝试使用 TensorFlow 手动为这个非常具体的问题制作一个优化器。我开始使用 TF 优化器，但那没有用，所以我想查看梯度，但这没有意义。

问题如下，我想优化一些回归参数 re 以最小化损失函数 np.sum(np.abs(y- x@re)) 或在 tensorflow 中为 tf.math.reduce_sum(tf.math.abs(y_tf - tf.matmul(x_tf,re_tf)))。我知道问题是凸的并且在 re = np.array([[1.0,-2.0,1.0]]).T 处具有最小值，如代码所示：

# Initializing data
x = np.array([[[0.,0.,0.],[0.,1.],1.,[1.,2.],2.,3.],[2.,3.,4.],[3.,4.,5.]]])
y = np.array([[[ 0.],[ 0.],[ 1.],[-2.],[ 2.],[ 0.]]])
re = np.array([[1.0,1.0]]).T

# Converting to tf variables
x_tf = tf.Variable(x)
y_tf = tf.Variable(y)
re_tf = tf.Variable(re)

#calulating Gradient
with tf.GradientTape() as g:
    g.watch(re_tf)
    norm = tf.math.reduce_sum(tf.math.abs(y_tf - tf.matmul(x_tf,re_tf)))
grad = g.gradient(norm,re_tf)
print(grad)

给出结果

tf.Tensor(
[[0.]
[0.]
[0.]],shape=(3,1),dtype=float64)

但是，如果我对 re 进行最小的更改，例如 re = np.array([[1.0,-2.0001,1.0]]).T，那么输出将毫无意义。

tf.Tensor(
[[ -6.]
[-11.]
[-14.]],dtype=float64)

我可以做些什么来修复这个渐变吗？

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

gradient-descent machine-learning python tensorflow2.0