如何使用批量大小在自定义TensorFlow层中创建张量

问题描述

我正在创建一个自定义TF层,并在其中创建一个类似这样的张量

class MyLayer(Layer):
  def __init__(self,config,**kwargs):
    super(MyLayer,self).__init__(**kwargs)
    ....

  def call(self,x):
    B,T,C = x.shape.as_list()
    ...
    ones = tf.ones((B,C))
    ...
    # output projection
    y = ...
    return y

现在问题是评估图层时B(批次大小)为None,这导致tf.ones失败并出现以下错误


ValueError: in user code:

    <ipython-input-69-f3322a54c05c>:29 call  *
        ones = tf.ones((B,C))
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper  **
        return target(*args,**kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py:3080 ones
        shape = ops.convert_to_tensor(shape,dtype=dtypes.int32)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/profiler/trace.py:163 wrapped
        return func(*args,**kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:1535 convert_to_tensor
        ret = conversion_func(value,dtype=dtype,name=name,as_ref=as_ref)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py:356 _tensor_shape_tensor_conversion_function
        "Cannot convert a partially kNown TensorShape to a Tensor: %s" % s)

    ValueError: Cannot convert a partially kNown TensorShape to a Tensor: (None,8,128)

我该如何工作?

解决方法

如果只想获得与x相同形状的张量,则可以使用tf.ones_like。像这样:

class MyLayer(Layer):

  ....

  def call(self,x):
    ones = tf.ones_like(x)

    ...

    # output projection
    y = ...
    return y

直到运行时才需要知道x的形状。

但是,通常,我们可能需要在运行时之前知道输入的形状,在这种情况下,我们可以在我们的层中实现build()方法,该方法将input_shape作为参数,并在我们编译我们的模型。

从文档here复制的示例:

class Linear(keras.layers.Layer):
    def __init__(self,units=32):
        super(Linear,self).__init__()
        self.units = units

    def build(self,input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1],self.units),initializer="random_normal",trainable=True,)
        self.b = self.add_weight(
            shape=(self.units,),trainable=True
        )

    def call(self,inputs):
        return tf.matmul(inputs,self.w) + self.b