使用tf.sparse.to_dense函数时出错

问题描述

我正在尝试解析tfrecord数据集以将其用于对象检测。当我尝试将稀疏张量更改为密集张量时，出现以下我无法理解的错误：


ValueError: Shapes must be equal rank,but are 1 and 0
    From merging shape 3 with other shapes. for '{{node stack}} = Pack[N=5,T=DT_FLOAT,axis=1](SparsetoDense,SparsetoDense_1,SparsetoDense_2,SparsetoDense_3,Cast)' with input shapes: [?],[?],[].

我的feature_description是：

feature_description = {
    'image/filename': tf.io.FixedLenFeature([],tf.string),'image/encoded': tf.io.FixedLenFeature([],'image/object/bBox/xmin': tf.io.VarLenFeature(tf.float32),'image/object/bBox/ymin': tf.io.VarLenFeature(tf.float32),'image/object/bBox/xmax': tf.io.VarLenFeature(tf.float32),'image/object/bBox/ymax': tf.io.VarLenFeature(tf.float32),'image/object/class/label': tf.io.VarLenFeature(tf.int64),}

我的解析代码：

def _parse_image_function(example_proto):
  # Parse the input tf.Example proto using the dictionary above.
  return tf.io.parse_single_example(example_proto,feature_description)

def _parse_tfrecord(x):  
    x_train = tf.image.decode_jpeg(x['image/encoded'],channels=3)
    x_train = tf.image.resize(x_train,(416,416))    
    labels = tf.cast(1,tf.float32)
#    print(type(x['image/object/bBox/xmin']))
    tf.print(x['image/object/bBox/xmin'])

    y_train = tf.stack([tf.sparse.to_dense(x['image/object/bBox/xmin']),tf.sparse.to_dense(x['image/object/bBox/ymin']),tf.sparse.to_dense(x['image/object/bBox/xmax']),tf.sparse.to_dense(x['image/object/bBox/ymax']),labels],axis=1)

    paddings = [[0,100 - tf.shape(y_train)[0]],[0,0]]
    y_train = tf.pad(y_train,paddings)
    return x_train,y_train


def load_tfrecord_dataset(train_record_file,size=416):

    dataset=tf.data.TFRecordDataset(train_record_file)
    parsed_dataset = dataset.map(_parse_image_function)
    final = parsed_dataset.map(_parse_tfrecord)
    return final


load_tfrecord_dataset(train_record_file,416)

我使用了for循环来查看我的数据是否有问题，并且tf.sparse.to_dense与for循环完美地完成了工作，但是当我使用.map(_parse_tfrecord)时，它给了我错误我上面写的。

在_parse_tfrecord（x）中打印x ['image / object / bBox / xmin']的结果：

SparseTensor(indices=Tensor("DeserializeSparse_1:0",shape=(None,1),dtype=int64),values=Tensor("DeserializeSparse_1:1",),dtype=float32)

在for循环中打印x ['image / object / bBox / xmin']的结果：

SparseTensor(indices=[[0]
 [1]
 [2]
 ...
 [4]
 [5]
 [6]],values=[0.115384616 0.432692319 0.75 ... 0.581730783 0.0817307681 0.276442319],shape=[7])

我的for循环：

for x in parsed_dataset:
    tf.print(x['image/object/bBox/xmin'])
    break

这是我的错？

解决方法

问题在于labels的形状为()，即零维（它是一个标量），而您尝试堆叠的所有稀疏张量都是一维的。您应该制作一个label张量，其形状与框数据张量相同：

# Assuming all box data tensors have the same shape
box_data_shape = tf.shape(x['image/object/bbox/xmin'])
# Make label data
labels = tf.ones(box_data_shape,dtype=tf.float32)

除此之外，由于您要解析单个示例，因此所有稀疏张量都应是一维且连续的，因此您可以将转换保存为密集的，而只需采用它们的.values：

y_train = tf.stack([x['image/object/bbox/xmin'].values,x['image/object/bbox/ymin'].values,x['image/object/bbox/xmax'].values,x['image/object/bbox/ymax'].values,labels],axis=1)

object-detection python tensorflow tensorflow2.0