TypeError:“ ...”的类型为str,但应为以下类型之一:字节

问题描述

我正在尝试在开放图像数据集(v6)上使用基本的Tensorflow边界框对象检测...

  File "/home/work/models/research/object_detection/dataset_tools/create_oid_tf_record.py",line 115,in main
    tf_example = oid_tfrecord_creation.tf_example_from_annotations_data_frame(
  File "/root/anaconda3/lib/python3.8/site-packages/object_detection/dataset_tools/oid_tfrecord_creation.py",line 71,in tf_example_from_annotations_data_frame
    dataset_util.bytes_feature('{}.jpg'.format(image_id)),File "/root/anaconda3/lib/python3.8/site-packages/object_detection/utils/dataset_util.py",line 33,in bytes_feature
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
TypeError: '000411001ff7dd4f.jpg' has type str,but expected one of: bytes

相关代码似乎在这里

def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

我以为value=[value.encode()]可能会解决它,但是它说:

AttributeError: 'bytes' object has no attribute 'encode'

(那是TF?字节还是str?)

输入文件中的行包含:

ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,XClick1X,XClick2X,XClick3X,XClick4X,XClick1Y,XClick2Y,XClick3Y,XClick4Y
000411001ff7dd4f,xclick,/m/09b5t,1,0.1734375,0.46875,0.19791667,0.7916667,0

TFRecord的功能图:

feature_map = {
standard_fields.TfExampleFields.object_bBox_ymin:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.YMin.to_numpy()),standard_fields.TfExampleFields.object_bBox_xmin:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.XMin.to_numpy()),standard_fields.TfExampleFields.object_bBox_ymax:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.YMax.to_numpy()),standard_fields.TfExampleFields.object_bBox_xmax:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.XMax.to_numpy()),standard_fields.TfExampleFields.object_class_text:
dataset_util.bytes_list_feature(
filtered_data_frame_Boxes.LabelName.to_numpy()),standard_fields.TfExampleFields.object_class_label:
dataset_util.int64_list_feature(
filtered_data_frame_Boxes.LabelName.map(lambda x: label_map[x])
.to_numpy()),standard_fields.TfExampleFields.filename:
dataset_util.bytes_feature('{}.jpg'.format(image_id)),standard_fields.TfExampleFields.source_id:
dataset_util.bytes_feature(image_id),standard_fields.TfExampleFields.image_encoded:
dataset_util.bytes_feature(encoded_image),}

有什么主意吗?我安装了pip3,并且在此之前必须修复许多软件包弃用错误

pip3 install tensorflow
pip3 install tensorflow-object-detection-api

编辑:

版本:

tensorflow                         2.3.1
tensorflow-object-detection-api    0.1.1

我尝试过

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(bytes(('{}.jpg'.format(image_id)),'ascii')),

但它得到以下信息:

TypeError: '000411001ff7dd4f' has type str,but expected one of: bytes

(。jpg到哪里去了?)

解决方法

TypeError似乎是指在此行中创建的文件名:

dataset_util.bytes_feature('{}.jpg'.format(image_id)),

也许此图像名称应为字节,如下所示:

dataset_util.bytes_feature('{}.jpg'.format(image_id).encode()),
,

因此,理想情况下,Karl是正确的,我们“绝对 不要试图通过编辑库代码来解决问题。”

但是看来我的库代码目前无法正常工作。找到了相关的github issue bug

并使用.encode()对其进行了修复似乎bytes()解决方案也有效。丢失的“ .jpg”是因为它移至下一次并在那里死亡。

固定代码:

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(('{}.jpg'.format(image_id)).encode('utf-8')),standard_fields.TfExampleFields.source_id:
      dataset_util.bytes_feature(image_id.encode('utf-8')),

只需移至下一个错误。

TypeError: '/m/09b5t' has type str,but expected one of: bytes

相同错误,但适用于bytes_list_feature()呼叫。通过更改来解决:

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.to_numpy()),

收件人:

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.map(lambda x: x.encode('utf8')).to_numpy()()),

遵循拉取请求:https://github.com/tensorflow/models/pull/4771/files(并且还必须将as_matrix()更改为to_numpy()