TypeError：“ ...”的类型为str，但应为以下类型之一：字节

问题描述

我正在尝试在开放图像数据集（v6）上使用基本的Tensorflow边界框对象检测...

  File "/home/work/models/research/object_detection/dataset_tools/create_oid_tf_record.py",line 115,in main
    tf_example = oid_tfrecord_creation.tf_example_from_annotations_data_frame(
  File "/root/anaconda3/lib/python3.8/site-packages/object_detection/dataset_tools/oid_tfrecord_creation.py",line 71,in tf_example_from_annotations_data_frame
    dataset_util.bytes_feature('{}.jpg'.format(image_id)),File "/root/anaconda3/lib/python3.8/site-packages/object_detection/utils/dataset_util.py",line 33,in bytes_feature
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
TypeError: '000411001ff7dd4f.jpg' has type str,but expected one of: bytes

相关代码似乎在这里：

def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

我以为value=[value.encode()]可能会解决它，但是它说：

AttributeError: 'bytes' object has no attribute 'encode'

（那是TF？字节还是str？）

输入文件中的行包含：

ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,XClick1X,XClick2X,XClick3X,XClick4X,XClick1Y,XClick2Y,XClick3Y,XClick4Y
000411001ff7dd4f,xclick,/m/09b5t,1,0.1734375,0.46875,0.19791667,0.7916667,0

TFRecord的功能图：

feature_map = {
standard_fields.TfExampleFields.object_bBox_ymin:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.YMin.to_numpy()),standard_fields.TfExampleFields.object_bBox_xmin:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.XMin.to_numpy()),standard_fields.TfExampleFields.object_bBox_ymax:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.YMax.to_numpy()),standard_fields.TfExampleFields.object_bBox_xmax:
dataset_util.float_list_feature(
filtered_data_frame_Boxes.XMax.to_numpy()),standard_fields.TfExampleFields.object_class_text:
dataset_util.bytes_list_feature(
filtered_data_frame_Boxes.LabelName.to_numpy()),standard_fields.TfExampleFields.object_class_label:
dataset_util.int64_list_feature(
filtered_data_frame_Boxes.LabelName.map(lambda x: label_map[x])
.to_numpy()),standard_fields.TfExampleFields.filename:
dataset_util.bytes_feature('{}.jpg'.format(image_id)),standard_fields.TfExampleFields.source_id:
dataset_util.bytes_feature(image_id),standard_fields.TfExampleFields.image_encoded:
dataset_util.bytes_feature(encoded_image),}

有什么主意吗？我安装了pip3，并且在此之前必须修复许多软件包弃用错误。

pip3 install tensorflow
pip3 install tensorflow-object-detection-api

编辑：

版本：

tensorflow                         2.3.1
tensorflow-object-detection-api    0.1.1

我尝试过

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(bytes(('{}.jpg'.format(image_id)),'ascii')),

但它得到以下信息：

TypeError: '000411001ff7dd4f' has type str,but expected one of: bytes

（。jpg到哪里去了？）

解决方法

TypeError似乎是指在此行中创建的文件名：

dataset_util.bytes_feature('{}.jpg'.format(image_id)),

也许此图像名称应为字节，如下所示：

dataset_util.bytes_feature('{}.jpg'.format(image_id).encode()),

因此，理想情况下，Karl是正确的，我们“绝对不要试图通过编辑库代码来解决问题。”

但是看来我的库代码目前无法正常工作。找到了相关的github issue bug。

并使用.encode（）对其进行了修复似乎bytes（）解决方案也有效。丢失的“ .jpg”是因为它移至下一次并在那里死亡。

固定代码：

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(('{}.jpg'.format(image_id)).encode('utf-8')),standard_fields.TfExampleFields.source_id:
      dataset_util.bytes_feature(image_id.encode('utf-8')),

只需移至下一个错误。

TypeError: '/m/09b5t' has type str,but expected one of: bytes

相同错误，但适用于bytes_list_feature()呼叫。通过更改来解决：

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.to_numpy()),

收件人：

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.map(lambda x: x.encode('utf8')).to_numpy()()),

遵循拉取请求：https://github.com/tensorflow/models/pull/4771/files（并且还必须将as_matrix()更改为to_numpy()）

object-detection object-detection-api python tensorflow tensorflow tensorflow tensorflow2.0