使用 Google Colaboratory TPU 训练 Keras+Tensorflow 模型

问题描述

目标

我想用 Google Colab TPU 训练 EfficientNet 模型。但是,发生了以下错误。请教我具体情况。我总是使用 ImageDataGenerator,但我认为我必须使用 tf.data 才能使用 Google Colab TPU,因为如果我使用 ImageDataGenerator,我将无法使用 Google Colab TPU。

我想ImageDatagenerator.flow_from_dataframe。但我做不到,因为 Colab TPU。你能解决这个问题吗?具体情况请赐教。

代码

import tensorflow as tf
from keras.utils import to_categorical

AUTOTUNE = tf.data.experimental.AUTOTUNE
IMAGE_SIZE = 240
TRAIN_IMAGE_COUNT = len(df_train)
VAL_IMAGE_COUNT = len(df_val)
BATCH_SIZE = 32

def preprocess_image(path,label=None):
  image = tf.io.read_file(path)
  image = tf.image.decode_jpeg(image,channels=3)
  image = tf.image.resize(image,[IMAGE_SIZE,IMAGE_SIZE])
  image = tf.cast(image,tf.float32)/255.0

  return image,label

data_train = tf.data.Dataset.from_tensor_slices((df_train['image_path'],df_train['label']))
data_train = data_train.map(preprocess_image)
data_train = data_train.shuffle(buffer_size=TRAIN_IMAGE_COUNT)
data_train = data_train.repeat()
data_train = data_train.batch(BATCH_SIZE)
data_train = data_train.prefetch(buffer_size=AUTOTUNE)
data_train = data_train.cache()

data_val = tf.data.Dataset.from_tensor_slices((df_val['image_path'],df_val['label']))
data_val = data_val.map(preprocess_image)
data_val = data_val.shuffle(buffer_size=VAL_IMAGE_COUNT)
data_val = data_val.repeat()
data_val = data_val.batch(BATCH_SIZE)
data_val = data_val.prefetch(buffer_size=AUTOTUNE)
data_val = data_val.cache()
%tensorflow_version 2.x
import tensorflow as tf
print("Tensorflow version " + tf.__version__)

try:
  tpu = tf.distribute.cluster_resolver.TPUClusterResolver()  # TPU detection
  print('Running on TPU ',tpu.cluster_spec().as_dict()['worker'])
except ValueError:
  raise BaseException('ERROR: Not connected to a TPU runtime; please see the prevIoUs cell in this notebook for instructions!')

tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.TPUStrategy(tpu)
from tensorflow.keras.applications import EfficientNetB1
from tensorflow.keras.layers import GlobalAveragePooling2D,Dense,Dropout
from tensorflow.keras import optimizers

TRAIN_STEPS_PER_EPOCH = tf.math.ceil(len(df_train)/BATCH_SIZE).numpy()
VAL_STEPS_PER_EPOCH = tf.math.ceil(len(df_val)/BATCH_SIZE).numpy()
EPOCS = 1

with strategy.scope():
  model = tf.keras.Sequential()

  conv_base = EfficientNetB1(weights='imagenet',input_shape=(IMAGE_SIZE,IMAGE_SIZE,3),include_top=False)

  model.add(conv_base)
  model.add(GlobalAveragePooling2D())
  model.add(Dense(5,activation='softmax'))

  model.compile(loss='sparse_categorical_crossentropy',optimizer=optimizers.Adagrad(),metrics=['accuracy'])

print('### START TRAINING ###')

history = model.fit(data_train,steps_per_epoch=TRAIN_STEPS_PER_EPOCH,epochs=EPOCS,validation_data=data_val,validation_steps=VAL_STEPS_PER_EPOCH)

print('### FINISH TRAINING ###')

错误信息

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-4-a67e6b797143> in <module>()
     20 print('### START TRAINING ###')
     21 
---> 22 history = model.fit(data_train,validation_steps=VAL_STEPS_PER_EPOCH)
     23 
     24 print('### FINISH TRAINING ###')

5 frames
/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value,from_value)

InvalidArgumentError: Unable to parse tensor proto

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)