在 tensorflow 的 `BERT` 中使用 `keras.Model.fit` 时，维度不匹配

问题描述

我按照Fine-tuning BERT的说明，用我自己的数据集（有点大，大于20G）建立模型，然后采取步骤重新编码我的数据并从{{1}加载它们}} 文件。我创建的 tf_record 与指令中的签名相同

training_dataset

其中 training_dataset.element_spec ({'input_word_ids': TensorSpec(shape=(32,1024),dtype=tf.int32,name=None),'input_mask': TensorSpec(shape=(32,'input_type_ids': TensorSpec(shape=(32,name=None)},TensorSpec(shape=(32,),name=None)) 是 32，batch_size 是 1024。正如说明所示，

max_seq_length

看起来一切都按预期工作，（虽然指令没有显示如何使用The resulting tf.data.Datasets return (features,labels) pairs,as expected by keras.Model.fit）但是，下面的代码

training_dataset

遇到一个我觉得很奇怪的错误，

bert_classifier.fit(
    x = training_dataset,validation_data=test_dataset,# has the same signature just as training_dataset
    batch_size=32,epochs=epochs,verbose=1,)

与 512 无关，我的代码中也没有使用 512。那么我的代码哪里出了问题以及如何解决？

解决方法

他们基于从 bert_classifier 加载的 bert_config_file 创建了 bert_config.json

bert_classifier,bert_encoder = bert.bert_models.classifier_model(bert_config,num_labels=2)

bert_config.json

{
'attention_probs_dropout_prob': 0.1,'hidden_act': 'gelu','hidden_dropout_prob': 0.1,'hidden_size': 768,'initializer_range': 0.02,'intermediate_size': 3072,'max_position_embeddings': 512,'num_attention_heads': 12,'num_hidden_layers': 12,'type_vocab_size': 2,'vocab_size': 30522
}

根据此配置，hidden_size 为 768，max_position_embeddings 为 512，因此用于提供给 BERT 模型的输入数据必须与描述的形状相同。它解释了您遇到形状不匹配问题的原因。

因此，要使其工作，您必须将用于创建张量输入的所有行从 1024 更改为 512。

bert-language-model nlp tensorflow tensorflow tensorflow tensorflow-data-validation tensorflow-datasets