批次大小不断变化,抛出“ Pytorch值错误预期:输入批次大小与目标批次大小不匹配”

问题描述

我正在与Bert一起进行多标签文本分类任务。

以下是用于生成可迭代数据集的代码

from torch.utils.data import TensorDataset,DataLoader,RandomSampler,SequentialSampler

train_set = TensorDataset(X_train_id,X_train_attention,y_train)
test_set = TensorDataset(X_test_id,X_test_attention,y_test)

train_DataLoader = DataLoader(
    train_set,sampler = RandomSampler(train_set),drop_last=True,batch_size=13
)

test_DataLoader = DataLoader(
    test_set,sampler = SequentialSampler(test_set),batch_size=13
)

以下是训练集的尺寸:

在[]

print(X_train_id.shape)
print(X_train_attention.shape)
print(y_train.shape)

出[]

torch.Size([262754,512])
torch.Size([262754,34])

应该有262754行,每行512列。输出应从34个可能的标签中预测值。我将它们分为13个批次。

培训代码

optimizer = AdamW(model.parameters(),lr=2e-5)
# Training
def train(model):
    model.train()
    train_loss = 0
    for batch in train_DataLoader:
        b_input_ids = batch[0].to(device)
        b_input_mask = batch[1].to(device)
        b_labels = batch[2].to(device)
        optimizer.zero_grad()
        loss,logits = model(b_input_ids,token_type_ids=None,attention_mask=b_input_mask,labels=b_labels)
        loss.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(),1.0)
        optimizer.step()
        train_loss += loss.item()
    return train_loss


# Testing
def test(model):
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for batch in test_DataLoader:
            b_input_ids = batch[0].to(device)
            b_input_mask = batch[1].to(device)
            b_labels = batch[2].to(device)
            with torch.no_grad():        
                (loss,logits) = model(b_input_ids,labels=b_labels)
            val_loss += loss.item()
    return val_loss

# Train task
max_epoch = 1
train_loss_ = []
test_loss_ = []

for epoch in range(max_epoch):
    train_ = train(model)
    test_ = test(model)
    train_loss_.append(train_)
    test_loss_.append(test_)

出[]

Expected input batch_size (13) to match target batch_size (442).

这是我的模型的描述:

from transformers import BertForSequenceClassification,AdamW,BertConfig

model = BertForSequenceClassification.from_pretrained(
    "cl-tohoku/bert-base-japanese-whole-word-masking",# 日本語Pre trainedモデル
    num_labels = 34,output_attentions = False,output_hidden_states = False,)

我已经明确指出我希望批次大小为13。但是,在训练过程中pytorch会引发运行时错误

数字442甚至来自哪里?我已经明确指出,我希望每个批次的大小为13行。

我已经确认每个批次的input_id的尺寸为[13,512],注意张量的尺寸为[13,512],标签的尺寸为[13,34]。

我曾尝试在初始化DataLoader时探入并使用442的批处理大小,但是在一次批处理迭代之后,它抛出了另一个Pytorch Value Error Expected: input batch size does not match target batch size,这次显示

ValueError: Expected input batch_size (442) to match target batch_size (15028).

为什么批量大小不断变化?这个数字15028到底是哪里来的?

以下是我浏览过的一些答案,但是在应用到我的源代码时没有运气:

https://discuss.pytorch.org/t/valueerror-expected-input-batch-size-324-to-match-target-batch-size-4/24498

https://discuss.pytorch.org/t/valueerror-expected-input-batch-size-1-to-match-target-batch-size-64/43071

Pytorch CNN error: Expected input batch_size (4) to match target batch_size (64)

先谢谢了。非常感谢您的支持:)

解决方法

根据documentation,该模型似乎无法处理多目标方案:

标签(形状为(batch_size,)的Torch.LongTensor,可选)–用于计算序列分类/回归损失的标签。索引应位于[0,...,config.num_labels-1]中。如果config.num_labels == 1,则计算回归损失(均方差);如果config.num_labels> 1,则分类损失(交叉熵)。

因此,您需要准备标签,使其形状为batch_sizetorch.Size([batch_size]),且类索引的范围为[0,...,config.num_labels - 1],就像原始pytorch的{ {3}}(请参阅示例部分)。