使用 Dropout 对 HuggingFace 的 BERT 进行预测

问题描述

评估模型时 HuggingFace 中 Trainer(...) 的默认行为是禁用 Dropout。具体来说，y_pred 的 M 运行将完全相同

for i in range(M):
    logits,labels,metrics = trainer.predict(tokenized_datasets["eval"])
    y_pred = np.argmax(logits,axis=2)
    ...

现在我正在尝试应用 this this answer 引入的 Monte Carlo Dropout 技巧。这需要在对验证集进行预测时打开 Dropout打开。

我想知道我是如何实现这个目标的。任何输入表示赞赏。

解决方法

您只能将 dropout 层设置为训练：

from torch import nn

from transformers import BertModel

model= BertModel.from_pretrained('bert-base-uncased')
model.eval()

def apply_dropout(m):
    if type(m) == nn.Dropout:
        m.train()

model.apply(apply_dropout)

在 pytorch 论坛 (here,here) 中推荐。

dropout huggingface-transformers