从 BertForSequenceClassification 中提取特征

问题描述

大家好，目前我正在尝试开发一个用于检测矛盾的模型。使用和微调 BERT 模型我已经得到了相当满意的结果，但我认为使用其他一些功能我可以获得更好的准确性。我把自己定位在这个Tutorial上。微调后，我的模型是这样的：

==== Embedding Layer ====

bert.embeddings.word_embeddings.weight                  (30000,768)
bert.embeddings.position_embeddings.weight                (512,768)
bert.embeddings.token_type_embeddings.weight                (2,768)
bert.embeddings.Layernorm.weight                              (768,)
bert.embeddings.Layernorm.bias                                (768,)

==== First Transformer ====

bert.encoder.layer.0.attention.self.query.weight          (768,768)
bert.encoder.layer.0.attention.self.query.bias                (768,)
bert.encoder.layer.0.attention.self.key.weight            (768,768)
bert.encoder.layer.0.attention.self.key.bias                  (768,)
bert.encoder.layer.0.attention.self.value.weight          (768,768)
bert.encoder.layer.0.attention.self.value.bias                (768,)
bert.encoder.layer.0.attention.output.dense.weight        (768,768)
bert.encoder.layer.0.attention.output.dense.bias              (768,)
bert.encoder.layer.0.attention.output.Layernorm.weight        (768,)
bert.encoder.layer.0.attention.output.Layernorm.bias          (768,)
bert.encoder.layer.0.intermediate.dense.weight           (3072,768)
bert.encoder.layer.0.intermediate.dense.bias                 (3072,)
bert.encoder.layer.0.output.dense.weight                 (768,3072)
bert.encoder.layer.0.output.dense.bias                        (768,)
bert.encoder.layer.0.output.Layernorm.weight                  (768,)
bert.encoder.layer.0.output.Layernorm.bias                    (768,)

==== Output Layer ====

bert.pooler.dense.weight                                  (768,768)
bert.pooler.dense.bias                                        (768,)
classifier.weight                                           (2,768)
classifier.bias                                                 (2,)

我的下一步是从这个模型中获取 [CLS] 令牌，将其与一些手工制作的特征结合起来，并将它们输入到不同的模型 (MLP) 中进行分类。任何提示如何做到这一点？

解决方法

您可以使用 bert 模型的池化输出（输入到池化层的 [CLS] 令牌的上下文嵌入）：

summary(glm(status ~ 0 + acidic,data=mtrx,family=binomial))

Call:
glm(formula = status ~ 0 + acidic,family = binomial,data = mtrx)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.31529  -1.29325   0.06384   1.06582   1.46140  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)  
acidiclemon    0.3185     0.3286   0.969   0.3324  
acidiclime    -0.6466     0.3722  -1.737   0.0823 .
acidicorange   0.2683     0.3684   0.728   0.4665

bert-language-model huggingface-transformers nlp python

从 BertForSequenceClassification 中提取特征

问题描述

解决方法

相关问答