问题描述
大家好,目前我正在尝试开发一个用于检测矛盾的模型。使用和微调 BERT 模型我已经得到了相当满意的结果,但我认为使用其他一些功能我可以获得更好的准确性。我把自己定位在这个Tutorial上。微调后,我的模型是这样的:
==== Embedding Layer ====
bert.embeddings.word_embeddings.weight (30000,768)
bert.embeddings.position_embeddings.weight (512,768)
bert.embeddings.token_type_embeddings.weight (2,768)
bert.embeddings.Layernorm.weight (768,)
bert.embeddings.Layernorm.bias (768,)
==== First Transformer ====
bert.encoder.layer.0.attention.self.query.weight (768,768)
bert.encoder.layer.0.attention.self.query.bias (768,)
bert.encoder.layer.0.attention.self.key.weight (768,768)
bert.encoder.layer.0.attention.self.key.bias (768,)
bert.encoder.layer.0.attention.self.value.weight (768,768)
bert.encoder.layer.0.attention.self.value.bias (768,)
bert.encoder.layer.0.attention.output.dense.weight (768,768)
bert.encoder.layer.0.attention.output.dense.bias (768,)
bert.encoder.layer.0.attention.output.Layernorm.weight (768,)
bert.encoder.layer.0.attention.output.Layernorm.bias (768,)
bert.encoder.layer.0.intermediate.dense.weight (3072,768)
bert.encoder.layer.0.intermediate.dense.bias (3072,)
bert.encoder.layer.0.output.dense.weight (768,3072)
bert.encoder.layer.0.output.dense.bias (768,)
bert.encoder.layer.0.output.Layernorm.weight (768,)
bert.encoder.layer.0.output.Layernorm.bias (768,)
==== Output Layer ====
bert.pooler.dense.weight (768,768)
bert.pooler.dense.bias (768,)
classifier.weight (2,768)
classifier.bias (2,)
我的下一步是从这个模型中获取 [CLS] 令牌,将其与一些手工制作的特征结合起来,并将它们输入到不同的模型 (MLP) 中进行分类。任何提示如何做到这一点?
解决方法
您可以使用 bert 模型的池化输出(输入到池化层的 [CLS] 令牌的上下文嵌入):
summary(glm(status ~ 0 + acidic,data=mtrx,family=binomial))
Call:
glm(formula = status ~ 0 + acidic,family = binomial,data = mtrx)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.31529 -1.29325 0.06384 1.06582 1.46140
Coefficients:
Estimate Std. Error z value Pr(>|z|)
acidiclemon 0.3185 0.3286 0.969 0.3324
acidiclime -0.6466 0.3722 -1.737 0.0823 .
acidicorange 0.2683 0.3684 0.728 0.4665