如何使用源代码类 RobertaSelfAttention(nn.Module):

问题描述

我想从下面的代码获取查询和密钥。这部分来自https://huggingface.co/transformers/_modules/transformers/models/roberta/modeling_roberta.html#RobertaModel 最后,我想在这paper 中重现图 1 有人可以就他们将如何做提供直观的说明吗?

class RobertaSelfAttention(nn.Module):
    def __init__(self,config):
        super().__init__()
        if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config,"embedding_size"):
            raise ValueError(
                f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
                f"heads ({config.num_attention_heads})"
            )

        self.num_attention_heads = config.num_attention_heads
        self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
        self.all_head_size = self.num_attention_heads * self.attention_head_size

        self.query = nn.Linear(config.hidden_size,self.all_head_size)
        self.key = nn.Linear(config.hidden_size,self.all_head_size)
        self.value = nn.Linear(config.hidden_size,self.all_head_size)

        self.dropout = nn.Dropout(config.attention_probs_dropout_prob)
        self.position_embedding_type = getattr(config,"position_embedding_type","absolute")
        if self.position_embedding_type == "relative_key" or self.position_embedding_type == "relative_key_query":
            self.max_position_embeddings = config.max_position_embeddings
            self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1,self.attention_head_size)

        self.is_decoder = config.is_decoder

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)