问题描述
我想从下面的代码中获取查询和密钥。这部分来自https://huggingface.co/transformers/_modules/transformers/models/roberta/modeling_roberta.html#RobertaModel 最后,我想在这个 paper 中重现图 1 有人可以就他们将如何做提供直观的说明吗?
class RobertaSelfAttention(nn.Module):
def __init__(self,config):
super().__init__()
if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config,"embedding_size"):
raise ValueError(
f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
f"heads ({config.num_attention_heads})"
)
self.num_attention_heads = config.num_attention_heads
self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
self.all_head_size = self.num_attention_heads * self.attention_head_size
self.query = nn.Linear(config.hidden_size,self.all_head_size)
self.key = nn.Linear(config.hidden_size,self.all_head_size)
self.value = nn.Linear(config.hidden_size,self.all_head_size)
self.dropout = nn.Dropout(config.attention_probs_dropout_prob)
self.position_embedding_type = getattr(config,"position_embedding_type","absolute")
if self.position_embedding_type == "relative_key" or self.position_embedding_type == "relative_key_query":
self.max_position_embeddings = config.max_position_embeddings
self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1,self.attention_head_size)
self.is_decoder = config.is_decoder
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)