建立具有余弦邻近度的Keras文字嵌入模型

问题描述

我正在尝试构建一个单词嵌入keras模型，在该模型中，我将输入的文本转换为相应的输入id和掩码（例如输入到Albert模型的输入），然后将768维矢量作为输出。我计划在这里使用的是Albert层，然后是LSTM层和致密层，以返回向量。至于目标变量，有768维向量。我想使用余弦近似函数之类的东西来确定模型的权重。我的总体架构是这样的。但是，在训练了模型一段时间之后，当我测试模型时，所有输入的输出向量几乎相同。为了使模型正常工作，我需要更改一些东西吗？

max_seq_length = 400
in_id = Input(shape=(max_seq_length,),name="input_ids")
in_mask = Input(shape=(max_seq_length,name="input_masks")
in_segment = Input(shape=(max_seq_length,name="segment_ids")

albert_inputs = [in_id,in_mask,in_segment]   
albert_output = AlbertLayer(n_fine_tune_layers=3,pooling="first")(albert_inputs)
x = RepeatVector(1)(albert_output)
x = LSTM(units=512,return_sequences=False,recurrent_dropout=0.3,dropout=0.3)(x)
x = Flatten()(x)
embedding_output = Dense(768)(x)

model = Model(inputs=albert_inputs,outputs=embedding_output)
model.compile(loss=cosine_proximity,optimizer='adam')

对于目标变量，我为每个训练实例都有相应的向量。对于一个输入向量，可能有多个目标向量。在这种情况下，我将单独的训练实例用于输入向量。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

deep-learning keras nlp word-embedding