Huggingface Transformer 将 logit 分数转换为概率

问题描述

我是这个领域的初学者并且被困住了。我正在按照本教程 (https://towardsdatascience.com/multi-label-multi-class-text-classification-with-bert-transformer-and-keras-c6355eccb63a) 使用 Huggingface 变换器构建多标签分类。

以下是我用来训练模型的代码。

# Name of the BERT model to use
model_name = 'bert-base-uncased'
# Max length of tokens
max_length = 100

PATH = 'uncased_L-12_H-768_A-12/'

# Load transformers config and set output_hidden_states to False
config = BertConfig.from_pretrained(PATH)
config.output_hidden_states = False

# Load BERT tokenizer
tokenizer = BertTokenizerFast.from_pretrained(PATH,local_files_only=True,config = config)
# tokenizer = BertTokenizer.from_pretrained(PATH,config = config)

# Load the Transformers BERT model
transformer_model = TFBertModel.from_pretrained(PATH,config = config,from_pt=True)

#######################################
### ------- Build the model ------- ###

# Load the MainLayer
bert = transformer_model.layers[0]

# Build your model input
input_ids = Input(shape=(None,),name='input_ids',dtype='int32')
# attention_mask = Input(shape=(max_length,name='attention_mask',dtype='int32') 
# inputs = {'input_ids': input_ids,'attention_mask': attention_mask}
inputs = {'input_ids': input_ids}

# Load the Transformers BERT model as a layer in a Keras model
bert_model = bert(inputs)[1]
dropout = Dropout(config.hidden_dropout_prob,name='pooled_output')
pooled_output = dropout(bert_model,training=False)


# Then build your model output
issue = Dense(units=len(data.U_label.value_counts()),kernel_initializer=Truncatednormal(stddev=config.initializer_range),name='issue')(pooled_output)
outputs = {'issue': issue}

# And combine it all in a model object
model = Model(inputs=inputs,outputs=outputs,name='BERT_MultiLabel_MultiClass')

# Take a look at the model
model.summary()


#######################################
### ------- Train the model ------- ###

# Set an optimizer
optimizer = Adam(
    learning_rate=5e-05,epsilon=1e-08,decay=0.01,clipnorm=1.0)

# Set loss and metrics
loss = {'issue': CategoricalCrossentropy(from_logits = True)}
# loss = {'issue': CategoricalCrossentropy()}
metric = {'issue': CategoricalAccuracy('accuracy')}

# Compile the model
model.compile(
    optimizer = optimizer,loss = loss,metrics = metric)

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
le.fit(data['U_H_Label'])

# Ready output data for the model
y_issue = to_categorical(le.transform(data['U_H_Label']))

# Tokenize the input (takes some time)
x = tokenizer(
    text=data['Input_Data'].to_list(),add_special_tokens=True,max_length=max_length,truncation=True,padding=True,return_tensors='tf',return_token_type_ids = False,return_attention_mask = True,verbose = True)

# Fit the model
history = model.fit(
    # x={'input_ids': x['input_ids'],'attention_mask': x['attention_mask']},x={'input_ids': x['input_ids']},y={'issue': y_issue},validation_split=0.2,batch_size=64,epochs=10)

当我使用 model.predict() 函数时，我想我得到了每个类的 logit 分数，并希望将它们转换为 0 到 1 的概率分数。

我在多篇博客中读到过，我必须使用 softmax 函数，但无法说明在哪里以及如何使用。如果有人可以告诉我需要哪行代码，我将不胜感激！

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

bert-language-model huggingface-transformers multilabel-classification python text-classification