问题描述
我使用 sklearn 创建了一个使用 Tf-Idf 的文本分类器,我想使用 BERT 和 Elmo 嵌入而不是 Tf-Idf。
人们会怎么做?
我正在使用以下代码嵌入 Bert:
from flair.data import Sentence
from flair.embeddings import TransformerWordEmbeddings
# init embedding
embedding = TransformerWordEmbeddings('bert-base-uncased')
# create a sentence
sentence = Sentence('The grass is green .')
# embed words in sentence
embedding.embed(sentence)
import pandas as pd
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression
column_trans = ColumnTransformer([
('tfidf',TfidfVectorizer(),'text'),('number_scaler',MinMaxScaler(),['number'])
])
# Initialize data
data = [
['This process,however,afforded me no means of.',20,1],['another long description',21,['It never once occurred to me that the fumbling',19,0],['How lovely is spring As we looked from Windsor',18,0]
]
# Create DataFrame
df = pd.DataFrame(data,columns=['text','number','target'])
X = column_trans.fit_transform(df)
X = X.toarray()
y = df.loc[:,"target"].values
# Perform classification
classifier = LogisticRegression(random_state=0)
classifier.fit(X,y)
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)