LDA和CorEx之间的比较，如何？甚至有可能吗？

问题描述

我正在学习和测试20个新闻组数据上的LDA和CorEx以进行主题建模：我一直在使用以下教程：CorEx，LDA gensim。

我没有统计背景，但是我注意到在评估方面，LDA使用一致性，uMass和困惑性，而CorEx使用总相关性。

一致性和总体相关性可比吗？如果没有，我该如何比较这两种方法？

===更新===

我在github CorEx repository上找到了这段代码。我用它来衡量CorEx的一致性。

假定您将文档作为文档列表，每个文档都是一个令牌列表，以及一个训练有素的corex模型，称为corex_model。

from gensim.models.coherencemodel import CoherenceModel
from gensim import corpora

# Creating the term dictionary,where every unique term is assigned an index
dictionary = corpora.Dictionary(documents)
 
# Creating corpus using dictionary prepared above
corpus = [dictionary.doc2bow(doc) for doc in tqdm(documents)]

# Get top words for each topic from the trained corex model
topics = corex_model.get_topics(n_words=100)
corex_topic_words = [[word for word,tc in topic] for topic in topics]

# Get coherence score
cm_corex = CoherenceModel(topics=corex_topic_words,texts=documents,corpus=corpus,dictionary=dictionary,coherence='c_v')
cm_corex.get_coherence()

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

comparison comparison evaluation lda topic-modeling