问题描述
我在使用相干模型时遇到问题
我的代码是
def compute_coherence_values(dictionary,corpus,texts,limit,start,step):
coherence_values = []
model_list = []
for num_topics in range(start,step):
model = gensim.models.ldamodel.Ldamodel(corpus=corpus,id2word=id2word,num_topics=num_topics)
model_list.append(model)
coherencemodel = CoherenceModel(model=model,texts=texts,dictionary=dictionary,coherence="c_v")
coherence_values.append(coherencemodel.get_coherence())
return model_list,coherence_values
coherence_values = []
model_list = []
# topic number
nt = pre_nt
start_ = nt;
limit_ = nt + 1;
step_ = 1;
model_list1,coherence_values1 = compute_coherence_values(dictionary=id2word,corpus=corpus,texts=texts_wi_new,start=start_,limit=limit_,step=step_)
错误是
Traceback (most recent call last):
File "<string>",line 1,in <module>
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\spawn.py",line 105,in spawn_main
Traceback (most recent call last):
File "<input>",line 3,in <module>
File "<input>",line 92,in compute_coherence_values
File "D:\All Python\venv\lib\site-packages\gensim\models\coherencemodel.py",line 609,in get_coherence
confirmed_measures = self.get_coherence_per_topic()
File "D:\All Python\venv\lib\site-packages\gensim\models\coherencemodel.py",line 569,in get_coherence_per_topic
self.estimate_probabilities(segmented_topics)
File "D:\All Python\venv\lib\site-packages\gensim\models\coherencemodel.py",line 541,in estimate_probabilities
self._accumulator = self.measure.prob(**kwargs)
File "D:\All Python\venv\lib\site-packages\gensim\topic_coherence\probability_estimation.py",line 156,in p_boolean_sliding_window
return accumulator.accumulate(texts,window_size)
File "D:\All Python\venv\lib\site-packages\gensim\topic_coherence\text_analysis.py",line 444,in accumulate
workers,input_q,output_q = self.start_workers(window_size)
File "D:\All Python\venv\lib\site-packages\gensim\topic_coherence\text_analysis.py",line 478,in start_workers
worker.start()
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\process.py",line 112,in start
self._popen = self._Popen(self)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\context.py",line 223,in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\context.py",line 322,in _Popen
return Popen(process_obj)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\popen_spawn_win32.py",line 89,in __init__
reduction.dump(process_obj,to_child)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\reduction.py",line 60,in dump
ForkingPickler(file,protocol).dump(obj)
brokenPipeError: [Errno 32] broken pipe
exitcode = _main(fd)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\spawn.py",line 114,in _main
prepare(preparation_data)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\spawn.py",line 225,in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\multiprocessing\spawn.py",line 277,in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\runpy.py",line 261,in run_path
code,fname = _get_code_from_file(run_name,path_name)
File "C:\Users\lee96\AppData\Local\Programs\Python\python37\Lib\runpy.py",line 231,in _get_code_from_file
with open(fname,"rb") as f:
OSError: [Errno 22] Invalid argument: 'D:\\All Python\\<input>'
此部分发生错误
coherencemodel.get_coherence()
我用pycharm。 我该怎么解决?
对不起 看起来您的帖子大部分是代码;请添加更多详细信息。 看起来您的帖子大部分是代码;请添加更多详细信息。 看起来您的帖子大部分是代码;请添加更多详细信息。
解决方法
我在使用完全相同的代码时遇到了完全相同的问题。当我从Spyder IDE运行该代码时,它工作得很好,但是当我将其插入Power BI时,它会出错。到目前为止,我已经将其从功能中分解出来并循环到下面的基本行中。 LDA和Coherence模型运行良好,但是由于某种原因,调用get_coherence()时会出错。
model = gensim.models.ldamodel.LdaModel(corpus,num_topics=5,id2word=dictionary,passes=10)
coherencemodel = CoherenceModel(model=model,texts=texts,dictionary=dictionary,coherence='c_v')
test = coherencemodel.get_coherence()
以下是我收到的错误消息的一部分:
,RuntimeError:已尝试在启动新进程之前 当前过程已完成其引导阶段。
这可能意味着您没有使用fork来启动您的孩子 流程,而您忘记了在主界面中使用适当的习惯用法 模块:
if __name__ == '__main__': freeze_support() ...
如果程序不执行,则可以省略“ freeze_support()”行 被冻结以生成可执行文件。
详细信息: DataSourceKind = Python DataSourcePath = Python Message = Python脚本错误。
我对此进行了更多研究,并发现了其他一些对我有帮助的文章,但最终似乎错误与Windows框架中的多处理有关。
where to put freeze_support() in a Python script? https://docs.python.org/2/library/multiprocessing.html#windows
对我有用的是,我将所有代码放在下面的代码行下:
if __name__ == '__main__':
freeze_support()
model_list,coherence_values = compute_coherence_values(dictionary=dictionary,corpus=corpus,start=start,limit=limit,step=step)
max_value = max(coherence_values)
max_index = coherence_values.index(max_value)
best_model = model_list[max_index]
ldamodel= best_model
我不是Python上最出色的开发人员,但是我可以根据需要工作。如果其他人有更好的建议,我将无所不在:)