问题描述
我尝试总结文本正文,控制台向我显示max()arg是一个空序列,用于计算每个令牌中的最大频率单词,我的数据为20000,代码在前7000行上工作正常
for i in range(0,len(data)):
doc = nlp(data.text[i])
tokens = [token.text for token in doc]
word_freq = {}
stop_words= list(STOP_WORDS)
for word in doc:
if word.text.lower() not in stop_words:
if word.text.lower() not in punctuation:
if word.text not in word_freq.keys():
word_freq[word.text] = 1
else :
word_freq[word.text] += 1
#the ERROR
max_freq =max(word_freq.values())
for word in word_freq.keys():
word_freq[word] = word_freq[word] / max_freq
sent_tokens = [sent for sent in doc.sents]
sent_score={}
for sent in sent_tokens:
for word in sent:
if word.text.lower() in word_freq.keys():
if sent not in sent_score.keys():
sent_score[sent] = word_freq[word.text.lower()]
else :
sent_score[sent] += word_freq[word.text.lower()]
from heapq import nlargest
N=int(len(sent_score)* 0.1)
summary = nlargest(n = N,iterable = sent_score,key = sent_score.get)
final_summary = [word.text for word in summary]
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)