编写一个函数来查找布朗语料库中不是停用词的10个最常用词汇

问题描述

全部。数学研究生。这是我第一次真正尝试编码...

这是我写的:

lowerbrown = [w.lower() for w in brown.words()]
stopwords = nltk.corpus.stopwords.words('english')
notstopwords = [w for w in lowerbrown if w not in stopwords]
cfd = nltk.ConditionalFreqdist(
    (genre,word)
    for genre in brown.categories()
    for word in notstopwords)

genres = ['news','editorial','reviews','religion','hobbies','lore','belles_lettres','government','learned','fiction','mystery','science_fiction','adventure','romance','humor']

result = cfd()
cfd[10:]

这是编译后的结果:

TypeError                                 Traceback (most recent call last)
<ipython-input-43-0b8092b42663> in <module>
      9 genres = ['news','humor']
     10 
---> 11 result = cfd()
     12 cfd[10:]

TypeError: 'ConditionalFreqdist' object is not callable

我确定我的代码只是...一团糟。但是,对此我将非常感谢。

更新:我针对类似的问题查看了解决方案,并修改了我的代码

lowerbrown = [w.lower() for w in brown.words()]
stopwords = nltk.corpus.stopwords.words('english')
notstopwords = [w for w in lowerbrown if w not in stopwords]
cfd = nltk.ConditionalFreqdist(
    (genre,word)
    for genre in brown.categories()
    for word in notstopwords)

genres = brown.categories()
cfd.tabulate(conditions=genres)

但是我不确定如何修改代码以仅给我十个最常用的词汇。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)