使用多个 if 条件和 next() 时停止迭代错误/异常/错误 编辑原始回复

问题描述

在这里使用了 jupyter notebook。

代码来自 YouTube 视频。它在 youtuber 的计算机上工作,但我的引发了 stopiteration 错误

在这里,我试图获取与“Go”语言相关的所有标题(来自 csv 的问题)

import pandas as pd

df = pd.read_csv("Questions.csv",encoding = "ISO-8859-1",usecols = ["Title","Id"])

titles = [_ for _ in df.loc[lambda d: d['Title'].str.lower().str.contains(" go "," golang ")]['Title']]

#新单元格

import spacy

nlp = spacy.load("en_core_web_sm",disable= ["ner"])

#新单元格

def has_golang(text):
    doc = nlp(text)
    for t in doc:    
        if t.lower_ in [' go ','golang']:
            if t.pos_ != 'VERB':
                if t.dep_ == 'pobj':
                    return True
    return False

g = (title for title in titles if has_golang(title))
[next(g) for i in range(10)]

#这是错误

stopiteration                             Traceback (most recent call last)
<ipython-input-56-862339d10dde> in <module>
      9 
     10 g = (title for title in titles if has_golang(title))
---> 11 [next(g) for i in range(10)]

<ipython-input-56-862339d10dde> in <listcomp>(.0)
      9 
     10 g = (title for title in titles if has_golang(title))
---> 11 [next(g) for i in range(10)]

stopiteration: 

就我所做的研究而言,我认为这可能是一个错误

我想要做的就是获得满足 3 个“if”条件的标题

link to the youtube video

解决方法

StopIteration 是在耗尽的迭代器上调用 next() 的结果,即 g 产生的结果少于 10 个。您可以从 help() 函数获取此信息。

help(next)
Help on built-in function next in module builtins:
next(...)
    next(iterator[,default])
    
    Return the next item from the iterator. If default is given and the iterator
    is exhausted,it is returned instead of raising StopIteration.

编辑

您的 has_golang 不正确。第一个测试总是 False 因为 nlp 标记单词,即修剪前导和尾随空格。试试这个:

def has_golang(text):
    doc = nlp(text)
    for t in doc:    
        if t.lower_ in ['go','golang']:
            if t.pos_ != 'VERB':
                if t.dep_ == 'pobj':
                    return True
    return False

我通过找到一个标题来解决这个问题,该标题应该导致 Truehas_golang。然后我运行了以下代码:

doc = nlp("Making a Simple FileServer with Go and Localhost Refused to Connect")
print("\n".join(str((t.lower_,t.pos_,t.dep_)) for t in doc))
('making','VERB','csubj')
('a','DET','det')
('simple','PROPN','compound')
('fileserver','dobj')
('with','ADP','prep')
('go','pobj')
('and','CCONJ','cc')
('localhost','conj')
('refused','ROOT')
('to','PART','aux')
('connect','xcomp')

然后看('go','pobj'),很明显PROPN不是动词,pobj是pobj,所以问题必须出在token上:go,特别是"go"而不是" go "


原始回复

如果您只想要满足 3 个 if 条件的标题,请跳过生成器:

g = list(filter(has_golang,titles))

如果你需要生成器但也想要一个列表:

g = (title for title in titles if has_golang(title))
list(g)