从.ConLL文件中读取句子时，为什么会出现“ ValueError：列数不一致”的提示？

问题描述

from nltk.corpus.reader.conll import ConllCorpusReader

READER = ConllCorpusReader(root="./",fileids=".conll",columntypes=('words','pos','tree','chunk','ne','srl','ignore')
                          )

READER_sents(myConLLfile)

我正在从.conll文件中提取句子作为字符串列表。上面的代码没有报告任何错误，因此我认为每个句子都提取了一些东西。但是，当我尝试打印或在每个句子中添加POS标签时，第1007个句子之后的每个句子都会出现下面的 Value Error 。

发生了什么事？有没有办法查看那些提取但结构错误的句子？
如何正确提取句子？我猜有些标记以字符串和OBI（而不是string）的元组表示。但是对于许多句子却拥有相同的错误报告，很奇怪。
更糟糕的情况是，我能否仅提取结构良好的句子？

i = 0
for sentence in READER_sents(myConLLfile):
    print(i)
    print(sentence)
    i += 1

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-125-9c03d8d69ec0> in <module>()
      1 i = 0
----> 2 for sentence in READER.sents(myConLLfile):
      3     print(i)
      4     print(sentence)
      5     i += 1

2 frames
/usr/local/lib/python3.6/dist-packages/nltk/corpus/reader/conll.py in _read_grid_block(self,stream)
    206                 if len(row) != len(grid[0]):
    207                     raise ValueError('Inconsistent number of columns:\n%s'
--> 208                                      % block)
    209             grids.append(grid)
    210         return grids

ValueError: Inconsistent number of columns:
This    O
guy O
needs   O
his O
own O
show    O
on  O
discivery   B-corporation
Channel I-corporation
!   O

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

conll nltk python valueerror