值错误:输入无效应该是字符串、字符串列表/元组或整数列表/元组

问题描述

from os import listdir
from os.path import isfile,join
from datasets import load_dataset
from transformers import BertTokenizer

test_files = [join('./test/',f) for f in listdir('./test') if isfile(join('./test',f))]

dataset = load_dataset('json',data_files={"test": test_files},cache_dir="./.cache_dir")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

def encode(batch):
  return tokenizer.encode_plus(batch["abstract"],max_length=32,add_special_tokens=True,pad_to_max_length=True,return_attention_mask=True,return_token_type_ids=False,return_tensors="pt")

dataset.set_transform(encode)

当我运行这段代码时,我有

ValueError: Input is not valid. Should be a string,a list/tuple of strings or a list/tuple of integers.

我有一个字符串列表,而不是一个字符串列表。以下是batch["article"]的内容:

[['eleven politicians from 7 parties made comments in letter to a newspaper .',"said dpp alison saunders had ` damaged public confidence ' in justice .",'ms saunders ruled lord janner unfit to stand trial over child abuse claims .','the cps has pursued at least 19 suspected paedophiles with dementia .'],['an increasing number of surveys claim to reveal what makes us happiest .','but are these generic lists really of any use to us ?','janet street-porter makes her own list - of things making her unhappy !'],["author of ` into the wild ' spoke to five rape victims in missoula,montana .","` missoula : rape and the justice system in a college town ' was released april 21 .","three of five victims profiled in the book sat down with abc 's nightline wednesday night .",'kelsey belnap,allison huguet and hillary mclaughlin said they had been raped by university of montana football players .',"huguet and mclaughlin 's attacker,beau donaldson,pleaded guilty to rape in 2012 and was sentenced to 10 years .",'belnap claimed four players gang-raped her in 2010,but prosecutors never charged them citing lack of probable cause .','mr krakauer wrote book after realizing close friend was a rape victim .'],['tesco announced a record annual loss of £ 6.38 billion yesterday .','drop in sales,one-off costs and pensions blamed for financial loss .','supermarket giant now under pressure to close 200 stores nationwide .','here,retail industry veterans,plus mail writers,identify what went wrong .'],...,['snp leader said alex salmond did not field questions over his family .',"said she was not ` moaning ' but also attacked criticism of women 's looks .",'she made the remarks in latest programme profiling the main party leaders .','ms sturgeon also revealed her tv habits and recent image makeover .','she said she relaxed by eating steak and chips on a saturday night .']]

我该如何解决这个问题?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...