问题描述
我正在尝试将令牌生成器保存为拥抱状态,以便以后可以从不需要访问互联网的容器中加载令牌生成器。
BASE_MODEL = "distilbert-base-multilingual-cased"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
tokenizer.save_vocabulary("./models/tokenizer/")
tokenizer2 = AutoTokenizer.from_pretrained("./models/tokenizer/")
但是,最后一行给出了错误:
OSError: Can't load config for './models/tokenizer3/'. Make sure that:
- './models/tokenizer3/' is a correct model identifier listed on 'https://huggingface.co/models'
- or './models/tokenizer3/' is the correct path to a directory containing a config.json file
变形金刚版本:3.1.0
How to load the saved tokenizer from pretrained model in Pytorch并没有帮助。
编辑1
由于下面的@ashwin回答,我改用save_pretrained
,但出现以下错误:
OSError: Can't load config for './models/tokenizer/'. Make sure that:
- './models/tokenizer/' is a correct model identifier listed on 'https://huggingface.co/models'
- or './models/tokenizer/' is the correct path to a directory containing a config.json file
我尝试将tokenizer_config.json
重命名为config.json
,然后收到错误消息:
ValueError: Unrecognized model in ./models/tokenizer/. Should have a `model_type` key in its config.json,or contain one of the following strings in its name: retribert,t5,mobilebert,distilbert,albert,camembert,xlm-roberta,pegasus,marian,mbart,bart,reformer,longformer,roberta,flaubert,bert,openai-gpt,gpt2,transfo-xl,xlnet,xlm,ctrl,electra,encoder-decoder
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)