使用Spacy自定义标记程序时ReversibleField失败

问题描述

ReversibleField可以很好地工作,而不会产生杂音

react-native-fast-image中使用tokenize=None时,一切正常

ReversibleField

使用spacy时,ReversibleField失败

但是,当我尝试使用spacy作为我的令牌生成器时,它给了我一大堆对我来说没有意义的字符串。

from torchtext.datasets import Multi30k
from torchtext.data import Field,BucketIterator,ReversibleField
import spacy

SRC = ReversibleField(tokenize=None,init_token = '<sos>',eos_token = '<eos>',lower = True,batch_first= True)

TRG = ReversibleField(tokenize=None,batch_first= True)
train_data,valid_data,test_data = Multi30k.splits(exts = ('.de','.en'),fields = (SRC,TRG))
SRC.build_vocab(train_data,min_freq = 2)
TRG.build_vocab(train_data,min_freq = 2)

device = 'cuda:2'

BATCH_SIZE = 3

train_iterator,valid_iterator,test_iterator = BucketIterator.splits(
    (train_data,test_data),batch_size = BATCH_SIZE,device = device)

batch = next(iter(train_iterator))
TRG.reverse(batch.trg)

output>>>
['a group of kids playing with tires.','seven construction workers working on a building.','a man is performing with fire sticks before a crowd outside.']

这是怎么了?使用spacy时如何正确将令牌转换回字符串?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)