将 distilbert-base-uncased tokenizer 与 tflite 模型一起导入到 android 应用程序

问题描述

我已使用以下方法将模型 (.h5) 转换为 tflite：

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,tf.lite.OpsSet.SELECT_TF_OPS]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
open("/models/tflite_models/5th_Jan/distilbert_sms_60_5_jan.tflite","wb").write(tflite_model)

但我还需要分词器来在 android 应用上本地运行模型（独立于互联网可用性）。

根据互联网上的文章和在 stackoverflow How to tokenize input text in android studio to process in NLP model? 上回答的问题，我们需要分词器的 json 文件来对新输入中的单词进行分词。

当我运行以下代码时：

import json

with open( 'android/word_dict.json','w' ) as file:
    json.dump( tokenizer.word_index,file )

出现以下错误：

AttributeError: 'distilBertTokenizer' object has no attribute 'word_index

我无法找到在 android 应用程序中使用 distilbert-base-uncased 标记器的解决方案。任何帮助将不胜感激。谢谢。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）