NLTK和下载的Chaquopy问题

问题描述

根据Chaquopy Not able to download Resource,我不确定问题是否得到解决

这是nltk上下文中的问题。 包含nltk.download行之一后:

nltk.download('popular')
or
nltk.download('punkt')
or
nltk.download('all')

我得到了这个堆栈跟踪:

2020-08-26 13:33:45.742 19765-19765/com.pro.useyournotes E/ExceptionTag: com.chaquo.python.PyException: BadZipFile: File is not a zip file
    com.chaquo.python.PyException: BadZipFile: File is not a zip file
        at <python>.zipfile._RealGetContents(zipfile.py:1335)
        at <python>.zipfile.__init__(zipfile.py:1268)
        at <python>.nltk.data.__init__(data.py:936)
        at <python>.nltk.compat._decorator(compat.py:41)
        at <python>.nltk.data.__init__(data.py:396)
        at <python>.nltk.compat._decorator(compat.py:41)
        at <python>.nltk.data.find(data.py:544)
        at <python>.nltk.data.find(data.py:557)
        at <python>.nltk.tag.perceptron.__init__(perceptron.py:168)
        at <python>.nltk.tag._get_tagger(__init__.py:106)
        at <python>.nltk.tag.pos_tag_sents(__init__.py:178)
        at <python>.uyn_pre_processing.pre_processing(uyn_pre_processing.py:88)
        at <python>.uyn_analysis_workflow.analyse_new_data(uyn_analysis_workflow.py:62)
        at <python>.uyn_main.main(uyn_main.py:266)
        at <python>.chaquopy_java.call(chaquopy_java.pyx:285)
        at <python>.chaquopy_java.Java_com_chaquo_python_PyObject_callAttrThrows(chaquopy_java.pyx:257)
        at com.chaquo.python.PyObject.callAttrThrows(Native Method)
        at com.chaquo.python.PyObject.callAttr(PyObject.java:209)
        at com.pro.useyournotes.MainActivity.getPythonHello(MainActivity.kt:70)
        at com.pro.useyournotes.MainActivity.onCreate(MainActivity.kt:59)
        at android.app.Activity.performCreate(Activity.java:7136)
        at android.app.Activity.performCreate(Activity.java:7127)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1271)
        at android.app.ActivityThread.performlaunchActivity(ActivityThread.java:2893)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3048)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:78)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:108)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:68)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1808)
        at android.os.Handler.dispatchMessage(Handler.java:106)
        at android.os.Looper.loop(Looper.java:193)
        at android.app.ActivityThread.main(ActivityThread.java:6669)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)

发生此错误代码是:

    tagged_words=nltk.pos_tag_sents(tokenized_sentences)

at <python>.uyn_pre_processing.pre_processing(uyn_pre_processing.py:88)

我也不知道nltk文件放在哪里。早些时候,当我只是在python端编程时,我只记得使用 import nltk 命令。希望一些人已经找到了使用nltk的解决方案。

解决方法

我能够在模拟器上重现类似的内容。就我而言,根本原因是下载失败并显示DECRYPTION_FAILED_OR_BAD_RECORD_MAC错误,留下了不完整的ZIP文件。

这似乎是模拟器的一个低级问题,并非特定于Python。如果您可以确定遇到相同的问题(通过在nltk.download logcat输出中看到DECRYPTION_FAILED_OR_BAD_RECORD_MAC),请在Android问题跟踪器here上加一个星号,以鼓励他们进行修复它。

您可以通过以下方法解决此问题:循环重复调用nltk.download,直到它返回true。为了节省时间,您可能应该只下载所需的数据文件。您可以通过简单地调用相应的函数并查看错误消息来找出它们是什么,例如:

>>> nltk.pos_tag_sents([["hello","world"]])
...
LookupError: 
**********************************************************************
  Resource [93maveraged_perceptron_tagger[0m not found.
  Please use the NLTK Downloader to obtain the resource:
 
  [31m>>> import nltk
  >>> nltk.download('averaged_perceptron_tagger')

然后您可以将其添加到您的代码中:

while not nltk.download('averaged_perceptron_tagger'):
    print("Retrying download")

经过几次迭代后,此操作成功完成,然后我就能够成功调用nltk.pos_tag_sents

,

将此添加到您的python脚本中:

    while not nltk.download('punkt'):
        return ("Retrying download punkt")

同样在你的 AndroidManifest 中不要忘记添加这些权限:

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />