使用 Python 从 JSON 文件中读取英语和韩语字符串?

问题描述

我正在尝试使用 Python 读取包含有关播放列表中歌曲信息的 JSON 文件,但在读取韩文和中文字符时似乎有问题?

我的代码如下:

import json
from pprint import pprint

with open('.../playlist.json',encoding="utf-8") as f:
    data = json.load(f)

for playlist in data['playlists']:
    for item in playlist['items']:
        pprint(item['track']['trackName'])

当我运行它时,所有全英文的曲目名称都可以正常打印,但是对于那些包含韩文或中文字符的曲目,我会收到一条错误消息:

  File "...\parse.py",line 10,in <module>
    pprint(item['track']['trackName'])
  File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 53,in pprint
    printer.pprint(object)
  File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 148,in pprint
    self._format(object,self._stream,{},0)
  File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 185,in _format
    stream.write(rep)
  File "...\AppData\Local\Programs\Python\python39\lib\encodings\cp1252.py",line 19,in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 5-6: character maps to <undefined>

编辑:当它碰到 字符时,它似乎也摔倒了,给出了 UnicodeEncodeError: 'charmap' codec can't encode character '\u2032'

解决方法

是否重定向输出?

Python 使用 UTF-8 进行控制台输出,因此可以打印所有字符。但是当标准输出被重定向时(例如管道或文件),Python 使用 ANSI 代码页。

您可以设置 const { fork } = require('child_process'); const WORKER_PATH = 'PATH_TO_WORKER_JS'; const INTERVAL = 10000; // 10 seconds let CHECKS = 0; // keep count of how many times you check for completion const worker = fork(WORKER_PATH); // send message to start the async process worker.send({ start: true },err => { if(err) { // handle error } }); // check for completion on the worker setTimeout(() => { worker.send({ checkIsDone: true },err => {}); },INTERVAL); // listen for message from worker function checkMsg(msg) { const { done } = msg; if(done) { // async process ended; kill the worker worker.kill('SIGKILL'); } else { if(check > 10) { // checked 10 times for completion,10 seconds apart; // process not terminating so force kill worker.send('SIGKILL'); } else { // setup another check in 10 seconds setTimeout(() => { worker.send({ checkIsDone: true },err => {}); },INTERVAL); } CHECKS++; } } process.on('message',checkMsg); 环境变量以使用 UTF-8 作为默认文本编码。见https://docs.python.org/3/using/windows.html#utf-8-mode