问题描述
我正在尝试使用 Python 读取包含有关播放列表中歌曲信息的 JSON 文件,但在读取韩文和中文字符时似乎有问题?
我的代码如下:
import json
from pprint import pprint
with open('.../playlist.json',encoding="utf-8") as f:
data = json.load(f)
for playlist in data['playlists']:
for item in playlist['items']:
pprint(item['track']['trackName'])
当我运行它时,所有全英文的曲目名称都可以正常打印,但是对于那些包含韩文或中文字符的曲目,我会收到一条错误消息:
File "...\parse.py",line 10,in <module>
pprint(item['track']['trackName'])
File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 53,in pprint
printer.pprint(object)
File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 148,in pprint
self._format(object,self._stream,{},0)
File "...\AppData\Local\Programs\Python\python39\lib\pprint.py",line 185,in _format
stream.write(rep)
File "...\AppData\Local\Programs\Python\python39\lib\encodings\cp1252.py",line 19,in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 5-6: character maps to <undefined>
编辑:当它碰到 ′
字符时,它似乎也摔倒了,给出了 UnicodeEncodeError: 'charmap' codec can't encode character '\u2032'
解决方法
是否重定向输出?
Python 使用 UTF-8 进行控制台输出,因此可以打印所有字符。但是当标准输出被重定向时(例如管道或文件),Python 使用 ANSI 代码页。
您可以设置 const { fork } = require('child_process');
const WORKER_PATH = 'PATH_TO_WORKER_JS';
const INTERVAL = 10000; // 10 seconds
let CHECKS = 0; // keep count of how many times you check for completion
const worker = fork(WORKER_PATH);
// send message to start the async process
worker.send({ start: true },err => {
if(err) { // handle error }
});
// check for completion on the worker
setTimeout(() => {
worker.send({ checkIsDone: true },err => {});
},INTERVAL);
// listen for message from worker
function checkMsg(msg) {
const { done } = msg;
if(done) {
// async process ended; kill the worker
worker.kill('SIGKILL');
} else {
if(check > 10) {
// checked 10 times for completion,10 seconds apart;
// process not terminating so force kill
worker.send('SIGKILL');
} else {
// setup another check in 10 seconds
setTimeout(() => {
worker.send({ checkIsDone: true },err => {});
},INTERVAL);
}
CHECKS++;
}
}
process.on('message',checkMsg);
环境变量以使用 UTF-8 作为默认文本编码。见https://docs.python.org/3/using/windows.html#utf-8-mode