将嵌套的 Jsonl 文件转换为 CSV 格式:取消嵌套 Jsonl 并提取为 CSV

问题描述

我目前有一个脚本,可让我转换 jsonl 文件并将其写入单独文件夹中的 csv 文件。该脚本对于 jsonl 的非嵌套值似乎没问题,但是当我必须处理每一行中的嵌套值时就会出现问题。嵌套值从 "info:" 开始,如下所示。

示例jsonl:

{"symbol": "DOGE-PERP","timestamp": 1621948955550,"datetime": "2021-05-25T13:22:35.550Z","high": null,"low": null,"bid": 0.342372,"bidVolume": null,"ask": 0.3424855,"askVolume": null,"vwap": null,"open": null,"close": 0.3424025,"last": 0.3424025,"prevIoUsClose": null,"change": null,"percentage": 0.039249281423858244,"average": null,"baseVolume": null,"quoteVolume": 433162290.0506585,"info": {"name": "DOGE-PERP","enabled": true,"postOnly": false,"priceIncrement": "5e-7","sizeIncrement": "1.0","minProvideSize": "1.0","last": "0.3424025","bid": "0.342372","ask": "0.3424855","price": "0.3424025","type": "future","baseCurrency": null,"quoteCurrency": null,"underlying": "DOGE","restricted": false,"highLeverageFeeExempt": false,"change1h": "0.023470298206100425","change24h": "0.039249281423858244","changeBod": "-0.07136396489976689","quoteVolume24h": "433162290.0506585","volumeUsd24h": "433162290.0506585"}}
{"symbol": "DOGE-PERP","timestamp": 1621948955976,"datetime": "2021-05-25T13:22:35.976Z","bid": 0.3424955,"ask": 0.3427185,"close": 0.3427185,"last": 0.3427185,"percentage": 0.04020839466903005,"last": "0.3427185","bid": "0.3424955","ask": "0.3427185","price": "0.3427185","change1h": "0.024414849178225707","change24h": "0.04020839466903005","changeBod": "-0.07050693556414092","volumeUsd24h": "433162290.0506585"}}

我希望有人能够帮助我取消嵌套值,并允许它们在创建的 csv 文件中作为单独的列读取。

当前 CSV 文件的样子(到目前为止,“信息”中的嵌套值都没有被提取出来)

enter image description here

我当前的脚本(我对每个变量的每个列标题进行了硬编码,并对每个变量进行了硬编码以从 jsonl 文件提取):

import glob
import json
import csv
import time


start = time.time()
#import pandas as pd
from flatten_json import flatten

#Path of jsonl file
File_path = (r'C:\Users\Natthanon\Documents\Coding 101\Python\JSONL')
#reading all jsonl files
files = [f for f in glob.glob( File_path + "**/*.jsonl",recursive=True)]
i = 0

for f in files:
    with open(f,'r') as F:
        #creating csv files  
        file_name = f.rsplit("\\",1)[-1].replace('.jsonl','')
        with open(r'C:\Users\Natthanon\Documents\Coding 101\Python\CSV\\' + file_name + ".csv",'w',newline='') as csv_file:
            thewriter = csv.writer(csv_file)
            thewriter.writerow(["symbol","timestamp","datetime","high","low","bid","bidVolume","ask","askVolume","vwap","open","close","last","prevIoUsClose","change","percentage","average","baseVolume","quoteVolume",""])

            for line in F:
                #flatten json files 
                data = json.loads(line)
                data_1 = flatten(data)
                #headers should be the Key values from json files that make Column header                    
                thewriter.writerow([data_1['symbol'],data_1['timestamp'],data_1['datetime'],data_1['high'],data_1['low'],data_1['bid'],data_1['bidVolume'],data_1['ask'],data_1['askVolume'],data_1['vwap'],data_1['open'],data_1['close'],data_1['last'],data_1['prevIoUsClose'],data_1['change'],data_1['percentage'],data_1['average'],data_1['baseVolume'],data_1['quoteVolume']])

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)