问题描述

我正在尝试加载 JSON 文件，然后尝试使用不同的 ID 填充相同的 JSON。这些 ID 可以多次面对文件中的任何位置。我的目标是从它们中查找、替换和生成一个新文件。代码工作正常，但生成一个文件需要 13 秒（我的文件大小为 5MB），我需要生成 250 GB 的总量。生成全部数据需要 7 天时间。对于较短的时间段有什么建议和想法吗？

这是我的代码：

import datetime
import re
import uuid

textfile = open('Sample.json','r')
filetext = textfile.read()

textfile.close()
matches = re.findall("[0-9a-z]{24}",filetext)
# Unique ids
unique_array = list(dict.fromkeys(matches))
# print unique_array
# print uuid.uuid4().hex[:24]

json_file = 'Sample.json'

print 'Total unique key: ' + str(len(unique_array))
print datetime.datetime.Now()

with open(json_file) as f:
    file_data = f.read()

for old_id in unique_array:
    new_id = uuid.uuid4().hex[:24]

    file_data = file_data.replace(old_id,new_id)
new_file_name = "My_New_File.json"
print str(count + 1) + ' '  + str(new_file_name[0])

with open(new_file_name[0],'w') as a:
    a.write(file_data)
a.close()

print datetime.datetime.Now()

f.close()

编辑：

顺便说一下，我尝试实现多线程，但无法实现。因为我根本不懂python。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

data-generation dummy-data json multithreading python-2.7

有没有更好/更快的方法用新值替换旧值？

问题描述

编辑：

解决方法