Python Itrtools 和多处理

问题描述

我有一个脚本，它使用 itertools 循环 zip_longest 来交互列表和文件。例如，在下面的代码中 json_file_list 是文件列表，其中每个 json_file 都意味着处理 url_list 文件中的 N 行，并不断重复操作，直到处理完 url_list 的所有行。不幸的是，这个操作非常慢。有没有办法在这里使用multiprocessing，这样每个json文件可以同时处理url_list文件中的N行，并且操作不断重复，直到url_list中的所有行都被处理完毕。

batch_size = 100

JSON_KEY_FILE_PATH = "json_files/"
JSON_FILENAME = '*.json'
json_file_list = glob.glob(JSON_KEY_FILE_PATH + JSON_FILENAME,recursive=True)
itr_length = len(json_file_list)


ScopES = [ "https://www.googleapis.com/auth/indexing" ]
ENDPOINT = "https://indexing.googleapis.com/v3/urlNotifications:publish"

def grouper(iterable,n,fillvalue=None):
    args = [iter(iterable)] * n
    return zip_longest(*args,fillvalue=fillvalue)

def GoogleApiCall(url_file,json_file_list):
    with open(url_file) as f:
        counter = 0
        for json_file,lines in zip(cycle(json_file_list),grouper(f,batch_size)):
            # json_data = json.load(open(json_file))
            credentials = ServiceAccountCredentials.from_json_keyfile_name(json_file,scopes=ScopES)
            http = credentials.authorize(httplib2.Http())
            counter += 1
            for line in lines:
                content = """{
                    url: "%s",type: "URL_UPDATED"
                }"""%line
                if line:
                    response,content = http.request(ENDPOINT,method="POST",body=content)
                    if(response.status == 200):
                        s1 = str(line)
                        s2 = 'successfully indexed,wait a while for google to refresh'+'\n'
                    else:
                        s1 = str(line)
                        s2 = 'Failed to be indexed'+'\n'
                    print ("=========================")
                    print ("====>> " + str(json_file))
                    print(response)
                    
            print ("my counter:" + str(counter))
            if counter % len(json_file_list) == 0:
                time.sleep(20)

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

itertools multiprocessing multithreading multithreading python

Python Itrtools 和多处理

问题描述

解决方法

相关问答