如何与ThreadPoolExecutor并行运行代码？

问题描述

嗨，我真的是线程技术的新手，这让我感到困惑，如何并行运行此代码？

def search_posts(page):

    page_url = f'https://jsonplaceholder.typicode.com/posts/{page}'
    req = requests.get(page_url)
    res = req.json()
    
    title = res['title']
    
    return title



page = 1

while True:

    with ThreadPoolExecutor() as executer:
        t = executer.submit(search_posts,page)

        title = t.result()

        print(title)

    if page == 20:
        break

    page += 1

另一个问题是，我是否需要学习操作系统才能了解线程的工作原理？

解决方法

这里的问题是您要为每个页面创建一个新的ThreadPoolExecutor。要并行执行操作，只需创建一个ThreadPoolExecutor并使用其map方法：

import concurrent.futures as cf
import requests


def search_posts(page):
    page_url = f'https://jsonplaceholder.typicode.com/posts/{page}'
    res = requests.get(page_url).json()
    return res['title']


if __name__ == '__main__':
    with cf.ThreadPoolExecutor() as ex: 
        results = ex.map(search_posts,range(1,21))
    for r in results:
        print(r)

请注意，使用if __name__ == '__main__'包装器是使代码更具可移植性的好习惯。

使用线程时要记住的一件事；如果您使用的是CPython（最常见的一种来自python.org的Python实现），则线程不会实际上并行运行。

为了简化内存管理，一次只能在一个线程中执行CPython中的Python字节码。这是由CPython中的全局解释器锁（“ GIL”）强制执行的。

好消息是，使用requests获取网页将花费大部分时间使用网络I / O。通常，GIL是在I / O期间释放的。

但是，如果您要在辅助函数中执行计算（即执行Python字节码），则应改用ProcessPoolExecutor。

如果您使用ProcessPoolExecutor并且在ms-windows上运行，则需要使用if __name__ == '__main__'包装器，因为Python必须能够{{1 }}在这种情况下，您的主程序没有副作用。

python python-multithreading