Python:使用 asyncio 命中 api 并输出 .csv 的

问题描述

我正在考虑如何异步重写一些代码。我必须从 api 下载 ~7500 个数据集并将它们写入 .csv's。这是一个可重现的示例(假设您有一个免费的 alpha vantage api 密钥):

from alpha_vantage.timeseries import TimeSeries
import pandas as pd
import numpy as np
api_key = ""

def get_ts(symbol):
    
    ts = TimeSeries(key=api_key,output_format='pandas')
    data,Meta_data = ts.get_daily_adjusted(symbol=symbol,outputsize='full')
    fname = "./data_dump/{}_data.csv".format(symbol)
    data.to_csv(fname)

symbols = ['AAPL','GOOG','TSLA','MSFT']

for s in symbols:
    get_ts(s)

制作 alpha_vantage API 的人写了一篇关于将它与 asyncio here 一起使用的文章,但我不确定我是否应该制作两个函数提取数据和编写 csv,例如 { {3}}。

我以前没有使用过 asyncio,所以如果有任何建议,我们将不胜感激 - 只是希望尽可能让我的下载时间少于 3 小时!

编辑:一个警告是我正在帮助研究人员解决这个问题,所以我们使用 Jupyter 笔记本 - 请参阅他们对 asyncio here 的警告。

解决方法

如果不更改您的函数 get_ts,它可能如下所示:

import multiprocessing

# PROCESSES = multiprocessing.cpu_count()
PROCESSES = 4  # number of parallel process
CHUNKS = 6  # one process handle n symbols

# 7.5k symbols
TICKERS = ["BCDA","WBAI","NM","ZKIN","TNXP","FLY","MYSZ","GASX","SAVA","GCE","XNET","SRAX","SINO","LPCN","XYF","SNSS","DRAD","WLFC","OILD","JFIN","TAOP","PIC","DIVC","MKGI","CCNC","AEI","ZCMD","YVR","OCG","IMTE","AZRX","LIZI","ORSN","ASPU","SHLL","INOD","NEXI","INR","SLN","RHE-PA","MAX","ARRY","BDGE","TOTA","PFMT","AMRH","IDN","OIS","RMG","IMV","CHFS","SUMR","NRG","ULBR","SJI","HOML","AMJL","RUBY","KBLMU","ELP"]

# create a list of n sublist
TICKERS = [TICKERS[i:i + CHUNKS] for i in range(0,len(TICKERS),CHUNKS)]


def download_data(pool_id,symbols):
    for symbol in symbols:
        print("[{:02}]: {}".format(pool_id,symbol))
        # do stuff here
        # get_ts(symbol)


if __name__ == "__main__":
    with multiprocessing.Pool(PROCESSES) as pool:
        pool.starmap(download_data,enumerate(TICKERS,start=1))

类似问题here

在此示例中,我将股票代码列表拆分为子列表,以便每个进程检索多个交易品种的数据并限制由于创建和销毁进程而产生的开销。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...