从越来越多的 csv 文件中实时更新图表

问题描述

我需要实时分析一些光谱数据并将其绘制为自更新图。我使用的程序每两秒输出一个文本文件。

通常我在收集数据后进行分析，代码运行良好。我创建了一个数据框，其中每个 csv 文件代表一列。问题是，对于数千个 csv 文件，导入变得非常缓慢，并且从所有 csv 文件中创建一个数据框通常需要半个小时以上。下面是从多个 csv 文件创建数据框的代码。

    ''' import,append and concat files into one dataframe '''
    all_files = glob.glob(os.path.join(path,filter + "*.txt")) # path to the files by joining path and file name
    all_files.sort(key=os.path.getmtime)
    data_frame = []
    name = []
    for file in (all_files):
        creation_time = os.path.getmtime(file)
        readible_date = datetime.fromtimestamp(creation_time)
        df = pd.read_csv(file,index_col=0,header=None,sep='\t',engine='python',decimal=",",skiprows = 15)
        df.rename(columns={1: readible_date},inplace=True)
        data_frame.append(df)
        full_spectra = pd.concat(data_frame,axis=1)
    for column in full_spectra.columns:
        time_step = column - full_spectra.columns[0]
        minutes = time_step.total_seconds()/60
        name.append(minutes)    
    full_spectra.columns = name
    return full_spectra

我想到的解决方案是使用看门狗模块，每次创建一个新的文本文件时，它都会作为一个新列附加到现有数据帧中，并绘制更新的数据帧。因为那样，我不需要一直循环遍历所有 csv 文件。

I found a very nice example on how to use watchdog here

我的问题是，在使用看门狗检测到新文件后，我找不到如何读取它并将其附加到现有数据帧的解决方案。

一个简约的示例代码应该是这样的：

def latest_filename():
"""a function that checks within a directoy for new textfiles"""
    return(filename)

df = pd.DataFrame() #create a dataframe

newdata  = pd.read_csv(latest_filename) #The new file is found by watchdog

df["newcolumn"] = newdata["desiredcolumn"] #append the new data as column

df.plot() #plot the data

绘图部分应该很简单，我的想法是adapt the code presented here。我更关心自更新数据框。

感谢任何可以解决我的问题的帮助或其他解决方案！

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

matplotlib pandas pandas python-3.x watchdog