线程计时器功能在多处理递归脚本中不起作用

问题描述

我是生物学家，并且对并行处理不熟悉。一些重要的背景是我的某些脚本可能需要15个小时才能执行。当主要功能（in_command）正在运行时，我试图并行运行一个将对硬件使用情况（CPU，RAM等）进行快照的功能。我遇到的问题是，计时器（get_stats）上的递归脚本分别运行时正确执行，但是一旦我使用多处理并行运行它，计时器似乎就无法工作。即使我有一个300秒的计时器，该函数也大约每秒运行一次。在其他脚本完成后，它确实停止了，但是我获得的快照比需要的更多。我也不喜欢我目前的方法，因此，如果有更好的方法，我愿意学习。我只是不能让它对其他脚本产生太大影响，因此快照方法只需要大致了解正在发生的事情。谢谢！

import psutil
import platform
from datetime import datetime
import multiprocessing
import time
import os
from threading import Timer
import multiprocessing as mp



def get_size(bytes,suffix="B"):
    """
    Scale bytes to its proper format
    e.g:
        1253656 => '1.20MB'
        1253656678 => '1.17GB'
    """
    factor = 1024
    for unit in ["","K","M","G","T","P"]:
        if bytes < factor:
            return f"{bytes:.2f}{unit}{suffix}"
        bytes /= factor


def get_stats(switch,snapshot_dict,beg_time):
    df =snapshot_dict
    start_time = beg_time
    df['time'].append(time.time() - start_time)
   
    # Get core information
    df['total_cores'].append(psutil.cpu_count(logical=True))
    df['physical_cores'].append(psutil.cpu_count(logical=False))
    cpufreq = psutil.cpu_freq()
    
    # cpu frequency in Mhz
    df['max_frequency'].append(cpufreq.max)
    df['min_frequency'].append(cpufreq.min)
    df['current_frequency'].append(cpufreq.current)
    cpu_core = {}
    for i,percentage in enumerate(psutil.cpu_percent(percpu=True,interval=1)):
        cpu_core[str(i)] = percentage
    df['cpu_core'].append(cpu_core)
    
    # get ram information
    svmem = psutil.virtual_memory()
    df['total_memory'].append(get_size(svmem.total))
    df['available_memory'].append(get_size(svmem.available))
    df['used_memory'].append(get_size(svmem.used))
    df['percent_memory'].append(svmem.percent)
    
    # swap memory if it exists
    swap = psutil.swap_memory()
    df['swap_total'].append(get_size(swap.total))
    df['swap_free'].append(get_size(swap.free))
    df['swap_used'].append(get_size(swap.used))
    df['swap_percentage'].append(swap.percent)
    print(df)

    #Call the code on a recursive function
    t = Timer(300,get_stats(switch,df,beg_time))
    t.start()

    print('Switch: ',switch.value)
    if switch.value == 1:
        t.cancel()


def in_command(file,switch):
    f = open(file,'r')
    f_lines = f.readlines()
    for line in f_lines:
        print(line)
        os.system(line)
    f.close()
    switch.value += 1


if __name__ == "__main__":

    manager= mp.Manager()

    df = {'time': [],'total_cores': [],'physical_cores': [],'max_frequency': [],'min_frequency': [],'current_frequency': [],'cpu_core': [],'total_memory': [],'available_memory': [],'used_memory': [],'percent_memory': [],'swap_total': [],'swap_free': [],'swap_used': [],'swap_percentage': []}


    process_switch = manager.Value('i',0)

    start_time = time.time()


    p1 = mp.Process(target=get_stats,args = (process_switch,start_time))
    p2 = mp.Process(target=in_command,args=('text_command.txt',process_switch))

    p1.start()
    p2.start()

    p1.join()
    p2.join()

    print('finished')

解决方法

谢谢迈克尔。因此，对于可能正在阅读本文的每个人，我都不会在调用函数中包含参数，而是将它们分开放置。

不正确： t = Timer(300,get_stats(switch,df,beg_time))

正确： t = Timer(5,function=get_stats,args=[switch,snapshot_dict,beg_time])

multiprocessing python recursion

线程计时器功能在多处理递归脚本中不起作用

问题描述

解决方法

相关问答