psutil 显示我有超过 250GB 的可用内存,但我在加载 6.5GB 文件时出现内存错误

问题描述

在具有数百 GB RAM 的服务器上使用带有 6.5GB 数据集的 python(通过 psutil 确认)。尝试将文件加载到 Pandas 时出现内存错误。这是 psutil输出

import psutil
psutil.virtual_memory()

svmem(total=405042839552,available=254328373248,percent=37.2,used=148782104576,free=148047446016,active=79192813568,inactive=96666456064,buffers=20480,cached=108213268480,shared=767070208,slab=4305301504)

psutil 显示 254.3GB 可用 RAM,但是当我尝试加载 6.5GB 文件时,我得到以下回溯:

#filename is 6.5GB
df = pd.read_table(filename,sep='\t')

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-8-0b957ec637b5> in <module>
      
----> 2 df = pd.read_table(filename,sep='\t')

/hpc/packages/minerva-centos7/py_packages/3.7/lib/python3.7/site-packages/pandas/io/parsers.py in read_table(filepath_or_buffer,sep,delimiter,header,names,index_col,usecols,squeeze,prefix,mangle_dupe_cols,dtype,engine,converters,true_values,false_values,skipinitialspace,skiprows,skipfooter,nrows,na_values,keep_default_na,na_filter,verbose,skip_blank_lines,parse_dates,infer_datetime_format,keep_date_col,date_parser,dayfirst,cache_dates,iterator,chunksize,compression,thousands,decimal,lineterminator,quotechar,quoting,doublequote,escapechar,comment,encoding,dialect,error_bad_lines,warn_bad_lines,delim_whitespace,low_memory,memory_map,float_precision)
    765         # default to avoid a ValueError
    766         sep = ","
--> 767     return read_csv(**locals())
    768 
    769 

/hpc/packages/minerva-centos7/py_packages/3.7/lib/python3.7/site-packages/pandas/io/parsers.py in read_csv(filepath_or_buffer,float_precision)
    686     )
    687 
--> 688     return _read(filepath_or_buffer,kwds)
    689 
    690 

/hpc/packages/minerva-centos7/py_packages/3.7/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer,kwds)
    458 
    459     try:
--> 460         data = parser.read(nrows)
    461     finally:
    462         parser.close()

/hpc/packages/minerva-centos7/py_packages/3.7/lib/python3.7/site-packages/pandas/io/parsers.py in read(self,nrows)
   1196     def read(self,nrows=None):
   1197         nrows = _validate_integer("nrows",nrows)
-> 1198         ret = self._engine.read(nrows)
   1199 
   1200         # May alter columns / col_dict

/hpc/packages/minerva-centos7/py_packages/3.7/lib/python3.7/site-packages/pandas/io/parsers.py in read(self,nrows)
   2155     def read(self,nrows=None):
   2156         try:
-> 2157             data = self._reader.read(nrows)
   2158         except stopiteration:
   2159             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers._concatenate_chunks()

<__array_function__ internals> in concatenate(*args,**kwargs)

MemoryError: Unable to allocate 15.3 MiB for an array with shape (2003397,) and data type float64

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)