使用 Pandas Python 读取 HDF 文件时出现问题

问题描述

我有一个 NASA 数据集的链接,格式为 .hdf(下载如下)

https://reason.gesdisc.eosdis.nasa.gov/data/Vegetation_Indices/MODVI.005/2001/MODVI.200101.005.hdf

在本地下载这个后,我尝试了以下代码

import pandas as pd
pd.read_hdf('MODVI.200101.005.hdf')

不幸的是,这段代码给了我一个错误(见下文)。任何阅读此文件的帮助将不胜感激!

HDF5ExtError                              Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\io\pytables.py in open(self,mode,**kwargs)
    654         try:
--> 655             self._handle = tables.open_file(self._path,self._mode,**kwargs)
    656         except IOError as err:  # pragma: no cover

~\anaconda3\lib\site-packages\tables\file.py in open_file(filename,title,root_uep,filters,**kwargs)
    314     # Finally,create the File instance,and return it
--> 315     return File(filename,**kwargs)
    316 

~\anaconda3\lib\site-packages\tables\file.py in __init__(self,filename,**kwargs)
    777         # Now,it is time to initialize the File extension
--> 778         self._g_new(filename,**params)
    779 

tables/hdf5extension.pyx in tables.hdf5extension.File._g_new()

HDF5ExtError: HDF5 error back trace

  File "C:\ci\hdf5_1545244154871\work\src\H5F.c",line 509,in H5Fopen
    unable to open file
  File "C:\ci\hdf5_1545244154871\work\src\H5Fint.c",line 1400,in H5F__open
    unable to open file
  File "C:\ci\hdf5_1545244154871\work\src\H5Fint.c",line 1700,in H5F_open
    unable to read superblock
  File "C:\ci\hdf5_1545244154871\work\src\H5Fsuper.c",line 411,in H5F__super_read
    file signature not found

End of HDF5 error back trace

Unable to open/create file 'MODVI.200101.005.hdf'

During handling of the above exception,another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-13-ee439b8cf983> in <module>
      1 import pandas as pd
----> 2 pd.read_hdf('MODVI.200101.005.hdf')

~\anaconda3\lib\site-packages\pandas\io\pytables.py in read_hdf(path_or_buf,key,errors,where,start,stop,columns,iterator,chunksize,**kwargs)
    395             raise FileNotFoundError(f"File {path_or_buf} does not exist")
    396 
--> 397         store = hdfstore(path_or_buf,mode=mode,errors=errors,**kwargs)
    398         # can't auto open/close if we are using an iterator
    399         # so delegate to the iterator

~\anaconda3\lib\site-packages\pandas\io\pytables.py in __init__(self,path,complevel,complib,fletcher32,**kwargs)
    535         self._fletcher32 = fletcher32
    536         self._filters = None
--> 537         self.open(mode=mode,**kwargs)
    538 
    539     def __fspath__(self):

~\anaconda3\lib\site-packages\pandas\io\pytables.py in open(self,**kwargs)
    685             # is not part of IOError,make it one
    686             if self._mode == "r" and "Unable to open/create file" in str(err):
--> 687                 raise IOError(str(err))
    688             raise
    689 

OSError: HDF5 error back trace

  File "C:\ci\hdf5_1545244154871\work\src\H5F.c",in H5F__super_read
    file signature not found

End of HDF5 error back trace

Unable to open/create file 'MODVI.200101.005.hdf'

解决方法

请尝试<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <link rel='stylesheet' href='https://fonts.googleapis.com/icon?family=Material+Icons'/> <meta name="viewport" content="width=device-width,initial-scale=1.0"> <link rel="stylesheet" type="text/css" href="style/style.css" /> </head> <body> <header class="t-site-header"> <div class="full_container" id="mobile_control"> <div class="bottom"> <div class="nav"> <span id="next" onclick="moon('next')" class="action"><i class="material-icons">keyboard_arrow_left</i></span> <span id="prev" onclick="moon('prev')" class="action"><i class="material-icons">keyboard_arrow_right</i></span> </div> <div class="orbit"> <div class="planet_container"> <div class="planet pt earth" id="pl0"> <div class="moon"> <span></span> </div> </div> <div class="planet mars" id="pl1"> <div class="moon"> <span></span> </div> </div> <div class="planet jupiter" id="pl2"> <div class="moon"> <span></span> </div> </div> <div class="planet saturn" id="pl3"> <div class="moon"> <span></span> </div> </div> <div class="planet uranus" id="pl4"> <div class="moon"> <span></span> </div> </div> <div class="planet neptune" id="pl5"> <div class="moon"> <span></span> </div> </div> <div class="planet mercury" id="pl6"> <div class="moon"> <span></span> </div> </div> <div class="planet venus" id="pl7"> <div class="moon"> <span></span> </div> </div> </div> <div class="name_container"> <p class="pn" id="show_pathname"></p> <p class="more">READ MORE</p> </div> </div> </div> </div> </header> <script src='https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js'></script> <script src='https://cdnjs.cloudflare.com/ajax/libs/hammer.js/2.0.8/hammer.min.js'></script> <script src='https://cdnjs.cloudflare.com/ajax/libs/animejs/2.0.2/anime.min.js'></script><script src="./js/script.js"></script> </body> </html>。问题似乎是由 pip install tables

引起的