从简单的块中按名称选择尺寸

问题描述

我有一些grib格式的合奏文件,我想使用dask和xarray延迟加载到Python中。基于https://climate-cms.org/2018/09/14/dask-era-interim.html,我设法按预期方式延迟加载文件,但是现在我想切片并选择尺寸以绘制数据一段时间和水平。

我的程序如下:

import dask
import dask.array as da
import xarray as xr
import pandas as pd
import numpy as np
from glob import glob
from datetime import date,datetime,timedelta

import matplotlib.pyplot as plt

bpath = '/some/path/to/my/data'

# pressure levels
levels =['1000','925','850','700','500','300','250','200','50']

# ensemble member names
ensm = ['M01','M02','M03','M04','M05'] 

@dask.delayed
def open_file_delayed(file,vname):
    ds = xr.open_dataset(file,decode_cf='False',engine='pynio')
    return ds

def open_file(file,vname,nlevs,nlats,nlons,ftype):
    file_data = open_file_delayed(file,vname)[vname].data
    return da.from_delayed(file_data,(nlevs,nlons),ftype)

    # list of files to open (sorted by date)
    # filename mask: PREFIXMEMYYYYiMMiDDiHHiYYYYfMMfDDfHHf.grb
    # MEM: member name (see the levels list)
    # YYYYiMMiDDiHHi: analysis date (passed as an argument to the open_file function)
    # YYYYfMMfDDfHHf: forecast date
    files = sorted(glob(bpath + '/%(dateanl)s/%(mem)s/PREFIX%(mem)s%(dateanl)s*.grb'%
                        {'dateanl': date,'mem': member}))
   
    # open the first file in the list to get dimensions and coordinates
    ds0 = xr.open_dataset(files[0],engine='pynio')
   
    var0 = ds0[vname]
   
    levs = ds0.lv_ISBL2.data
    lats = ds0.g4_lat_0.data
    lons = ds0.g4_lon_1.data
    
    nlevs = ds0.lv_ISBL2.size
    nlats = ds0.g4_lat_0.size
    nlons = ds0.g4_lon_1.size
    ftype = var0.dtype
   
    ds0.close()
    
    # calculate the date range of the forecasts,in my case len(date_fmt) = 61 (every grib file has 61 times and 9 levels)
    date_fmt = pd.date_range(start=datetime.strptime(date,"%Y%m%d%H"),freq="6H",periods=61)
        
    # call the function 'open_file' for all files contained in the list 'files' and concatenate
    dask_var = da.concatenate([open_file(file,ftype) for file in files],axis=0)
        
    # xda is the data array with all files
    xda = xr.DataArray(dask_var,dims=['tlev','lat','lon']) 
            
    # set coordinates values
    # note: tlev is a chunk of 9 levels * 61 forecast dates
    xda.coords['tlev'] = ('tlev',np.tile(levels,len(date_fmt)))
    xda.coords['lat'] = ('lat',lats)
    xda.coords['lon'] = ('lon',lons)
    
    return xda

我要使用此代码(对于单个分析日期-202005300-2020年5月30日,以及一个名为ZGEO的变量):

lens_zgeo = [open_ensemble('2020053000',ens,'ZGEO') for ens in ensm]
dens_zgeo = xr.concat(lens_zgeo,dim='ens')
dens_zgeo.coords['ens'] = ('ens',ensm)

dens_zgeo一个数据数组,其中tlev的大小为549,即9级* 61个预测时间。

我可以通过执行以下操作将dens_zgeoens维度分组:

tmp = dict(list(dens_zgeo.groupby('ens')))

然后我可以选择一个集合成员和tlev中的第一个坐标:

tmp['NMC'].isel(tlev=0).plot()

哪个给我一个不错的经/纬图。在这一点上,我还不清楚选择哪个级别或时间,这引发了一个问题:

如何使用“ sel”而不是“ isel”按名称来选择时间和级别?

对于我来说,这样做很自然:

tmp['NMC'].sel(lev='925',time='2020-05-30').plot()

但是似乎无法执行此操作,因为tlev是同时根据levelsdate_fmttlev = np.tile(levels,len(date_fmt)))定义的。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)