如何优化多维数据数组上的一维插值?

问题描述

我有一个 4D 数据数组,我只沿垂直轴对它进行插值。

from scipy.interpolate import interp1d

#data array dims
da[time,plev,la,lon]

#array with vertical levels
lev = da.plev

#new temperatures ->dummy values
tem = np.arange(10,100,5)

#begin loop for interpolation
for time in range(da.time.size):
    for lat in range(da.lat.size):
        for lon in range(da.lon.size):
            f = interp1d(da[time,:,lat,lon],lev,fill_value='extrapolate')
            holder[time,lon] = f(tem)

代码有效,但需要一段时间才能运行。我仍在学习 apply_ufunc 和 dask,我看到了一些示例 here,我认为这有助于大大减少运行时间(至少与 for 循环相比)。

我试图运行类似的东西

# return a tuple of DataArrays
res = xr.apply_ufunc(interp1d,hus,input_core_dims=[['plev'],['plev']],output_core_dims=[[]],vectorize=True)

但是当我尝试使用插值函数时:

holder = res(tem)

我收到一条错误消息:DataArray 对象不可调用

更新

我尝试了以下代码,将内插器放入函数中。我知道它在工作,因为我在 return 语句之前打印了一些结果。但问题在于 return 语句。

def interp(x,y):
    # Wrapper around scipy linregress to use in apply_ufunc
    f = interp1d(x,y,fill_value='extrapolate')
    new = f(tem)
    return (new)

holder_new = xr.apply_ufunc(interp,check,p,vectorize=True)

错误信息:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<timed exec> in <module>

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func,input_core_dims,output_core_dims,exclude_dims,vectorize,join,dataset_join,dataset_fill_value,keep_attrs,kwargs,dask,output_dtypes,output_sizes,Meta,dask_gufunc_kwargs,*args)
   1108             join=join,1109             exclude_dims=exclude_dims,-> 1110             keep_attrs=keep_attrs,1111         )
   1112     # Feed Variables directly through apply_variable_ufunc

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func,signature,*args)
    260 
    261     data_vars = [getattr(a,"variable",a) for a in args]
--> 262     result_var = func(*data_vars)
    263 
    264     if signature.num_outputs > 1:

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func,*args)
    698             )
    699 
--> 700     result_data = func(*input_data)
    701 
    702     if signature.num_outputs == 1:

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self,*args,**kwargs)
   2106             vargs.extend([kwargs[_n] for _n in names])
   2107 
-> 2108         return self._vectorize_call(func=func,args=vargs)
   2109 
   2110     def _get_ufunc_and_otypes(self,func,args):

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self,args)
   2180         """Vectorized call to `func` over positional `args`."""
   2181         if self.signature is not None:
-> 2182             res = self._vectorize_call_with_signature(func,args)
   2183         elif not args:
   2184             res = func()

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self,args)
   2244 
   2245             for output,result in zip(outputs,results):
-> 2246                 output[index] = result
   2247 
   2248         if outputs is None:

ValueError: setting an array element with a sequence.

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)