如何从两个 xarray 时间序列数据集中删除不匹配的时间序列

问题描述

我有两个具有匹配和不匹配时间序列的 xarray 数据集。我想从数据集 2 中删除与数据集 1 的时间序列不匹配的时间序列。

ds1
    <xarray.Dataset>
    Dimensions:      (time: 149,x: 311,y: 266)
    Coordinates:
      * y            (y) float64 -3.256e+06 -3.256e+06 ... -3.263e+06 -3.263e+06
        spatial_ref  int32 3577
      * time         (time) datetime64[ns] 2016-01-01T00:09:15.704000 ... 2020-12...
      * x            (x) float64 1.913e+06 1.913e+06 1.913e+06 ... 1.92e+06 1.92e+06
    Data variables:
        FMCOB          (time,y,x) float64 78.63 48.68 85.0 ... 42.16 91.27 52.36
        Forest       (x,y) int64 0 0 0 3 3 3 3 3 0 0 0 0 ... 0 0 3 3 3 3 3 3 3 3 3
        Grass        (x,y) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0
        Shrub        (x,y) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0
    Attributes:
        crs:           epsg:3577
        grid_mapping:  spatial_ref
        units:         % dry matter

ds2
<xarray.Dataset>
    Dimensions:      (time: 155,x: 76,y: 47)
    Coordinates:
      * y            (y) float64 -3.257e+06 -3.257e+06 ... -3.258e+06 -3.258e+06
        spatial_ref  int32 3577
      * time         (time) datetime64[ns] 2016-01-01T00:09:15.704000 ... 2020-12...
      * x            (x) float64 1.919e+06 1.919e+06 ... 1.921e+06 1.921e+06
    Data variables:
        FMCOB          (time,x) float64 81.67 87.5 74.4 95.0 ... nan 58.39 85.96
        Forest       (x,y) int64 0 0 0 0 0 0 3 3 3 3 3 3 ... 0 0 0 0 0 0 0 0 0 0 0
        Grass        (x,y) int64 0 0 1 1 1 1 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0
        Shrub        (x,y) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 2 0 0 0 0 2
    Attributes:
        crs:           epsg:3577
        grid_mapping:  spatial_ref
        units:         % dry matter

我尝试的是以下内容

for i in ds1.time:
    for k in ds2.time:
        if k!=i:
            
            ds2.drop_sel(time = np.datetime64(k))
            

但这会引发以下错误

ValueError                                Traceback (most recent call last)
<ipython-input-226-70fe5e9f97a4> in <module>
      4     for k in ds2.time:
      5         if k!=i:
----> 6             ds2.drop_sel(time = np.datetime64(k))

ValueError: Could not convert object to NumPy datetime   

解决方法

如果您想从 ds2 中选择也存在于 ds1 中的所有时间片,您可以这样做

time_ix = np.isin(ds2.time,ds1.time)
ds2_sel = ds2.sel(time=time_ix)

其中,time_ix 是一个简单的布尔数组,其中 True 中的每个元素都包含 ds2.time,这些元素也出现在 ds1.time 中。