在一个时间序列上进行迭代时,如何产生大块滑动窗口?

问题描述

已编辑:

我有一个时间序列,例如ts = [[0 0][1 1][2 2][3 3][4 4][5 5][6 6][7 7][8 8]],我想按以下两个顺序进行划分:

 X = [[[[0][1]][[1][2]][[2][3]]] [[[1][2]][[2][3]][[3][4]]] [[[2][3]][[3][4]][[4][5]]] [[[3][4]][[4][5]][[5][6]]] [[[4][5]][[5][6]][[6][7]]] [[[5][6]][[6][7]][[7][8]]]] 
y = [[3][4][5][6][7][8]]

X是三个两步滑动窗口的大块序列,而y是它的特征。 我的策略是首先采用以下方法

def split_sequences(sequences,n_steps):
        X,y = list(),list()
        for i in range(len(sequences)):
        # find the end of this pattern
            end_ix = i + n_steps
            prev_end_ix = end_ix - 1
        # check if we are beyond the dataset
            if end_ix > len(sequences):
                break
        # gather input and output parts of the pattern
            seq_x,seq_y = sequences[i:end_ix,:-1],sequences[prev_end_ix:end_ix,-1]
            X.append(seq_x)
            y.append(seq_y)
        return np.array(X),np.array(y)

哪些内容会被撤销

X =[[[0][1]] [[1][2]] [[2][3]] [[3][4]] [[4][5]] [[5][6]] [[6][7]] [[7][8]]] 
y = [[1][2][3][4][5][6][7][8]]

然后我应用以下两种方法来获得所需的输出

def separar_uni_X(sequencia,n_passos):
    X = list()
    for i in range(len(sequencia)):
        # find the end of this pattern
        end_ix = i + n_passos
        # check if we are beyond the sequence
        if end_ix > len(sequencia):
            break
        # gather input and output parts of the pattern
        seq_x = sequencia[i:end_ix,:]
        X.append(seq_x)
    return np.array(X)

def separar_uni_y(sequencia,n_passos):
    y = list()
    for i in range(len(sequencia)):
        # find the end of this pattern
        end_ix = i + n_passos
        # check if we are beyond the sequence
        if end_ix > len(sequencia):
            break
        # gather input and output parts of the pattern
        seq_y = sequencia[i:end_ix,:]
        y.append(seq_y[-1])
    return np.array(y)

问题:问题在于,为了获得所需的输出,它必须存储从第一种方法到第二种方法的数据,并且当序列太长时,它将超过存储容量。为了解决此缺点,我在子流程中分解了流程:

def split_sequence_3D(sequences,n_steps,batch_size):
    X,list()
    for i in range(len(sequences)):
    # find the end of this pattern
        end_ix = i + n_steps
        prev_end_ix = end_ix - 1
    # check if we are beyond the dataset
        if end_ix > len(sequences):
            break
    # gather input and output parts of the pattern
        seq_x,-1]
        sub_X,sub_y = [],[]
        for j in range(batch_size):
            sub_X.append(seq_x)
            sub_y.append(seq_y)
        X.append(sub_X)
        y.append(sub_y[-1])    
    return np.array(X),np.array(y)

出于明显的原因,哪个给出了错误输出

X = [[[[0][1]][[0][1]][[0][1]]] [[[1][2]][[1][2]][[1][2]]] [[[2][3]][[2][3]][[2]   [3]]] [[[3][4]][[3][4]][[3][4]]] [[[4][5]][[4][5]][[4][5]]] [[[5][6]][[5][6]][[5 [6]]] [[[6][7]][[6][7]][[6][7]]] [[[7][8]][[7][8]][[7][8]]]] 
y = [[1][2][3][4][5][6][7][8]]

我已经在广泛寻找替代品,但没有找到。

解决方法

好吧,我真的很想解决您的问题,这也是我的。但是最终,解决方案变得很简单。我的解决方案是使滑动窗口迭代器也能滑动。

def input_3D(sequencia,lote,janela):
        if lote > len(sequencia):
            raise ValueError('Tamanho do lote maior que o conjunto dos dados')
        if janela > len(sequencia):
            raise ValueError('Tamanho da janela maior que o conjunto dos dados')    
        X_,y_ = [],[]
        for j in range (len(sequencia)):
            if j+lote+janela > len(sequencia):
                break
            X,y = [],[]
            for i in range (j,j+lote,1):
                end_ix = i+janela
                prev_end_ix = end_ix - 1
                seq_x,seq_y = sequencia[i:end_ix,:-1],sequencia[prev_end_ix:end_ix,-1]
                X.append(np.array(seq_x))
                y.append(np.array(seq_y[-1]))
            X_.append(np.array(X))
            y_.append(np.array(y[-1]))
        return np.array(X_),np.array(y_)

假设您输入的是:

arr_x = list(range(0,100))
arr_y = list(range(0,100))
arr = np.stack([arr_x,arr_y])
arr = arr.T

那么您的输出将是:

[[[[ 0]
   [ 1]
   [ 2]
   ...
   [ 7]
   [ 8]
   [ 9]]

  [[ 1]
   [ 2]
   [ 3]
   ...
   [ 8]
   [ 9]
   [10]]

  [[ 2]
   [ 3]
   [ 4]
   ...
   [ 9]
   [10]
   [11]]

  [[ 3]
   [ 4]
   [ 5]
   ...
   [10]
   [11]
   [12]]]

...

 [[[86]
   [87]
   [88]
   ...
   [93]
   [94]
   [95]]

  [[87]
   [88]
   [89]
   ...
   [94]
   [95]
   [96]]

  [[88]
   [89]
   [90]
   ...
   [95]
   [96]
   [97]]

  [[89]
   [90]
   [91]
   ...
   [96]
   [97]
   [98]]]] [12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 
 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98]