如何对特征张量执行某些操作?

问题描述

我需要对特征张量执行某些操作。但我没有找到任何示例或文档。

我有两个张量:

Eigen::Tensor feature_buffer(K,45,7); feature_buffer.setZero();

VectorXi number_buffer(K);

我需要对张量执行以下操作。

feature_buffer[:,:,-3:] = feature_buffer[:,:3] - \
    feature_buffer[:,:3].sum(axis=1,keepdims=True)/number_buffer.reshape(K,1,1)

上面的代码是numpy代码。我做了所有的事情,但我卡在了最后一步。

有人可以帮我解决这个问题吗?我一整天都被困住了。

提前致谢

解决方法

我相信 numpy 操作在两个地方是不适定的,其中维度不匹配。我对 numpy ndarray 操作不是很熟悉,所以这可能是我的一个简单的误解,但如果该操作成功,我的猜测是当某些维度匹配时 numpy 可以做出有根据的猜测起来...

话虽如此,我已了解您要完成的工作的要点,因此我在下面逐步写下了等效的 C++ 代码。我采取了一些自由重新解释该操作以使维度正确匹配:最后,如果它不是完全相同的操作,我希望通过阅读语法可以解决问题。

#include <unsupported/Eigen/CXX11/Tensor>

int main(){

    long d0 = 10; // This is "K"
    long d1 = 10;
    long d2 = 10;
    Eigen::Tensor<float,3> feature_buffer(d0,d1,d2);
    Eigen::Tensor<float,1> number_buffer(d0);

    feature_buffer.setRandom();
    number_buffer.setRandom();

    // Step 1) Define numpy "feature_buffer[:,:,-3:]" in C++
    std::array<long,3> offsetA = {0,d2-3};
    std::array<long,3> extentA = {d0,3};
    auto feature_sliceA        = feature_buffer.slice(offsetA,extentA);
     // Note: feature_sliceA is a "slice" object: it does not own the data in feature_buffer,//       it merely points to a rectangular subregion inside of feature_buffer.
     //       If you'd rather make a copy of that data,replace "auto" with "Eigen::Tensor<float,3>".

    // Step 2) Define numpy "feature_buffer[:,:3]" in C++
    std::array<long,3> offsetB = {0,0};
    std::array<long,3> extentB = {d0,3};
    auto feature_sliceB        = feature_buffer.slice(offsetA,extentA);

    // Step 3) Perform the numpy operation "feature_buffer[:,:3].sum(axis=1,keepdims=True)"
    std::array<long,1> sumDims         = {1};
    std::array<long,3> newDims         = {d0,1,3}; // This takes care of "keepdims=True": d1 is summed over,then kept as size 1.
    Eigen::Tensor<float,3> feature_sum = feature_sliceB.sum(sumDims).reshape(newDims);

    // Step 4) The numpy division "feature_buffer[:,keepdims=True)/number_buffer.reshape(K,1)"
    //         looks ill-formed: There are fewer elements in [:,:3] than in number_buffer.reshape(K,1).
    //         To go head,we could interpret this as dividing each of the 3 "columns" (in dimension 2) by number_buffer:
    //         Something like: "feature_sum/number_buffer.reshape(d0,3)"
    std::array<long,3> numBcast         = {1,3};
    std::array<long,3> numDims          = {d0,1};
    Eigen::Tensor<float,3> number_bcast = number_buffer.reshape(numDims).broadcast(numBcast);
    
    // Step 5) Perform the division operation

    Eigen::Tensor<float,3> feature_div = feature_sum/number_bcast;


    // Step 6) Perform the numpy subtraction 
    //         "feature_buffer[:,:3] - feature_buffer[:,1)
    //         in our current program this corresponds to 
    //              "feature_sliceB - feature_div"
    //          Actually,this is also ill-formed,since: 
    //              feature_sliceB has dimensions (d0,3) = (10,10,3)
    //              feature_div    has dimensions (d0,3)
    //
    //          To go ahead we can reinterpret once again: Assume the subtraction happens once for each dimension 1.
    //          We use broadcast again to copy the contents of feature_div d1 times along dimension 1
    std::array<long,3> divBcast = {1,3> feature_div_bcast = feature_div.broadcast(divBcast);


    // Step 7) Perform the main assignment operation
    feature_sliceA = feature_sliceB - feature_div_bcast;

}

您可以在 godbolt 上看到相同的代码。

我根本没有考虑这里的性能。我相信你可以找到更好的方式来巧妙地写这篇文章。