Thrust:并行计算多个segment的set_difference

问题描述

我编写了一些代码来使用推力计算每段的 set_difference。这个想法是使用额外的数组来指示哪个元素属于哪个段,以及一个自定义比较器。 这会产生正确的输出集,但如果输入段为空,则输出段大小是错误的。 Reduce-by-key 用于使用每个元素的输出段标识符来计算每个段的元素数量。如果输入段为空,则相应的段 id 不会出现在输出段 id 中,从而导致错误结果。

正确:

segmentsizesLeft {4,3}
segmentIdsLeft {0,1,1}

outputsegmentIds {0,1}
outputsegmentsizes {3,1}

错误:

segmentsizesLeft {0,3}
segmentIdsLeft {1,1}

outputsegmentIds {1}
outputsegmentsizes {1,uninitialized} //should be {0,1}

如何解决?

#include <thrust/set_operations.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/tuple.h>
#include <thrust/reduce.h>
#include <thrust/iterator/constant_iterator.h>
#include <thrust/iterator/discard_iterator.h>

#include <iostream>

int main(){

    //compute set_difference(left,right),per segment
    //output: contiguous range of remaining elements,array of output segment sizes

#if 1    
    //correct result
    int numLeft = 7;
    int dataLeft[7]{0,2,3,4};
    int segmentsizesLeft[2]{4,3};
    int psLeft[3]{0,4,7};
    int segmentIdsLeft[7]{0,1};
#else 
    //wrong result. empty input segment is no longer present in output segment sizes
    int numLeft = 3;
    int dataLeft[3]{1,4};
    int segmentsizesLeft[2]{0,3};
    int segmentIdsLeft[3]{1,1};
#endif    
 
    int dataRight[7]{2,4};
    int segmentsizesRight[2]{1,2};
    int psRight[3]{0,3};
    int segmentIdsRight[3]{0,1};

    auto first1 = thrust::make_zip_iterator(thrust::make_tuple(&segmentIdsLeft[0],&dataLeft[0]));
    auto last1 = thrust::make_zip_iterator(thrust::make_tuple(&segmentIdsLeft[0] + numLeft,&dataLeft[0] + numLeft));

    auto first2 = thrust::make_zip_iterator(thrust::make_tuple(&segmentIdsRight[0],&dataRight[0]));
    auto last2 = thrust::make_zip_iterator(thrust::make_tuple(&segmentIdsRight[0] + 3,&dataRight[0] + 3));

    int segmentIdsOutput[7];
    int dataOutput[7];
    int segmentsizesOutput[2];

    auto output = thrust::make_zip_iterator(thrust::make_tuple(&segmentIdsOutput[0],&dataOutput[0]));

    auto comp = [](const auto& t1,const auto& t2){
        const int idl = thrust::get<0>(t1);
        const int idr = thrust::get<0>(t2);

        if(idl < idr) return true;
        if(idl > idr) return false;

        return thrust::get<1>(t1) < thrust::get<1>(t2);
    };

    auto outputend = thrust::set_difference(first1,last1,first2,last2,output,comp);

    int outputsize = thrust::distance(output,outputend);

    thrust::reduce_by_key(&segmentIdsOutput[0],&segmentIdsOutput[0] + outputsize,thrust::make_constant_iterator(1),thrust::make_discard_iterator(),&segmentsizesOutput[0]
    );

    std::cerr << "raw data output: ";
    for(int i = 0; i < outputsize; i++){
        std::cerr << " " << dataOutput[i];
    }
    std::cerr << "\n";

    std::cerr << "result segment sizes: ";
    for(int i = 0; i < 2; i++){
        std::cerr << " " << segmentsizesOutput[i];
    }
    std::cerr << "\n";
}

解决方法

以下方法正在起作用。不是丢弃唯一的段 ID 并将减少的值存储在输出数组中,而是将两个输出都保存到临时存储中。然后,将输出数组初始化为零以考虑任何空段,并使用键值对在正确位置设置段大小。

int uniqueIds[7];
int reducedCounts[7];

int numUnique = thrust::distance(
    &uniqueIds[0],thrust::reduce_by_key(
        &segmentIdsOutput[0],&segmentIdsOutput[0] + outputsize,thrust::make_constant_iterator(1),&uniqueIds[0],&reducedCounts[0]
    ).first
);

thrust::fill(&segmentsizesOutput[0],&segmentsizesOutput[0] + 2,0);

thrust::for_each_n(
    thrust::make_zip_iterator(thrust::make_tuple(&uniqueIds[0],&reducedCounts[0])),numUnique,[&] (auto tup){
        segmentsizesOutput[thrust::get<0>(tup)] = thrust::get<1>(tup);
    }
);

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...