问题描述
我有一些手动矢量化的C ++代码,我正在尝试通过功能多版本化为可分发的二进制代码。由于该代码将SIMD内部函数用于不同的指令集(SSE2,AVX2,AVX512),因此它使用模板专业化来确定要使用的内部函数。
总体结构大致如下:
const objectScan = require('object-scan');
const modify = (data) => objectScan(['**.internalChecked'],{
rtn: 'count',filterFn: ({ parent,property }) => {
parent[property] = false;
}
})(data);
const json = [{"internaldisabled":false,"internalChecked":true,"internalCollapsed":true,"text":"Steel","value":2,"internalChildren":[{"internaldisabled":false,"internalCollapsed":false,"text":"Cars","value":54,"text":"Sedan","value":55,"text":"test","value":1053},{"internaldisabled":false,"text":"cc cc cc","value":1054,"text":"cccccc","value":1055},"text":"xxxxxxx","value":1056}]}]},"text":"train","value":2053,"text":"bullet","value":2054},"text":"pessenger","value":2055}]}]}]},"text":"Auto/Boat","value":3},"text":"Build Your Own Job","value":4},"text":"Cleaning & Housekeeping","value":5},"text":"Delivery & Courier","value":6},"text":"Handyman","value":7},"text":"Hourly Help","value":8},"text":"Lawn & Yard","value":10},"text":"Moving","value":11},"text":"Organization","value":12},"text":"Painting","value":13},"text":"Pet Care","value":14},"text":"TV Mount & Electronics","value":15}];
console.log(modify(json)); // returns number of matches
// => 22
console.log(JSON.stringify(json));
// => [{"internaldisabled":false,"internalChecked":false,"value":15}]
现在的问题是,我需要template <unsigned W,unsigned N> struct SIMD {}; // SIMD abstraction
template <> struct SIMD<128,8> { // specialization for specific dimensions
using Vec = __m128i;
static always_inline Vec add(Vec a,Vec b) { return _mm_add_epi8(a,b); }
... // many other SIMD methods
};
... // many other dimension specializations for different instruction sets
template <unsigned W,unsigned N> class Worker {
void docomputation(int x) {
using S = SIMD<W,N>;
... // do computations using S:: methods
}
}
的不同实例化来具有不同的属性,因为每个实例将针对不同的指令集。像这样:
Worker
,以便为这些不同的目标编译这些不同的实例。但是,这仍然会在Clang上产生错误:
error: always_inline function 'add' requires target feature 'avx2',but would be inlined into function 'docomputation' that is compiled without support for 'avx2'
如果我用template __attribute__((target("avx2"))) void Worker<256,8>::docomputation(int x);
template __attribute__((target("avx512bw"))) void Worker<512,8>::docomputation(int x);
...
注释原始方法,那么如果没有AVX-512支持,它会编译但在运行时执行非法的硬件指令,所以我想我使用上述带注释的专业化的直觉不起作用
有没有一种使用函数属性用Clang或GCC来表达这一点的方法?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)