问题描述
我试图找出为什么我的基准测试中的性能会线性下降的答案。据我所知,我的 L1 数据缓存大小为 32 KiB。我希望在我的基准测试中的某个时候看到巨大的差异,但没有。有没有人知道出了什么问题?
我的假设是否正确,在适合 L1 的小数组(例如 512 字节)上的迭代应该比 L3/RAM 快几个数量级?
系统信息
- 系统:mac os
- cpu:i7-9750H
- 理论上:32 KiB L1 数据缓存 (https://en.wikichip.org/wiki/intel/core_i7/i7-9750h)
我的机器基准测试得分
Benchmark (longs) Mode Cnt score Error Units
AdditionBenchmark.addition 64 avgt 5 15.899 ± 0.434 us/op
AdditionBenchmark.addition 128 avgt 5 31.062 ± 0.855 us/op
AdditionBenchmark.addition 256 avgt 5 61.062 ± 0.524 us/op
AdditionBenchmark.addition 512 avgt 5 121.283 ± 3.414 us/op
AdditionBenchmark.addition 1024 avgt 5 243.102 ± 2.684 us/op
AdditionBenchmark.addition 2048 avgt 5 483.627 ± 3.025 us/op
AdditionBenchmark.addition 4096 avgt 5 969.184 ± 20.331 us/op
AdditionBenchmark.addition 8192 avgt 5 1948.989 ± 43.016 us/op
AdditionBenchmark.addition 16384 avgt 5 3848.305 ± 95.566 us/op
AdditionBenchmark.addition 32768 avgt 5 7755.592 ± 205.539 us/op
AdditionBenchmark.addition 65536 avgt 5 16663.239 ± 388.075 us/op
AdditionBenchmark.addition 131072 avgt 5 33894.256 ± 1218.559 us/op
AdditionBenchmark.addition 262144 avgt 5 67138.158 ± 512.422 us/op
AdditionBenchmark.addition 524288 avgt 5 139241.696 ± 9124.094 us/op
基准代码
@Warmup(iterations = 5,time = 2)
@Measurement(iterations = 5,time = 2)
@Fork(value = 1,jvmArgsAppend = {
"-Xmx4G","-xms4G"
})
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class AdditionBenchmark {
long[] array;
@Param({"64","128","256","512","1024","2048","4096","8192","16384","32768","65536","131072","262144","524288"})
int longs;
@Setup(Level.Trial)
public void setup() {
ThreadLocalRandom r = ThreadLocalRandom.current();
array = r.longs().limit(longs).toArray();
}
@Benchmark
public int addition() {
int c = 0;
for (int i = 0; i < 1000; i++) {
for (long w : array) {
c += w;
}
}
return c;
}
public static void main(String[] args) throws IOException {
org.openjdk.jmh.Main.main(args);
}
}
UPDATE 1,更多随机游走,仍然几乎是线性的
Benchmark (longs) Mode Cnt score Error Units
AdditionBenchmark.addition 64 avgt 5 108.005 ± 9.939 us/op
AdditionBenchmark.addition 128 avgt 5 224.575 ± 5.631 us/op
AdditionBenchmark.addition 256 avgt 5 446.561 ± 3.309 us/op
AdditionBenchmark.addition 512 avgt 5 891.957 ± 9.686 us/op
AdditionBenchmark.addition 1024 avgt 5 1790.980 ± 65.162 us/op
AdditionBenchmark.addition 2048 avgt 5 3572.439 ± 108.912 us/op
AdditionBenchmark.addition 4096 avgt 5 7136.228 ± 42.287 us/op
AdditionBenchmark.addition 8192 avgt 5 14236.296 ± 224.648 us/op
AdditionBenchmark.addition 16384 avgt 5 28723.188 ± 832.471 us/op
AdditionBenchmark.addition 32768 avgt 5 57640.325 ± 1562.439 us/op
AdditionBenchmark.addition 65536 avgt 5 181636.470 ± 6433.434 us/op
@Benchmark
public int addition() {
int c = 0;
int mask = array.length - 1;
for (int i = 0; i < 1000; i++) {
for (int j = 0; j < array.length; j++) {
c += array[c & mask];
}
}
return c;
}
更新 2,现在有意义
Benchmark (longs) Mode Cnt score Error
L1
AdditionBenchmark.addition 64 avgt 5 127.561 ± 6.875 us/op
AdditionBenchmark.addition 128 avgt 5 251.343 ± 5.783 us/op
AdditionBenchmark.addition 256 avgt 5 502.992 ± 6.485 us/op
AdditionBenchmark.addition 512 avgt 5 1002.569 ± 4.776 us/op
AdditionBenchmark.addition 1024 avgt 5 2008.738 ± 51.341 us/op
AdditionBenchmark.addition 2048 avgt 5 4020.453 ± 37.940 us/op
AdditionBenchmark.addition 4096 avgt 5 8058.776 ± 99.703 us/op
L2
AdditionBenchmark.addition 8192 avgt 5 22918.915 ± 263.238 us/op
AdditionBenchmark.addition 16384 avgt 5 53222.615 ± 1162.671 us/op
AdditionBenchmark.addition 32768 avgt 5 117576.770 ± 2098.845 us/op
L3
AdditionBenchmark.addition 65536 avgt 5 528979.627 ± 16870.041 us/op
Main memory
for 2097152 longs comparing to 1048576,another 5x slower penalty but It takes too much time to complete
@Setup // new approach for setup
public void setup() {
array = LongStream.range(0,longs).limit(longs).toArray();
ArrayUtils.shuffle(array);
}
@Benchmark
public int addition() {
int c = 0;
int mask = array.length - 1;
for (int i = 0; i < 1000; i++) {
for (int j = 0; j < array.length; j++) {
c += array[(c ^ j) & mask];
}
}
return c;
}
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)