超线程如何影响缓存？

问题描述

我在使用Hyper-Threading的时候，出现了一个很奇怪的现象。我是用c语言来测量l2 hit的时间的。我在程序 A 中编写了测试时间。当我单独运行它时，它显示每次大约需要 26 cycles。当我写另一个程序 B. 将 B 和 A 绑定到同一个内核并以超线程的形式运行。即使只写了一个无用的循环在 B 中，例如：

for(;;){
}

我发现，在这种情况下，当我再次运行程序 A 时，显示的 L2 hit 时间变成了大约 10。

代码很长。测试的伪代码如下所示。

A:

    Select 16 cache lines (line 0 -16)in a set.  // l1 cache is VIPT structure
    Organize the disordered sequence of line0 to line8 into a linked list to prevent prefetching.
    Organize the disordered sequence of line9 to line15 into a linked list to prevent prefetching.
    load line 0 - 8   //Because it is a linked list,it is read serially
    fence()
    t1= rdtscp()
    load line 9 - 15  //Because it is a linked list,it is read serially
     t2 = rdtscp()
    print t2 - t1

B : for(;;){
}

只有A在运行时，输出结果平均为200。当 B 和 A 以相同的核心超线程运行时，A 的输出结果为 90。

如果超线程的效果让时间变长，我可以理解，可能是干扰，但时间变短的原因是什么？

我的服务器配置如下：

 Linux version 4.15.0-122-generic (buildd@lcy01-amd64-010) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)) #124~16.04.1-Ubuntu SMP

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

c cpu-cache hyperthreading microbenchmark