无锁队列：为什么读取 `Atomic*` 两次？

问题描述

我正在阅读 The Art of Multiprocessor Programming,2ed 并且我注意到以下模式用于读取多个 Atomic* 字段：

while (true) {
    var v1 = atomic1.get();
    var v2 = atomic2.get();
    ...
    var vN = atomicN.get();
    if (v1 == atomic1.get()) {
        // do work
    }
}

这个结构的目的是什么？

我在书中找到的唯一解释是：

... 检查读取的值是否一致...

我不明白这个解释。

这是书中的LockFreeQueue，它使用了这种模式：

public class LockFreeQueue<T> {
  
  atomicreference<Node> head,tail;

  public LockFreeQueue() {
    Node node = new Node(null);
    head = new atomicreference(node);
    tail = new atomicreference(node);
  }

  public void enq(T value) {
    Node node = new Node(value);
    while (true) {
      Node last = tail.get();
      Node next = last.next.get();
      if (last == tail.get()) {   // <=== WHY: reading tail.get() again
        if (next == null) {
          if (last.next.compareAndSet(next,node)) {
            tail.compareAndSet(last,node);
            return;
          }
        } else {
          tail.compareAndSet(last,next);
        }
      }
    }
  }

  public T deq() throws EmptyException {
    while (true) {
      Node first = head.get();
      Node last = tail.get();
      Node next = first.next.get();
      if (first == head.get()) {  // <=== WHY: reading head.get() again
        if (first == last) {
          if (next == null) {
            throw new EmptyException();
          }
          tail.compareAndSet(last,next);
        } else {
          T value = next.value;
          if (head.compareAndSet(first,next))
            return value;
        }
      }
    }
  }
}

public class Node {
  
  public T value;
  public atomicreference<Node> next;

  public Node(T value) {
    this.value = value;
    next = new atomicreference<Node>(null);
  }
}

我在 SO 上看到了另一个类似的问题：Lock-free queue algorithm,repeated reads for consistency。
但是：

接受的答案得分为负，并声明所有答案都可以在不重复阅读的情况下工作，但不提供任何证据
它讨论了一种不同的算法：该算法显式地释放节点，而这本书主要是关于 Java 中的算法（其中节点由 GC 隐式释放）。

UPD：书中说LockFreeQueue是a queue algorithm by Maged Michael and Michael Scott的稍微简化版本。
这与the similar SO question mentioned above中讨论的算法相同。

解决方法

我认为一般的想法是作者会按照给定的顺序更新字段，并且每次“更新”时第一个字段的值总是会改变。因此，如果读取器在第二次读取时看到第一个字段没有改变，那么它就知道它已经读取了所有字段的一组一致的值（快照）。

感谢Peter Cordes和Stephen C的回答。
我想现在我明白了，下面是我尝试详细解释这一点的尝试：

原来 LockFreeQueue 是 the queue algorithm by Maged Michael and Michael Scott 的简化版本。

在原始算法中，重复读取确实用于读取所有字段的一组一致值（快照）：

为了获得各种指针的一致值，我们依赖读取序列重新检查早期值以确保它们没有改变。这些读取序列类似于 Prakash 等人的快照，但比它们更简单（我们只需要检查一个共享变量，而不是两个）。

简化的 LockFreeQueue 在没有重复读取的情况下实际上可以正常工作（至少这是我得到的 - 论文中提到的所有安全属性始终保持不变，即使我删除了重复读取）。
尽管重复读取可能会提供更好的性能。

原始算法使用重复读取也是为了正确性（也就是安全性）。
这主要是因为该算法重用了从队列中移除的 Node 对象。
本书后面给出了完整算法 LockFreeQueueRecycle<T> 的 Java 版本（它使用 AtomicStampedReference 而不是 AtomicReference）：

/* 1  */ public T deq() throws EmptyException {
/* 2  */     int[] lastStamp = new int[1];
/* 3  */     int[] firstStamp = new int[1];
/* 4  */     int[] nextStamp = new int[1];
/* 5  */     while (true) {
/* 6  */         Node first = head.get(firstStamp);
/* 7  */         Node last = tail.get(lastStamp);
/* 8  */         Node next = first.next.get(nextStamp);
/* 9  */         if (head.getStamp() == firstStamp[0]) {
/* 10 */             if (first == last) {
/* 11 */                 if (next == null) {
/* 12 */                     throw new EmptyException();
/* 13 */                 }
/* 14 */                 tail.compareAndSet(last,next,/* 15 */                        lastStamp[0],lastStamp[0]+1);
/* 16 */             } else {
/* 17 */                 T value = next.value;
/* 18 */                 if (head.compareAndSet(first,firstStamp[0],firstStamp[0]+1)) {
/* 19 */                     free(first);
/* 20 */                     return value;
/* 21 */                 }
/* 22 */             }
/* 23 */         }
/* 24 */     }
/* 25 */ }

此处第 19 行的 free(first) 使 Node 对象可供重用。

第 9 行的重复读取 head.getStamp() == firstStamp[0] 允许我们读取 head、tail 和 head.next 的一致值。
head.getStamp() 没有改变的事实意味着 head 没有改变 ⇒ 没有节点从队列中删除 ⇒ tail 和 head.next 指向正确（尚未回收) 节点。
如果没有第 9 行的检查，可能会出现错误行为：想象一下，就在第 7 行之后：

我们有first == last，first.next !== null
当前线程被操作系统暂停
另一个线程多次执行 deq() 直到 first 节点被回收。回收期间 first.next 设置为 null。
当前线程被操作系统恢复
第 8 行：next = null — 我们从重用的 first 节点读取了错误的值
第 9 行：跳过
第 10 行：first == last 是 true
第 11 行：next == null 是 true
第 12 行：EmptyException 被错误地抛出（如果在 enq() 执行期间队列从未为空，则为事件）。

this answer 中显示了另一个不正确行为的示例。

atomic java java lock-free multithreading multithreading

无锁队列：为什么读取 `Atomic*` 两次？

问题描述

解决方法

相关问答