为什么“-XX:+UseLWPSynchronization”会对 Windows 操作系统上的印章锁定产生负面影响?

问题描述

我想在java中实现一个简单的速率限制器来学习如何使用jmh。在'https://github.com/William1104/rate-limiter'

创建了一个简单的 github 项目

有趣的是,当启用 '-XX:+UseLWPSynchronization' 选项时,某些实现(带有戳锁)的吞吐量会受到影响。基准测试是在 Windows 机器上执行的,我预计它对非 Solaris 系统没有影响。然而,测试结果显示不同。我知道有人可以帮助我了解到底发生了什么吗?

以下是我机器上的测试结果作为参考:

带选项: -server,-XX:+UnlockDiagnosticVMOptions,-XX:+UseNUMA

基准 (rateLimiterType) 模式 Cnt 得分 错误 单位
RaterLimiterBenchmark.thread_1 StampLockLongArrayRateLimiter thrpt 90 21487.385 ▒ 1082.163 操作数/毫秒
RaterLimiterBenchmark.thread_1 StampLockInstantArrayRateLimiter thrpt 90 13162.330 ▒ 1585.555 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedLongArrayRateLimiter thrpt 90 15362.934 ▒ 227.704 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedInstantArrayRateLimiter thrpt 90 17281.675 ▒ 2148.057 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockLongArrayRateLimiter thrpt 90 6868.653 ▒ 146.372 操作数/毫秒
RaterLimiterBenchmark.thread_10 StampLockInstantArrayRateLimiter thrpt 90 8189.747 ▒ 335.517 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedLongArrayRateLimiter thrpt 90 6643.004 ▒ 103.568 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedInstantArrayRateLimiter thrpt 90 5252.975 ▒ 190.363 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockLongArrayRateLimiter thrpt 90 7352.890 ▒ 2109.446 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockInstantArrayRateLimiter thrpt 90 8675.814 ▒ 922.653 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedLongArrayRateLimiter thrpt 90 6509.368 ▒ 157.212 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedInstantArrayRateLimiter thrpt 90 5042.867 ▒ 192.971 操作/毫秒

带选项:-server,-XX:+UseNUMA,-XX:+UseLWPSynchronization

基准 (rateLimiterType) 模式 Cnt 得分 错误 单位
RaterLimiterBenchmark.thread_1 StampLockLongArrayRateLimiter thrpt 90 11383.198 ▒ 353.921 操作/毫秒
RaterLimiterBenchmark.thread_1 StampLockInstantArrayRateLimiter thrpt 90 11666.918 ▒ 842.426 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedLongArrayRateLimiter thrpt 90 15696.852 ▒ 371.078 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedInstantArrayRateLimiter thrpt 90 15357.617 ▒ 650.846 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockLongArrayRateLimiter thrpt 90 6937.050 ▒ 130.727 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockInstantArrayRateLimiter thrpt 90 8268.909 ▒ 291.471 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedLongArrayRateLimiter thrpt 90 9134.319 ▒ 1208.998 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedInstantArrayRateLimiter thrpt 90 5294.341 ▒ 225.995 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockLongArrayRateLimiter thrpt 90 8453.825 ▒ 1075.312 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockInstantArrayRateLimiter thrpt 90 16297.921 ▒ 611.255 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedLongArrayRateLimiter thrpt 90 12536.378 ▒ 974.951 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedInstantArrayRateLimiter thrpt 90 9051.560 ▒ 1303.856 操作/毫秒

有 StampLockLongArrayRateLimiter 和 SynchronizedLongArrayRateLImiter 的实现:

package one.williamwong.ratelimiter;

import java.time.Duration;
import java.util.Arrays;
import java.util.concurrent.locks.StampedLock;

public class StampLockLongArrayRateLimiter implements IRateLimiter {

    private final long duration;
    private final long[] records;
    private final StampedLock lock;
    private int pointer;

    public StampLockLongArrayRateLimiter(int maxInvokes,Duration duration) {
        this.duration = duration.toNanos();
        this.records = new long[maxInvokes];
        this.lock = new StampedLock();
        this.pointer = 0;
    }

    @Override public void acquire() {
        final long stamp = lock.writeLock();
        try {
            final long now = System.nanoTime();
            if (records[pointer] != 0) {
                final long awayFromHead = now - records[pointer];
                if (awayFromHead < duration) {
                    handleExcessLimit(records.length,Duration.ofNanos(awayFromHead));
                }
            }
            records[pointer] = now;
            pointer = (pointer + 1) % records.length;
        } finally {
            lock.unlockWrite(stamp);
        }
    }

    @Override public void reset() {
        final long stamp = lock.writeLock();
        try {
            Arrays.fill(records,0);
            this.pointer = 0;
        } finally {
            lock.unlockWrite(stamp);
        }
    }

}
package one.williamwong.ratelimiter;

import java.time.Duration;
import java.util.Arrays;

public class SynchronizedLongArrayRateLimiter implements IRateLimiter {

    private final long duration;
    private final long[] records;
    private final Object lock;
    private int pointer;

    public SynchronizedLongArrayRateLimiter(int maxInvokes,Duration duration) {
        this.duration = duration.toNanos();
        this.records = new long[maxInvokes];
        this.lock = new Object();
        this.pointer = 0;
    }

    @Override
    public void acquire() {
        synchronized (lock) {
            final long now = System.nanoTime();
            if (records[pointer] != 0) {
                final long awayFromHead = now - records[pointer];
                if (awayFromHead < duration) {
                    handleExcessLimit(records.length,Duration.ofNanos(awayFromHead));
                }
            }
            records[pointer] = now;
            pointer = (pointer + 1) % records.length;
        }
    }

    @Override public void reset() {
        synchronized (lock) {
            Arrays.fill(records,0);
            this.pointer = 0;
        }
    }

}

解决方法

感谢您的评论。我使用不同的设置重新运行我的基准测试。如果我们每次迭代 1 秒再次执行 JMH,我得到以下结果:

带选项:-server,-XX:+UnlockDiagnosticVMOptions,-XX:+UseNUMA,-XX:-UseLWPSynchronization

基准 (rateLimiterType) 模式 Cnt 得分 错误 单位
RaterLimiterBenchmark.thread_1 StampLockLongArrayRateLimiter thrpt 90 23573.282 ▒ 364.739 操作/毫秒
RaterLimiterBenchmark.thread_1 StampLockInstantArrayRateLimiter thrpt 90 23062.260 ▒ 1035.395 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedLongArrayRateLimiter thrpt 90 34667.411 ▒ 246.003 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedInstantArrayRateLimiter thrpt 90 36426.369 ▒ 1248.360 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockLongArrayRateLimiter thrpt 90 13592.158 ▒ 76.319 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockInstantArrayRateLimiter thrpt 90 14564.306 ▒ 474.613 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedLongArrayRateLimiter thrpt 90 13524.610 ▒ 155.850 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedInstantArrayRateLimiter thrpt 90 13080.967 ▒ 309.736 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockLongArrayRateLimiter thrpt 90 13224.529 ▒ 459.035 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockInstantArrayRateLimiter thrpt 90 13890.278 ▒ 456.182 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedLongArrayRateLimiter thrpt 90 12672.925 ▒ 314.118 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedInstantArrayRateLimiter thrpt 90 12245.120 ▒ 296.395 操作/毫秒

带选项:-server,-XX:+UseLWPSynchronization

基准 (rateLimiterType) 模式 Cnt 得分 错误 单位
RaterLimiterBenchmark.thread_1 StampLockLongArrayRateLimiter thrpt 90 24842.514 ▒ 372.521 操作/毫秒
RaterLimiterBenchmark.thread_1 StampLockInstantArrayRateLimiter thrpt 90 24327.864 ▒ 322.659 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedLongArrayRateLimiter thrpt 90 34490.411 ▒ 330.288 操作/毫秒
RaterLimiterBenchmark.thread_1 SynchronizedInstantArrayRateLimiter thrpt 90 38383.257 ▒ 654.269 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockLongArrayRateLimiter thrpt 90 13536.284 ▒ 74.613 操作/毫秒
RaterLimiterBenchmark.thread_10 StampLockInstantArrayRateLimiter thrpt 90 13702.022 ▒ 289.616 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedLongArrayRateLimiter thrpt 90 12530.107 ▒ 243.471 操作/毫秒
RaterLimiterBenchmark.thread_10 SynchronizedInstantArrayRateLimiter thrpt 90 10795.833 ▒ 158.400 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockLongArrayRateLimiter thrpt 90 13204.275 ▒ 200.937 操作/毫秒
RaterLimiterBenchmark.thread_100 StampLockInstantArrayRateLimiter thrpt 90 11606.823 ▒ 224.213 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedLongArrayRateLimiter thrpt 90 11504.124 ▒ 107.543 操作/毫秒
RaterLimiterBenchmark.thread_100 SynchronizedInstantArrayRateLimiter thrpt 90 10732.451 ▒ 118.753 操作/毫秒

无论是否启用“UseLWPSynchronization”,我都没有观察到巨大的性能差异。我遇到的问题与不稳定的 JMH 设置有关。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...