Fisher yates 算法没有产生无偏的结果

问题描述

维基百科上描述的Fisher yates算法是

该算法产生无偏排列:每个排列的可能性均等。

我阅读了一些文章,这些文章解释了天真和费舍尔耶茨算法如何在集合中生成有偏和无偏的项目组合。

文章链接

Fisher-Yates Shuffle – An Algorithm Every Developer Should Know

Randomness is hard: learning about the Fisher-Yates shuffle algorithm & random number generation

文章继续展示了这两种算法的几乎无偏和非常有偏的结果的图表。我试图重现概率,但我似乎无法产生差异。

这是我的代码

import java.util.*

class Problem {
    private val arr = intArrayOf(1,2,3)
    private val occurrences = mutableMapOf<String,Int>()
    private val rand = Random()

    fun biased() {
        for (i in 1..100000) {
            for (i in arr.indices) {
                val k = rand.nextInt(arr.size)
                val temp = arr[k]
                arr[k] = arr[i]
                arr[i] = temp
            }


            val combination = arr.toList().joinToString("")

            if (occurrences.containsKey(combination)) {
                occurrences[combination] = occurrences[combination]!! + 1
            } else {
                occurrences[combination] = 1
            }
        }

        print("Naive:\n")
        occurrences.forEach { (t,u) ->
            print("$t: $u\n")
        }
    }

    /**
    * Fisher yates algorithm - unbiased
    */
    fun unbiased() {
        for (i in 1..100000) {
            for (i in arr.size-1 downTo 0) {
                val j = rand.nextInt(i + 1)
                val temp = arr[i]
                arr[i] = arr[j]
                arr[j] = temp
            }

            val combination = arr.toList().joinToString("")

            if (occurrences.containsKey(combination)) {
                occurrences[combination] = occurrences[combination]!! + 1
            } else {
                occurrences[combination] = 1
            }
        }

        print("Fisher Yates:\n")
        occurrences.forEach { (t,u) ->
            print("$t: $u\n")
        }
    }
}

fun main() {
    Problem().biased()
    Problem().unbiased()
}

结果如下

Naive:
312: 16719
213: 16654
231: 16807
123: 16474
132: 16636
321: 16710
Fisher Yates:
123: 16695
312: 16568
213: 16923
321: 16627
132: 16766
231: 16421

我的结果在两种情况下都没有太大不同。我的问题是,我的实现是错误的吗?还是我的理解有误?

解决方法

您对这两种算法的实现都有一个错误,它消除了幼稚改组引入的偏差。您不会对每次 shuffle 使用相同的排列重新开始,而是使用最后一次 shuffle 产生的排列。一个简单的解决方法是每次将数组重置为 [1,2,3]

import java.util.*

class Problem {
    private var arr = intArrayOf(1,3)
    private val occurrences = mutableMapOf<String,Int>()
    private val rand = Random()

    fun biased() {
        for (i in 1..100000) {
            arr = intArrayOf(1,3)  // reset arr before each shuffle
            for (i in arr.indices) {
                val k = rand.nextInt(arr.size)
                val temp = arr[k]
                arr[k] = arr[i]
                arr[i] = temp
            }


            val combination = arr.toList().joinToString("")

            if (occurrences.containsKey(combination)) {
                occurrences[combination] = occurrences[combination]!! + 1
            } else {
                occurrences[combination] = 1
            }
        }

        print("Naive:\n")
        occurrences.forEach { (t,u) ->
            print("$t: $u\n")
        }
    }

    /**
    * Fisher yates algorithm - unbiased
    */
    fun unbiased() {
        for (i in 1..100000) {
            arr = intArrayOf(1,3)  // reset arr before each shuffle
            for (i in arr.size-1 downTo 0) {
                val j = rand.nextInt(i + 1)
                val temp = arr[i]
                arr[i] = arr[j]
                arr[j] = temp
            }

            val combination = arr.toList().joinToString("")

            if (occurrences.containsKey(combination)) {
                occurrences[combination] = occurrences[combination]!! + 1
            } else {
                occurrences[combination] = 1
            }
        }

        print("Fisher Yates:\n")
        occurrences.forEach { (t,u) ->
            print("$t: $u\n")
        }
    }
}

fun main() {
    Problem().biased()
    Problem().unbiased()
}

输出:

Naive:
213: 18516
132: 18736
312: 14772
321: 14587
123: 14807
231: 18582
Fisher Yates:
321: 16593
213: 16552
231: 16674
132: 16486
123: 16802
312: 16893

不是 Kotlin 程序员,所以可能有一种更优雅的方法来做到这一点,但我想它已经足够了。