问题描述
我在Unix shell中尝试$ RANDOM变量,发现有些奇怪的地方。我运行了以下命令,该命令在循环中读取$ RANDOM 100k次,然后将输出通过管道传输到“ uniq”以查找重复项。
$ for i in {1..100000}; do echo $RANDOM; done | uniq -d
我将命令运行了7次以上,并且相同的两个数字(4455和4117)全部重复了7次。下面的屏幕截图显示了命令行输出。
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
kali@kali:~% for i in {1..100000}; do echo $RANDOM; done | uniq -d
4455
4117
请参阅:https://i.stack.imgur.com/5bpEe.png
我还打开了另一个终端窗口并重复该过程。在第二个终端中,数字不同,但是以类似的方式重复。这让我想知道$ RANDOM变量的熵以及它的种子方式。
我的猜测是,每当调用bash
时,它都会重新播种,但是我想知道是否有人在我在单个终端窗口中重复执行该命令时为何重复相同的值。
解决方法
伪随机数生成器并不完美。 Lehmer random number generator在bash sources中与“标准”常量一起使用:
x(n+1) = 16807 * x(n) mod (2**31 - 1)
此外bash限制了output to 15 bits only:
# define BASH_RAND_MAX 32767
...
return ((unsigned int)(rseed & BASH_RAND_MAX));
有了种子,您的外壳已经被播种,恰好发生在连续输出10000个随机数中,数字4455
和4117
依次出现。真的没什么奇怪的。您可以计算出种子以获得两个连续的数字,知道:
# We know that lower 15 bits of previous number are equal to 4455
x(n) mod 32768 = 4455
# We know that lower 15 bits of previous number are equal to 4455
x(n+1) mod 32768 = 4455
# We know the relation between next and previous number
x(n+1) = 16807 * x(n) mod (2**31 - 1)
# You could find x(n)
为什么重复相同的$ RANDOM数字?
因为在bash源中使用的伪随机数生成器方法(其中当前种子在shell中)恰好重复相同的数字。
,这是由于子外壳中RANDOM的zsh错误/“行为”所致。此错误未出现在bash中。
echo $RANDOM # changes at every run
echo `echo $RANDOM` # always return the same value until you call the first line
因为RANDOM是由它的最后一个值作为种子的,但是在子shell中获得的值不会在主shell中更新。
在man zshparam
中:
RANDOM <S>
A pseudo-random integer from 0 to 32767,newly generated each
time this parameter is referenced. The random number generator
can be seeded by assigning a numeric value to RANDOM.
The values of RANDOM form an intentionally-repeatable
pseudo-random sequence; subshells that reference RANDOM will
result in identical pseudo-random values unless the value of
RANDOM is referenced or seeded in the parent shell in between
subshell invocations.
甚至更疯狂,因为调用uniq
会创建一个子shell
for i in {1..10}; do echo $RANDOM; done # changes at every run
for i in {1..10}; do echo $RANDOM; done | uniq # always the same 10 numbers