这种实现真正随机性的尝试有效吗？

问题描述

由于在一系列生成值中缺乏实际模式，伪随机性变成了真正的随机性；所以本质上重复的随机元素序列可能是无限的。

我知道 random.py seed()s 的设计方式是尽可能远离“伪”字符（即使用当前时间戳、机器参数等）这对于大多数情况来说都很好，但是如果需要在数学上确保零可预测性怎么办？

我已经读到，当我们基于特定的物理事件（例如放射性衰变）seed()时可以实现真正的随机性，但是如果，例如，我使用了数组源自录制的音频流？

以下是我如何为此目的覆盖默认 random.seed() 行为的示例。我正在使用 sounddevice 库，该库实现对负责管理 I/O 声音设备的服务的绑定。

# original random imports here
# ...

from sounddevice import rec

__all__ = ["module level functions here"]

# original random constants here
# ...

# sounddevice related constants
# ----------------------------------------------------------------------
# FS: Sampling Frequency in Hz (samples per second);
# DURATION: Duration of the recorded audio stream (seconds);
# *Note: changing the duration will result in a slower generator,since
# the seed method must wait for the entire stream to be recorded
# before processing further.
# CHANNELS: N° of audio channels used by the recording function (_rec);
# DTYPE: Data type of the np.ndarray returned by _rec;
# *Note: dtype can also be a np.dtype object. E.g.,np.dtype("float64").

FS = 48000 
DURATION = 0.1
CHANNELS = 2 
DTYPE = 'float64'


# ----------------------------------------------------------------------
# The class implements a custom random generator with a seed obtained
# through the default audio input device.
# It's a subclass of random.Random that overrides only the seed method;
# it records an audio stream with the default parameters and returns the
# content in a newly created np.ndarray.
# Then the array's elements are added together and some transformations
# are performed on the sum,in order to obtain a less uniform float.
# This operation causes the randomness to concern the decimal part in
# particular,which is subject to high fluctuation,even when the noise
# of the surrounding environment is homogeneous over time.
# *Note: the blocking parameter suspends the execution until the entire
# stream is recorded,otherwise the np array will be partially empty.
# *Note: when the seed argument is specified and different than None,# SDRandom will behave exactly like its superclass

class SDRandom(Random):

    def seed(self,a=None,version=2):
        if isinstance(a,type(None)):
            stream = rec(frames=round(FS * DURATION),samplerate=FS,channels=CHANNELS,dtype=DTYPE,blocking=True
                         )

            # Sum and Standard Deviation of the flattened ndarray.
            sum_,std_ = stream.sum(),stream.std() 

            # round() determines the result's sign.
            b = sum_ - round(sum_)

            # Collecting a number of exponents based on the std' digits.
            e = [1 if int(c) % 2 else -1 for c in str(std_).strip("0.")]

            a = b * 10 ** sum(e)

        super().seed(a)


# ----------------------------------------------------------------------
# Create one instance,seeded from an audio stream,and export its
# methods as module-level functions.
# The functions share state across all uses.

_inst = SDRandom()
# binding class methods to module level functions here
# ...

## ------------------------------------------------------
## ------------------ fork support  ---------------------

if hasattr(_os,"fork"):
    _os.register_at_fork(after_in_child=_inst.seed)


if __name__ == '__main__':
    _test() # See random._test() deFinition.

根据理论，我的实现仍然没有实现真正的随机性。这怎么可能？即使考虑以下因素，音频输入怎么可能是确定性的？

此操作导致随机性涉及小数部分特别是，它会受到很大的波动，即使当噪音随着时间的推移，周围环境的变化是同质的。

解决方法

您最好只使用 secrets 模块来实现“真正的”随机性。这为您提供了来自内核 CSPRNG 的数据，这些数据应该不断地收集和混合新的熵，这种方式旨在让任何攻击者都难以生存。

你对无限的使用也不合适，你不能运行“无限长”的东西，宇宙的热死会在很久之前发生。

使用标准 Mersenne Twister（如 Python 的 random 模块所做的那样）似乎也不合适，因为攻击者可以在绘制 624 variates 后恢复状态。使用 CSPRNG 会使这变得更加困难，并且在新状态中不断混合，正如您的内核可能所做的那样，进一步加强了这一点。

最后，将样本视为浮点数然后取平均值和标准差似乎并不合适。您最好将它们保留为整数并通过加密哈希传递它们。例如：

import hashlib
import random

import sounddevice as sd

samples = sd.rec(
    frames=1024,samplerate=48000,channels=2,dtype='int32',blocking=True,)

rv = int.from_bytes(hashlib.sha256(samples).digest(),'little')
print(rv)

random.seed(rv)
print(random.random())

但话说回来，请使用 secrets，这是一个更好的选择。

注意：最新版本的 Linux、Windows、OSX、FreeBSD、OpenBSD 内核都如我上面描述的那样工作。他们在收集熵方面做了很好的尝试，并以合理的方式混合成一个 CSPRNG；例如，参见Fortuna。

python python-sounddevice random random random random-seed

这种实现真正随机性的尝试有效吗？

问题描述

解决方法

相关问答