如何根据Python中每个元素的分布填充数组？

问题描述

假设我有3个盒子和3个动物，我想根据它们各自的分布创建一个包含1个动物的盒子阵列：

animals = ["Cat","Dog","Bunny"]
Boxes = []

概率由给出

       "Cat"   "Dog"   "Bunny"
Box 1   0.3     0.4     0.3    
Box 2   0.2     0.3     0.5
Box 3   0.5     0.3     0.2

我将如何填充框的数组，以使第一个元素等于0.3的“猫”，概率为0.4的“狗”和概率0.3的“兔”，而第二个元素等于的概率为“猫”概率为0.2，“狗”的概率为0.3等。

此外，假设第一个元素/框为“猫”。要看第二个和第三个盒子，由于第一个盒子已经装满了一只猫，我们不可能再次改变它的概率> 0。我们也不能使第二个盒子中再有一只猫的可能性大于0，因为它已经在盒子1中了。

是否可以通过将剩余的行/列缩放为1来负责任地解决此问题，但它们的比例仍然相同？例如，如果框1是猫，那么我们会得到

       "Cat"   "Dog"   "Bunny"
Box 1   1       0       0    
Box 2   0       0.4     0.6
Box 3   0       0.6     0.4

解决方法

您可以使用random.choices。它会自动加权选择：

boxes = []

animals = ["Cat","Dog","Bunny"]
box1 = [0.3,0.4,0.3]
box2 = [0.2,0.3,0.5]
# box3 = [0.5,0.2] is commented out because it can be ignored

# Choose the first item to go in box1
boxes.append(random.choices(animals,k = 1,weights = box1))
chosen_ind = animals.index(boxes[0])

# Remove the chosen item from animals and box2
animals.pop(chosen_ind)
box2.pop(chosen_ind)

# Choose the second item
boxes.append(random.choices(animals,weights = box2))
chosen_ind = animals.index(boxes[1])

# Remove the chosen item from animals,append the only remaining item
animals.pop(chosen_ind)
boxes.append(animals[0])

我很清楚这不是解决问题的特别干净或可扩展的方法，但是在这种情况下可以完成工作。

编辑：具有numpy数组的新版本是这个

import numpy as np

boxes = []

# n animals to choose from
animals = ['cat','dog','bunny' ... ]   # as many items as needed

# n x n matrix of probabilities
prob = np.array([
    [prob(box1,cat),prob(box1,dog),...]
    [prob(box2,prob(box2,...]
    ...
])

for box_ind,box in enumerate(prob):
    boxes.append(random.choices(animals,weights = box)
    col_ind = animals.index(boxes[box_ind])
    
    # This line sets the probability of a chosen item to 0 for future iterations
    prob[:,col_ind] = 0

arrays arrays arrays distribution distribution probability python