问题描述
假设我有3个盒子和3个动物,我想根据它们各自的分布创建一个包含1个动物的盒子阵列:
animals = ["Cat","Dog","Bunny"]
Boxes = []
概率由给出
"Cat" "Dog" "Bunny"
Box 1 0.3 0.4 0.3
Box 2 0.2 0.3 0.5
Box 3 0.5 0.3 0.2
我将如何填充框的数组,以使第一个元素等于0.3的“猫”,概率为0.4的“狗”和概率0.3的“兔”,而第二个元素等于的概率为“猫”概率为0.2,“狗”的概率为0.3等。
此外,假设第一个元素/框为“猫”。要看第二个和第三个盒子,由于第一个盒子已经装满了一只猫,我们不可能再次改变它的概率> 0。我们也不能使第二个盒子中再有一只猫的可能性大于0,因为它已经在盒子1中了。
是否可以通过将剩余的行/列缩放为1来负责任地解决此问题,但它们的比例仍然相同?例如,如果框1是猫,那么我们会得到
"Cat" "Dog" "Bunny"
Box 1 1 0 0
Box 2 0 0.4 0.6
Box 3 0 0.6 0.4
解决方法
您可以使用random.choices。它会自动加权选择:
boxes = []
animals = ["Cat","Dog","Bunny"]
box1 = [0.3,0.4,0.3]
box2 = [0.2,0.3,0.5]
# box3 = [0.5,0.2] is commented out because it can be ignored
# Choose the first item to go in box1
boxes.append(random.choices(animals,k = 1,weights = box1))
chosen_ind = animals.index(boxes[0])
# Remove the chosen item from animals and box2
animals.pop(chosen_ind)
box2.pop(chosen_ind)
# Choose the second item
boxes.append(random.choices(animals,weights = box2))
chosen_ind = animals.index(boxes[1])
# Remove the chosen item from animals,append the only remaining item
animals.pop(chosen_ind)
boxes.append(animals[0])
我很清楚这不是解决问题的特别干净或可扩展的方法,但是在这种情况下可以完成工作。
编辑:具有numpy数组的新版本是这个
import numpy as np
boxes = []
# n animals to choose from
animals = ['cat','dog','bunny' ... ] # as many items as needed
# n x n matrix of probabilities
prob = np.array([
[prob(box1,cat),prob(box1,dog),...]
[prob(box2,prob(box2,...]
...
])
for box_ind,box in enumerate(prob):
boxes.append(random.choices(animals,weights = box)
col_ind = animals.index(boxes[box_ind])
# This line sets the probability of a chosen item to 0 for future iterations
prob[:,col_ind] = 0