问题描述
我正在 numpy 中实现遗传算法,我正在尝试弄清楚如何通过轮盘赌和随机通用采样正确实现选择。我在 stackoverflow 或其他地方看到的示例使用 python 循环而不是矢量化的 numpy 代码。
例如,这里是两种算法在 DEAP 中的实现。
{
"name": "TPFunction","instanceId": "4ef6513ebfc6bb","runtimeStatus": "Failed","input": {
"environment": "dev","DatetoProcess": "2013-04-08","SourceStorageType": "AdlsGen2","SourceAccountName": "storage06","SourceBlobContainer": "data","SourceFilePath": "file/file/file"
},"customStatus": null,"output": "orchestratorfunction 'TPFunction' Failed: Following error occurred during execution: The activity function 'TPFunction' Failed: \"Error occurred getting list of files: This request is not authorized to perform this operation using this permission.\nRequestId:7b5e-e4a603\nTime:2013-04-08:02:27.0924606Z\r\nStatus: 403 (This request is not authorized to perform this operation using this permission.)\r\nErrorCode: AuthorizationPermissionMismatch\r\n\r\nHeaders:\r\nServer: Windows-Azure-HDFS/1.0,Microsoft-HTTPAPI/2.0\r\nx-ms-error-code: AuthorizationPermissionMismatch\r\nx-ms-request-id: 7b5e-e4a603\r\nx-ms-version: 2019-07-07\r\nx-ms-client-request-id: e7357b5e-e4a603cc52\r\r\nContent-Length: 227\r\nContent-Type: application/json; charset=utf-8\r\n\". See the function execution logs for additional details.","createdTime": "2021-01-04T19:02:24Z","lastUpdatedTime": "2021-01-04T19:02:27Z"
}
这是我的轮盘赌的实现,似乎是带替换的加权采样,但我不确定替换参数。
def selRoulette(individuals,k,fit_attr="fitness"):
"""Select *k* individuals from the input *individuals* using *k*
spins of a roulette. The selection is made by looking only at the first
objective of each individual. The list returned contains references to
the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:returns: A list of selected individuals.
This function uses the :func:`~random.random` function from the python base
"""
s_inds = sorted(individuals,key=attrgetter(fit_attr),reverse=True)
sum_fits = sum(getattr(ind,fit_attr).values[0] for ind in individuals)
chosen = []
for i in xrange(k):
u = random.random() * sum_fits
sum_ = 0
for ind in s_inds:
sum_ += getattr(ind,fit_attr).values[0]
if sum_ > u:
chosen.append(ind)
break
return chosen
def selstochasticUniversalSampling(individuals,fit_attr="fitness"):
"""Select the *k* individuals among the input *individuals*.
The selection is made by using a single random value to sample all of the
individuals by choosing them at evenly spaced intervals. The list returned
contains references to the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:return: A list of selected individuals.
"""
s_inds = sorted(individuals,fit_attr).values[0] for ind in individuals)
distance = sum_fits / float(k)
start = random.uniform(0,distance)
points = [start + i*distance for i in xrange(k)]
chosen = []
for p in points:
i = 0
sum_ = getattr(s_inds[i],fit_attr).values[0]
while sum_ < p:
i += 1
sum_ += getattr(s_inds[i],fit_attr).values[0]
chosen.append(s_inds[i])
return chosen
这是我对 SUS 选择的实现。我是否正确,当在 numpy 中实现时,我唯一需要改变的是采样没有替换,或者我也应该删除权重?
# population is a 2D array of integers
# population_fitness is a 1D array of float of same length as population
weights = population_fitness / population_fitness.sum()
selected = population[np.random.choice(len(population),size=n,replace=True,p=weights)]
感谢您的任何建议!
解决方法
两种策略都可能多次选择同一个个体,所以替换不是重点。
我不知道 np.random.choice
在内部是如何实现的,但无论如何,函数的契约中并没有指定实现方法(因此它可以随时更改)。下面我将使用 numpy 给出我对这两种选择策略的实现。
请务必在认真使用这些功能之前对其进行测试。
编辑: 没有必要按适应度排序;我不知道我在想什么。
import numpy as np
formatters = {
'int': lambda x: '%4d' % x,'float': lambda x: '%.02f' % x
}
def print_report(population,fitness,wheel,selectors,selected_individuals):
with np.printoptions(formatter=formatters):
print(' Population:',population)
print(' Fitness:',fitness)
print(' Roulette wheel:',wheel) # fitness cumulative sum
print(' Sampled values:',selectors) # roulette "extractions"
print('Selected individuals:',selected_individuals)
# This should be equivalent to np.choice(population,size,weights=fitness)
def roulette_wheel_selection(rng: np.random.Generator,population: np.ndarray,fitness: np.ndarray,size: int) -> np.ndarray:
""" :Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum() # the "roulette wheel"
fitness_sum = fitness_cumsum[-1] # sum of all fitness values (size of the wheel)
sampled_values = rng.random(size) * fitness_sum
# For each sampled value,get the corresponding roulette wheel slot
selected = np.searchsorted(fitness_cumsum,sampled_values)
print_report(population,fitness_cumsum,sampled_values,selected)
return selected
def sus(rng: np.random.Generator,size: int) -> np.ndarray:
""" https://en.wikipedia.org/wiki/Stochastic_universal_sampling
:Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum()
fitness_sum = fitness_cumsum[-1] # the "roulette wheel"
step = fitness_sum / size # we'll move by this amount in the wheel
start = rng.random() * step # sample a start point in [0,step)
# get N evenly-spaced points in the wheel
selectors = np.arange(start,fitness_sum,step)
selected = np.searchsorted(fitness_cumsum,selectors)
print_report(population,selected)
return selected
if __name__ == "__main__":
from numpy.random import default_rng
n = 10
sample_size = 5
rng = default_rng()
# Random population data.
# I'm sorting by fitness just for making it easier to read the report
population = np.arange(n)
fitness = np.sort(
np.abs(rng.normal(size=len(population)))
)
print('Roulette wheel sampling:')
roulette_wheel_selection(rng,population,sample_size)
print()
print('SUS:')
sus(rng,sample_size)
输出:
Roulette wheel sampling:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [0.93 3.93 5.32 7.10 4.11]
Selected individuals: [ 2 7 8 9 7]
SUS:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [1.25 2.79 4.33 5.86 7.40]
Selected individuals: [ 3 5 7 9 9]