问题描述
我尝试学习我的神经网络来玩非常简单的游戏,但没有成功。问题是 scipy 中的 difference_evolution() 工作时间不够长:我设置了 maxiter=1000
但函数仅适用于 41 次迭代。
这是代码:
def fitness_func(x,*args):
#print('fitness func started')
arch,width,height = args
net = genome_to_nn(x,arch)
my_game = Game_2(height,width)
count = 0
move = -1
while count < 100:
count += 1
field = my_game.get_np_field()
decision_tensor = net(field)
move = int(tf.math.argmax( decision_tensor,axis =1))
if move != 2:
my_game.make_a_move(move)
if count % 2:
my_game.make_random()
my_game.next_iter()
result = 1/(150 + my_game.score)
return result
if __name__ == '__main__':
field_width = 5
field_height = 10
inp_size = field_width*(field_height-1) + 1
model = keras.Sequential(
[
layers.Dense(10,input_dim = inp_size),layers.Dense(10,input_dim = 10,activation='sigmoid'),layers.Dense(3,input_dim =10,activation='softmax')
]
)
args = (model,field_width,field_height)
bounds = np.asarray([(-10,10) for i in range(len(nn_to_genome(model)))])
print('start evolution')
res = differential_evolution(fitness_func,bounds= bounds,args=args,maxiter=50,workers=70,disp=True)
print('DE finished')
fitted_model = genome_to_nn(res.x,model)
print(res)
然后我得到以下输出:
start evolution
/opt/anaconda3/envs/myenv2/lib/python3.7/site-packages/scipy/optimize/_differentialevolution.py:494: UserWarning: differential_evolution: the 'workers' keyword has overridden updating='immediate' to updating='deferred'
" updating='deferred'",UserWarning)
differential_evolution step 1: f(x)= 0.00606061
differential_evolution step 2: f(x)= 0.00606061
differential_evolution step 3: f(x)= 0.00606061
differential_evolution step 4: f(x)= 0.00606061
differential_evolution step 5: f(x)= 0.00606061
differential_evolution step 6: f(x)= 0.00606061
differential_evolution step 7: f(x)= 0.00606061
differential_evolution step 8: f(x)= 0.00606061
differential_evolution step 9: f(x)= 0.00606061
differential_evolution step 10: f(x)= 0.00606061
differential_evolution step 11: f(x)= 0.00606061
differential_evolution step 12: f(x)= 0.00606061
differential_evolution step 13: f(x)= 0.00606061
differential_evolution step 14: f(x)= 0.00606061
differential_evolution step 15: f(x)= 0.00606061
differential_evolution step 16: f(x)= 0.00606061
differential_evolution step 17: f(x)= 0.00606061
differential_evolution step 18: f(x)= 0.00606061
differential_evolution step 19: f(x)= 0.00606061
differential_evolution step 20: f(x)= 0.00606061
differential_evolution step 21: f(x)= 0.00606061
differential_evolution step 22: f(x)= 0.00606061
differential_evolution step 23: f(x)= 0.00606061
differential_evolution step 24: f(x)= 0.00606061
differential_evolution step 25: f(x)= 0.00606061
differential_evolution step 26: f(x)= 0.00606061
differential_evolution step 27: f(x)= 0.00606061
differential_evolution step 28: f(x)= 0.00606061
differential_evolution step 29: f(x)= 0.00606061
differential_evolution step 30: f(x)= 0.00606061
differential_evolution step 31: f(x)= 0.00606061
differential_evolution step 32: f(x)= 0.00606061
differential_evolution step 33: f(x)= 0.00606061
differential_evolution step 34: f(x)= 0.00606061
differential_evolution step 35: f(x)= 0.00606061
differential_evolution step 36: f(x)= 0.00606061
differential_evolution step 37: f(x)= 0.00606061
differential_evolution step 38: f(x)= 0.00606061
differential_evolution step 39: f(x)= 0.00606061
differential_evolution step 40: f(x)= 0.00606061
differential_evolution step 41: f(x)= 0.00606061
DE finished
fun: 0.006060606060606061
message: 'Optimization terminated successfully.'
nfev: 399735
nit: 41
success: True
x: array([-6.59142662,-6.6655827,7.01109519,-6.61588426,-8.99447424,...
just a lot of unnecessary numbers here
...
-4.19133698,1.62013289,5.72924953,-0.29303238,-2.17649926,1.91011116,9.8819633,-9.58588766,6.05450803])
WARNING:tensorflow:From /opt/anaconda3/envs/myenv2/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: model_trained/assets
附言maxiter 是否为 50 of 5000(超过 41)都没有关系,该函数仍然进行 41 次迭代
如果我设置了 maxiter=30
,我会得到以下结果:
message: 'Maximum number of iterations has been exceeded.'
nfev: 298425
nit: 30
success: False
UPD:我将适应度函数的返回值改为
result = - my_game.score
所以,现在返回值在 [-100,100] 范围内(以前在 [1/250; 1/50] 范围内),并且有效!但是,我仍然不知道为什么它不能与旧版本的功能一起使用。官方文档没有说明对其返回值的任何限制(除非它应该是数字)
解决方法
嗨,您也可以在 Tensorflow Probability 中使用 DE 优化器 TF Probablity Docs for DE