SciPy差异进化在给定的迭代次数中不起作用

问题描述

我尝试学习我的神经网络来玩非常简单的游戏,但没有成功。问题是 scipy 中的 difference_evolution() 工作时间不够长:我设置了 maxiter=1000函数仅适用于 41 次迭代。 这是代码

def fitness_func(x,*args):
    #print('fitness func started')
    arch,width,height = args
    net = genome_to_nn(x,arch)
    my_game = Game_2(height,width)
    count = 0
    move = -1
    while count < 100:
        count += 1
        field = my_game.get_np_field()
        decision_tensor = net(field)
        move = int(tf.math.argmax( decision_tensor,axis =1))
        if move != 2:           
            my_game.make_a_move(move)       
        if count % 2:
            my_game.make_random()        
        my_game.next_iter()
    
    result = 1/(150 + my_game.score)
    return result


if __name__ == '__main__':
    field_width = 5
    field_height = 10
    inp_size = field_width*(field_height-1) + 1    
    
    model = keras.Sequential(
        [
            layers.Dense(10,input_dim = inp_size),layers.Dense(10,input_dim = 10,activation='sigmoid'),layers.Dense(3,input_dim =10,activation='softmax')
        ]
    )

    args = (model,field_width,field_height)
    bounds = np.asarray([(-10,10) for i in range(len(nn_to_genome(model)))])

    print('start evolution')
    res = differential_evolution(fitness_func,bounds= bounds,args=args,maxiter=50,workers=70,disp=True) 

    print('DE finished')
    
    fitted_model = genome_to_nn(res.x,model)
    
    print(res)

然后我得到以下输出

start evolution
/opt/anaconda3/envs/myenv2/lib/python3.7/site-packages/scipy/optimize/_differentialevolution.py:494: UserWarning: differential_evolution: the 'workers' keyword has overridden updating='immediate' to updating='deferred'
  " updating='deferred'",UserWarning)
differential_evolution step 1: f(x)= 0.00606061
differential_evolution step 2: f(x)= 0.00606061
differential_evolution step 3: f(x)= 0.00606061
differential_evolution step 4: f(x)= 0.00606061
differential_evolution step 5: f(x)= 0.00606061
differential_evolution step 6: f(x)= 0.00606061
differential_evolution step 7: f(x)= 0.00606061
differential_evolution step 8: f(x)= 0.00606061
differential_evolution step 9: f(x)= 0.00606061
differential_evolution step 10: f(x)= 0.00606061
differential_evolution step 11: f(x)= 0.00606061
differential_evolution step 12: f(x)= 0.00606061
differential_evolution step 13: f(x)= 0.00606061
differential_evolution step 14: f(x)= 0.00606061
differential_evolution step 15: f(x)= 0.00606061
differential_evolution step 16: f(x)= 0.00606061
differential_evolution step 17: f(x)= 0.00606061
differential_evolution step 18: f(x)= 0.00606061
differential_evolution step 19: f(x)= 0.00606061
differential_evolution step 20: f(x)= 0.00606061
differential_evolution step 21: f(x)= 0.00606061
differential_evolution step 22: f(x)= 0.00606061
differential_evolution step 23: f(x)= 0.00606061
differential_evolution step 24: f(x)= 0.00606061
differential_evolution step 25: f(x)= 0.00606061
differential_evolution step 26: f(x)= 0.00606061
differential_evolution step 27: f(x)= 0.00606061
differential_evolution step 28: f(x)= 0.00606061
differential_evolution step 29: f(x)= 0.00606061
differential_evolution step 30: f(x)= 0.00606061
differential_evolution step 31: f(x)= 0.00606061
differential_evolution step 32: f(x)= 0.00606061
differential_evolution step 33: f(x)= 0.00606061
differential_evolution step 34: f(x)= 0.00606061
differential_evolution step 35: f(x)= 0.00606061
differential_evolution step 36: f(x)= 0.00606061
differential_evolution step 37: f(x)= 0.00606061
differential_evolution step 38: f(x)= 0.00606061
differential_evolution step 39: f(x)= 0.00606061
differential_evolution step 40: f(x)= 0.00606061
differential_evolution step 41: f(x)= 0.00606061
DE finished
     fun: 0.006060606060606061
 message: 'Optimization terminated successfully.'
    nfev: 399735
     nit: 41
 success: True
       x: array([-6.59142662,-6.6655827,7.01109519,-6.61588426,-8.99447424,...
         just a lot of unnecessary numbers here
         ...
       -4.19133698,1.62013289,5.72924953,-0.29303238,-2.17649926,1.91011116,9.8819633,-9.58588766,6.05450803])
WARNING:tensorflow:From /opt/anaconda3/envs/myenv2/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: model_trained/assets

附言maxiter 是否为 50 of 5000(超过 41)都没有关系,该函数仍然进行 41 次迭代

如果我设置了 maxiter=30,我会得到以下结果:

message: 'Maximum number of iterations has been exceeded.'
    nfev: 298425
     nit: 30
 success: False

UPD:我将适应度函数的返回值改为

result = - my_game.score

所以,现在返回值在 [-100,100] 范围内(以前在 [1/250; 1/50] 范围内),并且有效!但是,我仍然不知道为什么它不能与旧版本的功能一起使用。官方文档没有说明对其返回值的任何限制(除非它应该是数字)

解决方法

嗨,您也可以在 Tensorflow Probability 中使用 DE 优化器 TF Probablity Docs for DE