minimax的评估功能

问题描述

大家好，我目前正在参加CS50AI课程。第一项任务是创建具有minimax功能的tictactoe AI。我的问题是：据我了解，必须对游戏位置进行静态评估。我试图用伪代码写这样的东西：

If next move is a winning move:
    return 10 point
elif opponent is going to win stop him:
    return 8 point

种东西。但是，当我检查其他minvalue-max value函数时，没有这样的事情。

def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    currentactions = actions(board)
    if player(board) == X:
        vT = -math.inf
        move = set()
        for action in currentactions:
            v,count = maxvalue(result(board,action),0)
            if v > vT:
                vT = v
                move = action
    else:
        vT = math.inf
        move = set()
        for action in currentactions:
            v,count = minvalue(result(board,0)
            if v < vT:
                vT = v
                move = action
    print(count)
    return move

    def maxvalue(board,count):
        """
        Calculates the max value of a given board recursively together with minvalue
        """
    
        if terminal(board): return utility(board),count+1
    
        v = -math.inf
        posactions = actions(board)
    
        for action in posactions:
            vret,count)
            v = max(v,vret)
        
        return v,count+1
    
    def minvalue(board,count):
        """
        Calculates the min value of a given board recursively together with maxvalue
        """
    
        if terminal(board): return utility(board),count+1
    
        v = math.inf
        posactions = actions(board)
    
        for action in posactions:
            vret,count)
            v = min(v,count+1

这是sikburn的tictactoe实现的max-min函数。我不明白最大值或最小值函数会产生什么结果。任何人都可以澄清我的逻辑吗？顺便说一句，terminal()函数检查游戏是否结束（有赢家还是平局），而result()函数将棋盘和动作作为输入并返回最终的棋盘。感谢您的所有帮助。

解决方法

在utility函数（代码中未包含）中，您可能将1分配给X胜利，将-1分配给O胜利，否则分配0。 minimax函数针对所有可能的动作递归调用minvalue和maxvalue，直到游戏结束，无论是平局还是胜利。然后，它调用utility以获取值。 minvalue和maxvalue都确保X和O始终选择最佳移动。

别忘了检查电路板是否为端子板，并在进行None功能之前返回minimax。

在minvalue中交换对maxvalue和minimax函数的调用：对于X，调用minvalue（因为X想要知道O的下一步动作），对于O，致电maxvalue（出于同样的原因）。

如果您想在每次迭代中查看评估，可以在返回之前，在f"Minvalue: {v},Iteration: {count+1}"和f"Maxvalue: {v},Iteration: {count+1}"函数的末尾打印类似minvalue和maxvalue的内容。这些价值观。我认为这样会更容易理解。

我hppe我澄清了您的疑问。

artificial-intelligence cs50 minimax python