python - Connect 4 Alpha-beta 修剪可能失败 :(-6ren

python - Connect 4 Alpha-beta 修剪可能失败 :(

转载作者：塔克拉玛干更新时间：2023-11-03 03:56:01

我已经在人工智能问题上工作了一段时间，在这周我尝试编写 AI 代码以将 4 与 python 连接起来。首先我在复制板时遇到了问题，但我发现在 python 中你需要使用 deepcopy 来避免复制错误。

最后，我设法创建了 alpha-beta 剪枝算法并且它工作正常，但后来我测试了深度 8 中的算法与深度 6 中的在线 alpha-beta 剪枝算法，令人惊讶的是我的算法失败了。我与哈佛的导师一起创建了评估函数，并从 msavenski 的代码中修改了 alpha-beta(链接在代码中)

那些一直在处理这些问题的人能否再检查一下我的算法和评估函数是否按预期工作，因为我很确定其中存在一些错误。我知道我可以使用换位表、深度迭代等来使代码更快、更有效，但我的另一个目标是保持代码简单。

这是我的代码:

# -*- coding: utf-8 -*-

import copy

class ConnectFour:
    def __init__(self):
        self.moves = 0  #The count of moves, 42 moves is equal than board is full
        self.turn = 0  #Use this variable to recognize which one player turn is it

    def PrintGameBoard(self, board):
        print('  0   1   2   3   4   5   6') # This function just raws a board
        for i in range(5, -1, -1):
            print('|---|---|---|---|---|---|---|')
            print("| ",end="")
            for j in range(7):
                print(board[i][j],end="")
                if j != 6:
                    print(" | ",end="")
                else:
                    print(" |")
        print('`---------------------------´')

    def LegalRow(self, col, board):
        stacks = [[x[i] for x in board] for i in range(len(board[0]))] # This function checks stack of given column and return the row where you can draw mark. If the stack is full return -1
        countofitems = stacks[col].count("x") + stacks[col].count("o") # count of items in stack
        if (countofitems) < 6:
            return (countofitems)
        else:
            return -1

    def LegalMoves(self, board):
        legalmoves = []
        stacks = [[x[i] for x in board] for i in range(len(board[0]))] 
        order = [3,2,4,1,5,0,6]
        for i in order:
            if self.LegalRow(i, board)!=-1:
                legalmoves.append(i)
        return legalmoves

    def MakeMove(self, board, col, player, row):
        board[row][col] = player # This function make a move and increases count of moves
        self.moves += 1
        return board

    def UnmakeMove(self, board, col, row):
        board[row][col] = " " # This function make a move and increases count of moves
        self.moves -= 1
        return board

    def IsWinning(self, currentplayer, board):
        for i in range(6): # This function returns True or False depending on if current player have four "tila" in a row (win)
            for j in range(4):
                if board[i][j] == currentplayer and board[i][j+1] == currentplayer and board[i][j+2] == currentplayer and board[i][j+3] == currentplayer:
                    return True
        for i in range(3):
            for j in range(7):
                if board[i][j] == currentplayer and board[i+1][j] == currentplayer and board[i+2][j] == currentplayer and board[i+3][j] == currentplayer:
                    return True     
        for i in range(3):
            for j in range(4):
                if board[i][j] == currentplayer and board[i+1][j+1] == currentplayer and board[i+2][j+2] == currentplayer and board[i+3][j+3] == currentplayer:
                    return True
        for i in range(3,6):
            for j in range(4):
                if board[i][j] == currentplayer and board[i-1][j+1] == currentplayer and board[i-2][j+2] == currentplayer and board[i-3][j+3] == currentplayer:
                    return True
        return False

    def PlayerMove(self, board, player):
        allowedmove = False     # This function ask players move when its his turn and returns board after making move.
        while not allowedmove:
            try:
                print("Choose a column where you want to make your move (0-6): ",end="")
                col =input()
                col=int(col)
                row = self.LegalRow(col, board)
            except (NameError, ValueError, IndexError, TypeError, SyntaxError) as e:
                print("Give a number as an integer between 0-6!")
            else:
                if row != -1 and (col<=6 and col>=0):
                    board[row][int(col)] = player
                    self.moves += 1
                    allowedmove = True
                elif col>6 or col<0:
                    print("The range was 0-6!!!")
                else:
                    print("This column is full")
        return board

    def SwitchPlayer(self, player): # This function gives opponent player's mark
        if player=="x":
            nextplayer="o"
        else:
            nextplayer="x"
        return nextplayer

    def evaluation(self, board): # This function evaluate gameboard (heuristic). The rules of evaluation are in site: http://isites.harvard.edu/fs/docs/icb.topic788088.files/Assignment%203/asst3c.pdf
        list = []
        player = "x"
        opponent = "o"
        if self.IsWinning(player, board):
            score = -512
        elif self.IsWinning(opponent, board):
            score = +512
        elif self.moves==42:
            score=0
        else:
            score = 0
            for i in range(6):  #append to list horizontal segments
                for j in range(4):
                    list.append([str(board[i][j]),str(board[i][j+1]),str(board[i][j+2]),str(board[i][j+3])])
            for i in range(3): #append to list vertical segments
                for j in range(7):
                    list.append([str(board[i][j]),str(board[i+1][j]),str(board[i+2][j]),str(board[i+3][j])])
            for i in range(3): #append to list diagonal segments
                for j in range(4):
                    list.append([str(board[i][j]),str(board[i+1][j+2]),str(board[i+2][j+2]),str(board[i+3][j+3])])
            for i in range(3, 6): #append to list diagonal segments
                for j in range(4):
                    list.append([str(board[i][j]),str(board[i-1][j+2]),str(board[i-2][j+2]),str(board[i-3][j+3])])
            for k in range(len(list)): #add to score with rules of site above
                if ((list[k].count(str("x"))>0) and (list[k].count(str("o"))>0)) or list[k].count(" ")==4:
                    score += 0
                if list[k].count(player)==1 and list[k].count(opponent)==0:
                    score -= 1
                if list[k].count(player)==2 and list[k].count(opponent)==0:
                    score -= 10
                if list[k].count(player)==3 and list[k].count(opponent)==0:
                    score -= 50
                if list[k].count(opponent)==1 and list[k].count(player)==0:
                    score += 1
                if list[k].count(opponent)==2 and list[k].count(player)==0:
                    score += 10
                if list[k].count(opponent)==3 and list[k].count(player)==0:
                    score += 50
            if self.turn==player:
                score -= 16
            else:
                score += 16
        return score

    def maxfunction(self, board, depth, player, alpha, beta):
        opponent = self.SwitchPlayer(player)
        self.turn = opponent
        legalmoves = self.LegalMoves(board)
        if (depth==0) or self.moves==42:
            return self.evaluation(board)
        value=-1000000000
        for col in legalmoves:
            row = self.LegalRow(col, board)
            newboard = self.MakeMove(board, col, opponent, row)
            value = max(value, self.minfunction(board, depth-1, opponent,alpha, beta))
            newboard = self.UnmakeMove(board, col, row)
            if value >= beta:
                return value
            alpha = max(alpha, value)
        return value

    def minfunction(self, board, depth, opponent, alpha, beta):
        player = self.SwitchPlayer(opponent)
        self.turn = player
        legalmoves = self.LegalMoves(board)
        if (depth==0) or self.moves==42:
            return evaluation(board)
        value=1000000000
        for col in legalmoves:
            row = self.LegalRow(col, board)
            newboard = self.MakeMove(board, col, player, row)
            value = min(value, self.maxfunction(board, depth-1, player ,alpha, beta))
            newboard = self.UnmakeMove(board, col, row)
            if value <= alpha:
                return value
            beta = min(beta, value)
        return value

    def alphabetapruning(self, board, depth, opponent, alpha, beta): #This is the alphabeta-function modified from: https://github.com/msaveski/connect-four
        values = []
        cols = []
        value = -1000000000
        for col in self.LegalMoves(board):
            row = self.LegalRow(col, board)
            board = self.MakeMove(board, col, opponent, row)
            value = max(value, self.minfunction(board, depth-1, opponent, alpha, beta))
            values.append(value)
            cols.append(col)
            board = self.UnmakeMove(board, col, row)
        largestvalue= max(values)
        print(cols)
        print(values)
        for i in range(len(values)):
            if largestvalue==values[i]:
                position = cols[i]
                return largestvalue, position

    def searchingfunction(self, board, depth, opponent):
        #This function update turn to opponent and calls alphabeta (main algorithm) and after that update new board (add alphabeta position to old board) and returns new board.
        newboard = copy.deepcopy(board)
        value, position=self.alphabetapruning(newboard, depth, opponent, -10000000000, 10000000000)
        board = self.MakeMove(board, position, opponent, self.LegalRow(position, board))
        return board

def PlayerGoesFirst():
    print("Player is X and AI is O") #This function just ask who goes first
    player = 'x'
    opponent = 'o'
    print('Do you want to play first? (y/n) : ',end="")
    return input().lower().startswith('y')

def PlayAgain():
    print('Do you want to play again? (y/n) :',end="") #This function ask if player want to play new game
    return input().lower().startswith('y')

def main():
    print("Connect4") #The main function. This ask player mark, initialize gameboard (table), print board after each turn, ask players move, make AI's move and checks after each move is game is tie/win or lose.
    print("-"*34)
    while True:
        connectfour = ConnectFour()
        gameisgoing = True
        table  = [[],[],[],[],[],[]]
        for i in range(7):
            for j in range(6):
                table[j].append(" ")
        player = "x"
        opponent = "o"
        if PlayerGoesFirst():
            turn = "x"
        else:
            turn = "o"
        while gameisgoing:
            connectfour.PrintGameBoard(table)
            if turn=="x":
                table = connectfour.PlayerMove(table, player)
                if connectfour.IsWinning(player, table):
                    connectfour.PrintGameBoard(table)
                    print('You won the game!')
                    gameisgoing = False
                else:
                    if connectfour.moves==42:
                        connectfour.PrintGameBoard(table)
                        print('Game is tie')
                        gameisgoing=False
                    else:
                        turn = "o"
            else:
                table = connectfour.searchingfunction(table, 6, opponent) #Here is AI's move. Takes as input current table (board), depth and opponents mark. Output should be new gameboard with AI's move.
                if connectfour.IsWinning(opponent, table):
                    connectfour.PrintGameBoard(table)
                    print('Computer won the game')
                    gameisgoing = False
                else:
                    if connectfour.moves==42:
                        connectfour.PrintGameBoard(table)
                        print('Game is tie')
                        gameisgoing=False
                    else:
                        turn = "x"
        if not PlayAgain():
            print("Game ended")
            print("-"*34)
            break

if __name__ == '__main__':
    main()

最佳答案

这个实现看起来不错，而且它的 Action 也很合理。

仅从一场比赛很难对比赛强度做出假设。您需要运行更多游戏才能获得可靠的统计数据。例如，通过改变起始位置并让两个引擎分别播放一次“X”和一次“0”。

还有一定的运气因素。当搜索深度增加时，Alpha-beta 通常会带来更好的游戏体验。然而，在某些情况下，它可能最终会下一个在更浅的搜索中不会下的坏棋。

不同的评估函数也有这种效果。即使一个评估函数较差，在某些位置上，较差的评估函数也会导致更好的移动。

只有当您玩了足够多的游戏，或者您的搜索非常深入以至于可以看到所有变体时，运气因素才会消失。 Connect 4 是一款解谜游戏。如果算法是完美的，那么第一个移动的玩家将永远获胜。这是指出搜索错误的唯一客观方式。启发式方法，就像您的评估函数一样，总会有弱点。

关于python - Connect 4 Alpha-beta 修剪可能失败 :(，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44970883/

文章推荐： javascript - 从html中提取数据

文章推荐： java - rJava 泛型类型

文章推荐： algorithm - 卷积核CUDA的设计

powershell 修剪 - 删除字符串后的所有字符
在字符串 (\test.something) 之后删除所有内容的命令是什么。我在文本文件中有信息，但是在字符串之后有 1000 行我不想要的文本。如何删除包括字符串在内的所有内容。这就是我所拥有的
jquery - 删除每个元素上的空白 - 修剪
我想删除每个项目的空白.amount 我在 .amount 类上使用 trim 和 each，但它似乎不起作用: jQuery('.amount').each(function(){ jQue
Python 修剪/过滤掉点
我列出了以下正在稳步增加的点，例如: [[0, 0], [9, 4], [18, 19], [25, 34], [48, 48], [54, 53], [61, 65], [69, 82], [73,
jQuery - 修剪()文档中所有元素的优雅方式？
清理自动生成的 html 带来更多乐趣。标签中注入(inject)了大量无关的空格: Lorem Ipsum dolor sit... ( 代表实际空间，而不是实
python - 修剪/缩尾标准差
计算 trimmed 的有效方法是什么？或winsorized列表的标准差？我不介意使用numpy，但如果我必须制作列表的单独副本，它会非常慢。最佳答案这将制作两个副本，但您应该尝试一下，因为它
c# - 修剪 float
这个问题在这里已经有了答案: 关闭10 年前。 Possible Duplicate: Leave only two decimal places after the dot Formatting
c# - 强制文本长度 + 修剪
我正在使用绑定(bind)来填充 Listbox，其中包含 TextBlock 等。问题是: 如何确保绑定(bind)到 TextBlock 的 Text 属性的文本具有特定长度，或者它是显示为某些
ios - 修剪 NSString
我正在按以下方式修剪 NSString: NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:
MySQL:修剪 *both* 空格和换行符
我有一个文本列，其内容在字符串的前后混合了换行符和空白字符。我正在尝试编写一个 SELECT 语句，它向我显示没有前导和尾随垃圾的内容。以下查询修剪空格: SELECT TRIM(column) F
python - 修剪/截断时我是否遗漏了什么？
这个问题在这里已经有了答案: How to slice a pandas DataFrame by position? (5 个答案) 关闭 5 年前。我似乎看不出这里有什么问题。我有一个长度为
MySQL 修剪 WHERE 中的值
我没有找到类似的问题可能是因为我没有找到正确的词(英语不是我的母语) 问题我有一个 varchar 值，末尾有一个空格:"opt-193-381-markets " 当我执行 SELECT 的值没有
PHP 修剪()问题
假设我有 $url="../folder/file"，我想找到并删除 ../ 部分。我正在使用 trim() ...... $url = trim($url,"../"); ……但它给了我一个警告:
JAVA 修剪()不工作
这个问题在这里已经有了答案: Java String trim has no effect (7 个答案) string trim function is not working [closed]
PHP 从字符串中提取文本 - 修剪？
我有以下 XML: tag:search.twitter.com,2005:22204349686 如何将第二个冒号后的所有内容写入变量？例如22204349686 最佳答案 if(preg_mat
c++ - 修剪:什么时候停止？
修剪在深度优先搜索中什么时候停止有效？我一直在研究一种有效的方法来解决 N-Queens 问题，并且我第一次关注修剪。我已经为前两行实现了它，但它什么时候停止有效？我应该修剪多远？最佳答案 N 皇后
r - 修剪 ggplot2 中的第一个和最后一个标签
我有一个图表，按天将两种类型的数据制成表格，我希望只修剪图表中的第一个和最后一个标签。这是一个可重现的数据示例: library(dplyr) library(ggplot2) library(sca
excel - 修剪 Excel 单元格中的前导空格
如何去掉 excel 中的前导空格? 我有很多行有这个问题。最佳答案在您的空格删除请求中，请注意: TRIM仅删除字符 32，即标准空格。 CLEAN将删除非打印空格，例如回车符(字符 13)和换
angularjs - 禁用指令属性的 Angular 修剪
当前正在编写指令，并且需要将空格作为字符传递给它。喜欢: 结果证明 angular 消除了前导空间；但我想保留它。有什么办法吗？编辑:我将指令参数作为字符串传递(使用@，而不是作为变量，使用=
delphi - 修剪 BOLD_CLOCKLOG 表
我正在为一个使用 Bold for Delphi 对象持久性框架的应用程序的数据库做一些维护。该数据库已经投入生产多年，其中一些表已经变得非常大。其中之一是 BOLD_CLOCKLOG这与 Bold
cocoa - “修剪”一个 NSString
如何“修剪” NSString 以便仅用旧字符串的特定部分创建新字符串？例如，我有字符串“Monday the 12th of September”，我如何仅选出“Monday”部分？最佳答案使

塔克拉玛干

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - Connect 4 Alpha-beta 修剪可能失败 :(