gpt4 book ai didi

java - TicTacToe minimax 算法在 4x4 游戏中返回意外结果

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:20:36 25 4
gpt4 key购买 nike

在我的方法 newminimax499 中,我有一个利用内存和 alpha beta 修剪的 minimax 算法。该方法通常适用于 3x3 游戏,但是当我玩 4x4 游戏时,我会得到奇怪的、意想不到的计算机位置选择。他仍然从不输,但他似乎并不是为了赢而比赛。为了说明这里的问题,我们使用 2 个 3x3 和 4x4 游戏的场景。首先是一个 3x3 游戏的场景,其中玩家是 X 并迈出第一步:enter image description here

这还不错,事实上,这正是人们希望计算机执行的操作。现在来看一个 4x4 游戏的场景。同样,O 是计算机,X 启动: enter image description here

如您所见,计算机只是将 Os 一个接一个地按系统顺序排列,只有在它有可能获胜时才打破该顺序来阻止 X。这是非常防守的打法,不像在 3x3 比赛中看到的那样。那么为什么该方法对于 3x3 和 4x4 表现不同?

代码如下:

//This method returns a 2 element int array containing the position of the best possible 
//next move and the score it yields. Utilizes memoization and alpha beta
//pruning to achieve better performance.
public int[] newminimax499(int a, int b){
//int bestScore = (turn == 'O') ? +9 : -9; //X is minimizer, O is maximizer
int bestPos=-1;
int alpha= a;
int beta= b;
int currentScore;
//boardShow();
String stateString = "";
for (int i=0; i<state.length; i++)
stateString += state[i];
int[] oldAnswer = oldAnswers.get(stateString);
if (oldAnswer != null)
return oldAnswer;
if(isGameOver()!='N'){
int[] answer = {score(), bestPos};
oldAnswers.put (stateString, answer);
return answer;
}
else{
for(int x:getAvailableMoves()){
if(turn=='X'){ //X is minimizer
setX(x);
//System.out.println(stateID++);
currentScore = newminimax499(alpha, beta)[0];
revert(x);
if(currentScore<beta){
beta=currentScore;
bestPos=x;
}
if(alpha>=beta){
break;
}
}
else { //O is maximizer
setO(x);
//System.out.println(stateID++);
currentScore = newminimax499(alpha, beta)[0];
revert(x);
if(currentScore>alpha){
alpha=currentScore;
bestPos=x;
}
if(alpha>=beta){
break;
}
}
}
}
if(turn=='X'){
int[] answer = {beta, bestPos};
oldAnswers.put (stateString, answer);
return answer;
}
else {
int[] answer = {alpha, bestPos};
oldAnswers.put (stateString, answer);
return answer;
}
}

以下是你们运行代码所需的其他组件和补充方法。我的类 State2 中使用的字段和构造函数:

private char [] state;  //Actual content of the board
private char turn; //Whose turn it is
private Map<String,int[]> oldAnswers; //Used for memoization. It saves every state along with the score it yielded which allows us to stop exploring the children of a certain node if a similar node's score has been previously calculated. The key is the board state(i.e OX------X for example), the int array is a 2 element array containing the score and position of last placed seed of the state.
private Map<Integer, int []> RowCol; //A mapping of positions from a board represented as a normal array to a board represented as a 2d array. For example: The position 0 maps to 0,0 on a 2d array board, 1 maps to 0,1 and so on.
private static int n; //Size of the board
private static int stateID; //An simple incrementer used to show number of recursive calls in the newminiax49 method.
private static int countX, countO; //Number of placed Xs and Os
private static int lastAdded; //Position of last placed seed
private char [][] DDState; //A 2d array representing the board. Contains the same values as state[]. Used for simplicity in functions that check the state of the board.

public State2(int n){
int a=0;
State2.n=n;
state=new char[n*n];
RowCol=new HashMap<Integer, int []>();
countX=0;
countO=0;
//Initializing the board with empty slots
for(int i = 0; i<state.length; i++){
state[i]='-';
}
//Mapping
for(int i=0; i<n; i++){
for(int j=0; j<n; j++){
RowCol.put(a, new int[]{i, j});
a++;
}
}
a=0;
DDState=new char[n][n];
//Initializing the 2d array with the values from state[](empty slots)
for(int i=0; i<n; i++){
for(int j=0; j<n; j++){
DDState[i][j]=state[a];
a++;
}
}
oldAnswers = new HashMap<String,int[]>();
}

补充方法:

getAvailableMoves,返回一个数组,其中包含棋盘上的空槽(即可能的下一步)。

public int[] getAvailableMoves(){
int count=0;
int i=0;
for(int j=0; j<state.length; j++){
if(state[j]=='-')
count++;
}
int [] availableSlots = new int[count];
for(int j=0; j<state.length; j++){
if(state[j]=='-')
availableSlots[i++]=j;
}
return availableSlots;
}

isGameOver2(),简单地检查棋盘的当前状态以判断游戏是否结束。返回一个字符 'X'、'O'、'D' 和 'N',分别代表 X 获胜、O 获胜、平局和未游戏结束。

public char isGameOver2(){
char turnOpp;
int count;
if(turn=='X'){
count=countO;
turnOpp='O';
}
else {
count=countX;
turnOpp='X';
}
if(count>=n){
for(int i=0; i<n; i++){
if(DDState[i][RowCol.get(lastAdded)[1]]!=turnOpp)
break;
if(i==(n-1)){
return turnOpp;
}
}

//Check row for win
for(int i=0; i<n; i++){
if(DDState[RowCol.get(lastAdded)[0]][i]!=turnOpp)
break;
if(i==(n-1)){
return turnOpp;
}
}

//Check diagonal for win
if(RowCol.get(lastAdded)[0] == RowCol.get(lastAdded)[1]){

//we're on a diagonal
for(int i = 0; i < n; i++){
if(DDState[i][i] != turnOpp)
break;
if(i == n-1){
return turnOpp;
}
}
}

//check anti diagonal

for(int i = 0; i<n; i++){
if(DDState[i][(n-1)-i] != turnOpp)
break;
if(i == n-1){
return turnOpp;
}
}

//check for draw
if((countX+countO)==(n*n))
return 'D';

}
return 'N';
}

boardShow,返回棋盘当前状态的矩阵显示:

public void boardShow(){
if(n==3){
System.out.println(stateID);
for(int i=0; i<=6;i+=3)
System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]");
System.out.println("***********");
}
else {
System.out.println(stateID);
for(int i=0; i<=12;i+=4)
System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]"+" ["+state[i+3]+"]");
System.out.println("***********");
}
}

score,是一个简单的评估函数,O 获胜时返回 +10,X 获胜时返回 -10,平局返回 0:

public int score(){
if(isGameOver2()=='X')
return -10;
else if(isGameOver2()=='O')
return +10;
else
return 0;
}

播种机:

//Sets an X at a certain location and updates the turn, countX and lastAdded variables
public void setX(int i){
state[i]='X';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='X';
turn='O';
countX++;
lastAdded=i;
}

//Sets an O at a certain location and updates the turn, countO and lastAdded variables
public void setO(int i){
state[i]='O';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='O';
turn='X';
countO++;
lastAdded=i;
}

还原,简单地还原移动。例如,如果一个 X 被放置在位置 0 revert(0) 设置一个 '-' 在它的位置并更新由 setX 更改的变量:

public void revert(int i){
state[i]='-';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='-';
if(turn=='X'){
turn = 'O';
countO--;
}
else {
turn = 'X';
countX--;
}
}

main 方法可能是这样的:

public static void main(String[] args) {
State2 s=new State2(4);
int [] results=new int[2];
s.setX(0);
long startTime = System.currentTimeMillis();
results=s.newminimax499(Integer.MIN_VALUE,Integer.MAX_VALUE);
long endTime = System.currentTimeMillis();
System.out.println("Score: "+results[0]+" Position: "+ results[1]);
System.out.println("Run time: " + (endTime-startTime));
s.boardShow();

}

最佳答案

我不相信这里有一个错误——如果 O 在较早的位置之一,它会 fork ,而如果它在中间,它会强制平局。大概 4x4 游戏更难赢/输,所以计算机无动于衷地选择了第一个开放的方格。

下面,1表示O的强制响应,2表示X的 fork 走法,?表示可能的获胜位置。

X|O|
-+-+-
2|X|?
-+-+-
?| |1

X| |O
-+-+-
X|2|?
-+-+-
1| |?

X|2|?
-+-+-
O|X|
-+-+-
|?|1

关于java - TicTacToe minimax 算法在 4x4 游戏中返回意外结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32119093/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com