Saturday, November 8, 2014

Alpha Beta pruning - Minimax Algorithm for Tic Tac Toe [Java]

I found this tutorial very helpful for understanding alpha-beta pruning:


What needs to be done:

1. Initially alpha and beta variables are set to Integer.MIN_VALUE and Integer.MAX_VALUE respectively to represent -infinity and +infinity

2. Traverse (Depth First) the whole tree (you can also set the depth in the code by setting the value of uptoDepth variable.

3. Traverse to the left (DFS) (intially from the root) until either leaf node (with or without the depth constraint) has been reached.

4. After every child node's evaluation, update parent's alpha or beta if required.

5. Evaluate other sibling only of beta > alpha. If that is not the case, we need to do pruning.

6. If beta <= alpha : We must return a value from the function alphaBetaMinimax(params) that sets the score value for a node-
    return Integer.MIN_VALUE if parent node (of the node being pruned) is a Max. Max will always     select the highest, so we  assign -infinity to the node we do not want to evaluate further. 

    return Integer.MAX_VALUE if parent node is a Min. Min will always select the minimum, so we       assign +infinity to the node we do not want to evaluate further so that it doesn't interfere with the       optimal result we've already found previously.

7. If Integer.MIN_VALUE or Integer.MAX_VALUE has been returned for a node, we know that pruning has been done and we don't want to evaluate more siblings, so we break out of the evaluation loop for that parent.

8. Beta for the root of the tree remains Infinity while the children are being evaluated. 

I've written this code in Java for Tic-Tac-Toe (Minimax + Alpha Beta pruning) :

Source:

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.Scanner;

class Point {

    int x, y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    @Override
    public String toString() {
        return "[" + x + ", " + y + "]";
    }
}

class PointsAndScores {

    int score;
    Point point;

    PointsAndScores(int score, Point point) {
        this.score = score;
        this.point = point;
    }
}

class Board {

    List<Point> availablePoints;
    Scanner scan = new Scanner(System.in);
    int[][] board = new int[3][3]; 

    List<PointsAndScores> rootsChildrenScore = new ArrayList<>();

    public int evaluateBoard() {
        int score = 0;

        //Check all rows
        for (int i = 0; i < 3; ++i) {
            int blank = 0;
            int X = 0;
            int O = 0;
            for (int j = 0; j < 3; ++j) {
                if (board[i][j] == 0) {
                    blank++;
                } else if (board[i][j] == 1) {
                    X++;
                } else {
                    O++;
                }

            } 
            score+=changeInScore(X, O); 
        }

        //Check all columns
        for (int j = 0; j < 3; ++j) {
            int blank = 0;
            int X = 0;
            int O = 0;
            for (int i = 0; i < 3; ++i) {
                if (board[i][j] == 0) {
                    blank++;
                } else if (board[i][j] == 1) {
                    X++;
                } else {
                    O++;
                } 
            }
            score+=changeInScore(X, O);
        }

        int blank = 0;
        int X = 0;
        int O = 0;

        //Check diagonal (first)
        for (int i = 0, j = 0; i < 3; ++i, ++j) {
            if (board[i][j] == 1) {
                X++;
            } else if (board[i][j] == 2) {
                O++;
            } else {
                blank++;
            }
        }

        score+=changeInScore(X, O);

        blank = 0;
        X = 0;
        O = 0;

        //Check Diagonal (Second)
        for (int i = 2, j = 0; i > -1; --i, ++j) {
            if (board[i][j] == 1) {
                X++;
            } else if (board[i][j] == 2) {
                O++;
            } else {
                blank++;
            }
        }

        score+=changeInScore(X, O);

        return score;
    }
    private int changeInScore(int X, int O){
        int change;
        if (X == 3) {
            change = 100;
        } else if (X == 2 && O == 0) {
            change = 10;
        } else if (X == 1 && O == 0) {
            change = 1;
        } else if (O == 3) {
            change = -100;
        } else if (O == 2 && X == 0) {
            change = -10;
        } else if (O == 1 && X == 0) {
            change = -1;
        } else {
            change = 0;
        } 
        return change;
    }
    
    //Set this to some value if you want to have some specified depth limit for search
    int uptoDepth = -1;
    
    public int alphaBetaMinimax(int alpha, int beta, int depth, int turn){
        
        if(beta<=alpha){ System.out.println("Pruning at depth = "+depth);if(turn == 1) return Integer.MAX_VALUE; else return Integer.MIN_VALUE; }
        
        if(depth == uptoDepth || isGameOver()) return evaluateBoard();
        
        List<Point> pointsAvailable = getAvailableStates();
        
        if(pointsAvailable.isEmpty()) return 0;
        
        if(depth==0) rootsChildrenScore.clear(); 
        
        int maxValue = Integer.MIN_VALUE, minValue = Integer.MAX_VALUE;
        
        for(int i=0;i<pointsAvailable.size(); ++i){
            Point point = pointsAvailable.get(i);
            
            int currentScore = 0;
            
            if(turn == 1){
                placeAMove(point, 1); 
                currentScore = alphaBetaMinimax(alpha, beta, depth+1, 2);
                maxValue = Math.max(maxValue, currentScore); 
                
                //Set alpha
                alpha = Math.max(currentScore, alpha);
                
                if(depth == 0)
                    rootsChildrenScore.add(new PointsAndScores(currentScore, point));
            }else if(turn == 2){
                placeAMove(point, 2);
                currentScore = alphaBetaMinimax(alpha, beta, depth+1, 1); 
                minValue = Math.min(minValue, currentScore);
                
                //Set beta
                beta = Math.min(currentScore, beta);
            }
            //reset board
            board[point.x][point.y] = 0; 
            
            //If a pruning has been done, don't evaluate the rest of the sibling states
            if(currentScore == Integer.MAX_VALUE || currentScore == Integer.MIN_VALUE) break;
        }
        return turn == 1 ? maxValue : minValue;
    }  

    public boolean isGameOver() {
        //Game is over is someone has won, or board is full (draw)
        return (hasXWon() || hasOWon() || getAvailableStates().isEmpty());
    }

    public boolean hasXWon() {
        if ((board[0][0] == board[1][1] && board[0][0] == board[2][2] && board[0][0] == 1) || (board[0][2] == board[1][1] && board[0][2] == board[2][0] && board[0][2] == 1)) {
            //System.out.println("X Diagonal Win");
            return true;
        }
        for (int i = 0; i < 3; ++i) {
            if (((board[i][0] == board[i][1] && board[i][0] == board[i][2] && board[i][0] == 1)
                    || (board[0][i] == board[1][i] && board[0][i] == board[2][i] && board[0][i] == 1))) {
                // System.out.println("X Row or Column win");
                return true;
            }
        }
        return false;
    }

    public boolean hasOWon() {
        if ((board[0][0] == board[1][1] && board[0][0] == board[2][2] && board[0][0] == 2) || (board[0][2] == board[1][1] && board[0][2] == board[2][0] && board[0][2] == 2)) {
            // System.out.println("O Diagonal Win");
            return true;
        }
        for (int i = 0; i < 3; ++i) {
            if ((board[i][0] == board[i][1] && board[i][0] == board[i][2] && board[i][0] == 2)
                    || (board[0][i] == board[1][i] && board[0][i] == board[2][i] && board[0][i] == 2)) {
                //  System.out.println("O Row or Column win");
                return true;
            }
        }

        return false;
    }

    public List<Point> getAvailableStates() {
        availablePoints = new ArrayList<>();
        for (int i = 0; i < 3; ++i) {
            for (int j = 0; j < 3; ++j) {
                if (board[i][j] == 0) {
                    availablePoints.add(new Point(i, j));
                }
            }
        }
        return availablePoints;
    }

    public void placeAMove(Point point, int player) {
        board[point.x][point.y] = player;   //player = 1 for X, 2 for O
    }

    public Point returnBestMove() {
        int MAX = -100000;
        int best = -1;

        for (int i = 0; i < rootsChildrenScore.size(); ++i) {
            if (MAX < rootsChildrenScore.get(i).score) {
                MAX = rootsChildrenScore.get(i).score;
                best = i;
            }
        }

        return rootsChildrenScore.get(best).point;
    }

    void takeHumanInput() {
        System.out.println("Your move: ");
        int x = scan.nextInt();
        int y = scan.nextInt();
        Point point = new Point(x, y);
        placeAMove(point, 2);
    }

    public void displayBoard() {
        System.out.println();

        for (int i = 0; i < 3; ++i) {
            for (int j = 0; j < 3; ++j) {
                System.out.print(board[i][j] + " ");
            }
            System.out.println();

        }
    } 
    
    public void resetBoard() {
        for (int i = 0; i < 3; ++i) {
            for (int j = 0; j < 3; ++j) {
                board[i][j] = 0;
            }
        }
    } 
}

public class AlphaBetaMinimaxTTT {
 
    public static void main(String[] args) { 
        Board b = new Board();
        Random rand = new Random();

        b.displayBoard();

        System.out.println("Who's gonna move first? (1)Computer (2)User: ");
        int choice = b.scan.nextInt();
        if (choice == 1) {
            Point p = new Point(rand.nextInt(3), rand.nextInt(3));
            b.placeAMove(p, 1);
            b.displayBoard();
        }

        while (!b.isGameOver()) {
            System.out.println("Your move: ");
            Point userMove = new Point(b.scan.nextInt(), b.scan.nextInt());

            b.placeAMove(userMove, 2); //2 for O and O is the user
            b.displayBoard();
            if (b.isGameOver()) break;
            
            b.alphaBetaMinimax(Integer.MIN_VALUE, Integer.MAX_VALUE, 0, 1);
            for (PointsAndScores pas : b.rootsChildrenScore) 
                System.out.println("Point: " + pas.point + " Score: " + pas.score);
            
            b.placeAMove(b.returnBestMove(), 1);
            b.displayBoard();
        }
        if (b.hasXWon()) {
            System.out.println("Unfortunately, you lost!");
        } else if (b.hasOWon()) {
            System.out.println("You win!");
        } else {
            System.out.println("It's a draw!");
        }
    }
}

Output:

0 0 0
0 0 0
0 0 0
Who's gonna move first? (1)Computer (2)User:
2
Your move:
2 2

0 0 0
0 0 0
0 0 2
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 4
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 4
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 4
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 4
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 5
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 6
Pruning at depth = 4
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 7
Pruning at depth = 6
Pruning at depth = 7
Pruning at depth = 5
Pruning at depth = 6
Pruning at depth = 7
... more
Point: [0, 0] Score: -90
Point: [0, 1] Score: -90
Point: [0, 2] Score: -90
Point: [1, 0] Score: -91
Point: [1, 1] Score: 0
Point: [1, 2] Score: 0
Point: [2, 0] Score: 0
Point: [2, 1] Score: 0

0 0 0
0 1 0
0 0 2
Your move:
0 0

2 0 0
0 1 0
0 0 2
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 3
Pruning at depth = 5
Pruning at depth = 3
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 3
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 3
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 2
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 3
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 2
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 5
Pruning at depth = 3
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 5
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 2
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 2
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 4
Pruning at depth = 2
Point: [0, 1] Score: 0
Point: [0, 2] Score: 0
Point: [1, 0] Score: 0
Point: [1, 2] Score: 0
Point: [2, 0] Score: -90
Point: [2, 1] Score: 0

2 1 0
0 1 0
0 0 2
Your move:
2 1

2 1 0
0 1 0
0 2 2
Pruning at depth = 3
Pruning at depth = 3
Point: [0, 2] Score: -109
Point: [1, 0] Score: -100
Point: [1, 2] Score: -100
Point: [2, 0] Score: 0

2 1 0
0 1 0
1 2 2
Your move:
0 2

2 1 2
0 1 0
1 2 2
Point: [1, 0] Score: -100
Point: [1, 2] Score: 0

2 1 2
0 1 1
1 2 2
Your move:
1 0

2 1 2
2 1 1
1 2 2
It's a draw!

7 comments:

  1. Thanks for the great tutorial.

    ReplyDelete
  2. when i run this code it takes first input then after we input second move it throws exception
    what should i do ?
    and how we can write this code for ( computer always win or match draw )

    ReplyDelete
  3. You might have entered the input for the 3rd row/column as 3; it should be 2 (the indexes are 0 based).

    ReplyDelete
    Replies
    1. when its time to human input it takes input and says k problem in returnbestmove function in main class

      Delete
  4. Does the principle of alpha beta pruning also work for a scoring system with only three scores, for example +10 for an O win, -10 for an X win and 0 for a draw? If yes how would it look?

    ReplyDelete
  5. I want to make ludo game using alpha beta pruning, can anybody help.?

    ReplyDelete