Minimax Algorithm Guide: How to Create an Unbeatable AI

Have you ever played a game of chess or checkers against a computer and been amazed at its ability to plan multiple steps ahead and seemingly predict your every move? The "secret weapon" powering many of these unbeatable AIs is an algorithm called minimax.

Minimax is a decision rule for minimizing the worst-case potential loss in a two-player zero-sum game. In plainer terms, it‘s a way for an AI to look ahead at all possible future moves to determine the best course of action that leads to a win or draw, assuming the opponent also plays optimally.

While minimax was originally conceived for use in two-player games, the core concepts have applications in all sorts of competitive domains where agents engage in strategic decision making, from economics to cybersecurity.

In this guide, I‘ll walk through how the minimax algorithm works, provide step-by-step code examples of using it to create an unbeatable Tic-Tac-Toe AI, discuss ways to optimize its performance, and explore other applications of this fascinating algorithm. Let‘s dive in!

Minimax Fundamentals: A High-Level Overview

The minimax algorithm works by constructing a search tree of all possible future game states, with the current game position as the root node. The algorithm performs a depth-first exploration of this tree, all the way down to the terminal nodes (leaf nodes) which represent completed games where the outcome is known (win, loss or draw).

Each game state is assigned a score or utility value based on whether it leads to a win, loss, or draw from the perspective of the maximizing player. These utilities are then propagated back up the tree, with nodes representing the maximizing player‘s turn assigned the maximum utility of its children, and nodes representing the minimizing player‘s turn assigned the minimum utility of its children.

This back-and-forth between selecting the maximum and minimum utility is why the algorithm is called minimax—the maximizer aims to maximize its chances of winning, while the minimizer aims to minimize the maximizer‘s score.

Once the utilities have been propagated all the way back up to the root node, the AI simply chooses whichever move corresponds to the child node with the highest utility. This represents the move that gives the maximizer the best worst-case outcome, assuming optimal play by both sides.

Here is some pseudo-code illustrating the basic minimax algorithm:

function minimax(node, depth, maximizingPlayer) is
    if depth == 0 or node is a terminal node then
        return the heuristic value of node
    if maximizingPlayer then
        value := −∞
        for each child of node do
            value := max(value, minimax(child, depth − 1, FALSE))
        return value
    else (* minimizing player *)
        value := +∞
        for each child of node do
            value := min(value, minimax(child, depth − 1, TRUE))
        return value

The key components are:

  1. A recursive function that takes the current node (game state), the depth to search to, and a boolean indicating whether the current player is the maximizer or minimizer
  2. A base case that returns the utility of the node if it‘s a terminal node or the maximum search depth has been reached
  3. A recursive case that either maximizes or minimizes the utility of child nodes depending on whose turn it is

With this high-level understanding in mind, let‘s see how to implement minimax in code to create an unbeatable Tic-Tac-Toe AI.

Implementing Minimax in Code: Tic-Tac-Toe Example

We‘ll implement a simple command-line Tic-Tac-Toe game in Python where a human player faces off against an AI using the minimax algorithm.

First, let‘s define a function to print out the board:

def print_board(board):
    print("|".join(board[0:3]))
    print("-" * 5)
    print("|".join(board[3:6]))  
    print("-" * 5)
    print("|".join(board[6:9]))

The board itself will be represented by a list of strings, with "X", "O", or " " in each position.

Next, we need a function that checks if anyone has won the game:

def check_win(board, player):
    win_positions = [
        [0, 1, 2], [3, 4, 5], [6, 7, 8],  # rows
        [0, 3, 6], [1, 4, 7], [2, 5, 8],  # columns
        [0, 4, 8], [2, 4, 6]              # diagonals
    ]
for pos in win_positions:
    if board[pos[0]] == board[pos[1]] == board[pos[2]] == player:
        return True

return False

This checks all possible winning combinations of positions to see if the given player has three in a row anywhere.

Now for the key part — the minimax algorithm itself:

def minimax(board, depth, maximizing_player):
    # Base case: check for a win or loss
    if check_win(board, "X"):
        return 1
    elif check_win(board, "O"):
        return -1
    elif " " not in board:  
        return 0
if maximizing_player:
    max_eval = -float("inf")
    for i in range(9):
        if board[i] == " ":
            board[i] = "X"
            eval = minimax(board, depth+1, False)
            board[i] = " "
            max_eval = max(max_eval, eval)
    return max_eval
else:
    min_eval = float("inf") 
    for i in range(9):
        if board[i] == " ":
            board[i] = "O"
            eval = minimax(board, depth+1, True)
            board[i] = " "
            min_eval = min(min_eval, eval)
    return min_eval

This recursive function follows the same basic logic as the pseudo-code above:

  1. First it checks the base cases — if either player has won or the board is full (a draw), it returns the appropriate utility value (+1 for X win, -1 for O win, 0 for draw).

  2. If it‘s the maximizing player‘s turn, it tries placing an X in each open position, recursively calls minimax on the resulting board state, and keeps track of the maximum utility.

  3. If it‘s the minimizing player‘s turn, it does the same thing but with O and keeping track of the minimum utility.

  4. It returns the best utility value found for the current player.

To have the AI actually make a move, we use minimax to evaluate the utility of each possible move and choose the one with the maximum utility:

def ai_move(board):
    best_score = -float("inf")
    best_move = None
    for i in range(9):
        if board[i] == " ":
            board[i] = "X"
            score = minimax(board, 0, False)
            board[i] = " "
            if score > best_score:
                best_score = score
                best_move = i
    board[best_move] = "X"

Finally, we can put it all together in a main game loop:

def play_game():
    board = [" "] * 9
    human_player = "O"
    ai_player = "X"
    current_player = "X"
while True:
    print_board(board)
    if current_player == human_player:
        move = int(input("Enter your move (0-8): "))
        board[move] = human_player
    else:
        ai_move(board)

    if check_win(board, current_player):
        print(f"{current_player} wins!")
        break
    elif " " not in board:
        print("It‘s a draw!")
        break

    current_player = "O" if current_player == "X" else "X"

play_game()

This alternates between the human player and AI making moves until someone wins or the game ends in a draw.

And there you have it — an unbeatable Tic-Tac-Toe AI using minimax! The AI will always either win or force a draw, because it exhaustively searches the entire game tree to find the optimal move.

Optimizing Minimax with Alpha-Beta Pruning

While minimax works great for simple games like Tic-Tac-Toe, it quickly becomes intractable for more complex games like chess that have much larger state spaces. The number of game states grows exponentially with each additional ply (a ply refers to one player‘s move), leading to impossibly large search trees.

For example, chess has an average branching factor of about 35 and games often last 50+ moves (100+ plys), so exhaustively searching the tree would require evaluating 35^100 nodes — a number vastly greater than the number of atoms in the observable universe!

To make minimax practical for these larger games, we need to prune away irrelevant parts of the search tree. This is where the alpha-beta pruning algorithm comes in.

Alpha-beta keeps track of two additional values as it performs the depth-first search:

  • alpha: the best (highest-value) choice found so far at any point along the path of Maximizer. The initial value of alpha is -∞.
  • beta: the best (lowest-value) choice found so far at any point along the path of Minimizer. The initial value of beta is +∞.

As it searches, the algorithm updates alpha whenever it finds a larger value, and beta whenever it finds a smaller value. Before evaluating each new node, it checks whether alpha is greater than or equal to beta. If so, it stops evaluating that node and its children (i.e. it prunes the tree at that point), because the other player would never let the current position occur.

In other words, alpha-beta eliminates subtrees that are provably worse than a previously examined move, saving huge amounts of search time. With perfect ordering, alpha-beta can search a tree twice as deep as minimax in the same amount of time!

Here‘s the updated minimax function with alpha-beta pruning:

def minimax(board, depth, alpha, beta, maximizing_player):
    if check_win(board, "X"):
        return 1
    elif check_win(board, "O"):
        return -1
    elif " " not in board:  
        return 0
if maximizing_player:
    max_eval = -float("inf")
    for i in range(9):
        if board[i] == " ":
            board[i] = "X"
            eval = minimax(board, depth+1, alpha, beta, False)
            board[i] = " "
            max_eval = max(max_eval, eval)
            alpha = max(alpha, eval)
            if beta <= alpha:
                break
    return max_eval
else:
    min_eval = float("inf") 
    for i in range(9):
        if board[i] == " ":
            board[i] = "O"
            eval = minimax(board, depth+1, alpha, beta, True)
            board[i] = " "
            min_eval = min(min_eval, eval)
            beta = min(beta, eval)
            if beta <= alpha:
                break
    return min_eval

The only differences are the two new parameters alpha and beta, the updates to their values in each recursive call, and the checks to break out of the loops if beta <= alpha.

Alpha-beta makes minimax feasible for much more complex games and is a core component of many world-class game AIs. While computers still can‘t outplay top humans at the full game of chess, AIs using optimized minimax searches can beat even the best human players at chess endgames.

Limitations and Alternative Approaches

As powerful as minimax is, it still has some significant limitations. Most notably, the algorithm requires assigning an accurate utility value to each game state, which can be difficult or impossible for very complex games. The utility function needs to capture all strategically relevant features of the position to enable good play.

Additionally, minimax assumes that the opponent always plays optimally, which may not reflect realistic gameplay. It‘s often beneficial to model the specific opponent and adapt strategy accordingly. Stochastic game AIs can address this by using probability distributions to represent uncertainty about the opponent‘s strategy.

Another issue is the memory required to store the search tree, which grows exponentially with the search depth even when using alpha-beta pruning. This has led to the development of memory-efficient variants like the MTD(f) algorithm.

Many successful game AIs also use entirely different approaches from minimax:

  • Monte Carlo tree search builds a search tree based on random sampling and can handle very large state spaces
  • Reinforcement learning AIs like AlphaZero learn by playing against themselves and gradually adjusting their parameters
  • Knowledge-based approaches rely on hard-coded human expert knowledge to prune the search space

Each approach has its own strengths and weaknesses, and the cutting edge of game AI research involves combining them in clever ways to achieve superhuman play.

Applications Beyond Games

While minimax is most commonly associated with two-player board games, the core concepts of adversarial search and utility maximization have applications in many real-world domains:

  • Cybersecurity: AI can model the goals and constraints of both the attacker and defender to identify optimal strategies
  • Military strategy: Minimax-style reasoning can help commanders anticipate enemy plans and devise countermeasures
  • Negotiations: Agents can use adversarial search to find win-win solutions that maximize the value for all parties
  • Autonomous vehicles: Self-driving cars must predict the behavior of other agents on the road and plan accordingly
  • Drug discovery: Minimax ideas can help find compounds that are maximally effective while minimally toxic

As AI systems tackle increasingly complex real-world problems, the concepts pioneered by minimax will be essential for enabling robust and strategic decision making. By reasoning through the possible actions of other agents, AIs can arrive at solutions that are not just locally optimal, but globally optimal even in the face of adversaries.

While it may have started with simple games like Tic-Tac-Toe, the minimax algorithm has grown into a cornerstone of modern artificial intelligence. As you look to create AIs that can navigate an uncertain and competitive world, take a page from the minimax playbook and remember: sometimes the best offense is a good defense.

Similar Posts