Generated by Leonardo.ai

Tic-tac-Go: A Gopher’s Approach to the Minimax Algorithm

Kevin Feng
14 min readAug 16, 2023

--

So I recently started learning Go, the programming language. Why? Because the gopher is very cute, of course.

Folding belly flesh is the only reason you need to start learning Go

Although I do think the gopher is quite cute, I decided to pick up Go for a number of other reasons:

  1. I feel like I’ve never truly mastered a language, so why not give Go a try?
  2. I got sucked down the rabbit hole of a debate that is Rust vs Go and I learned that Go is far easier to learn while still being blazingly fast
  3. DevOps is something that I’m currently exploring, and some of its fundamental tools are built on Go: Docker, Kubernetes, and Terraform are all written in Go

So after watching just a few videos on Go and completing A Tour of Go, I decided to build something very simple. Or so I thought.

Source

Tic-tac-toe is not that simple

Tic-tac-toe is one of those projects that you may consider building when you first learn a language. You probably won’t have to import any external packages; it’s really just a project to familiarize yourself with the syntax of a programming language.

Average RPS enjoyer

But as this section’s heading implies, I’m here to convince you that tic-tac-toe isn’t that simple. To demonstrate why this is so, let’s consider another (truly) simple game: Rock paper scissors. In rock paper scissors, both players simulataneously perform an option select between three choices: The titular rock, paper, or scissors. The results of the round are determined by what the players have chosen: Rock beats scissors, scissors beats paper, and paper beats rock. In the case that both players select the same option, the round ends in a tie. RPS is typically played as a single round or as a best 2 out of 3.

Source

And here’s the point I’d like to make: There is no optimal move to be made in rock paper scissors, at any point in the game. After boiling down all the potential mind games and the myriad of statistics that make claims such as “men are more likely to start with rock,” the game ultimately comes down to chance. Regardless of what option you have selected, there are three choices that your opponent can make, giving you a 33% chance to win, a 33% change to lose, and a 33% chance to tie. In essence, the game has no “thinking” to it.

Source

However, the very opposite can be said about tic-tac-toe, even considering the simplicity of the game. Tic-tac-toe is a two-player game played on a 3x3 grid. Players take turns marking open cells with their symbol, either ‘X’ or ‘O.’ The objective of the game is to align three of your own symbols in a row, either horizontally, vertically, or diagonally. If the grid fills up without a winner, the game ends in a draw. Unlike rock paper scissors, which is a game of chance, tic-tac-toe is a strategic, albeit solved, game that requires planning ahead to block your opponent while setting up your own winning plays at the same time.

Implementing tic-tac-toe as a PvP (Player vs Player) game is actually very easy. Just create some type of data structure (a 2D array) to store the board, use a game loop that will continuously run so long as the board is not full or neither player has won, and have each player take turns placing their own marker. Pretty simple right?

What happens if the user doesn’t have another person to play against? Then our game should feature a PvE (Player vs Environment) mode so that the player can play against the computer. Implementing a PvE option means we’ll have to consider how a computer will calculate the optimal move to make, no matter the board that it’s given. Not so simple after all.

Enter the minimax algorithm.

Source

The theory behind the minimax algorithm

The minimax algorithm is a game theory backtracking algorithm which determines future game states using recursion and attributing scores for each move that leads to those future game states. A positive score is associated to moves that lead the player to victory and a negative score is attributed to moves that lead to the player’s loss. Neutral scores are given in the case of a tie. The goal of the minimax algorithm is to maximize the score for the player and to minimize the score for the opponent.

Don’t get confused with the terminology of “player” here. Although we mentioned just earlier that a Player is a human, we’re now thinking from the perspective of the tic-tac-toe AI that we’re building. The computer wants the human, its “opponent,” to lose, but it wants itself, the “player,” to win.

To further explain the minimax algorithm in the context of tic-tac-toe, I’m going to incorporate some visual aids, although I lack the tenacity or artistic abilities to create diagrams better than those I found on this blog post by Never Stop Building:

Sidenote: I found Never Stop Building’s website to be extremely interesting. It is focused on woodworking and furniture design that draws inspiration from traditional Japanese techniques — quite the deviation from programming and algorithm design.

Back to the minimax algorithm…

Source: Never Stop Building

Consider this recursion tree of the minimax algorithm in the context of a tic-tac-toe game. The player placing X’s is the “player” of the minimax algorithm while the player placing O’s is the “opponent.” Each game state is denoted by a number above the corresponding board.

From state 1 of the game, X can go in three different places:

  1. Top — leads to state 2
  2. Right — leads to state 3
  3. Center — leads to state 4

If X goes top (state 3), then O can make one of two moves: center or right, the latter of which will force X to win (Yay!). Realistically, since we’re considering the opponent to also be a perfect player, the former should occur and X will lose.

The same can be said for the scenario in which X goes right (state 4). O can make one of two choices, one which leads to O’s victory, and one which leads to to X’s forced victory.

Lastly, X going center (state 2) will lead in an automatic win.

So now that we’ve clarified all the possible boards that can occur from the original state, what do the numbers -10 and +10 mean in the context of our algorithm?

Essentially, a score of -10 or +10 will indicate that the board is in an end state: One of the players has aligned three of their markers. More specifically, a score of -10 will mean that the “player” has lost, while a score of +10 will indicate that the “player” has won. However, if the board is completely full without a winner, the game state will have a score of 0.

Me when implementing a recursive game theory algorithm

More theory behind the minimax algorithm

Since the minimax algorithm is a recursive one, we should consider the first, and perhaps, the most important logic: The base case. Or in layman’s terms, the condition(s), that when satisfied include some form of return statement to end the recursive algorithm.

Let’s look at a zoomed-in version of the recursion tree from earlier to gain a better understanding of what our base case should be:

Source: Never Stop Building

Immediately, we can see one reason for our recursive algorithm to end: If one of the players has won the game. This is the case in the leftmost recursive call, where X has aligned three markers diagonally. But that doesn’t constitute the entire base case. After all, the board can fill up entirely without either player winning.

Based on the original game state, no matter what X does, someone is bound to win. But what about a game in which both players play perfectly? This will ultimately result in a full board with no winner — a tie, or a game state which has a score of 0, and thus, represents another the missing part of our base case.

So our base case should be the following: “Check if the board is in an end state. If so, return.” To make checking this base case easier, we can encapsulate this logic in a function, which we’ll code up later.

The next step is to design the core logic for our recursive algorithm that occurs immediately following the base case, but there is one last piece of theory that we need to go over before we write any additional code: Depth.

Source

頑張れ (Ganbare)!

As a wise aficionado of fisticuffs once said:

“It’s not about how hard you hit. It’s about how hard you can get hit and keep moving forward.”

And to echo that sentiment, I say 頑張れ! or Do your best! Even if you are going to lose the battle, that doesn’t mean you have to lose your honor.

I know that was quite the tangent coming from our discussion of a tic-tac-toe minimax algorithm, but I promise you that Rocky Balboa’s words of wisdom will help us in designing our perfect tic-tac-toe AI.

Consider this game state and just some of its potential outcomes:

Credit to Never Stop Building again

No matter what move O makes, X is bound to win, regardless of the depth of the recursion tree. But X winning should not be possible considering two perfect players, right? That’s correct, but take a closer look at the game state 1. This board could not have been constructed by two players making optimal moves.

Just think about it: It’s turn 6 and neither player has taken control of center. Although going center isn’t the optimal starting move in a perfect game, center should have certainly been taken by turn 2. In essence, this is a board that is littered with mistakes — moves that aren’t optimal.

But that doesn’t mean our algorithm shouldn’t account for these scenarios. A robust algorithm should handle any type of case thrown at it, at least in the scope of what it’s meant to do. To illustrate it more clearly, here’s what happens if we don’t account for these edge cases regarding depth:

Source: Never Stop Building

In this game state, we would expect O, the purportedly perfect player, to go top-right and block X’s vertical path to victory. Instead, if we don’t take depth into account, our minimax algorithm will cause our AI to make a “dumb” play and go center, which allows X to easily win by going top-right. Why does this happen?

Well, if we look at the recursion tree from earlier, thinking from the perspective of O, we know that no matter what, X is bound to win; all game states ultimately result in a score of -10. And since getting a score of -10 at depth 2 is less computation than getting that score at depth 3, the algorithm simply returns one of those options found at a depth of 2.

As such, this gameplay doesn’t look very optimal! Rather than prolonging the game even though the algorithm is “aware” that it is guaranteed to lose, it just chooses to kamikaze and lose as quickly as possible. To take depth into account, there’s a relatively simple solution:

Never Stop Building is the GOAT

Here’s the same recursion tree from earlier, but this time, instead of naively assigning scores of -10 for losses and +10 for victories, we do some simple arithmetic involving those scores and the depth of the end game states to compute some more useful info in the context of prolonging the game in the case of inevitable defeat.

If we lose, set the score of that game state to be depth-10. This expression will always result in a negative number, but the value of the number will be greater, if the depth is deeper. Since the minimax algorithm will choose the route with the highest score, this is how we can achieve “deeper” losses.

In the case of a victory, we instead set the score of that game state to be 10-depth. The resulting score will always be a positive number, but scores that are higher indicate a shallower depth — a faster victory, which is what we want for optimal gameplay.

Now I think we’re ready to write some code…

Source

Minimax Algorithm in Go

For reference, here’s the source code for this entire project:

The first order of business is to create some type of struct that represents our game board. It should be a 2D array, which will have marginally better performance than a 2D slice, and besides, there’s no need for the size of our tic-tac-toe board to change. It should be a 3x3 of cells no matter what.

I’ve defined the following bit of code in a separate package called tictactoe, in a file called tictactoe.go:

package tictactoe

type TicTacToe struct {
Board [3][3]rune
}

I’ve chosen rune as my data type, since we only need to store characters in each cell, but using string would be fine as well.

In our main.go file, we can import this and initalize a game board:

package main

import (
"tictacgo/tictactoe"
)

func main() {
var t tictactoe.TicTacToe
t.Board = [3][3]rune{
{' ', ' ', ' '},
{' ', ' ', ' '},
{' ', ' ', ' '},
}
}

And we’ll also want a way to print the board, which we can define a method for in our tictactoe.go file:

This is what the printed board looks like in my terminal:

The numbers surrounding the printed board are just there to help the player see which number options correspond with which cells. So if its X’s turn, and they see that bottom-left is open, they can input 6 into the CLI and an X will be placed there.

And let’s not forget about implementing a function to check the game status of a given board:

I know there’s some cursed logic in here, yell at me why dontcha

Now all we have to do is to implement the Minimax algorithm itself:

First, let’s break down the function signature of the minimax algorithm:

func MiniMax(t TicTacToe, depth int, max bool) (int, int, int)

In terms of parameters, we’re taking in a TicTacToe struct, an int called depth, which represents the depth of the current call, as well as a bool called max. What does max indicate?

Since the minimax algorithm has to simulate placing markers for each player, we need some sort of flag that indicates whether or not we want to maximize or minimize the score for the current player. In the context of our program, max being true means that the current call of the minimax algorithm is on the computer, or the AI that should play perfectly.

Since we want the AI to play perfectly, we maximize the score from subsequent recursive calls. If max is false, then that means the minimax recursive call is on the opponent, or the human player. In this case, we minimize the score, since we want the human player to lose, AKA computer victory.

Our minimax algorithm returns three integers, the best score found and the row and column that produces the move associated with that best score.

Next, let’s take a look at the base case; I think the code is pretty self-explanatory:

status, tie := GameStatus(t)

// return 10 - depth if computer wins
if status == false && tie == 2 {
return 10 - depth, 0, 0
// return -10 + depth if computer inevitably loses
} else if status == false && tie == 1 {
return -10 + depth, 0, 0
// return 0 if it's a tie
} else if status == false && tie == 0 {
return 0, 0, 0
}

And last, let’s take a look at the recursive logic, which is essentially written out twice: Once for the maximizing player and once for the minimizing player. The following code is for the maximizing player:

if max {
bestScore := -69
var row int
var col int
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
if t.Board[i][j] == ' ' {
t.Board[i][j] = 'o'
score, _, _ := MiniMax(t, depth+1, false)
t.Board[i][j] = ' '
if score > bestScore {
bestScore = score
row = i
col = j
}
}
}
}
return bestScore, row, col
}

The value of bestScore for the maximizing player is set to some arbitrary, high magnitude negative number — it doesn’t really matter what it is — so long as it’s smaller than -10, since the lowest value that a score can be is -10 before even considering depth.

We then loop through the board, simulating the AI player placing its marker, the ‘O,’ in every vacant cell, and recursively calling MiniMax on each of those boards. Note that in this recursive MiniMax call, we increase depth by 1 and also flip the value of max. Since max was true in this chunk of code, we flip it to false, since the next turn should be the human player’s move.

In the case that the resulting score is better than bestScore, bestScore gets updated and we return that score as well as the row and column which lead to that optimal outcome.

The same logic is applied to the minimizing player:

else {
bestScore := 69
var row int
var col int
for i := 0; i < 3; i++ {
for j := 0; j < 3; j++ {
if t.Board[i][j] == ' ' {
t.Board[i][j] = 'x'
score, _, _ := MiniMax(t, depth+1, true)
t.Board[i][j] = ' '
if score < bestScore {
bestScore = score
row = i
col = j
}
}
}
}
return bestScore, row, col
}

The only difference here is that we’re simulating the player who places ‘X,’ the initial value of bestScore is set to a positive number greater than 10, we update bestScore in the case that it gets smaller (rather than larger), and we flip the max parameter to true for the subsequent layer of recursive calls to MiniMax.

We’re done!

Conclusion

And that’s it! We’ve successfully created a perfect tic-tac-toe AI. No matter how you play against the computer, you will only ever tie at best. As that statement implies, if you make just a single mistake, you’ll lose. Since the AI goes second, that mistake will manifest in missing an obvious block — no foresight is really required.

I hope this article helped you learn something, whether it was something about game decision theory, recursive algorithm decision, or just Go syntax. Leave a comment below if you have any questions, fellow gophers.

Kevin Feng || Website || GitHub || LinkedIn || Medium (you’re already here!)

--

--