<< Java problem with mutual TLS authentication when using incoming and outgoing connections simultaneously | Home | Using Reinforcement Learning To Learn To Play Tic-Tac-Toe >>

The best opening move in a game of tic-tac-toe

As part of a machine learning project, I had to understand tic-tac-toe better, and so I have written an algorithm which a) finds all the possible unique games and b) gathers statistical information about those games.

Based on Wikipedia's tic-tac-toe article, consider a board with the nine positions numbered as follows:

  j=0 j=1 j=2
i=0 1 2 3
i=1 4 5 6
i=2 7 8 9

Assume X always starts. As an example, take the game where X moves top left, followed by O moving top right, then X going middle left followed by O going top middle. These first four moves can be written down as "1342". The game could continue and once completed could be written as "134258769". It's not a perfect game because the first player misses a few opportunities to win and in the end it's a draw.

Every possible combination of moves making up unique games of tic-tac-toe are hence found somewhere between the numbers 123456789 and 999999999 (although probably iterating up to 987654321 suffices). Most of the numbers are illegitimate because each cell is only allowed to be filled once, so for example the number 222222222 does not represent a valid combination. In order to find every valid combination we simply start with that lowest number and iterate up to nine nines, attempting to determine if each number is a valid combination and if it is, record the results of the game.

In order to determine if a combination is valid, we use the following Javascript snippet, which checks that the nine digit number does not contains zeros and contains all of the digits 1 to 9:
//gameAsString contains the current number, e.g. '123456798'
let isValidGame = true;
for(idx = 1; idx < 10; idx++) {
    if(gameAsString.indexOf('' + idx) === -1 //only valid if every digit from 1-9 is found
       || gameAsString.charAt(idx-1) === '0') //only 1-9 are valid
    {
        isValidGame = false;
        break;
    }
}
            
If a number does represent a valid combination, the next step is to iterate through each digit starting at the left, to work out when the game is finished (it can be won after just 5 moves, or it can require all 9 moves, ending in a win or a draw). For example numbers starting with '12437' represent combinations completed after just 5 moves. Although there are 9999 such valid combinations, the game is only recorded once and all other combinations of numbers which start with 12437 are discarded.

A game is analysed using the following algorithm. The empty board is built by creating a two dimensional array of objects using indexes i and j as shown above. An element in an array of cells represents a cell on the board and has attributes 'i', 'j' and 'v' which records the contents of the cell, either null, 'X' or 'O', depending on whether a player has moved in that cell.
let board = buildEmtpyBoard();
let moves = '';
for(idx = 0; idx < gameAsString.length; idx++){
    let c = gameAsString[idx];
    moves += c; //initially e.g. '1', then '12', etc.
    let coords = convertIndexToCoordinates(parseInt(c)); //turns e.g. '3' into [0,2]
    if(board[coords[0]][coords[1]].v)
        throw new Error("cell " + i + "," + j + " is already selected by " +
                        board[coords[0]][coords[1]].v);
    let player = idx % 2 === 0 ? X : O;
    let rival = player === X ? O : X;
...
    //move
    board[coords[0]][coords[1]].v = player;

    let winner = checkFinished(board);
    if(winner){
        let o = uniqueGames[moves];

        if(!o){
            ...
            //it doesnt exist. update stats
            updateStats(gameAsString, winner, idx, relevantStats);
            uniqueGames[moves] = {}; //record this game so it isn't harvested again
        }

        break;
    }
}
The function checkFinished(board) simply checks whether or not the game is over (a draw, or a player has three in a row), and returns 'DRAW', 'X' or 'O' depending upon the outcome of the game. 'uniqueGames' is an object used to track whether or not the current game has already been seen. Using the example above the combination '12437' will be seen 9999 times, but to all intents and purposes it is just one game of tic-tac-toe so we discard all subsequent sightings of it. If the game is finished and has not yet been seen, the statistics are updated.

The full algorithm is available on GitHub and you can run it live in your browser by clicking on this link: stats.html, which takes a few minutes to run.

That page records the outcome of the game from the perspective of the first move. "cornerXWins5" shows how many times X won games that were only 5 moves long by starting in a corner. As part of updating statistics, the results are grouped into games which are:
  • "stupid" - the player has missed winning in the current move or missed stopping the opponent from winning in the next move; grouped under "stupid stats"
  • won with a "fork" - the winning player had more than one position to move on the board in order to win, when they won; grouped under "fork stats"
  • all other games; grouped under "stats"
The functions used to categorise the games are determineIfStupid and determineIfWinningWithFork.

First of all, notice how there are 255,168 unique games (shown near the bottom of the page after computation is complete). This correlates with other results.

The following table is copied from that page and consolidates these raw results to show how often the starting player can win when opening in the relevant location.

14652 14232 14652
14232 15648 14232
14652 14232 14652

For example, the first player wins in 14652 of the 255168 unique games if they start in the top left. This number comes from adding the number of corner wins after 5, 7 and 9 moves (X doesn't win after 6 or 8 moves, rather O does), for each group of game (0+992+1344 from "stats", 0+1984+0 from "fork stats" and 720+19776+33792 from "stupid stats"), and importantly dividing by four, because there are four corners - they are shared out over the four corners of the table above. Results for the edges are also divided by four, but since there is only one centre cell the result is left undivided.

This table demonstrates that the best location to start is in fact the centre of the board, because starting there wins the most amount of games. This is confirmed by Azahar Machwes blog article Tic Tac Toe: Statistical Analysis with arbitrary Grid Size (albeit for different reasons than he gives) which states:
For the first movers the best starting square by far is 5 (right in the middle and opens up 4 winning lines – most of any starting square, all other squares open up to 3!). Second best option then becomes the corners (1, 3, 7, 9).
A similar result is shown when considering how often the first player loses after they start in one of the following locations (this is done by consolidating results of games that end after 6 or 8 moves, i.e. when the second player wins, beating the first player):

 7896 10176  7896
10176  5616 10176
 7896 10176  7896

For example, the first player loses in 10176 of the 255168 unique games when they start in the middle of the top row. This also demonstrates that the opening move for the first player, which leads to the least number of games lost, is also the centre. Finally we can consider how many games are drawn when the first player opens in the relevant cell:

5184 5184 5184
5184 4608 5184
5184 5184 5184

The first player draws least when opening in the centre.

As such, the centre is the best position for the first player to start, as it leads to a) the most number of games won and b) the least number of draws and c) the least number of games lost.

But because we are considering every possible game, this statement does depend upon the players being random in their moves and not necessarily moving so that they win as soon as possible or block an opponent from winning on their next turn. For that reason, the results have been grouped so that we can also analyse what happens when a more clever player plays. Let us disregard all "stupid" games where a move to win instantly is missed or a move to block the opponent winning next is missed. We are now considering a more "perfect player". They are only "more" perfect, rather than exactly "perfect", because the algorithm used to ignore stupid games works as follows. First it checks to see if the player is missing a chance to win immediately. If they are, they count as stupid. Second, it checks to see if the opponent could win in their next move, and if they can, and the current player does not block them, they count as stupid too. Now consider the game "1234567", as it has some special properties. After "12345" the board looks like this:

X O X
O X  
     

O, the second player, needs to move to position 7 or 9 in order to attempt to block the fork created when X moved to position 5. Of course any move which O makes is futile, and as such, the move to 6 which O is about to make does not count as stupid, because there is nothing they can do at this stage to avoid losing. Perhaps Os move to position 4 should have counted as stupid, because it didn't block X from getting a fork in their next move. The algorithm doesn't go that far, and as such this game is marked as being non-stupid and won with a fork by X in 7 moves. Since it isn't as good as it could be, I describe the players as more perfect rather than perfect.

So, when considering these more perfect players, the number of games won by the first player moving to the relevant cell then becomes:

1080 1240 1080
1240  480 1240
1080 1240 1080

The centre no longer leads to the most number of games won, rather starting on an edge improves the chances of a victory. The number of games lost by the first player moving to the relevant cell is:

304 608 304
608 128 608
304 608 304

The centre is still the starting position which results in the least number of losses for the first player. Although the edge has become the starting position which results in the most number of wins (see above), it is also the starting position resulting in the most number of games lost. Draw statistics for these more perfect players look like this:

1668 2144 1668
2144  816 2144
1668 2144 1668

Starting in the centre still results in the least number of draws. The results of the last three tables are a little conflicting - is it best to start in the centre and avoid a draw or start on an edge which leads to most victories but also to most losses? One way of answering this question is to consolidate the three tables by combining the results using a "reward function" - a formula to sum the individual results which can be weighted if required.

Imagine writing an artificial intelligence algorithm which had to determine the best cell to move to. It might be trained to favour a win over a draw and a draw over a loss. The reward function could be:
reward = (100*oddsOfWinning) + (10*oddsOfDrawing) + (-1*oddsOfLosing)
            
The table of rewards for perfect play, using the above weightings, is:

1,509,144 1,464,864 1,509,144
1,464,864 1,605,264 1,464,864
1,509,144 1,464,864 1,509,144

The result in this case is that a first move to the centre is still the best. Even if we change the weightings, it isn't possible to tune the reward function so that the corner becomes the best opening move.

Maybe it depends on how "best" is defined. The popular webcomic xkcd posted a complete map of optimal tic-tac-toe moves, perhaps in 2010, which is apparently based on the work titled "Fractal Images of Formal Systems", by Paul St. Denis and Patrick Grim, which shows a more complete map, because it does not show just the top left corner where X starts in the top left. For some reason, only the top left corner of their image was reproduced by xkcd. A few people have asked why this is on the xkcd forum and the best explanation I could find is from someone using the name Algrokoz, who states:
Opening in the corner is optimal because of the number of chances it gives O to not play optimally, not because of total potential win scenarios.
That means a corner opening gives the opponent less places to go in order to play optimally, whatever "optimally" means. Assuming that the best counter to an opening in a corner is the centre, and assuming that the best counter to an opening in the centre is a corner, then yes, one could argue that there are four corners and just one centre, so it is better to start in a corner, because a random opponent has a one in eight chance of finding the best counter move. Compare this to opening in the centre where a random opponent has a four in eight chance of finding the best counter move. We will examine this logic in a bit...

Martin Gardner, an American popular mathematics and science writer wrote the following in the 1988 edition of his book Hexaflexagons and Other Mathematical Diversions
If played "rationally" by both sides, the game must end in a draw. The only chance of winning is to catch an unwary opponent in a "trap" where a row can be scored on the next move in two ways, only one of which can be blocked. Of the three possible opening plays- a corner, the center or a side box- the strongest opening is the corner, because the opponent can avoid being trapped at the next move only by one of the eight possible choices: the center. Conversely, center opening traps can be blocked only by seizing a corner.
The "traps" which he is talking about are what Wikipedia and I refer to as forks - plays which result in a winning opportunity in more than one cell, forcing the opponent to lose. In his statement Gardner is saying that the only place the opponent can avoid a fork, after an opening in a corner, is the centre. If we analyse just games that end in a fork, and search for what happens after two moves we find that when X starts in a corner and O takes the centre, there are still 24 games that X can win with a fork, so I am not sure that Gardners statement is correct. However of all the other second moves, countering with the centre results in the least amount of possible forks later in the game (this is easy to calculate by grouping game results by their two opening moves, see the "second move analysis" shown on the statistics page after the computation completes).

His logic however does seem to be similar to that of the xkcd comment. Interestingly, if we consider what happens when X starts in the centre and O takes a corner, there are also 24 forks by which X can win the game, and this too is the best counter move from the point of view of forks.

If we consider only games won by a fork (see the "fork stats" section on the page which calculates them), then we find that starting in a corner results in 496 (1984/4) forks, compared to only 320 when starting in the centre. Finally we have a situation where it is indeed better to start in the corner.

But is it correct to just consider games that end in a fork? I don't believe it is possible to force a player into a situation where they will be confronted with a fork. Think about the game presented earlier, "1234567", where a fork could have been avoided by O moving to the centre (5) instead of 4. If a player is clever enough to avoid a fork then the game will certainly end in a draw. And so we are talking about players who know how to avoid losing from the start. There is no "best" place to go in order to avoid a draw, when perfect players are playing - Wikipedia and my own analysis shows that when perfect players play, the only possible result is a draw. An optimal opening position can surely only exist when playing a player who might make a mistake or a move which is less than ideal. In that case it makes no sense to analyse only games which end in a fork and we end up having to also consider games that end otherwise. As we have already seen, a corner opening is not the best.

Let's return quickly to the argument about the number of places the second player can move to. Maybe Gardners argument is the same, that in total, because there are four corners and just one centre, there are more moves available to the opponent to avoid a fork, when opening in the centre. But in the same quote he has reduced the number of opening moves to just three, because of board symmetries, so such a conclusion wouldn't be logical.

The problem with Gardners statement (and the graphic published by xkcd) is that the Wikipedia tic-tac-toe page uses them to make the following claim:
The first player, who shall be designated "X", has 3 possible positions to mark during the first turn... Player X can win or force a draw from any of these starting marks; however, playing the corner gives the opponent the smallest choice of squares which must be played to avoid losing[16]. This makes the corner the best opening move for X, when the opponent is not a perfect player.
Reference 16 is the Gardner book that I reference above. The last sentence in the Wikipedia quote is problematic because of the phrase "when the opponent is not a perfect player". As argued above, if players are not perfect, we need to consider at least games ending in a fork and otherwise, but probably also some games which involve stupid moves. As shown in this article, an opening in the centre is better than one in a corner.

Why is all of this so important? I've been working on training a machine learning algorithm to play tic-tac-toe and I was surprised how it learned to open in the centre rather than a corner. I assumed Wikipedia was correct... As a result of this study I shall now endevour to get the Wikipedia page corrected, before a generation of children wrongly learn that opening in a corner is the best move!


Copyright ©2018, Ant Kutschera
Social Bookmarks :  Add this post to Slashdot    Add this post to Digg    Add this post to Reddit    Add this post to Delicious    Add this post to Stumble it    Add this post to Google    Add this post to Technorati    Add this post to Bloglines    Add this post to Facebook    Add this post to Furl    Add this post to Windows Live    Add this post to Yahoo!



Add a comment Send a TrackBack