Creating Tic Tac Tableau Part 2: 5 Million Records – Really?
If you missed them, see the original Tic Tac Toe Dashboard and Part 1 first.
Chris Love asked if it was really necessary to have 5 million records. After all, there are only 19,683 possible variations of boards – even less when you consider what would be a valid board.
(By the way, check out Chris’ post Generating Tic Tac Toe Data the Alteryx Way which has caused me to bump “Learn Alteryx” up on my to-do list. Visually working with the data beats coding any day. That’s why I love Tableau! And I’m really hoping to see an Alteryx solution that incorporates the complexity below.)
At first, Chris’ question caused some self-doubt. Maybe I had over-complicated things and really should have had 19,683 records of data instead of 5 million. That would have derailed the posts I had planned to explain how I turned 5 million records into less than a million!
But then I remembered why I needed such a large data set. What makes the Tic Tac Toe dashboard work is not having every possible board. It’s having every possible move that leads to every possible board. You can see that here:
You’ll notice Board 1 at the top. That’s the starting point. X can play in any empty square, which represents 9 possible children boards for Board 1 (only 3 are shown). For any one of those boards, O has 8 possible moves. Those represent the grand-children of board 1. So there are 9 children and 9 X 8 = 72 grand-children of Board 1. So, just 2 plays into the game, there are 72 possible outcomes, but 72 + 9 combinations of moves. When you work your way through to every possible outcome, there are (9 X 8 X 7 X 6 X 5 X 4 X 3 X 2 X 1) + (9 X 8 X 7 X 6 X 5 X 4 X 3 X 2) + (9 X 8 X 7 X 6 X 5 X 4 X 3) + (9 X 8 X 7 X 6 X 5 X 4) + (9 X 8 X 7 X 6 X 5) + (9 X 8 X 7 X 6) + (9 X 8 X 7) + (9 X 8 X 7) + (9 X 8) + 9 possible moves. Except that some moves result in wins before every space is used. Then, multiply the result by 9 (one record for every space) and the end result is nearly 5,000,000.
That’s the number of records required for the full data set required by the Tic Tac Toe dashboard. Additionally, to allow Tableau to understand how to play, each record must have these elements:
- Board ID: the ID of the current board
- Child ID: the ID of the board resulting from the move. This will allow the action filter to work
- Parent ID: the ID of the board that preceded this one. The child ID alone would be enough except that the dashboard resulting from a click will always skip a generation. In the example above, Board 1 will be shown but no board from the second level will ever display. Upon clicking, Tableau will select a board from the next generation. The third level represents the resulting board after Tableau has taken its turn. Thus, the grand-children are the set of boards from which Tableau must pick the “best”
- Game Status: In Progress, X Win, O Win, or Draw. Tableau should select O Win, if possible
- Leads to Inevitable Loss: the board will result in a loss for Tableau (so, Tableau needs to not select it). This was a late addition to the data set (so it’s not in the code). I added it after the fact, once I realized it was necessary.
- # X Wins: the number of X wins possible from this board (Tableau needs to minimize this)
- # O Wins: the number of O wins possible from this board (Tableau needs to maximize this)
Now, out of the 5 million records that result from every combination of every possible move, I can eliminate some records and use some tricks to have Tableau supply missing data I remove from the source. I’ll show how in future posts…
In the meantime, here’s a view of the first 30 possible boards (with 9 spaces for each board)