Once upon a time I wrote a Connect Four player program for a competition. My program was able to play the game optimally, winning it whenever it was possible to win, in a few seconds. I won the competition, and the program turned out to be the only one that played optimally. The strategy was Minimax with various optimizations: caching, mirroring, a form of alpha-beta pruning, an opening book (hard coding the first moves), move ordering and… bit hacks.

The bit hacks made it possible to store the game state (the board) in two 64 bit fields plus an array for column heights. I did not invent this myself, instead using a Stack Overflow question, but actually this goes back to at least 1988 when the game was solved twice: first by James D. Allen, then quickly after by Victor Allis who used a very creative approach. More background here.

Recently I was able to take these Connect Four bit hacks a step further, coming up with some things I have not found anywhere else. I thought it would be fun to give a quick overview and share the insights.

Representation

The board representation I used has two 64 bit fields, one for each player like this:

5 14 23 32 41 50 59
4 13 22 31 40 49 58
3 12 21 30 39 48 57
2 11 20 29 38 47 56
1 10 19 28 37 46 55
0  9 18 27 36 45 54

If a bit is 1 then the corresponding position has a checker for that player, if 0 it doesn’t. That’s it.

Move

With this representation, the most straightforward way to do a move is like this:

bitboard |= 1 << (height[column] + 9*column)
height[column]++

But actually you don’t need height[] because you can do it like this:

filled = bitboardPlayerToMove | bitboardOtherPlayer
COL0 = 0x3f
ROW0 = 0x40201008040201
VALID_PLACES = COL0 * ROW0
moves = (filled + ROW0) & VALID_PLACES // main part
column_mask = COL0 << (column * 9)
move = moves & column_mask
bitboardPlayerToMove |= move

which is a trick relying on the fact that the columns in the combined bitboard filled are just consecutive sequences of 1 bits, starting from the bottom row. I haven’t seen this exactly before but something similar is done here.

Also note that switching the player to move is as simple as

bitboardPlayerToMove, bitboardOtherPlayer =
    bitboardOtherPlayer, bitboardPlayerToMove

No need to store a boolean or move counter.

Winning

The idea behind using the bitboards to check for a four-in-a-row is covered in the Stack Overflow question and more in-depth here. But to quickly explain, the idea behind it relies on this idea:

two_in_a_row = bitboard & (bitboard >> 1)

Here, bit i in two_in_a_row is 1 if bit i and i+1 in bitboard are 1. If you repeat it:

four_in_a_row = two_in_a_row & (two_in_a_row >> 2)

then four_in_a_row is 1 if bit i, i+1, i+2, i+3 in biboard are 1. Because some bits between the columns in the bitboard representation are not used (e.g. bits 6-8) there is no risk of overflow. Effectively, checking if four_in_a_row != 0 allows you to check for horizontal four-in-a-rows. You can play with the 1 and 2 shift values to make it work for vertical or diagonal four-in-a-rows. In Rust it would look like this:

const HOR_STRIDE: u64 = 9;
const VER_STRIDE: u64 = 1;
const DIAG_DOWN_STRIDE: u64 = HOR_STRIDE-1;
const DIAG_UP_STRIDE: u64 = HOR_STRIDE+1;

fn and4(x: u64, stride: u64) -> u64 {
    let and2 = x & (x >> stride);
    return and2 & (and2 >> (2 * stride));
}

fn won(board: u64) -> bool {
    let h4 = and4(board, HOR_STRIDE);        // Horizontal -
    let v4 = and4(board, VER_STRIDE);        // Vertical |
    let dd4 = and4(board, DIAG_DOWN_STRIDE); // Diagonal \
    let du4 = and4(board, DIAG_UP_STRIDE);   // Diagonal /
    return h4|v4|dd4|du4 != 0;
}

Almost-Wins

Instead of just checking if we have won on the current board, we can actually use bit hacks to see if we can make a “win-in-1-move”. For that we need what I’ll name “almost-wins”: positions that would result in a win for a certain player if a checker would be placed there. For that we need something similar to the and4() function used before, but unlike and4() which checks (for the horizontal case) if all the bits i, …, i+3 are 1, it checks if 3 out of 4 of those bits are 1:

fn comb34(x: u64, stride: u64) -> u64 {
    let and2 = x & (x >> stride);
    let xor2 = x ^ (x >> stride);
    // Check for patterns x[i]=1, x[i+s]=1 and
    // either x[i+2s]=1 or x[i+3s]=1
    let b21 = and2 & (xor2 >> (2 * stride));
    // Check for patterns x[i+2s]=1, x[i+3s]=1 and
    // either x[i]=1 or x[i+s]=1
    let b12 = xor2 & (and2 >> (2 * stride));
    return b21 | b12;
}

This function can be used to see if at bit i we have an almost-win due to bits i, …, i+3. That is useful, but bit i would also be an almost-win if, say, bits i-3, …, i would have three out of four 1 bits. So we need another operator

fn or4(x: u64, stride: u64) -> u64 {
    let or2 = x | (x << stride);
    return or2 | (or2 << (2 * stride));
}

This operator checks (for the horizontal case) if any of the bits i-3, …, i are 1. If we combine them like this: or4(comb34(board, 1), 1), we are able to check the full range. If we repeat that for the other cases, we get

fn almost_wins(board: u64) -> u64 {
    let h = or4(comb34(board, HOR_STRIDE), HOR_STRIDE);
    let v = or4(comb34(board, VER_STRIDE), VER_STRIDE);
    let dd = or4(comb34(board, DIAG_DOWN_STRIDE), DIAG_DOWN_STRIDE);
    let du = or4(comb34(board, DIAG_UP_STRIDE), DIAG_UP_STRIDE);
    return h | v | dd | du;
}

Now if we can place a checker on any of the almost-win positions (and if there wasn’t a checker before), then we would win. You can see that the number of instructions has about doubled compared to just checking for a win. This is little compared to checking for a win for each of the 7 columns. On top of that, this function lends itself well to SIMD.

Alternatively, these almost-wins can be used to check if the opponent can make more than one winning move on the current board. If so, the current player would always lose, no matter where they place their checker. Also, placing a checker directly below an opponent’s almost-win position would result in a win for the opponent. If you combine these two ideas you achieve the same as checking if the current player can make a winning move, and it is a little bit faster (about 10% for me). This idea is implemented here.

Color Less Checkers

This is an idea I haven’t seen before, and allows for large speed ups when using a cache. When computing the optimal move for a board in Connect Four with Minimax, you are often looking at board positions for which you already have computed the optimal move. With a cache you can look the boards up and return the score or move. You don’t have to look up the exact same board. As a simple example, you can look at the horizontally mirrored board. On top of that, when looking at a board there might be checkers on that board which cannot be part of a four-in-a-row (“color less checkers”). Consider this extreme example:

OOXO...
XOXX.OX
OOXO.OX
XXOO.XO
OOOXXOO
XXXOXXX

If we mark all color less checkers with * we get

**XO...
***X.O*
**XO.**
*XOO.X*
*OO*X**
****X**

The following similar board (only the checker in the top left corner is changed) would give the same color less checkers:

XOXO...
OOXX.OX
OOXO.OX
XXOO.XO
OOOXXOO
XXXOXXX

When determining the optimal moves, the color of these checkers doesn’t matter, so when looking them up we can use the same optimal move even though the boards are different. As you can imagine this can matter a lot when the boards are almost full.

To compute those color less checkers, we can first look at all positions that could still be part of a four-in-a-row, and then simply bit flip the result. To find positions that could still be part of a four-in-a-row, we can take a player’s checkers plus empty places, and check for places which are part of a four-in-a-row. Checking which places are part of a four-in-a-row can be done like this:

fn wins(self) -> BitBoard {
    let ph = or4(and4(self.board, HOR_STRIDE), HOR_STRIDE);
    let pv = or4(and4(self.board, VER_STRIDE), VER_STRIDE);
    let pdd = or4(and4(self.board, DIAG_DOWN_STRIDE), DIAG_DOWN_STRIDE);
    let pdu = or4(and4(self.board, DIAG_UP_STRIDE), DIAG_UP_STRIDE);
    return BitBoard {board: ph | pv | pdd | pdu};
}

Then we compute places where we can still potentially have a four-in-a-row like this:

potentialCurrent = wins(!bitboardOtherPlayer & VALID_PLACES)

This can be repeated for the other player. The color less checkers are then

(bitboardCurrentPlayer | bitboardOtherPlayer) &
    !potentialCurrent & !potentialOther

Using this when caching gave me a ~3.3x speedup.

Final Remarks

Connect Four lends itself very well to bitboard representations and surely there are further speedups possible. That said, not all bit hacks pay them selves back. The cost of the check has to be weighted against the cost of the savings, and this gets harder as you add more speedups.

I have implemented a full implementation of a Connect Four solver in Rust (with SIMD), containing all these ideas and more. It can be found on GitHub.

The Wondrous World of Connect Four Bit Boards