Sudoku Helper - Strategies


Overview

In what follows a row denotes a horizontal row of 9 squares, a column denotes a vertical column of 9 squares, a block denotes one of the nine 3x3 groups of squares, and a unit denotes any row, column or block.

Sudoku Helper solves positions by maintaining a list of possible digits for each unsolved square. A square can be solved immediately if it has only one possibility, or if one of its possible digits cannot be in any other square in the same unit. If no squares can be solved immediately then Sudoku Helper uses four increasingly complicated strategies to try to eliminate possibilities; the following sections describe these strategies.

1. Sudoku Rules

The rules of Sudoku state that a digit can appear at most once in any unit. Thus, when a square is solved, the digit in that square can be eliminated as a possibility from all other squares in the same row, column or block.

2. Buckets

A bucket is three consecutive horizontal or vertical squares within a block; note that this is the intersection of the block with a row or column respectively. The Bucket strategy is to try to find a digit that must be in a certain horizontal/vertical bucket; that digit can then be eliminated from the squares of the other horizontal/vertical buckets in the same row/column or block. There are four variations of this strategy which are stated almost identically:

  1. If a digit cannot be in two of the horizontal buckets in a row, then it must be in the third, and is therefore not in the other two horizontal buckets in the same block.
  2. If a digit cannot be in two of the horizontal buckets in a block, then it must be in the third, and is therefore not in the other two horizontal buckets in the same row.
  3. If a digit cannot be in two of the vertical buckets in a column, then it must be in the third, and is therefore not in the other two vertical buckets in the same block.
  4. If a digit cannot be in two of the vertical buckets in a block, then it must be in the third, and is therefore not in the other two vertical buckets in the same column.
The Bucket strategy is variously known as Intersection Removal, Pointing Pairs/Triples, and Box-Line Reduction.

3. Pigeonhole

The pigeonhole principle states that if there are n + 1 pigeons and n pigeonholes, then some pigeonhole must contain at least two pigeons. Two variations of this principle apply to Sudoku if we assume that every pigeonhole contains exactly one pigeon (and every pigeon is in a pigeonhole):

  1. If a set of k pigeons can only be in k pigeonholes, then no other pigeons are in any of these k pigeonholes.
  2. If a set of k pigeonholes can only contain k pigeons, then no other pigeonholes contain any of these k pigeons.
In Sudoku the pigeons are the digits 1-9, and the pigeonholes are the nine squares in a single unit. If some set of k digits can only appear in k squares in the unit then we can eliminate all other digits as possibilities in these squares, and if some set of k squares can only contain k digits then we can eliminate these digits as possibilities in all other squares in the same unit.

In theory the Pigeonhole strategy can be applied with 2 ≤ k ≤ 8, but because of the dual nature of this strategy we never need to consider k > 4. Suppose, for example, that some set of 5 digits can only appear in 5 squares in a unit (so that we can eliminate the other 4 digits as possibilities from these squares). This means that the remaining 4 squares do not have these 5 digits as possibilities, so these 4 squares can only contain the remaining 4 digits, so we can apply the Pigeonhole strategy with k = 4 and remove these digits from the 5 original squares. More generally, this argument shows that if the Pigeonhole strategy can be used with k = n, then the dual strategy can be used with k = 9 - n to eliminate the same set of possibilities.

Special cases of the pigeonhole strategy are known as Hidden/Naked Pairs/Triples/Quads.

4. Chains

Suppose we can prove that one of two squares A, B must contain a digit d. It follows that d can be eliminated as a possibility from any other square which shares units with both A and B. Logically, the statement "Either A or B contains d" is equivalent to "If A does not contain d then B contains d". The chain strategy seeks to prove such statements by finding a chain of implications

(d is not in A)  =>  (x1 is in X1)   =>  (x2 is not in X2)   =>  …   =>  (d is in B)

Four basic rules are used to construct the individual implications in these chains:

  1. If d and e are the only possible digits in A, then (d is not in A)  =>  (e is in A).
  2. If A and B are the only squares in a unit which can contain d, then (d is not in A)  =>  (d is in B).
  3. If d and e are both possibilities of A, then (d is in A)  =>  (e is not in A).
  4. If A and B are in the same unit and d is a possibility of each, then (d is in A)  =>  (d is not in B).
Two special cases of chains are easier to describe (and easier for a human to find) than such Generalized Chains. The first, a Chain of Pairs, is a sequence of squares where consecutive squares share a unit, each square has exactly two possibilities, and the sequence of possibilities is of the form

(a, b),  (b, c),  (c, d),  ,  (x, a)

Using rules 1 and 4, if the first square does not contain a, then the first square does contain b, then the second square does not contain b, then the second square does contain c, etc., giving a chain of implications that ends with the last square containing a. Note that the digits need not all be distinct; the only requirement is that the second digit in each pair is the same as the first digit in the next (and, of course, that the very first and very last digits are the same). Special cases of the Chain of Pairs strategy are known as XY-Wing, Y-Wing, Y-Wing Chains, and Remote Pairs.

The second special case is a Chain of Digits for some digit d. This is a sequence of an even number of squares where consecutive squares again share a unit, each square has d as a possibility, and for each n ≥ 1 the (2n - 1)th and (2n)th squares in the sequence are the only two squares in some unit which can contain d. Using rules 2 and 4, if the first square does not contain d, then the second square does contain d, then the third square does not contain d, etc., giving a chain of implications that again ends with the last square containing d. Special cases of the Chain of Digits strategy are known as X-Wing, Swordfish, Guardians, Broken Wings and Turbot-Fish.

Sudoku Helper will always first look for a Chain of Pairs or a Chain of Digits, reporting the smallest chain that it can find. Only if there are no such chains will it make use of Generalized Chains since these tend to be quite complicated. When Sudoku Helper finds a chain, the hint will report the chain of distinct squares, even if some square actually appears twice consecutively in the chain of implications (which occurs when rules 1 or 3 are used).