website

website.git
git clone git://git.lenczewski.org/website.git
Log | Files | Refs

cdoku.html (8565B)


      1 <h3>Solving Sudoku</h3>
      2 <p>
      3   A while ago, for a university assignment, we were asked to write a sudoku solver.
      4   My first implementation used the classical brute force approach, written in 
      5   python. It worked relatively well for simple grids, but took ridiculously long 
      6   for more complex grids with fewer given values. It also had a pathological case 
      7   which couldn&#39;t be completed after 3 hours of runtime:
      8 </p>
      9 
     10 <pre><code class="language-text">
     11  . . . | . . . | . . .
     12  . . . | . . 3 | . 8 5
     13  . . 1 | . 2 . | . . .
     14 =======#=======#=======
     15  . . . | 5 . 7 | . . .
     16  . . 4 | . . . | 1 . .
     17  . 9 . | . . . | . . .
     18 =======#=======#=======
     19  5 . . | . . . | . 7 3
     20  . . 2 | . 1 . | . . .
     21  . . . | . 4 . | . . 9
     22 </code></pre>
     23 
     24 <p>
     25   This case is pathological, because the initial row consists of the numbers 
     26   9 -&gt; 1. This means that the algorithm takes the longest amount of time possible 
     27   on the first row, having to test 9! posibilities; the algorithm starts with 1, 
     28   attempts to fill in the rest of the grid, and if it fails backtracks and tries 
     29   with 2, then 3, etc. This poor performance motivated me to try to write a better 
     30   solver.
     31 </p>
     32 
     33 <h3>Sudoku and Exact Covers</h3>
     34 <p>
     35   During a trip down a particular internet rabbit hole concerning sudoku solvers, 
     36   I ran across the so-called Algorithm X. It was invented by Donald E. Knuth so 
     37   solve <q>exact cover</q> problems. An exact cover problem is one where you have a 
     38   set X of possible values, and you have another set S containing sets of values 
     39   in X:
     40 </p>
     41 
     42 <pre><code class="language-text">
     43 For example:
     44 X = { 0, 1, 2, 3, 4, 5, 6 }
     45 
     46 S = { A, B, C, D }
     47 A = { 0, 1, 3 }
     48 B = { 2, 4, 5 }
     49 C = { 2, 6 }
     50 D = { 6 }
     51 </code></pre>
     52 
     53 <p>
     54   In an exact cover problem, you are trying to find a subset of S such that each 
     55   element in the universe set X is present <em>exactly</em> once. Luckily for us, sudoku 
     56   can be expressed as exactly this kind of problem!
     57 </p>
     58 <p>
     59   In sudoku, you have a 2D grid. This grid usually has 9 rows and 9 columns, and 
     60   is arranged as a 3x3 grid of 3x3 cells. To satisfy a sudoku, you must have exactly 
     61   one of each number between 1 and 9 (inclusive) in each row, in each column, and 
     62   in each cell. See the link? We have 4 sets of constraints:
     63 </p>
     64 
     65 <ul>
     66 <li><q>value-constraints</q> : (<strong><em>i</em></strong>, <strong><em>j</em></strong>) = <strong><em>v</em></strong></li>
     67 <li><q>row-constraints</q> : <strong><em>v</em></strong> in row <strong><em>j</em></strong></li>
     68 <li><q>column-constraints</q>: <strong><em>v</em></strong> in column <strong><em>i</em></strong></li>
     69 <li><q>cell-constraints</q> : <strong><em>v</em></strong> in cell (<strong><em>i&#39;</em></strong>, <strong><em>j&#39;</em></strong>)</li>
     70 </ul>
     71 
     72 <p>
     73 Our universe set (aka constraint set) X is therefore the union of these 4 sets, 
     74 and our possibility set S will be the union of all numer-position pairs.
     75 </p>
     76 
     77 <h3>Introducing Algorithm X</h3>
     78 <p>
     79   Knuth&#39;s Algorithm X is an efficient way of solving exact cover problems. It 
     80   takes in a matrix representation of the problem, with the universe set X as 
     81   the columns of the matrix, and the possibility set S as the rows. For each row, 
     82   the presence of a 1 in a given column means that the element represented by the 
     83   column is present in the set represented by the row, and similarly a 0 in the 
     84   same column would mean that the element is not present in the set.
     85 </p>
     86 <p>
     87   Algorithm X works as follows:
     88 </p>
     89 
     90 <pre><code class="language-text">
     91 Let M be our possibility-constraint matrix
     92 Let M(i,j) denote the value in column i, row j
     93 
     94 1. If M has no columns, return true
     95 2. Pick a column c in M
     96 3. Pick a row r in column c such that M(c,r) = 1
     97 4. Add the row to a solution set
     98 5. For each column j such that M(j,r) = 1
     99       For each row i such that M(j,i) = 1
    100          Remove row i from the matrix
    101       Remove column j from the matrix
    102 6. Recurse on the new, reduced matrix M
    103 </code></pre>
    104 
    105 <p>
    106   Knuth recommends picking the column with the fewest 1s in step 2. Although any 
    107   consistent way of picking a column would work, it could lead to more operations 
    108   and thus a slower solve. Thus, we will stick with the recommended method. Using 
    109   our previous example problem, heres how Algorithm X would run through the problem:
    110 </p>
    111 
    112 <pre><code class="language-text">
    113 X = { 0, 1, 2, 3, 4, 5, 6 }
    114 
    115 S = { A, B, C, D }
    116 A = { 0, 1, 3 }
    117 B = { 2, 4, 5 }
    118 C = { 2, 6 }
    119 D = { 6 }
    120 
    121 M:
    122     0 1 2 3 4 5 6 
    123 A [ 1 1 0 1 0 0 0 ]
    124 B [ 0 0 1 0 1 1 0 ]
    125 C [ 0 0 1 0 0 0 1 ]
    126 D [ 0 0 0 0 0 0 1 ]
    127 
    128 Solution = { }
    129 
    130 Algorithm:
    131 1. M has columns, continue
    132 2. Pick column 0 (first with fewest 1s)
    133 3. Pick row A
    134 4. S += A (S = { A })
    135 5. For j in { 0, 1, 3 }
    136       For i in { A }
    137          Remove row i from M
    138       Remove column j from M
    139 
    140 M:
    141     2 4 5 6
    142 B [ 1 1 1 0 ]
    143 C [ 1 0 0 1 ]
    144 D [ 0 0 0 1 ]
    145 
    146 6. Recurse on the new, reduced M
    147 7. M has columns, continue
    148 8. Pick column 4 (first with fewest 1s)
    149 9. Pick row B
    150 10. S += B (S = { A, B })
    151 11. For j in { 2, 4, 5 }
    152        For i in { B, C }
    153           Remove row i from M
    154        Remove column j from M
    155 
    156 M:
    157     6
    158 D [ 1 ]
    159 
    160 12. Recurse on the new, reduced M
    161 13. M has columns, continue
    162 14. Pick column 6 (first with fewest 1s)
    163 15. Pick row D
    164 16. S += D (S = { A, B, D })
    165 17. For j in { 6 }
    166        For i in { D }
    167           Remove row i from M
    168        Remove column j from M
    169 
    170 M:
    171 [ ]
    172 
    173 18. Recurse on the new, reduced M
    174 19. M has no columns, return true
    175 
    176 Therefore, S = { A, B, D}
    177 </code></pre>
    178 
    179 <p>
    180   As we can see, the algorithm is quite efficient. We didn&#39;t have to make many 
    181   tests and comparisons, and arrived at the correct answer quickly.
    182 </p>
    183 
    184 <h3>Optimisations</h3>
    185 <h4>Using a Linked List Matrix</h4>
    186 <p>
    187   The example given above was fairly trivial. It showed how the algorithm works, 
    188   but it was a tiny problem. For larger problems, the naive version of Algorithm X 
    189   (using an array-based matrix) spends a lot of its time iterating over rows and 
    190   columns looking for a 1. This problem is exacerbated when the matrix is sparse 
    191   (a lot of the values are 0). To solve this problem, Knuth recommends using a 
    192   linked-list based matrix, which stores only the 1s in the matrix. This is done 
    193   by having a row node indicate the presence of a 1.
    194 </p>
    195 <p>
    196   An example of what the linked-list based matrix would look like for our example:
    197 </p>
    198 
    199 <pre><code class="language-text">
    200 &quot;Dense&quot; Matrix:
    201     0 1 2 3 4 5 6 
    202 A [ 1 1 0 1 0 0 0 ]
    203 B [ 0 0 1 0 1 1 0 ]
    204 C [ 0 0 1 0 0 0 1 ]
    205 D [ 0 0 0 0 0 0 1 ]
    206 
    207 &quot;Sparse&quot; Matrix:
    208   | 0      1     2     3     4     5    6
    209 ==|=======================================
    210 A | A0 &lt;&gt; A1 &lt;------&gt; A3
    211 B |             B2 &lt;------&gt; B4 &lt;&gt; B5
    212 C |             C2 &lt;------------------&gt; C6
    213 D |                                     D6
    214 </code></pre>
    215 
    216 <p>
    217   All nodes in a column are doubly-linked to the elements above them and below 
    218   them (i.e the column header for 2 is linked downwards to B2, and upwards to C2).
    219   All nodes in a row are doubly-linked to the elements to their left and right 
    220   (i.e A0 is linked to the left to A3, and to the right to A1).
    221 </p>
    222 <p>
    223   Fun Fact: This 2D doubly-linked list forms a torus topology, where the column 
    224   headers form the initial 2D ring, and the rows make the ring 3D (give the 
    225   doughnut its thickness).
    226 </p>
    227 
    228 <h4>Dancing Links</h4>
    229 
    230 <p>
    231   Another optimisation that follows naturally from the use of a linked-list matrix 
    232   is to use a technique called <q>dancing links</q>. This is a method of implementing 
    233   backtracking using linked lists, by splicing out an element (without resetting 
    234   its previous and next pointers; this means it is in an invalid and unsafe state), 
    235   recursing, then splicing it back afterwards (using its stored pointers).
    236 </p>
    237 <p>
    238   Splicing an element out of a linked list:
    239 </p>
    240 
    241 <pre><code class="language-text">
    242 A &lt;---&gt; B &lt;---&gt; C
    243 
    244 After splicing out B:
    245 A &lt;---&gt; C
    246  \_ B _/
    247 </code></pre>
    248 
    249 <p>
    250   Here, B has retained its previous pointer to A and its next pointer to C. It can 
    251   then be added to the list again by settings its previous element&#39;s (A) next 
    252   pointer to itself (B), and settings its next element&#39;s (C) previous pointer to 
    253   itself (B):
    254 </p>
    255 
    256 <pre><code class="language-text">
    257 A &lt;---&gt; C
    258  \_ B _/
    259 
    260 After splicing in B:
    261 A &lt;---&gt; B &lt;---&gt; C
    262 </code></pre>
    263 
    264 <h3>Implementation</h3>
    265 <p>
    266   Although this post has mostly been on Algorithm X, and hasn&#39;t shown any code, 
    267   my implementation (in C) can be found
    268   <span><a href="https://git.lenczewski.org/cdoku/log.html">here</a></span>.
    269 </p>