The trivial test for my initial solver sketch passed. Will it hold up? Let’s find out.

Here’s the test that passes:

    def test_nearly_solved(self):
        puzzle = Puzzle('123312230')
        solver = Solver(puzzle)
        solved = solver.solve()
        assert solved.game == '123312231'

We have there a 3x3 puzzle with just one open slot, and the current solver sketch finds the right answer.

A wise person might unsolve (sic) just one more value and see what happens. A wild person might just give the program the whole problem and let it rip. If it fails, and I kind of fear that it will, the conservative test may be best.

Let’s be conservative: it’ll be easier to see the problem if it fails and avoids a big disappointment if we put in a complete problem setup and let it try.

    def test_two_steps(self):
        puzzle = Puzzle('123312200')
        solver = Solver(puzzle)
        solved = solver.solve()
        assert solved.game == '123312231'

That test passes as well! OK, now I’ll give it the test I originally intended for 33 with this:

1 0 3
0 0 0
0 0 1

Leading to this:

1 2 3
3 1 2
2 3 1

The test is this:

    def test_full_3x3(self):
        puzzle = Puzzle('10300001')
        solver = Solver(puzzle)
        solved = solver.solve()
        assert solved.game == '123312231'

And it fails:

Expected :'123312231'
Actual   :'12323131'

Darn. My hopes were up, now they are dashed. Let’s simplify the input here until the first failure.

‘123312001’ passes. ‘123310001’ passes. ‘123300001’ passes. ‘123000001’ passes. ‘103000001’ passes.

Duh! My initial test only had 8 numbers in the input, not the requisite 9. It works!

Commit: Solves full 3x3.

I was so ready to have the code be wrong that I didn’t double-check the test. This does answer the question, however, of what happens in TDD when the test is wrong. The answer is, you work until all the tests pass, confident that wrong code or mistaken test will be caught somewhere.

Let’s review the Solver with an eye to making it work on 9x9.

class Solver:
    def __init__(self, puzzle):
        self.puzzle = puzzle

    def solve(self):
        position = self.puzzle.first_unsolved_position()
        if position == -1:
            return self.puzzle  # it's solved
        avails = self.puzzle.find_available_values(position)
        # print(f'\nin {self.puzzle.game=} {position=} {avails=}')
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            # print(f'trying {guess=} {position=} = {puzzle.game}')
            solver = Solver(puzzle)
            solved = solver.solve()
            if solved is not None:
                return solved
        return None

There’s nothing about 3 vs 9 here. All the solver does is check to see if it has been given a solved puzzle, in which case it returns the puzzle. If not, it gets a list of values that can go into the current cell. If there are none, it returns None, signifying a failure and a need to try again on a prior cell. If there are values to try, it tries them repeatedly, creating a new Solver, passing it the puzzle with the current level’s current guess filled in. If that solver returns other than None, the problem was solved and we return the solved puzzle. Otherwise we loop. If the loop terminates, none of our guesses worked, and we return None so that an earlier cell gets retried.

I recall some confusion about what the solver returns from solve: it returns a Puzzle, or None. Let’s type-hint the return ad rename a temp:

    def solve(self) -> Puzzle | None:
        position = self.puzzle.first_unsolved_position()
        if position == -1:
            return self.puzzle  # it's solved
        avails = self.puzzle.find_available_values(position)
        # print(f'\nin {self.puzzle.game=} {position=} {avails=}')
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            # print(f'trying {guess=} {position=} = {puzzle.game}')
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

I think that’s better. Commit: tidying.

Anyway, that looks good to me. Let’s see what Puzzle is doing that needs to be changed to handle 9x9. I know there are some things, not least that we don’t check the sub-grids.

class Puzzle:
    @classmethod
    def trying(cls, puzzle, guess, position):
        old = puzzle.game
        new = old[:position] + guess + old[position+1:]
        return cls(new)

    def __init__(self, a_game_string):
        self.game = a_game_string

    def used_numbers_in_row(self, position):
        start = position // 3 * 3
        used_in_row = self.game[start:start + 3]
        return [c for c in used_in_row if c != '0']

    def used_numbers_in_column(self, position):
        start = position % 3
        used_in_column = self.game[start:len(self.game):3]
        return [c for c in used_in_column if c != '0']

    def find_available_values(self, position):
        row = self.used_numbers_in_row(position)
        column = self.used_numbers_in_column(position)
        all_available = '123'
        return [c for c in all_available if c not in row and c not in column]

    def first_unsolved_position(self):
        try:
            return self.game.index('0')
        except ValueError:
            return -1

Quite a bit to take in there. Let’s begin in the __init__, checking the input size and setting up a member for length, which we use a lot, and for all available values.

    def __init__(self, a_game_string):
        self.game = a_game_string
        if len(a_game_string) == 9:
            self.available_values = '123'
            self.line_size = 3
        elif len(a_game_string) == 81:
            self.available_values = '123456789'
            self.line_size = 9
        else:
            raise ValueError('problem must have length 9 or 81')

Now we want to change:

    def used_numbers_in_row(self, position):
        start = position // 3 * 3
        used_in_row = self.game[start:start + 3]
        return [c for c in used_in_row if c != '0']

I’d be a fool to change this code, and the other clever bits, without tests. Let’s add a test for the utility functions, starting with this one.

    def test_used_in_row(self):
        puzzle_lines = [
            '123456780',
            '023456789',
            '123400789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
        ]
        puzzle = Puzzle.from_list(puzzle_lines)
        used = puzzle.used_numbers_in_row(4)
        assert used == ['1', '2' '3' '4', '5', '6', '7', '8']

I figured there was no way I could type an 81 character string correctly, so I posited a new class method from_list:

Urrk! Somehow either the tests didn’t run or I didn’t notice. My new creation method is failing. Ah, they were all actually passing in short test string. Fixed, not worth reading about.

In my from_list, I did the join wrong. I always do.

With that fixed I get this error:

Expected :['1', '2', '3', '4', '5', '6', '7', '8']
Actual   :['4', '5', '6']

I’m not going to figure that out, I’m just going to adjust the method:

    def used_numbers_in_row(self, position):
        start = position // self.line_size * self.line_size
        used_in_row = self.game[start:start + self.line_size]
        return [c for c in used_in_row if c != '0']

That passes. I’ll add some lines to the test, which is why I left other holes in the input:

    def test_used_in_row(self):
        puzzle_lines = [
            '123456780',
            '023456789',
            '123400789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
            '123456789',
        ]
        puzzle = Puzzle.from_list(puzzle_lines)
        used = puzzle.used_numbers_in_row(4)
        assert used == ['1', '2', '3', '4', '5', '6', '7', '8']
        used = puzzle.used_numbers_in_row(13)
        assert used == ['2', '3', '4', '5', '6', '7', '8', '9']
        used = puzzle.used_numbers_in_row(22)
        assert used == ['1', '2', '3', '4', '7', '8', '9']

Green. Commit: updated used_numbers_in_row fox 9x9.

OK, now for the column one:

    def used_numbers_in_column(self, position):
        start = position % 3
        used_in_column = self.game[start:len(self.game):3]
        return [c for c in used_in_column if c != '0']

Here again, let’s get a failing test, make it work, then make the test harder to be sure we’re on the right track.

    def test_used_in_column(self):
        puzzle_lines = [
            '023456789',
            '103456798',
            '203456987',
            '345678910',
            '423456789',
            '523456789',
            '623456789',
            '723456789',
            '823456789',
        ]
        puzzle = Puzzle.from_list(puzzle_lines)
        used = puzzle.used_numbers_in_column(9)  # first column
        assert used == ['1', '2', '3', '4', '5', '6', '7', '8']

I think that’s right. Fails anyway, which is a good sign. The failure print is awful Anyway let’s make it work, if we can …

    def used_numbers_in_column(self, position):
        start = position % self.line_size
        used_in_column = self.game[start:len(self.game):self.line_size]
        return [c for c in used_in_column if c != '0']

Test passes. Add a couple more checks and I changed the input for better checking:

    def test_used_in_column(self):
        puzzle_lines = [
            '023456789',
            '103456798',
            '203456987',
            '345678910',
            '453456785',
            '563456784',
            '673456783',
            '783456782',
            '893456781',
        ]
        puzzle = Puzzle.from_list(puzzle_lines)
        used = puzzle.used_numbers_in_column(9)  # first column
        assert used == ['1', '2', '3', '4', '5', '6', '7', '8']
        used = puzzle.used_numbers_in_column(19)  # second column
        assert used == ['2', '4', '5', '6', '7', '8', '9']
        used = puzzle.used_numbers_in_column(26)  # last column
        assert used == ['9', '8', '7', '5', '4', '3', '2', '1']

All passing. Commit: updated used_numbers_in_column for 9x9.

I think we’re left with a slightly larger problem now, the checking of the sub-grids. I am tired, there is something in my eye, and this article is long enough. Let me sum up.

Summary

Once I gave the Solver a well-formed problem, it solved a full 3x3 Sudoku without change. We are left with the conversion of Solver and Puzzle to support 9x9. All the changes appear to be in Puzzle, adjusting for the longer rows and columns—now done—and adding in the sub-grid check. We have a free-standing test for that, so it should be fairly simple.

My tests are bearing a lot of weight here. The code skips around in the puzzle string, and slices out bits allowing for changes, and so on. All those tricky bits are covered by tests, and that means that my confidence in the solver code is high. Furthermore, I imagine that once this works, we may want to refactor to make it nicer or more clever or something, and the tests will give us confidence that all is well.

I am pleased so far, and I continue to believe that I will shortly prove that tDD and XP are not the cause of all the evil in the universe.

See you next time!