Let’s review the code and the “project”, to see what, if anything, we can conclude.

The Project So Far

It seems that we have built a Sudoku solver that works via a stack-based brute-force method whose only “optimization” is that it never tries a number that breaks the rules of Sudoku. Other than that, it just processed from top to bottom left to right, trying values until it either succeeds or runs out of ideas.

This is only the fifth article in the series, so it is pretty safe to estimate that, counting writing articles, testing, and coding, the “project” up until these words, has taken about eight or ten hours of work. Probably you could do better. Surely someone could. I hope they write some contemporaneous articles while they do it, and that I get a chance to read them. I might learn something, and it will surely be interesting.

We “warmed up” by writing some tests and code to manipulate the puzzle, choosing a simple string of 81 characters 0-9 to represent the puzzle’s [partial] solution, and performing some magical arithmetic to find the row, column and/or sub-grid corresponding to any given position 0-80 in that string.

By then, if I recall the sequence of things, we had a general idea of what we wanted the solver to do, namely work from left to right in the input, finding the next ‘0’ character, representing an empty cell in the puzzle, then try each possible value one after another, and then recursively trying to assign the subsequent cell, rinse, repeat until all the cells are filled. If that happens, the puzzle is solved, and if the first level runs out of guesses, the puzzle cannot be solved.

We were working by that point with 3x3 puzzles, which were sufficient to exercise the code and small enough that we had a chance of figuring out what was happening.

Right about there, I somehow, more or less magically, typed in an attempted solve method, which looks like this:

    def solve(self) -> Puzzle | None:
        position = self.puzzle.first_unsolved_position()
        if position == -1:
            return self.puzzle  # it's solved
        avails = self.puzzle.find_available_values(position)
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

Let’s see if we can explain that a bit while we’re here. We should also consider making the code more clear as part of finishing up this series.

  1. We search the puzzle for an unsolved position, which will be a ‘0’ character in the string, although the solver just knows whether we found a position or not.

  2. We ask the puzzle for the values available to guess at this position. These will be all the characters from ‘1’ to ‘9’ that do not occur in the position’s column, row, or sub-grid.

  3. If there are no available values,, this is a dead end in an attempted solution, and we return None, indicating a failure at our level. Since this is a recursive algorithm, as we’ll see in a moment, if the first level runs out of available numbers, the puzzle cannot be solved and the whole thing returns None.

  4. If there are available values to guess, we loop through them. For each one, we create a new Puzzle with our guess plugged in, create a new Solver for that puzzle, and call its solve method. This is the recursion step of the Solver. If that call to solve returns other than None, it has returned a solved Puzzle and we return that as our result. Otherwise we loop, trying our next guess.

  5. If we run out of guesses, we return None, signifying that given what has gone before, we cannot solve. If this happens at the first level, the initial call to solve returns None, signifying no solution can be found.

In short:

If the puzzle is not solved, we find the next position needing a guess, and we make repeated guesses at that position, then recursively trying to solve the next position, until either we get a solved puzzle or run out of guesses.

My recollection is that I got that method pretty close to right immediately, You can review the tapes if you wish.

With that correct solve method in hand, it came down to some testing of 3x3 and then a move to 9x9. With 9x9, we needed the sub-grid, which does not exist in 3x3, so a digression to code that up was followed by typing in a 9x9 sudoku and seeing if we could solve it.

We could, although I think I got a different answer for the first one. Quite often, Sudoku layouts can have alternate solutions, usually where two values can be swapped without harm, or, I think, sometimes larger numbers of values can change. It is also entirely possible that my initial 9x9 test was incorrectly transcribed: I did find that to be the case in the second one I tried. Even after carefully looking at it and declaring it to be correct, I found, when I typed it in again from scratch, that I had made an error in the input. The result was a valid solution to the puzzle I had provided, but of course it didn’t match the published one, because I wasn’t solving the same problem!

All the input errors aside, the solve method was good and the project can be declared to be a success.

Now Then

Let’s examine our code to see if we can make it more clear. I’m not really concerned about solve, though we’ll see if we can make it more clear, but my larger concern is the code that finds the row, column, or sub-grid values, because the subscript calculations are far from obvious. We’ll get to those today or tomorrow depending on how things go. Probably today.

But first, Solver:

class Solver:
    solver_count = 0

    def __init__(self, puzzle):
        Solver.solver_count += 1
        self.puzzle = puzzle

    def solve(self) -> Puzzle | None:
        position = self.puzzle.first_unsolved_position()
        if position == -1:
            return self.puzzle  # it's solved
        avails = self.puzzle.find_available_values(position)
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

The solver_count is just there to count how many solvers are created, because I was curious. We can ignore that or remove it as we wish. Let’s focus on solve.

It seems to me that Puzzle could be a bit more helpful here. We ask it for the next available position and then for the values at that position and then, later, plug in a guess at that position. What if Puzzle class were more helpful? We could write something like this:

    def solve(self) -> Puzzle | None:
        if self.puzzle.is_solved:
            return self.puzzle
        guesses = self.puzzle.find_available_guesses()
        for guess in guesses:
            puzzle = Puzzle.trying(self.puzzle, guess)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

That would be better, I think. Let’s refactor toward that.

Let’s have a look at Puzzle and see if we can make it more helpful:

class Puzzle:
    @classmethod
    def trying(cls, puzzle, guess, position):

    @classmethod
    def from_list(cls, a_list):

    def __init__(self, a_game_string):
        self.game = a_game_string
        if len(a_game_string) == 9:
            self.available_values = '123'
            self.line_size = 3
        elif len(a_game_string) == 81:
            self.available_values = '123456789'
            self.line_size = 9
        else:
            raise ValueError('problem must have length 9 or 81')

    def used_numbers_in_row(self, position):

    def used_numbers_in_column(self, position):

    def used_numbers_in_sub_grid(self, position):


    def find_available_values(self, position):
        row = self.used_numbers_in_row(position)
        column = self.used_numbers_in_column(position)
        sub_grid = self.used_numbers_in_sub_grid(position) if self.line_size == 9 else []
        return [c for c in self.available_values if c not in row and c not in column and c not in sub_grid]

    def first_unsolved_position(self):
        try:
            return self.game.index('0')
        except ValueError:
            return -1

I’ve folded the methods that I think we can ignore for now, though we do want to improve them later.

What if the Puzzle were to compute the first available position upon creation and remember it, then use it instead of the position parameter in find... and trying? Let’s just try that: if the tests run, it’s a good idea.

    def __init__(self, a_game_string):
        self.game = a_game_string
        if len(a_game_string) == 9:
            self.available_values = '123'
            self.line_size = 3
        elif len(a_game_string) == 81:
            self.available_values = '123456789'
            self.line_size = 9
        else:
            raise ValueError('problem must have length 9 or 81')
        self.guess_position = self.first_unsolved_position()

I gave it the name guess_position to help me avoid confusion with position, a parameter we use throughout.

Now can’t I just use it here?

    def find_available_values(self, ignored_position):
        row = self.used_numbers_in_row(self.guess_position)
        column = self.used_numbers_in_column(self.guess_position)
        sub_grid = self.used_numbers_in_sub_grid(self.guess_position) if self.line_size == 9 else []
        return [c for c in self.available_values if c not in row and c not in column and c not in sub_grid]

Not quite: two tests fail, because they are checking the actual logic of this operation. But we were going to call the new method find_available_guesses anyway, so:

    def find_available_guesses(self):
        return self.find_available_values(self.guess_position)

And use it:

    def solve(self) -> Puzzle | None:
        position = self.puzzle.first_unsolved_position()
        if position == -1:
            return self.puzzle  # it's solved
        avails = self.puzzle.find_available_guesses()
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

Green. Commit: refactoring to eliminate position parameter.

Now the Puzzle can know whether it is solved:

class Puzzle:
    @property
    def is_solved(self):
        return self.guess_position == -1

class Solver:
    def solve(self) -> Puzzle | None:
        position = self.puzzle.first_unsolved_position()
        if self.puzzle.is_solved:
            return self.puzzle
        avails = self.puzzle.find_available_guesses()
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess, position)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

Now we’d like to change trying not to need the position. I suspect we have tests for that that may fail. We have. We’ll change this:

    @classmethod
    def trying(cls, puzzle, guess, position):
        old = puzzle.game
        new = old[:position] + guess + old[position+1:]
        return cls(new)

To this:

    @classmethod
    def trying(cls, puzzle, guess, position=None):
        if position is None:
            position = puzzle.guess_position
        old = puzzle.game
        new = old[:position] + guess + old[position+1:]
        return cls(new)

That works for all the tests, but now:

    def solve(self) -> Puzzle | None:
        if self.puzzle.is_solved:
            return self.puzzle
        avails = self.puzzle.find_available_guesses()
        if not avails:
            return None
        for guess in avails:
            puzzle = Puzzle.trying(self.puzzle, guess)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle is not None:
                return solved_puzzle
        return None

Green. Commit: remove use of position in solve.

One more thing: avails should be renamed to guesses, and it always comes back as a list, or should, so we can do this:

    def solve(self) -> Puzzle | None:
        if self.puzzle.is_solved:
            return self.puzzle
        guesses = self.puzzle.find_available_guesses()
        for guess in guesses:
            puzzle = Puzzle.trying(self.puzzle, guess)
            solver = Solver(puzzle)
            solved_puzzle = solver.solve()
            if solved_puzzle:
                return solved_puzzle
        return None

Green. Commit: simplifying solve.

We could do some in-lining. How about this:

    def solve(self) -> Puzzle | None:
        if self.puzzle.is_solved:
            return self.puzzle
        for guess in self.puzzle.find_available_guesses():
            new_puzzle = Puzzle.new_puzzle_trying_guess(self.puzzle, guess)
            solved_puzzle = Solver(new_puzzle).solve()
            if solved_puzzle:
                return solved_puzzle
        return None

I think that’s better. We’ll probably look again later to see if we think it’s too compact, but for now I am satisfied. Green. Commit: simplifying.

Let’s take a quick look at Puzzle, with an eye to the magic subscripting. We might not change it in this session, and I’m sure we won’t finish it, because the article is long enough. So we’ll just take a look, maybe tweak, but mostly just try to prime our minds for next time.

The methods I’m concerned about are below. My concern is that the numeric calculations are far from obvious, even to me, and I wrote them.

class Puzzle:
    def used_numbers_in_column(self, position):
        start = position % self.line_size
        used_in_column = self.game[start:len(self.game):self.line_size]
        return [c for c in used_in_column if c != '0']

    def used_numbers_in_row(self, position):
        start = position // self.line_size * self.line_size
        used_in_row = self.game[start:start + self.line_size]
        return [c for c in used_in_row if c != '0']

    def used_numbers_in_sub_grid(self, position):
        first_index_in_row = position // self.line_size * self.line_size
        offset_in_row = position % 9 // 3 * 3
        first_index_in_sub_grid = first_index_in_row // 27 * 27 + offset_in_row
        # print(f'\n{position=}\n{first_index_in_row=}\n{first_index_in_sub_grid=}')
        result = []
        for row in range(first_index_in_sub_grid, first_index_in_sub_grid+27, 9):
            for col in range(0, 3):
                result.append(self.game[row+col])
        return result

The column one is about the simplest. The starting index for a column is 0-9, and is equal to the index of the item we are considering, mod nine. Then we slice every 9th item from the start to the end of the whole string.

Oh! This reminds me. Our cells can contain 0-9, zero representing an unsolved cell. The method column and row methods are eliminating any zeros. That’s not necessary (and in fact the sub_grid doesn’t do it), because when we get the available numbers we only consider one through nine anyway. So we could simplify those first two methods to just return the slice … except that we have some specific tests that don’t expect to see any zeros. Maybe we’ll consider that later.

In the calculation for row, there is the construct

start = position // self.line_size * self.line_size

What is p // x * x anyway? The construct occurs a few times in the code above, including some literal 3’s and 27’s and such that really shouldn’t be there.

I can sort of explain what that does. It sets p to the nearest even multiple of x less than p: If x is, say, 3,

p p//x*x
0 0
1 0
2 0
3 3
4 3
5 3
6 6
7 6
8 6


Like this:

p 0 1 2 3 4 5 6 7 8
p//x*x 0 0 0 3 3 3 6 6 6


Or this:

data a b c d e f g h i
cell 0 1 2 3 4 5 6 7 8
row start 0 0 0 3 3 3 6 6 6


Or this:

row start cells data
0 0 0 1 2 a b c
1 3 3 4 5 d e f
2 6 6 7 8 g h i


So if the rows are of length 3, position p’s row starts at cell p//3*3. It’s kind of like the ceil function, in a way.

In the code in Puzzle, we make use of that “fact” a number of times, selecting the row, selecting the row start for a sub-grid, and so on.

I do not find that code clear. Yes, it works. We could perhaps simplify some cases, but if we did, we wouldn’t make anything more clear.

We’ll let that concern sit for today. I see a concern and not a solution.

Summary

For today, we have simplified the Puzzle / Solver relationship and, I think, made Solver a bit more clear, and certainly simpler, at little or no cost in Puzzle.

I feel that Puzzle is a bit more complicated than it needs to be, but I don’t see yet what we might do about it. It’s surely “good enough”, or nearly so, and everything is well tested and works nicely.

I suspect that Puzzle could be more bullet-proof, but my general practice is to ensure that things work when they are used correctly and not to care what they do if they are used incorrectly, relying on tests to ensure that they are used correctly. In a sufficiently large system, we would probably want to do more error-checking and recovery than I usually do.

Our purpose here is to examine problems and solutions, and to see how clear and simple we can make things. In a live environment, our priorities would probably shift.

For now, where we are, I’m pretty well pleased, and hope that you are as well.

See you next time!