Sudoku: An Idea
This thing just keeps on giving. I want to check something out. Result surprises me and I’ll need to do some more thinking. Got good ideas? Let me know!
I had a delicious lie-in this morning, almost 10 hours of restful sleep with a slow easy semi-musing wakeup at the end. Nice. The musing was based on the approach to Sudoku taken by my Zoom friend, whom I’ll call Bryan, or B for short. B is approaching Sudoku with code, but holding off on solving as long as he can, writing instead various things he thinks will be useful, including a full validator that checks a puzzle to see if it is solved.
When B does get around to solving, his plan is to solve, not by a brute force search like I’ve done, but instead to apply the analysis and heuristics that he uses when he solves them by hand. His intention, I think, is never to have to guess at a value, and therefore never to have to backtrack. I am certain of one thing and suspect another.
I am certain that there exist Sudoku puzzles with more than one possible solution. If that’s true, then it seems to me that no matter how much reasoning you do, there will come a time when there is a selection of possibilities for some cell, with no reason to choose one over the others, because any of them will work. This will come about, it seems to me, with a selection of cells and integers such that we can swap those values around and the puzzle will still be solved.
I suspect that in published Sudoku puzzles, this ambiguity is considered to be a bad thing, because we want there to be a unique solution to which one can, in principle, reason. If I’m right about both of these things, B’s approach will work, if he knows enough heuristics.
And it made me wonder about something else. If a puzzle can be solved without guessing, doesn’t that mean that if we listed all the possible values for all the open cells, there must be at least one cell that only has one possibility?
This morning, I plan to explore that question.
I propose to take my “hard” puzzle, and compute the available values for each cell, and then see if any of them have only one available value. What happens after that depends on the answer to that question.
def test_all_available(self):
puzzle = Puzzle.from_list(self.puzzle_list)
available_values = [len(puzzle.possible_answers(position)) for position in range(81)]
best_guess = min(available_values)
for p in range(81):
if available_values[p] == best_guess:
print(f'{p=} {puzzle.possible_answers(p)=}')
assert best_guess == 0
This must of course fail. And it prints:
p=71 puzzle.possible_answers(p)=['1', '6']
I think we should refine the available_values to skip the defined results. I expect this to make no difference.
def test_all_available(self):
puzzle = Puzzle.from_list(self.puzzle_list)
available_values = [len(puzzle.possible_answers(position))
for position in range(81)
if puzzle.game[position] == '0']
best_guess = min(available_values)
for p in range(81):
if available_values[p] == best_guess:
print(f'{p=} {puzzle.possible_answers(p)=}')
assert best_guess == 0
I do not get the same answer. I get:
p=53 puzzle.possible_answers(p)=['4', '7', '8', '9']
OK, there are issues here. First of all, the new formulation will not get 81 answers: it’ll skip the defined values entirely. Second, why did it get a different minimum? Oh … that’s because I recompute the list and the indices are wrong.
I’m somewhat surprised to find that there is guessing involved in this puzzle. Let me reformulate the test: I think I’d like to explore further.
def test_all_available(self):
puzzle = Puzzle.from_list(self.puzzle_list)
available_values = [puzzle.possible_answers(position)
for position in range(81)]
lengths = [len(values) for values in available_values]
best_guess = min(lengths)
for p in range(81):
if lengths[p] == best_guess:
print(f'{p=} {puzzle.possible_answers(p)=}')
assert best_guess == 0
Still getting the same answer as before:
p=71 puzzle.possible_answers(p)=['1', '6']
I decide to pretty-print to get a picture of what happened.
Original Puzzle:
| 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 3 | ||
| 0 | 7 | 4 | 0 | 8 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 2 | ||
| 0 | 8 | 0 | 0 | 4 | 0 | 0 | 1 | 0 | ||
| 6 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 1 | 0 | 7 | 8 | 0 | ||
| 5 | 0 | 0 | 0 | 0 | 9 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 |
Number of Available Values
| 4 | 5 | 6 | 5 | 4 | 4 | 6 | 4 | 7 | ||
| 3 | 3 | 4 | 5 | 4 | 5 | 6 | 4 | 5 | ||
| 3 | 5 | 5 | 4 | 5 | 3 | 5 | 4 | 4 | ||
| 4 | 4 | 4 | 4 | 3 | 5 | 5 | 4 | 5 | ||
| 4 | 5 | 5 | 3 | 5 | 3 | 4 | 5 | 4 | ||
| 5 | 4 | 5 | 5 | 3 | 4 | 4 | 3 | 4 | ||
| 4 | 4 | 4 | 3 | 5 | 4 | 5 | 5 | 3 | ||
| 5 | 4 | 6 | 5 | 4 | 5 | 4 | 3 | 2 | ||
| 6 | 4 | 7 | 4 | 5 | 5 | 6 | 5 | 4 |
Available Values
| 1389 | 2 | 135689 | 13479 | 3579 | 1457 | 145689 | 5679 | 1456789 | ||
| 189 | 159 | 1589 | 6 | 2579 | 12457 | 124589 | 2579 | 3 | ||
| 139 | 7 | 4 | 1239 | 8 | 125 | 12569 | 2569 | 1569 | ||
| 1479 | 1459 | 1579 | 1789 | 679 | 3 | 45689 | 5679 | 2 | ||
| 2379 | 8 | 23579 | 279 | 4 | 267 | 3569 | 1 | 5679 | ||
| 6 | 1349 | 12379 | 5 | 279 | 1278 | 3489 | 379 | 4789 | ||
| 2349 | 3469 | 2369 | 234 | 1 | 2456 | 7 | 8 | 569 | ||
| 5 | 1346 | 123678 | 23478 | 2367 | 9 | 1236 | 236 | 16 | ||
| 123789 | 1369 | 1236789 | 2378 | 23567 | 25678 | 123569 | 4 | 1569 |
A bit of inspection convinces me that my numbers are probably correct. We can check this result another way. If we were to provide two versions of the above puzzle, one with a one provided in position 71 and one with a 6, and both are solvable, the puzzle is convicted of being ambiguous. Here goes.
def test_is_ambiguous(self):
puzzle = Puzzle.from_list(self.puzzle_list)
puzzle_1 = Puzzle.new_puzzle_trying_guess(puzzle, '1', 71)
solved_1 = Solver(puzzle_1).solve()
assert solved_1.is_correctly_solved
puzzle_6 = Puzzle.new_puzzle_trying_guess(puzzle, '6', 71)
solved_6 = Solver(puzzle_6).solve()
assert solved_6 is None
Note the check for None at the end: the version of the puzzle where we guess ‘6’ does not solve. I wonder why. Let’s see if we can instrument the solver a bit more.
No … that’s not easy, since the solver works from top to bottom left to right, there are a lot of fails and retries before we get any information. I’ll have to think about that.
What I’m left with is mostly surprise. What other constraints are there on those cells such that checking for available values makes us think that 6 would work, but in fact it won’t? Or … it’s always possible that I have a flaw in my thinking or my code.
How could this result be true, that is, that the basic check for what’s possible comes up with one and six, but six won’t work? Well, the solution that we do find is this:
| 1 | 2 | 6 | 4 | 3 | 7 | 9 | 5 | 8 | ||
| 8 | 9 | 5 | 6 | 2 | 1 | 4 | 7 | 3 | ||
| 3 | 7 | 4 | 9 | 8 | 5 | 1 | 2 | 6 | ||
| 4 | 5 | 7 | 1 | 9 | 3 | 8 | 6 | 2 | ||
| 9 | 8 | 3 | 2 | 4 | 6 | 5 | 1 | 7 | ||
| 6 | 1 | 2 | 5 | 7 | 8 | 3 | 9 | 4 | ||
| 2 | 6 | 9 | 3 | 1 | 4 | 7 | 8 | 5 | ||
| 5 | 4 | 8 | 7 | 6 | 9 | 2 | 3 | 1 | ||
| 7 | 3 | 1 | 8 | 5 | 2 | 6 | 4 | 9 |
We see that the cells where the sixes finally showed up had plenty of options at the beginning:
- Column: 1569
- Row: 1235679
- Sub-Grid: 1235679
So there’s nothing leaping out at me to say why the 6 won’t work, but I think we know that it doesn’t.
Must reflect. Odd. Needs some research and thought. If you have insights, please let me know!