Sudoku: Haskell? Functional? (r, c, v)?
Tomas has been nerd-sniped and is solving Sudoku with Haskell. I nerd-snipe myself, thinking about Haskell, immutability, and updating. Some learning, almost no progress.
Haskell is a language that I find quite strange, and it’s one of those languages that is enough unlike the C, Java, Python C-like family that I can’t just read it and get the drift. Haskell is functional, and includes pattern matching and list comprehensions as first-class notions. If I have any experience with it, it was in a pairing sessions two decades ago, and come to think of it, the session I have in mind might have been with Scala. Hm. Anyway, Tomas is doing Sudoku in Haskell and his code, and our mastodon conversations, have given me to think.
The trouble is, when I think about what I might do, inspired by that code and those interactions, I can’t quite remember enough details about my own code, right here, that I was working on less than 24 hours ago. What I do know is that the Python code has evolved, and some of the older code is almost certainly more ad hoc and less coherent than the newer. In short, we may be building upon sand, and I am told that building on code sand is fraught. Here’s what’s troubling me, bunkie …
The core of our implementation is in the Puzzle / Component / Candidates space, where those three classes among them represent the state of a puzzle. I’m not clear on:
- Is the Puzzle supposed to be immutable, or not? When we decide on a value in a cell, do we update the puzzle we have, or create a new instance. Or, worst case, do we do it one way sometimes and another way other times?
- Is a Component a pure view of a Puzzle, or is it a copy of some information about the Puzzle? If the Puzzle did change (see above), would the Component reflect that change?
- Is a Component a list of lists, or a Candidates of lists, or what? What should it be? If it’s a cover for Candidates, should Component class exist at all?
- I’m pretty sure that we recompute the lists / Candidates / pointers every time we ask for a Component. Would it be better to cache them in the Puzzle?
- … and probably more …
To The Code, Robin!
- Note
- You deserve to know that aside from a few simple but useful refactoring bits, all I manage to accomplish here is to discover that we need something and that, so far, I don’t know how to provide it.
-
Feel free, nay encouraged, to skim.
My guess: it’s a bit of a mess. Let’s review and see if we want to clean anything up before we work on our next Technique.
Puzzle is pretty long. Let’s look at some key aspects.
class Puzzle:
def __init__(self, a_game_string):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
self.candidates = Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old = puzzle.game
new = old[:position] + guess + old[position+1:]
return cls(new)
There are two things that this much addresses: what the Puzzle knows, and what happens when we try a value in a cell / position.
When we create a puzzle, we are recomputing its candidates. And when we make a guess, we create a new puzzle. That means we create new candidates, and that, in turn, means that any logic we have applied to the candidates in an existing puzzle is lost. If we have used NakedPairs to eliminate some possibilities, we cannot use new_puzzle_trying_guess
without losing that information.
And guess what! Here’s what we do if we discover that some cell has only one possible value:
class SudokuTechniques:
def apply(self):
self.only_one_possible_value()
self.only_one_possible_position()
return self.puzzle
def only_one_possible_value(self):
changed = True
while changed:
changed = False
for cell in self.puzzle.unknown_cells():
available = self.puzzle.possible_answers(cell)
if len(available) == 1:
SudokuTechniques.only_one += 1
self.puzzle = Puzzle.new_puzzle_trying_guess(self.puzzle, available[0], cell)
changed = True
If clever Sudoku reasoning had eliminated some possible candidates in some possible cells, we are throwing that information away. This is not as good as it might be.
Honestly, that won’t do. I’m going to belay the larger-scale study listed above, and fix that issue right here and now. We’ll write a test, show the problem happening, then fix it.
def test_does_not_lose_info(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
assert row.candidates_at(1) == ['7', '8']
new_puzzle = Puzzle.new_puzzle_trying_guess(puzzle, '1', 0)
new_row = Component.row(new_puzzle, 0)
assert new_row.candidates_at(1) == ['7', '8']
As expected, this fails. It should be failing with the candidates 2-9. Yes:
Expected :['7', '8']
Actual :['2', '3', '4', '5', '6', '7', '8', '9']
We have the test that shows that we don’t have what we wanted. Now how can we get it?
Our init works like this:
class Puzzle:
def __init__(self, a_game_string):
...
self.candidates = Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
And, again, the guessing method:
class Puzzle:
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old = puzzle.game
new = old[:position] + guess + old[position+1:]
return cls(new)
Let’s pass in the old puzzle’s candidates and use them if they are there. And let’s have a rename while we’re at it.
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old_string = puzzle.game
new_string = old_string[:position] + guess + old_string[position+1:]
return cls(new_string, puzzle.candidates)
And …
def __init__(self, a_game_string, candidates=None):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
self.candidates = candidates if candidates else Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
I really thought that was going to work. In fact, two tests fail. I wish I had committed my new test, it would make rolling back a bit easier. But I cannot resist the temptation of looking. Besides, it might be useful information.
One test that fails is a 3x3 test on the full solver. It complains thus:
Expected :'123312231'
Actual :'123212221'
Yeah, well, that isn’t even a solution, what’s up with that? Oh, I know: we can’t just pass in our old Candidates: we need at least to update them to remove the new guess. (We should rename guess
as well, since our Techniques are not guessing, they know.)
I can think I can create a hack that can fix this, let me try it just so that we can hate on it.
This is getting to be a crock faster than I thought. So far I have this:
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old_string = puzzle.game
new_string = old_string[:position] + guess + old_string[position+1:]
new_puzzle = cls(new_string, puzzle.candidates)
# update new puzzle candidates
return new_puzzle
I am hating this sooner and more than I thought, but I want to push it through, not to keep it, but to help me understand enough to—I fondly hope—come up with a better idea.
First, “by intention” (wishful thinking):
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old_string = puzzle.game
new_string = old_string[:position] + guess + old_string[position+1:]
new_puzzle = cls(new_string, puzzle.candidates)
new_puzzle.update_candidates()
return new_puzzle
I think we’ll be able to move that but we’ll still hate it. Anyway:
No. In fact let’s move it now.
class Puzzle:
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old_string = puzzle.game
new_string = old_string[:position] + guess + old_string[position+1:]
return cls(new_string, puzzle.candidates)
def __init__(self, a_game_string, candidates=None):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
if candidates:
self.candidates = candidates
self.update_candidates()
else:
self.candidates = Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
Still wishful thinking of course. But don’t we have a method that does this somewhere?
def __init__(self, a_game_string, candidates=None):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
if candidates:
self.candidates = candidates
self.candidates.fill_in_solved(a_game_string) # <===
else:
self.candidates = Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
This passes my new test but is still failing the 3x3. What does fill_in_solved
do?
- Note
- I know I am on a failed path and should roll back. I’m struggling, partly with sunk cost, partly because I feel that if I can make it work I can make it better, and partly because the objects aren’t really helping. It’s that last bit that really needs addressing. As I’ve been saying for days now.
-
I will roll back soon, but not yet. Someone should reach over my shoulder and just do it.
def fill_in_solved(self, a_game_string):
for cell_position, cell_value in enumerate(a_game_string):
if cell_value != '0':
impacted_positions = Component.all_impacted_positions(self, cell_position)
for impacted_position in impacted_positions:
self.candidates[impacted_position] = \
[value for value in self.candidates[impacted_position]
if value != cell_value]
self.candidates[cell_position] = [cell_value]
Why are we setting candidates[cell_position]? There are no viable candidates there. We should clear that cell, or do nothing. What if we clear i?
I try that and doing nothing. Two tests fail including the trivial 3x3. Roll back, keeping the failing test.
We are back to our original test failing:
def test_does_not_lose_info(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
assert row.candidates_at(1) == ['7', '8']
new_puzzle = Puzzle.new_puzzle_trying_guess(puzzle, '1', 0)
new_row = Component.row(new_puzzle, 0)
assert new_row.candidates_at(1) == ['7', '8']
Three hundred lines in, and my editor’s lines are often whole paragraphs. Fortunately, I generally write short paragraphs. Still 300 of the things and we do not have a viable solution.
It might be time to sit back and think, since just doing the obvious broke things.
Why did the little 3x3 break? It broke in two different ways. Early on, it was breaking because the candidates, having been copied instead of reconstituted, included a value that had been used already, and it got reused, returning a solution that was not correct, with lots of duplicated values.
Ah. And later it failed for want of finding a solution. Probably the same issue, just manifesting differently, although that is just a guess. Occam’s Razor or something.
Let’s begin with a simple thing, renaming this method:
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position=None):
if position is None:
position = puzzle.guess_position
old = puzzle.game
new = old[:position] + guess + old[position+1:]
return cls(new)
Let’s be more positive here. We are creating a new puzzle, a sub-puzzle if you will, with one new cell filled in. Why? Well, sometimes we’re guessing, but other times, with our Techniques, we know the value.
And what about that odd default, where we do not provide a position? Is that used outside tests? It is only used outside tests and it’s used here:
class Puzzle:
def find_next_puzzles(self):
guesses = self.possible_answers(self.guess_position)
Puzzle.guess_count += len(guesses)
puzzles = (Puzzle.new_puzzle_trying_guess(self, guess) for guess in guesses)
return puzzles
We could certainly be providing the position here. And why do we set that value, guess_position
, as a member variable? Who else cares?
First, let’s make it required, since this is the only place that relies on it being optional.
def find_next_puzzles(self):
position = self.guess_position
guesses = self.possible_answers(position)
Puzzle.guess_count += len(guesses)
puzzles = (Puzzle.new_puzzle_trying_guess(self, guess, position) for guess in guesses)
return puzzles
@classmethod
def new_puzzle_trying_guess(cls, puzzle, guess, position):
old_string = puzzle.game
new_string = old_string[:position] + guess + old_string[position+1:]
return cls(new_string)
We’re green, because I set the new test to skip. I want to commit some stuff and would prefer nto to commit a broken test. Commit: refactoring to remove optional position.
Now let’s work to remove the guess_position member. Here’s how it’s set:
def __init__(self, a_game_string):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
self.candidates = Candidates(a_game_string)
self.game = a_game_string
self.guess_position = self.first_unsolved_position()
Let’s see, what is a nice error-free way to do this?
Let’s remove that line, replacing thus:
def __init__(self, a_game_string):
Puzzle.puzzle_count += 1
if len(a_game_string) == 9:
self.available_values = '123'
self.line_size = 3
elif len(a_game_string) == 81:
self.available_values = '123456789'
self.line_size = 9
else:
raise ValueError('problem must have length 9 or 81')
self.candidates = Candidates(a_game_string)
self.game = a_game_string
@property
def guess_position(self):
return self.first_unsolved_position()
Green. Now inline guess_position. Bummer, can’t inline methods with decorators. That would have been so nice. OK find the senders and do it the hard way.
There are only two. Replaced. Green. Commit: remove guess_position member.
Where was I? Oh, right, wanting to rename new_puzzle_trying_guess
. And to implement it too.
You know what? I’m going to stop here and start a new article. A clean sheet of virtual paper might be just what I need.
Summary
I came in here because I wasn’t clear on how things work, in the light of ideas about immutability and pattern matching and the general feeling that something isn’t quite right.
We have a test that shows that when we evolve the puzzle (Hey! That’s the word! Evolve!) when, as I was saying, we evolve the puzzle, we are not properly carrying knowledge forward in the Candidates. If we have eliminated any candidates via Techniques, we lose that information. We need not to do that.
On the one hand, I think we need to make the Puzzle essentially immutable, but on the other hand, I think our Techniques want to modify the Candidates but generally not to make a cell assignment and thus evolve the puzzle. On the gripping hand, it may be that we should make a sort of deep copy operation for use by the brute solver and let our Techniques modify things.
I think immutable is better. I just don’t see quite how to do it.
Thus, I want to start on a clean page. See you there!