Sudoku: Better
Did you ever look at your code? I mean REALLY look at it? This is going to be good! And it is!
Most mornings, as I begin to wake up, I start thinking about the current code. I have intentionally developed that habit, because, if I just let myself ruminate randomly, my thoughts tend to turn depressing, and that’s no way to start the day. And some mornings, those code thoughts actually drive me to get up earlier than I had planned, because I think there’s some great work to be done. Today is one of those days.
- Freely Granted
- In what we jokingly refer to as “real life”, we do not often get the chance to repeatedly consider the same 40 or 50 lines of code day after day. We’re under too much pressure to write the next 40 or 50 lines. So the repeated refinement of the NakedPairsTechnique is probably not something people who work for a living get any chances to do.
- Big Payoff
- However, practice really pays off. Practice in making code clear, in recognizing what it is, in seeing better ways, in refactoring toward those better ways. The more we practice, the faster we can recognize room for improvement, and the faster we can make needed improvements. And, of course, the more we write good code, the better code we’ll write.
So, in that light, let’s look again at NakedPairsTechnique:
class NakedPairsTechnique:
def __init__(self, component):
self.component = component
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_each_pair_from_other_candidates(self, naked_pair_positions: list):
for pair_positions in naked_pair_positions:
self.remove_pair_from_other_candidates(pair_positions)
def remove_pair_from_other_candidates(self, pair_positions):
other_candidate_positions = set(range(9)) - set(pair_positions)
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.remove_pair_from_these_positions(pair_to_remove, other_candidate_positions)
def remove_pair_from_these_positions(self, pair_to_remove, other_candidate_positions):
for other_position in other_candidate_positions:
self.remove_pair_from_component(pair_to_remove, other_position)
def remove_pair_from_component(self, pair_to_remove, other_position):
self.component.remove_candidates_at(other_position, pair_to_remove)
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
for first_position in range(9):
candidates = self.component.candidates_at(first_position)
if len(candidates) == 2:
for second_position in range(first_position+1, 9):
second_candidates = self.component.candidates_at(second_position)
if second_candidates == candidates:
result.append([first_position, second_position])
return result
Here, in what may be increasing difficulty of being noticed, are some things we might think about this code:
- It’s really odd how there are eight tiny simple methods and one big messy one.
- The code is very obsessed with positions, and less interested in values.
- The code makes very detailed demands on component candidates. Could those objects be more helpful?
- Once we have applied this rule, there is no point applying it again unless something changes.
Slight Digression
I don’t recall whether I have mentioned the following before. If not, great, here it is now, and if so, great, it needs emphasis.
One of the things that has helped me with naked pairs is that I have tried, three, four, five different ways of expressing the technique so that I, and other humans, could understand what it does. Let me try again right now:
If the same pair of values [A, B] appears twice in a component’s candidates, A and B can be removed from any other candidates in that component, because those two values must occur in the paired candidates, either A, B or B, A, there being no other options for those cells.
Let’s try again.
If two cells in a component each have only the same two possible values, A and B, then either the first cell has A and the second B, or vice versa. Therefore, we can eliminate A and B from all the other candidates sets in the component.
Again.
If [A, B] occurs twice in a component’s candidates lists, remove A and B from all candidate lists other than the ones containing exactly [A, B].
What am I trying to do here? Well, I’m trying to be sure that I understand the requirement, and that you do as well. But more than that, I am trying to express the requirement so that its implementation becomes clear to me, and as simple as I can make it.
What I am down to at the moment is something like:
Find the pairs that occur twice. Remove them from any candidates list that includes them, except for the original twinned pairs.
I think we’ll find the code more interesting, when we get there. The larger point is that rephrasing the requirement helps me think of ways to implement it.
End Digression
OK, what about those four observations?
- It’s really odd how there are eight tiny simple methods and one big messy one.
This is a clear sign that the code could use a little improvement. Yesterday, I was thinking we’d improve that one long method. We might still do that, but …
- The code is very obsessed with positions, and less interested in values.
When we work with indices, we’re pretty deep in the code. Modern languages are good with values, and we often do better to take advantage of that. You know, functional programming and all that jazz.
- The code makes very detailed demands on component candidates. Could those objects be more helpful?
This is always worth looking at. We have, in our code, told the component to do some pretty detailed stuff, in a loop. Couldn’t the component, or the candidate list, or the candidates be more helpful?
- Once we have applied this rule, there is no point applying it again unless something changes.
Something to think about. A human applies these techniques one at a time and is unlikely to go back often and worry about those naked pairs, because she can see at a glance that the values are not repeated anywhere else in the component. We would do well to avoid running techniques that can’t possibly find anything to do. This is an “interesting problem”, by which I mean “I don’t see how to do that at this time”.
Tentative Plan
Let’s review how the puzzle, components, and candidates work, and then see whether we can work more with values and less with indices. We may or may not touch that large method as we go: I think we’ll start with the small methods, which are doing the job of updating, while the large one is doing the job of finding pairs.
Let’s Do It
Let’s look at our algorithm and whether we can make it more oriented toward values. Our code that updates the candidates is this:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_each_pair_from_other_candidates(self, naked_pair_positions: list):
for pair_positions in naked_pair_positions:
self.remove_pair_from_other_candidates(pair_positions)
def remove_pair_from_other_candidates(self, pair_positions):
other_candidate_positions = set(range(9)) - set(pair_positions)
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.remove_pair_from_these_positions(pair_to_remove, other_candidate_positions)
def remove_pair_from_these_positions(self, pair_to_remove, other_candidate_positions):
for other_position in other_candidate_positions:
self.remove_pair_from_component(pair_to_remove, other_position)
def remove_pair_from_component(self, pair_to_remove, other_position):
self.component.remove_candidates_at(other_position, pair_to_remove)
Notice again how obsessed we are with the positions, including even a very clever set manipulation. But instead of looping over the candidate positions, why don’t we just tell the Candidates to remove the values? Well, I think the big reason is that we don’t have a Candidates instance at this point. A component looks like a list of cell indices, I think. Let’s see:
class Component:
def __init__(self, puzzle, start, steps):
self.puzzle = puzzle
self.start = start
self.positions = [start + step for step in steps]
Right. And it only provides individual candidate lists for given positions:
def candidates_at(self, position):
return self.puzzle.candidates.at(self.positions[position])
def remove_candidates_at(self, position, to_remove):
candidates = self.candidates_at(position)
candidates = [candidate for candidate in candidates if candidate not in to_remove]
self.set_candidates_at(position, candidates)
def set_candidates_at(self, position, candidates):
self.puzzle.set_candidates_at(position, candidates)
Well, that’s no help. I do think we could get some help from an instance of Candidates here, but we’ll just keep that in mind. Let’s work in the Technique, and see how to make it better with more reliance on the Component. Later on, we can look at Component and see if Puzzle or Candidates can help there.
Let’s proceed in NakedPairsTechnique by intention, or as I am starting to think of it, by wishful thinking. Let’s just flat out tell the component what we want it to do, changing this:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_each_pair_from_other_candidates(self, naked_pair_positions: list):
for pair_positions in naked_pair_positions:
self.remove_pair_from_other_candidates(pair_positions)
def remove_pair_from_other_candidates(self, pair_positions):
other_candidate_positions = set(range(9)) - set(pair_positions)
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.remove_pair_from_these_positions(pair_to_remove, other_candidate_positions)
To this:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_pair_from_other_candidates(self, pair_positions):
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.component.remove_naked_pair(pair_to_remove)
We’ll push the finding of the actual pair upward soon. For now, we’re working toward Component and Candidates. Tests fail for want of the new method on Component, so:
class Component:
def remove_naked_pair(self, pair):
self.puzzle.remove_naked_pair(pair, self.positions)
I briefly considered giving Component a candidates method or member, but since Component only has the puzzle, let’s try this. The puzzle will need to know which positions are of interest, so I’m passing that down.
class Puzzle:
def remove_naked_pairs(self, pair, positions):
self.candidates.remove_naked_pairs(pair, positions)
Will no one do this for me? OK, Candidates, it’s down to you:
class Candidates:
def remove_naked_pairs(self, pair, positions):
for pos in positions:
if (cand := self.candidates[pos]) != pair:
self.candidates[pos] = [v for v in cand if v not in pair]
I actually expected this to work, but it does not. That’s a bummer. Ah. I called the method in Puzzle remove_naked_pairs, not singular pair. Fixing that, we are green.
We can remove these methods from NakedPairsTechnique:
def remove_pair_from_these_positions(self, pair_to_remove, other_candidate_positions):
for other_position in other_candidate_positions:
self.remove_pair_from_component(pair_to_remove, other_position)
def remove_pair_from_component(self, pair_to_remove, other_position):
self.component.remove_candidates_at(other_position, pair_to_remove)
Let’s commit, this is better. Commit: converting NakedPairsTechnique to use values, not indices.
Now we have this:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_each_pair_from_other_candidates(self, naked_pair_positions: list):
for pair_positions in naked_pair_positions:
self.remove_pair_from_other_candidates(pair_positions)
def remove_pair_from_other_candidates(self, pair_positions):
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.component.remove_naked_pair(pair_to_remove)
Inline the last method, giving:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
self.remove_each_pair_from_other_candidates(pair_indices)
def remove_each_pair_from_other_candidates(self, naked_pair_positions: list):
for pair_positions in naked_pair_positions:
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.component.remove_naked_pair(pair_to_remove)
Inline again:
class NakedPairsTechnique:
def apply(self):
pair_indices = self.find_pairs_of_indices_of_naked_pairs()
for pair_positions in pair_indices:
pair_to_remove = self.component.candidates_at(pair_positions[0])
self.component.remove_naked_pair(pair_to_remove)
Require the find to return the pairs, not the indices. This will break the tests.
class NakedPairsTechnique:
def apply(self):
naked_pairs = self.find_pairs_of_indices_of_naked_pairs()
for naked_pair in naked_pairs:
self.component.remove_naked_pair(naked_pair)
That was two renames and a line deletion. Tests are broken as expected. Change this:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
for first_position in range(9):
candidates = self.component.candidates_at(first_position)
if len(candidates) == 2:
for second_position in range(first_position+1, 9):
second_candidates = self.component.candidates_at(second_position)
if second_candidates == candidates:
result.append([first_position, second_position])
return result
We want the candidates, not their indices. This should make the tests run:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
for first_position in range(9):
candidates = self.component.candidates_at(first_position)
if len(candidates) == 2:
for second_position in range(first_position+1, 9):
second_candidates = self.component.candidates_at(second_position)
if second_candidates == candidates:
result.append(candidates)
return result
Two tests fail, but it’s a different two. Here’s one:
def test_find_etc(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(0, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(2, ['5', '6'])
row.set_candidates_at(7, ['5', '6'])
technique = NakedPairsTechnique(row)
pairs = technique.find_pairs_of_indices_of_naked_pairs()
assert len(pairs) == 2
assert pairs[0] == [0, 3]
assert pairs[1] == [2, 7]
We need to change our expectations:
assert pairs[0] == ['7', '8']
assert pairs[1] == ['5', '6']
Passes. The other is similar:
def test_find_result_with_three(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(5, ['7', '8'])
technique = NakedPairsTechnique(row)
pairs = technique.find_pairs_of_indices_of_naked_pairs()
assert len(pairs) == 3
assert pairs[0] == [1, 3]
assert pairs[1] == [1, 5]
assert pairs[2] == [3, 5]
This is testing a situation that cannot occur without an error. It’s there to show what happens. Let’s rename it and fix it up.
def test_find_result_with_three_illegal_configuration(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(5, ['7', '8'])
technique = NakedPairsTechnique(row)
pairs = technique.find_pairs_of_indices_of_naked_pairs()
assert len(pairs) == 3
assert pairs[0] == ['7', '8']
assert pairs[1] == ['7', '8']
assert pairs[2] == ['7', '8']
Green. Commit? I’d like to improve that find a bit, but sure, commit first: using values not indices.
Now about this find:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
for first_position in range(9):
candidates = self.component.candidates_at(first_position)
if len(candidates) == 2:
for second_position in range(first_position+1, 9):
second_candidates = self.component.candidates_at(second_position)
if second_candidates == candidates:
result.append(candidates)
return result
I think my mind was contaminated by that other Python Sudoku thing I glanced at, and that set me on the path of considering positions. How might we do this without them?
How about this:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = [self.component.candidates_at(pos) for pos in range(9)]
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
That breaks one test, that weird one, because now it only finds two the same, not three. Fixed. Commit: improve find....
What did we just do? Well, we made a list of all our candidates. That should really be done a better way, or perhaps in a different place, but for now hides the double looping and actually reduces processing a bit. We could improve its behavior, but not performance, this way:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = [self.component.candidates_at(pos) for pos in range(9)]
for pos, cand in enumerate(component_candidates):
if len(cand) == 2 and cand not in result:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
Now we should get no duplicates in our weird test. Sure enough:
Expected :2
Actual :1
I fix that up. Now, as I was saying, we scan the candidates and if we find a pair that’s not in our results already, we check to see if it is below us (using the slice), and if it is, we add it to the result.
Much nicer, I think. I have some evil ideas about improving it more but I’ll save them for later. And we really should do something about creating that list. Let’s do that before we go:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = [self.component.all_candidates()]
for pos, cand in enumerate(component_candidates):
if len(cand) == 2 and cand not in result:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
Hm, that didn’t work. Roll back:
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = [self.component.candidates_at(pos) for pos in range(9)]
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
class Components:
def all_candidates(self):
return [self.candidates_at(pos) for pos in self.positions]
Oh. Too many brackets. Do again.
class NakedPairsTechnique:
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = self.component.all_candidates()
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
Summary
Enough. It’s about 2 1/2 hours in and I’m clearly tiring. I think our results are rather good, in that the code is more expressive, less tied to specific indices, and substantially shorter.
The whole NakedPairsTechnique class is now just 17 lines long, and it was 38 when we started.
Is there more to do? Yes, my thoughts include:
- The Technique seems to want to think it terms of candidates, but only has the puzzle, so we wind up pushing capabilities down what seems like a level too far. We should explore the balance of behavior among Component, Candidates, and Puzzle.
- And should Candidates really be a raw list of lists? Is there another object waiting to be born?
But for this morning, I am pleased. We’ll revisit this set of changes, probably next time, but what we have seen to be helpful includes:
- Noticing unbalanced code and doing something about it;
- Repeatedly rephrasing the problem and solution for clarity and insight;
- Very small steps toward improvement;
- Pushing details down, abstractions up.
And of course, having a test suite that we trust and can rely on.
See you next time!