Sudoku: Design Changes
Ron, you keep changing the design all the time! Why don’t you just nail down the design once and for all, and move on?
Well, bunkie, I’ll answer that. I only have a little over six decades of software design and development experience, so I’m just not good enough to get the design right in one go, or even several. Yet. Maybe someday I’ll be that good. I should live so long.
As some wise person before me said, at the beginning of the effort, we know less than we ever will again. Surely that’s not a good time to be nailing anything down.
Naturally, I try to make good design decisions, not bad ones, although often enough the decisions I make do not pan out. (Yes, I hear you saying that makes your case for coming up with a solid design early, but no, it doesn’t. It makes mine: I don’t know enough to make a better decision.) So I try to make small decisions, simple decisions, decisions that will be easy to change when, not if, a better idea comes along.
Fortunately, there are some approaches to help get those simple flexible designs: it basically comes down to creating small objects that support a single coherent purpose. Also fortunately, there are practices and tools to help us change the design when we need to, including small steps, tests, and refactoring. And, again, small objects with a single coherent purpose.
But the best news of all is this: even if it were possible to come up with a near-perfect design early on, it seems not to have happened in any code I’ve even encountered, so I’ve generally been working in a system whose design leaves something to be desired. And guess what: small steps, tests, refactoring, making small coherent objects out of large incoherent ones … all that gives us the ability to improve the design of whatever code we work with.
- One More Thing
- In most every article of the 5,000 articles here, at least the programming ones, I have intentionally done minimal design up front. My intention was to demonstrate what happens when I do that, expecting that small steps, tests, and refactoring would enable me to improve the design as needed.
-
In almost every situation, that has turned out to be the case. Incremental design, at least at the level I do it here, works well. And the techniques thereof allow us to improve any code that needs it.
Let’s Look at Some Code and Design
In the Sudoku program here at hand, the part that I think is currently the best is what we worked on the past few days, the naked pairs technique. What that does is this:
If the same pair of values occurs twice as cell candidates in any row, column, or sub-grid, those two cells must contain those values in one order or another. Therefore those two values cannot be valid candidates in any other element of the component and should be removed from those other lists of candidates.
Note that I keep expressing this different ways. I do this on purpose, because every now and then, I accidentally put it particularly nicely and it suggests a nice implementation. Maybe not this time.
Let’s look at the code:
class NakedPairsTechnique:
def __init__(self, component):
self.component = component
def apply(self):
naked_pairs = self.find_pairs_of_indices_of_naked_pairs()
for naked_pair in naked_pairs:
self.component.remove_naked_pair(naked_pair)
def find_pairs_of_indices_of_naked_pairs(self) -> list:
result = []
component_candidates = self.component.all_candidates()
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
There is something to like here, and something to like somewhat less. I like that the apply
method just says … oh my that’s the wrong name, isn’t it? We find the pairs now. Function+Shift+F6.
def apply(self):
naked_pairs = self.find_all_naked_pairs()
for naked_pair in naked_pairs:
self.component.remove_naked_pair(naked_pair)
Commit: rename method.
A different author would just make the change and start writing there. I want you to see that when we see something that needs improvement, we do it, test it, commit it.
As I was saying, I like that the method just says to get all the pairs and then, for each one tells the component to deal with it. Why do I like that second part? Well, because otherwise we’d have to loop over the individual lists in the component, creating new lists and putting them back. That’s like rifling through your friend’s pockets, removing anything you think shouldn’t be there. Generally considered rude.
This practice is called, variously, “tell, don’t ask”, or “Law of Demeter”. We tell the component what to do.
But now, let’s see how the component does that, because a) we’re curious and 2) I think we may discover something interesting.
class Component:
def remove_naked_pair(self, pair):
self.puzzle.remove_naked_pair(pair, self.positions)
Well that’s interesting and curious. Component just tells Puzzle to do it. Puzzle has this:
class Puzzle:
def remove_naked_pair(self, pair, positions):
self.candidates.remove_naked_pair(pair, positions)
Curiouser and curiouser. Puzzle just asks its Candidates to do it. I hope it doesn’t ask the component, that would be bad.
class Candidates:
def remove_naked_pair(self, pair, positions):
for pos in positions:
if (cand := self.candidates[pos]) != pair:
self.candidates[pos] = [v for v in cand if v not in pair]
So for all the given positions if that candidate isn’t a copy of the pair, we set that candidate to all its prior elements except for elements of the pair, if they were in there.
So I would call that nice. The flow goes Technique to Component to Puzzle to Candidates, where the work is done. Why nice? Because now only Candidates knows the format of an individual candidate.
But not perfectly nice, because let’s review Candidates in a bit more detail:
class Candidates:
def __init__(self, a_game_string):
self.game = a_game_string
self.line_size = 0
self.candidates = self.create_raw_candidates(a_game_string)
self.fill_in_solved(a_game_string)
def at(self, cell):
return self.candidates[cell]
def set_at(self, cell, values):
self.candidates[cell] = values
def create_raw_candidates(self, a_game_string):
if len(a_game_string) == 9:
available_values = '123'
elif len(a_game_string) == 81:
available_values = '123456789'
else:
raise ValueError('problem must have length 9 or 81')
self.line_size = len(available_values)
all_possible = [value for value in available_values]
return [all_possible for cell in range(len(a_game_string))]
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
self.candidates[cell] = [value]
row = Component.row(self, cell)
col = Component.column(self, cell)
sub = Component.sub_grid(self, cell)
positions = set(row.positions) | set(col.positions) | set(sub.positions)
positions.discard(cell)
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
def remove_naked_pair(self, pair, positions):
for pos in positions:
if (cand := self.candidates[pos]) != pair:
self.candidates[pos] = [v for v in cand if v not in pair]
Now here, there are things to critique.
-
We refer to a “game string”, a reflection of the fact that the Puzzle holds a simple string of characters 0-9 as the current puzzle state. The fact that Candidates knows that, creates coupling between Puzzle and Candidates. If we wanted to change how Puzzle represents the state, we would very likely have to change Candidates as well.
-
fill_in_solved
is pretty complicated and knows a lot about the game string, and rows and columns and sub-grids and the classes that support them. Here again, Candidates is coupled. -
We seem to have confusion with the word “cell” and the word “position”. Here, they seem to be synonymous, but possibly they should not be.
In Candidates’ defense, the weird code is in setting up the object. On the other hand, it really doesn’t do much else … yet. But that’s because we don’t have many Techniques yet.
Let’s review in a bit more detail, aiming toward fill_in_solved
. We can see in __init__
:
def __init__(self, a_game_string):
self.game = a_game_string
self.line_size = 0
self.candidates = self.create_raw_candidates(a_game_string)
self.fill_in_solved(a_game_string)
I happen to know what we’re doing here. We begin by setting every candidates list to either all the values from ‘1’ through ‘9’ if we are on on 9x9 puzzle and ‘1’ through ‘3’ otherwise. Then we go through the puzzle (game string) and if there is a non-zero (i.e. given) value there, we remove that value from the corresponding list.
I think we could do better. This is a local change, and I think we need some global changes, but couldn’t we just go through the game string and set each list correctly once and for all?
Consider this
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
self.candidates[cell] = [value]
row = Component.row(self, cell)
col = Component.column(self, cell)
sub = Component.sub_grid(self, cell)
positions = set(row.positions) | set(col.positions) | set(sub.positions)
positions.discard(cell)
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
Here’s a tweak to make it a bit better:
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
self.candidates[cell] = [value]
row = Component.row(self, cell)
col = Component.column(self, cell)
sub = Component.sub_grid(self, cell)
positions = set(row.positions) | set(col.positions) | set(sub.positions) - {cell}
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
Instead of the discard
, I subtracted the cell directly. Check the end of the positions-setting line if it isn’t visible.
Commit: tidying.
Anyway, that code only sets the components for cells that are given, which typically amount to less than 10 percent of the field. So the two-phase operation is probably worth it.
What about this bit:
row = Component.row(self, cell)
col = Component.column(self, cell)
sub = Component.sub_grid(self, cell)
positions = set(row.positions) | set(col.positions) | set(sub.positions) - {cell}
What are those positions
? They are the positions of all the cells impacted by a decision about the cell we’re looking at. I suggest that Component should be providing that.
Let’s ask for that. Remind me to discuss something nasty here in a moment.
class Candidates:
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
self.candidates[cell] = [value]
positions = Component.all_impacted_positions(self, cell)
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
class Component:
@classmethod
def all_impacted_positions(cls, puzzle, position):
row = cls.row(puzzle, position).positions
col = cls.column(puzzle,position).positions
sub = cls.sub_grid(puzzle, position).positions
return set(row) | set(col) | set(sub) - {position}
This reduces Candidates’ dependency on Component a bit, and leaves the determination of the cells where it belongs. I still don’t like the fact that we’re passing a bunch of subscripts into the game back and forth. However:
Nasty? Yes, nasty. The Component’s creation methods refer to a puzzle:
def __init__(self, puzzle, start, steps):
self.puzzle = puzzle
self.start = start
self.positions = [start + step for step in steps]
@classmethod
def row(cls, puzzle, position):
if puzzle.line_size == 9:
steps = [0, 1, 2, 3, 4, 5, 6, 7, 8]
else:
steps = [0, 1, 2]
start = position - position % puzzle.line_size
return cls(puzzle, start, steps)
But what does Candidates pass in? Itself! That works, not quite by luck, because Candidates, like Puzzle, has the game string and that’s really all that Component cares about. (In fact, it really only cares about the length of the string.)
So the good news is, duck typing. The bad news is, duck typing. If this doesn’t surprise and confuse the programmer just about every time they look at this, I myself will be surprised.
Someone should do something about that sometime. We are the someone but now may not be the time. We are green, so commit: simplify Candidates’ use of Component somewhat. Beware duck typing.
Oh! You want to know how confusing and nasty that is? It’s not the string that Component asks for, it is the member variable line_size
, which Puzzle has and which Candidates computes during create_raw_candidates
. Weird bit of coupling. One way to remove it would be to settle on 9x9 as our size. If we need generalized size, we’d need a better way. Right now, we do use size 3x3 sometimes, in tests, and the code is not the better for it.
I’ve mentioned this before … there is something about the interaction between Puzzle, Component, and Candidates that seems not quite right. I honestly don’t know what it is, it just doesn’t feel right to me.
One thing that I don’t like here in Candidates is in the code we’ve been improving:
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
self.candidates[cell] = [value]
positions = Component.all_impacted_positions(self, cell)
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
We set self.candidates[cell]
twice. Once should suffice, but the code makes me wonder why it’s done that way. I think I know what’s up:
It makes more sense to me to set the single value first. But at one time, the positions we accessed were all those in the row, column, and subgrid for cell … which includes cell. So we had to do the setting last. Then, when we got the set solution, we could put the single cell’s setting first, and forgot to remove the trailing one.
I think we should leave it at the end. Why? Because it is not obvious from here that all_impacted_positions
excludes the cell itself. A better name might help, but for now:
def fill_in_solved(self, a_game_string):
for cell, value in enumerate(a_game_string):
if value != '0':
positions = Component.all_impacted_positions(self, cell)
for pos in positions:
self.candidates[pos] = [v for v in self.candidates[pos] if v != value]
self.candidates[cell] = [value]
Now about cell
vs pos
. I think we’d do best to have pos
(which I will allow as an abbreviation for position
because of long and faithful service in that role) mean the position and cell
, if we ever use it, mean the actual little square that contains the numbers. Currently we do not have such an object. So let’s rename here:
def fill_in_solved(self, a_game_string):
for cell_position, cell_value in enumerate(a_game_string):
if cell_value != '0':
impacted_positions = Component.all_impacted_positions(self, cell_position)
for impacted_position in impacted_positions:
self.candidates[impacted_position] = \
[value for value in self.candidates[impacted_position]
if value != cell_value]
self.candidates[cell_position] = [cell_value]
I think that most of my colleagues would prefer those long names. I am not so sure, but we’ll let it ride for now. Commit: tidying.
I think we’ve done enough to make our points for the day.
Summary
We have observed some somewhat commendable code in NakedPairsTechnique, and followed it down to some less commendable code in Candidates. We improved that code by providing better support in Component, and by some judicious renaming. We have improved the system design, through some very small and simple changes, with our tests running all the time.
I repeat: we have improved the design.
I’m not good enough to create the right design for even a moderate-sized program at the beginning. What I am nearly good enough at is recognizing design issues and finding ways to improve them. As I refine and refine and refine, the design moves toward smaller objects, with single responsibilities and simple code. The design improves, bit by bit, almost by itself.
Many programmers find themselves in my situation, unable to imagine a perfect design at the beginning of things, often even at the beginning of a relatively small step. My example here might suggest not trying to do the thing we cannot, and instead doing the things we can:
- Take small steps
- Guided by tests
- Refactoring to improve the design
- Rinse, repeat.
I’ll see you next time!