Sudoku: Immutable?

Based on input from my betters, I work on a faster way to deal with Candidates. GeePaw puts a word to it. Probably the wrong word. I go off on a tangent and discover profiling. An odd morning, with some interesting discoveries and one nice tiny object.

Now that I’ve made the Puzzle more or less immutable, I’m not at all sure that I like it. I’ve had a number of interesting inputs on the question:

Jessitron suggests that perhaps there is a missing idea representing “how the program is currently thinking about the puzzle”. Also, a Candidates might be immutable and able to return a new Candidates with a change in it. And that a Technique is a process that deals with Candidates in the light of a Puzzle, not so much dealing with a Puzzle per se. (I write here my understanding of what Jessica said, so this may be a misinterpretation of the ideas. (This is always the case when we say something to someone else. What they get is almost certainly not what we offered.))

At FGNO¹ last night, we were talking briefly about my immutability exercise and somehow the notion of editing the Candidates came up, and GeePaw mentioned the well-known(?) notion of a “pick list”, which after discussion we came to understand is a well-known(?) approach to editing a string, not by breaking it up and inserting and so on, but by building a table “above it”, reflecting edits, and you spin through the table to find the characters you’re looking for. Subsequent research turned up the “piece table” notion, which may or may not have been what GeePaw was thinking about.
Ward Cunningham(!), on Mastodon, mentioned his work with Sudoku with a link to some work he did four and two years ago. Besides feeling all glowy that Ward would chime in, his JavaScript code made me feel that I still have much to learn about elegance and simplicity.
Subsequent musing in the light of these events and the in dark of early morning leads me to think that I should have a little object, maybe called CandidatesEditor, that sits on top of a Candidates (or another CandidatesEditor, it could happen) and accepts edits and behaves as if the Candidates object were edited but really it’s the Editor reflecting the changes. If we did that, we wouldn’t need to copy the Candidates so thoroughly, we would instead just toss a CandidatesEditor on top.
Further reflection on the above caused me to think about the switch from the notion of “guess” to “evolve” when we create a new game based on the one we’re solving. The Techniques never guess: they only ever adjust the Candidates for one or more cells. And the Solver, our brute, always guesses. It has to be prepared to be wrong, which is why we create a new Puzzle for it and hold on to the old one, and it’s the closest to a good reason why we made the Puzzle immutable, so that the earlier puzzle isn’t damaged by the later one’s work. That’s what really led to copying the Candidates, because we do want the later puzzle to be able to take advantage of whatever has been learned before.

So Therefore?

Well, so therefore I think we should do these things:

Create a CandidatesEditor thingie that sits on top of Candidates, accepts edits, and looks like the Candidates have been changed;
Change our evolve method on Puzzle so that it doesn’t deep copy the Candidates but instead used a CandidatesEditor;
Discover, probably, that we can accomplish #2 by changing Candidates.copy() with no change to Puzzle;
Only use evolve when guessing a value, not when its value is forced by the puzzle constraints. Does this imply a PuzzleEditor?
Consider whether we are going to mix the Solver and the Techniques together at all. In favor: Techniques can speed up the Solver. Opposed: once you bring out the Solver, nothing interesting will ever happen again. You smash the puzzle with the biggest hammer you’ve got.
Probably should do this item first, because it says to do #5 next, because that consideration may change what we do for the #1-4.

Warmup: CandidatesEditor

I’m not sure about the name, but let’s TDD up a CandidatesEditor to see what one might look like. I do have a theory.

The Editor will essentially be a dictionary from integer (e.g 0-80) to (whatever) and an associated object also accessible via integers. When it is asked for the value at n, it accesses the dictionary and if n is missing, it accesses its associate.

I can think of at least two ways to do this, one with the access built in to the Editor and the other with a less intelligent object that does the dictionary / missing / default stuff and an editor part that understands the Candidates methods:

class Candidates:
    def at(self, cell):
    def copy(self):
    def create_raw_candidates(self, a_game_string):
    def fill_in_solved(self, a_game_string):
    def remove_naked_pair(self, pair, positions):
    def set_at(self, cell, values):

We should look at __init__ and copy in particular, since we foresee that we’ll be changing copy and that will almost certainly impact __init__.

    def __init__(self, a_game_string, candidates=None):
        self.game = a_game_string
        if not candidates:
            self.line_size = 0
            self.candidates = self.create_raw_candidates(a_game_string)
            self.fill_in_solved(a_game_string)
        else:
            self.line_size = candidates.line_size
            candidate_list = candidates.candidates
            better_list = [items.copy() for items in candidate_list]
            self.candidates = better_list

    def copy(self):
        return Candidates(self.game, self)

Basically we have two entirely different ways of creating this object. It seems to me that we need a simpler __init__ that gets the line size (if we use it) and the candidates object (array or new editor thing) and the clever stuff in the current __init__ should be done in factory methods.

A look at the use of line_size suggests to me that it is never used. Let’s see about removing it.

Oh! Dammit! That duck typing trick here:

    def fill_in_solved(self, a_game_string):
        for cell_position, cell_value in enumerate(a_game_string):
            if cell_value != '0':
                impacted_positions = Component.all_impacted_positions(self, cell_position)
                for impacted_position in impacted_positions:
                    self.candidates[impacted_position] = \
                        [value for value in self.candidates[impacted_position]
                         if value != cell_value]
                self.candidates[cell_position] = [cell_value]

In that impacted_positions = line, we pass self to the Component. The component really expects a Puzzle, and it wants to know the line size. So we hacked a line_size into Candidates and relied on duck typing to make it work.

Flash Retrospective

First of all, all you strict typing folks can add a point to your tally, because this bit of duck typing saved me some work when I did it and is causing me trouble now, and the only way I could find that I couldn’t remove it was to run the code. Now remove a point, because the problem isn’t strict vs duck typing, the problem is really somewhere else, probably in that Candidates does not have a puzzle to refer to.

But if there is something to learn it is that when I realized that I needed line_size and didn’t have it, that flagged a design issue and instead of working around it, it would likely have been better to dealt with the issue. In my defense, I have been expressing confusion about the relationship between Puzzle, Candidates, and Components for ages, so what I’m saying is yes, but I didn’t know then what to do and I don’t know now.

Let’s keep this in mind, since we’re going to improve the creation of Candidates soon. For now, let’s press on.

Back to Work

Let’s build our editor thing. We’ll do it in tests first to get a feeling for how to do it.

class TestCandidates:
    def test_dictionary_cover(self):
        base = [10, 20, 30, 40]
        cover = ListEditor(base)
        cover[1] = 200
        assert base[1] == 20
        assert cover[0] == 10
        assert cover[1] == 200
        assert cover[2] == 30

And I provide this small class:

class ListEditor:
    def __init__(self, covered):
        self.cover = dict()
        self.covered = covered

    def __setitem__(self, key, value):
        self.cover[key] = value

    def __getitem__(self, key):
        try:
            return self.cover[key]
        except KeyError:
            return self.covered[key]

I think I rather like that. As a second way of implementing the idea, I had thought to subclass dict, which a lot of Pythonistas would do, but when we subclass dict we pick up all kinds of dict behavior that we really don’t want. And this is quite simple. Since it’s pretty general, this name might be OK. Of course it can edit anything indexed with a single key value. We’ll leave the name for now. Names are easy to change. I’ll move this to its own non-test file.

Commit: test and build ListEditor.

Now could we change Candidates.copy to return a ListEditor? Not quite. It would need to be a Candidates containing a ListEditor.

class Candidates:
    def __init__(self, a_game_string, candidates=None):
        self.game = a_game_string
        if not candidates:
            self.line_size = 0
            self.candidates = self.create_raw_candidates(a_game_string)
            self.fill_in_solved(a_game_string)
        else:
            self.line_size = candidates.line_size
            candidate_list = candidates.candidates
            better_list = [items.copy() for items in candidate_list]
            self.candidates = better_list

    def copy(self):
        return Candidates(self.game, self)

We’re in the else part if we’re copying. I think this should work:

class Candidates:
    def __init__(self, a_game_string, candidates=None):
        self.game = a_game_string
        if not candidates:
            self.line_size = 0
            self.candidates = self.create_raw_candidates(a_game_string)
            self.fill_in_solved(a_game_string)
        else:
            self.line_size = candidates.line_size
            self.candidates = ListEditor(candidates)

It fails grandly, 14 tests. Must be something egregiously obvious.

It is, it’s the !@#$%@!! candidates.candidates thing. Try this:

class Candidates:
    def __init__(self, a_game_string, candidates=None):
        self.game = a_game_string
        if not candidates:
            self.line_size = 0
            self.candidates = self.create_raw_candidates(a_game_string)
            self.fill_in_solved(a_game_string)
        else:
            self.line_size = candidates.line_size
            self.candidates = ListEditor(candidates.candidates)

Now only one test fails. This displeases me.

Ah, it’s the test for candidates.copy:

    def test_candidates_copy(self):
        game_string = '103000001'
        puzzle = Puzzle(game_string)
        assert puzzle.candidates.at(1) == ['2']
        new_candidates = puzzle.candidates.copy()
        assert new_candidates.at(1) is not puzzle.candidates.at(1)

Well, you see, in the new scheme of things, the existing items come back. I think this test can be removed, but maybe we should recast it. Let’s do:

    def test_candidates_copy(self):
        game_string = '103000001'
        puzzle = Puzzle(game_string)
        assert puzzle.candidates.at(1) == ['2']
        new_candidates = puzzle.candidates.copy()
        assert new_candidates.at(1) is puzzle.candidates.at(1)
        new_candidates.set_at(1, ['3'])
        assert puzzle.candidates.at(1) == ['2']
        assert new_candidates.at(1) == ['3']

That’s better. Commit: Candidates copy uses ListEditor, not a full copy.

I wonder if this sped things up. When I last ran the hard 9x9 Solver, it took 24 seconds. Let’s try it now. About the same. Bummer.

class Puzzle:
    puzzle_count = 0
    guess_count = 0

    @classmethod
    def evolve_with_change(cls, puzzle, change, position):
        old_string = puzzle.game
        new_string = old_string[:position] + change + old_string[position+1:]
        copy = puzzle.copy()
        copy.game = new_string
        positions_needing_update = Component.all_impacted_positions(copy, position)
        for position in positions_needing_update:
            values = [v for v in copy.candidates.at(position) if v != change]
            copy.candidates.set_at(position, values)
        return copy

That’s a lot of messing about but it should only be addressing a relatively small number of positions, 21 out of 81. We used to reconstruct all the possible candidates from scratch.

Who’s using this method? Certainly Solver but are the Techniques using it?

class SudokuTechniques:
    def only_one_possible_value(self):
        changed = True
        while changed:
            changed = False
            for cell in self.puzzle.unknown_cells():
                available = self.puzzle.possible_answers(cell)
                if len(available) == 1:
                    SudokuTechniques.only_one += 1
                    self.puzzle = Puzzle.evolve_with_change(self.puzzle, available[0], cell)
                    changed = True

This isn’t exactly a Technique … or, more accurately we don’t have a very good definition of Technique. There are things we can do that change the candidates, our “current understanding” of the puzzle per Jessitron, and there are things that change the solution, like this thing does, by discovering that there is only one possible answer. Even in this case, do we really need a new Puzzle? There are two possibilities:

We’re not running the Solver, so there is no guessing involved and this change is therefore not going to be rolled back, or
We are running the Solver, in which case the Solver has saved the previous puzzle and will return to it, and we can (I think) change this one with impunity. And anyway
We’re going to stop mixing Solver and Techniques, and so only #1 applies.

I was mistaken about “two possibilities”. As usual.

I think Solver would benefit from running the current and only Technique, the one that forces. And it does benefit: when I turn on techniques in the long test it changes from 24 seconds to four. Still too long to run it all the time but it’s worth having.

Break / Retro

I realize that I am sort of bouncing around randomly, without any noticeable purpose. Need to settle down, pick a direction, focus a bit.

One question arises. I really thought that improving the Candidates.copy situation was going to pay off in performance, since it avoids all the copying. It adds the cost of the try-except on about 3/4 of the accesses to candidates. Whatever, it’s not faster, and I thought it would be. Copying a Puzzle should be nearly free. Let’s do some timing.

    def test_puzzle_copies(self):
        puzzle_list = [
            '020000000',
            '000600003',
            '074080000',
            '000003002',
            '080040010',
            '600500000',
            '000010780',
            '500009000',
            '000000040'
        ]
        puzzle = Puzzle.from_list(puzzle_list)
        count = 1000
        start = timeit.default_timer()
        for i in range(count):
            pass
        stop = timeit.default_timer()
        overhead = stop - start
        start = timeit.default_timer()
        for i in range(count):
            new_puzzle = puzzle.copy()
        stop = timeit.default_timer()
        total = stop - start - overhead
        microseconds = int(total * 1000000 // count)
        assert microseconds == pytest.approx(140, abs=10)

OK, 140 microseconds per copy isn’t bad. How much is 140 times 70,000 microseconds? About ten seconds. We’re using 24ish without forces. We need some more study if we want to know why Solver is so slow. Might have to learn how to profile.

Right, that was easy, except that I’m not sure how to make the window go away. So we see that the bulk of our time is in fill_in_solved and impacted_positions and in list comprehensions, doubtless the ones that those methods call. What is that graph that they offer me?

I’ve put the graph in a footnote², to move it to the end: it’s long and tall. I want you to see the words that follow. I could be wrong for wanting that.

If you open that graph² in its own page you should be able to see it at full scale and look around in it. I’ll want to study the graph and table a bit but they certainly are showing us some things to look at, notably our use of list comprehensions.

Ideas come to mind …

Instead of just picking the next empty cell in Solver, we should select the cell with the minimum number of possibilities. Ideally, I think we’d check all the forces first, though if Solver were selecting short ones, it would pick a force if there was one. Still, filling them all in would be better, since the force wouldn’t really have to do all that copying.
We remove elements from our candidates lists by regenerating the lists with comprehensions. It would probably be better to remove them with a list operation. Even better would be to make them into a dictionary. I know that some Sudoku implementations have used a bitmap to represent possibilities. Each cell’s possibilities would only be 9 bits long and we could remove an item with a couple of quick xors or something. (Hm, that almost sounds like fun.)
We make a lot of calls to row, column and sub-grid. Speeding those up could pay off. They seem to be using about six percent of the time, each.
fill_in_solved accounts for 90% (on the graph), between its own 35%, and 27% in its list comprehension, and a bunch of time in all_impacted_positions, which calls the row column sub-grid family.

Pick a Direction!

I’m almost frustrated with myself, until I remember why we do this. We are not here to build any particular thing, we are here to enjoy exploring how to build whatever we build, how to refactor whatever we need to improve, how to test whatever we build.

So I don’t really have to do any particular thing. That said, somethings we might do include:

Optimize Solver to make it as fast as we can reasonably make it, using the Profiling info. Pro: Kind of fun, feeling of accomplishment, learn use of profiling info.
Generate a few more techniques. Pro: They are not easy to express, we might learn some good ways to express them. They are not easy to set up. They are not easy to test.
Grab a bunch of Sudoku and solve them.
Move on to the next exciting thing, which might be Robot World II. Or IV. Or MCMLVII, I don’t know.

That’s for tomorrow. For today, just a few learnings:

Summary

We have a rather nice little editor that can sit on top of a Candidates list and allow “editing” it without changing the underlying list. That’s … not as useful as I had thought, especially if we allow Techniques to change the Candidates freely. We’ll see, we have reason to work near there.

We have learned, via profiling, that our list comprehension approach to editing the candidate lists is costly. Why are we doing that? Well, mostly because Python lists throw an exception if you try to remove something that’s already not there. We may profit from changing the base Candidates structure to an array of dictionaries rather than lists.

We know that the Component creation is taking a notable amount of time. That’s because we compute the scripts every time we create a Component. There are only 81 possible cells to inquire about, and only 27 unique Components, since each one appears nine times across the row, down the column, etc. We could compute them once and be done. Since we iterate thousands of times, that could be good news.

We have learned that PyCharm’s profiling feature is rather nice, including a really nifty graph.

An odd morning, but you were warned. See you next time!

Friday Geeks Night Out, our Zoom meeting held every Tuesday evening. ↩
↩ ↩²