Sudoku: Moar Nicer

I’ve come up with something that I think will be much nicer for computing the base positions from which to fetch our rows, columns, and sub-grids. A little refactoring does improve things, I think.

In order to pull out the column, row, and sub-grid, we have to compute the starting position from which to slice, based on the position we’re looking at. For example, the position 11 is in column 2 (counting from zero) and we need to slice nine values starting from two.

Similarly, the cell at position 30 is in the row numbered 3, starting at 27 and selecting nine consecutive numbers.

And the sub-grid, well, that’s interesting. for example, cells 30, 31, 32, 39, 40, 41, 48, 49, 50 are in the sub-grid that starts in row number 3, and column number 3. And it selects three numbers from there, three from there plus 9, and three from there plus 18.

I’ve been using calculations like this to find those chunks:

    def used_numbers_in_column(self, position):
        start = position % self.line_size
        used_in_column = self.game[start:len(self.game):self.line_size]
        return [c for c in used_in_column if c != '0']

    def used_numbers_in_row(self, position):
        start = position // self.line_size * self.line_size
        used_in_row = self.game[start:start + self.line_size]
        return [c for c in used_in_row if c != '0']

    def used_numbers_in_sub_grid(self, position):
        first_index_in_row = position // self.line_size * self.line_size
        offset_in_row = position % 9 // 3 * 3
        first_index_in_sub_grid = first_index_in_row // 27 * 27 + offset_in_row
        # print(f'\n{position=}\n{first_index_in_row=}\n{first_index_in_sub_grid=}')
        result = []
        for row in range(first_index_in_sub_grid, first_index_in_sub_grid+27, 9):
            for col in range(0, 3):
                result.append(self.game[row+col])
        return result

The position%self.line_size is pretty clearly the starting position of a given column, I think, but the other expressions are far from obvious:

start = position // self.line_size * self.line_size

And

first_index_in_row = position // self.line_size * self.line_size
offset_in_row = position % 9 // 3 * 3q
first_index_in_sub_grid = first_index_in_row // 27 * 27 + offset_in_row

Again, those are correct but I can’t really see why they’re correct, and I wrote them!

I think we can do better, because last night I realized that this expression does nice things for us:

start = position - position % length

Let me show some examples:

Cell Numbers

0	1	2	3	4	5	6	7	8
9	10	11	12	13	14	15	16	17
18	19	20	21	22	23	24	25	26

27	28	29	30	31	32	33	34	35
36	37	38	39	40	41	42	43	44
45	46	47	48	49	50	51	52	53

54	55	56	57	58	59	60	61	62
63	64	65	66	67	68	69	70	71
72	73	74	75	76	77	78	79	80

Cell % 9 = Column Number

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

Cell - Cell % 9 == First Number in Row

0	0	0	0	0	0	0	0	0
9	9	9	9	9	9	9	9	9
18	18	18	18	18	18	18	18	18

27	27	27	27	27	27	27	27	27
36	36	36	36	36	36	36	36	36
45	45	45	45	45	45	45	45	45

54	54	54	54	54	54	54	54	54
63	63	63	63	63	63	63	63	63
72	72	72	72	72	72	72	72	72

Cell - Cell % 27 == First Number in First Row of Sub-Grid

0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0

27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27

54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54

The above + (i % 9)//3*3 = Start of Sub-Grid

0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6

27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33

54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60

Thus far, I haven’t been able to get rid of that last nasty one. Let’s plug these into the code and see if it looks better.

    def used_numbers_in_row(self, position):
        start = position - position % self.line_size
        used_in_row = self.game[start:start + self.line_size]
        return [c for c in used_in_row if c != '0']

    def used_numbers_in_sub_grid(self, position):
        starting_row_of_sub_grid = position - position % (3*self.line_size)
        offset_in_row = position % 9 // 3 * 3
        first_index_in_sub_grid = starting_row_of_sub_grid + offset_in_row
        result = []
        for row in range(first_index_in_sub_grid, first_index_in_sub_grid+27, 9):
            for col in range(0, 3):
                result.append(self.game[row+col])
        return result

Well, maybe a little better. Let’s extract a method from the end of the second one:

    def used_numbers_in_sub_grid(self, position):
        starting_row_of_sub_grid = position - position % (3*self.line_size)
        offset_in_row = position % 9 // 3 * 3
        first_index_in_sub_grid = starting_row_of_sub_grid + offset_in_row
        return self.fetch_sub_grid(first_index_in_sub_grid)

    def fetch_sub_grid(self, first_index_in_sub_grid):
        result = []
        for row in range(first_index_in_sub_grid, first_index_in_sub_grid + 27, 9):
            for col in range(0, 3):
                result.append(self.game[row + col])
        return result

What might we do to the fetch? Let’s try a few things.

    def fetch_sub_grid(self, first_index_in_sub_grid):
        result = []
        for row_index in range(3):
            cell = first_index_in_sub_grid + row_index*9
            result.extend(self.game[cell:cell+3])
        return result

I think that’s better by a fair margin. It expresses the sub-grid as three slices of size three. We are green and should have been committing for a while.

Commit: refactoring to improve clarity, we hope.

Let’s sum up: I have things to worry about today.

Summary

We have, I think, slightly improved the code that slices and dices the puzzle string to fetch out the row, column, and sub-grids. We do have tests for the sub-gridding, and of course getting the correct puzzle solution is the best test of all.

If this were a production program, I would of course want to test it against some more problems, but I am in fact quite confident that it’s working.

Speaking of production, my friend Ed asked me yesterday about my decision not to study existing solutions, pointing out that in his experience, one gets better results after learning more about solutions that are out there.

And Ed is right to point that out. I replied that I do these exercises in the hope that they’ll be useful to someone, but really for my own joy. And my own joy comes from discovery of solutions and from improving my code.

If I were tasked with coming up with a real production-ready product of any kind, I would want to know as much as possible about the domain and solutions in the domain. I would still want to produce running code as soon and as frequently as possible, but I would much prefer to be primed with lots of knowledge.

As for Sudoku, we know that solving the hard one we have solved takes 78000 attempts at less than 3 microseconds per attempt. However, in my search for problems to solve, I believe that I have seen that programs that use known Sudoku oriented heuristics solve the problems with far fewer operations.

So if we wanted a really fast solver, we might do well to look into more sophisticated algorithms for solving the puzzles. If 20 milliseconds ever gets too slow for my Sudoku solving, perhaps I’ll look into that.

But seriously: humans need heuristics because we are pretty good at pattern recognition and pretty bad at recursive brute-force calculations, so a good set of Sudoku heuristics is important. It’s probably even fun to pull some seldom-used heuristic out of your um bag of heuristics and apply it, resulting in a nifty solution.

I wouldn’t know. Other than a few when the puzzles were just becoming popular, I’ve never solved many at all. I didn’t find it fun. I did find this to be fun. And I enjoy watching videos of people solving them. And videos of people turning trees into tables. Weird.

Fun is where you find it, I guess.

No idea what we’ll do next time, but I hope to see you then!

0	1	2	3	4	5	6	7	8
9	10	11	12	13	14	15	16	17
18	19	20	21	22	23	24	25	26

27	28	29	30	31	32	33	34	35
36	37	38	39	40	41	42	43	44
45	46	47	48	49	50	51	52	53

54	55	56	57	58	59	60	61	62
63	64	65	66	67	68	69	70	71
72	73	74	75	76	77	78	79	80

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	0	0	0	0	0	0	0	0
9	9	9	9	9	9	9	9	9
18	18	18	18	18	18	18	18	18

27	27	27	27	27	27	27	27	27
36	36	36	36	36	36	36	36	36
45	45	45	45	45	45	45	45	45

54	54	54	54	54	54	54	54	54
63	63	63	63	63	63	63	63	63
72	72	72	72	72	72	72	72	72

0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0

27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27

54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54

0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6

27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33

54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60

0	1	2	3	4	5	6	7	8
9	10	11	12	13	14	15	16	17
18	19	20	21	22	23	24	25	26

27	28	29	30	31	32	33	34	35
36	37	38	39	40	41	42	43	44
45	46	47	48	49	50	51	52	53

54	55	56	57	58	59	60	61	62
63	64	65	66	67	68	69	70	71
72	73	74	75	76	77	78	79	80

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	0	0	0	0	0	0	0	0
9	9	9	9	9	9	9	9	9
18	18	18	18	18	18	18	18	18

27	27	27	27	27	27	27	27	27
36	36	36	36	36	36	36	36	36
45	45	45	45	45	45	45	45	45

54	54	54	54	54	54	54	54	54
63	63	63	63	63	63	63	63	63
72	72	72	72	72	72	72	72	72

0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0

27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27

54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54

0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6

27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33

54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60

0	1	2	3	4	5	6	7	8
9	10	11	12	13	14	15	16	17
18	19	20	21	22	23	24	25	26

27	28	29	30	31	32	33	34	35
36	37	38	39	40	41	42	43	44
45	46	47	48	49	50	51	52	53

54	55	56	57	58	59	60	61	62
63	64	65	66	67	68	69	70	71
72	73	74	75	76	77	78	79	80

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8
0	1	2	3	4	5	6	7	8

0	0	0	0	0	0	0	0	0
9	9	9	9	9	9	9	9	9
18	18	18	18	18	18	18	18	18

27	27	27	27	27	27	27	27	27
36	36	36	36	36	36	36	36	36
45	45	45	45	45	45	45	45	45

54	54	54	54	54	54	54	54	54
63	63	63	63	63	63	63	63	63
72	72	72	72	72	72	72	72	72

0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0	0

27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27
27	27	27	27	27	27	27	27	27

54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54
54	54	54	54	54	54	54	54	54

0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6
0	0	0	3	3	3	6	6	6

27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33
27	27	27	30	30	30	33	33	33

54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60
54	54	54	57	57	57	60	60	60