Moving On With Sudoku

A little more explanation of what I'm up to, and a test of a method that actually figures out what could go in a cell. Whee!

Added: Nice Catch, Howard

Howard Trickey of Google noticed that the givens matrix shown below had a typo in it. That makes me think that a little verification testing might be a good thing. We’ll think about that in upcoming articles. Thanks, Howard!

What WAS I Thinking?

I haven’t been writing down my every thought on this program, partly because I don’t have any thoughts, and partly from lack of practice. I’ll try to rectify both of those issues here. Here’s a test I typed in last night. I found a bunch of Sudoku at websudoku.com, and this is Easy Puzzle number 6,484,450,847. Wow, they are well-organized over there.

    def test_first_game
      givens = 
      [0, 2, 3, 8, 0, 0, 0, 0, 7,
       0, 5, 8, 9, 7, 6, 2, 0, 0,
       7, 0, 0, 2, 0, 0, 0, 5, 0,
       9, 0, 0, 4, 0, 2, 0, 0, 0,
       6, 0, 7, 0, 0, 0, 4, 0, 5,
       0, 0, 0, 7, 0, 8, 0, 0, 6,
       0, 1, 0, 0, 0, 7, 0, 0, 8, # modified to fix typo
       0, 0, 9, 5, 8, 4, 1, 7, 0,
       8, 0, 0, 0, 0, 3, 5, 6, 0]
      game = Game.new(givens)
      assert_equal(-1, game.first_constrained)
    end
  end

I was reading on some web site, and chatting on the TDD list (I think) that often, in easy games, a cell is fully constrained by what has gone on before, namely there is only one value that can possibly be in it. I looked at the game above on the web, and found a cell in it that can only have one value. So my thought is to find the first cell in the game that is fully constrained. It seems to me that if at each stage in a game, there is at least one fully constrained cell, then we can just repeatedly find the first one, set it to its necessary value, and repeat. This seems like a job for a computer, not a human.

(Except, perhaps, for the fact that the cell I found is way down toward the bottom of the puzzle, and I found it in very few tries, maybe even one. I did that by just looking at the layout of the givens, and looking in a square that had a lot of values in it, and a cell that had relatively large number of values in the rows and columns aiming at it. It seemed obvious to me that such a cell would be more likely to be fully constrained. So I didn't search each cell until I found one -- I picked a likely one. I don't think a computer would mind just searching them all, but we could think about whether it would be interesting to have it look in some other order than sequential.)

I didn’t mention earlier why I wanted those methods that fetch all the values in a row, column, and square. Well, it seems to me that a given empty cell can contain only a value that is not already in its row, column, and square. So those seem to me to be something we’ll need. (I choose to say “we’ll need” here to emphasize that implementing those methods in yesterday’s spike was a violation of YAGNI, the rule that says not to implement things just because you need them.)

"And thirdly, the Code is more what you'd call 'guidelines' than actual rules. Welcome aboard the Black Pearl, Miss Turner."

So sue me. I felt the need to write some code that manipulated the puzzle, and identifying rows and such is a pretty clear need.

However, Cory Foy posted some code last night that seems to be asking similar questions about the rows and columns, but that’s doing so by scanning the rows and columns, looking for things. I’m planning not to do that, as I expect we’ll see in this article, and the row and column fetchers are part of that. Let’s see if I use those, and then let’s see if we can figure out why I did it that way.

Finding the First Constrained Cell

That’s what the test above implies, finding the first constrained cell. My cunning plan for doing that is to go through the cells until I find an empty one that is fully constrained. By that I mean that in the values in its row, its column, and its square, there is only one number left.

I typed in the test above because I had the Sudoku puzzle up on the screen, so it was convenient to type it in. I’m not at all sure that I can make that puzzle run all at once, because it’s going to take a fair amount of code. I have to figure out which row number a cell is in, which column, and which square. Each of those is a bunch of integer divides and modular arithmetic, which I have a good chance of getting wrong. I might have to back off and do some simpler tests, but I’m tempted to go right for this one.

No. As we talk about it, I’m just sure that that I’ll do at least one of those calculations wrong. I don’t want to write 243 tests for row, column, and square, but I’d better write some first. So I’ll put that test on hold and write some other ones first. First the row:

    def test_row_containing
      assert_equal(0, @game.row_containing(0))
    end

That should be easy. I might even fake it. While I was writing that test, I changed all the existing ones to use @game, a member variable, which I initialized in setup, and removed all the individual setups. There were already six or seven tests, so I had been derelict in not removing that duplication. It happens when you get excited. Anyway, let’s make that test run.

    def row_containing(aCell)
      0
    end

Well, done. Ship it. On the other hand, maybe a few more test values. Now some people might write a test for every row they plan to look up. I myself do not intend to do that: I’ll just test some more values in our existing test.

    def test_row_containing
      assert_equal(0, @game.row_containing(0))
      assert_equal(0, @game.row_containing(8))
      assert_equal(1, @game.row_containing(10))
      assert_equal(2, @game.row_containing(20))
      assert_equal(3, @game.row_containing(30))
      assert_equal(4, @game.row_containing(40))
      assert_equal(5, @game.row_containing(50))
      assert_equal(6, @game.row_containing(60))
      assert_equal(7, @game.row_containing(70))
      assert_equal(8, @game.row_containing(80))
    end

I’m not sure why I went by tens. I can see how those values could lead to a bad implementation. And I’m not entirely comfortable with the method name, because it could mean “look for a row containing the value X” instead of “look for a row containing the cell X”. I’ll let that ride but might rename it if it bothers me more.

The test fails at line 37. Now if I had used that one assertion per test trick, the error message might have been more clear, but at the cost of nine more test methods. Today I prefer this. Now to implement row_containing:

    def row_containing(aCell)
      aCell / 9
    end

The tests all run. The rows go 0..8, 9..17, and so on. So that implementation seemed obvious. Less obvious will be the columns. Let’s rape and paste that test:

    def test_column_containing
      assert_equal(0, @game.column_containing(0))
      assert_equal(8, @game.column_containing(8))
      assert_equal(1, @game.column_containing(10))
      assert_equal(2, @game.column_containing(20))
      assert_equal(3, @game.column_containing(30))
      assert_equal(4, @game.column_containing(40))
      assert_equal(5, @game.column_containing(50))
      assert_equal(6, @game.column_containing(60))
      assert_equal(7, @game.column_containing(70))
      assert_equal(8, @game.column_containing(80))
    end

Now I don’t like this one, because the same implementation would work for most of the items, because I’m going down the diagonal. Better choose some harder values.

    def test_column_containing
      assert_equal(0, @game.column_containing(0))
      assert_equal(8, @game.column_containing(8))
      assert_equal(1, @game.column_containing(19))
      assert_equal(2, @game.column_containing(29))
      assert_equal(3, @game.column_containing(39))
      assert_equal(4, @game.column_containing(49))
      assert_equal(5, @game.column_containing(59))
      assert_equal(6, @game.column_containing(69))
      assert_equal(7, @game.column_containing(79))
      assert_equal(8, @game.column_containing(71))
    end

Now I’m still uncomfortable. I did this test by kind of reasoning about the puzzle board. I need to draw one that’s neat enough to examine. But first I’l see whether this one can be made to work. Let’s think. The column number is just the cell number modulo 9, isn’t it? Cell 0 is column zero, cell 9 is zero, and so on. So …

    def column_containing(aCell)
      aCell % 9
    end

The test runs. I wasn’t thinking about modulo when I wrote the test, so I’m much more confident than I was.

How Confident Should I Be?

My guess is that you may not be entirely confident in this code right now, and that Chet might not be if he were here pairing with me. When you’re not confident in the code, the cool thing to do is to propose a test that you think won’t run. That way you don’t have to impugn the moral integrity of your pair. Let’s add column tests going up the minor diagonal of the puzzle:

    def test_column_containing_minor_diagonal
      assert_equal(0, @game.column_containing(72))
      assert_equal(1, @game.column_containing(64))
      assert_equal(2, @game.column_containing(56))
      assert_equal(3, @game.column_containing(48))
      assert_equal(4, @game.column_containing(40))
      assert_equal(5, @game.column_containing(32))
      assert_equal(6, @game.column_containing(24))
      assert_equal(7, @game.column_containing(16))
      assert_equal(8, @game.column_containing(8))
    end

I got those numbers by looking at a picture. Let’s see if it runs. In fact it does! Now about the squares. For those, what would be good to test? Let’s test the center value in each square, and a few others, maybe the major diagonal of the top left and bottom right squares. Then we’ll see how we feel.

    def test_square_containing
      assert_equal(0, @game.square_containing(10))
      assert_equal(1, @game.square_containing(13))
      assert_equal(2, @game.square_containing(16))
      assert_equal(3, @game.square_containing(37))
      assert_equal(4, @game.square_containing(40))
      assert_equal(5, @game.square_containing(43))
      assert_equal(6, @game.square_containing(64))
      assert_equal(7, @game.square_containing(67))
      assert_equal(8, @game.square_containing(70))
    end

I did that one by looking at the picture again. I’ll make that work then add a few more tests. Let’s reason. The square numbers go across and then down:

0 1 2
3 4 5
6 7 8

Those values are three times something, plus 0, 1, or 2, depending on something else. What are the something and the something else? They must relate to the row number and column number. Ha. “Something” is the row number of the cell, divided by three. The “something else” is the column number of the cell, also divided by three. At least that’s my bet …

    def square_containing(aCell)
      row_containing(aCell) / 3 + column_containing(aCell) / 3
    end

Bach, Hummel. Wrong. Let’s think again. Ah. I forgot the “three times something”

    def square_containing(aCell)
      row_containing(aCell) / 3 * 3 + column_containing(aCell) / 3
    end

Right. That works, and it looks awfully familiar. Where have I seen that kind of expression before? Ah … something much like it appears in this method:

    def start_cell(square_number)
      first_row = square_number / 3 * 3
      first_column = (square_number % 3) * 3
      first_row * 9 + first_column
    end

Now all this modular and integer arithmetic is fine, and I think we’re good. (I’m not going to forget to check some more values in the square_containing test, though.) But it’s not entirely clear what all the divides and mods are about, even though they are correct. I’m wondering if there aren’t some mathematical identities lurking here in this puzzle manipulation, and some objects waiting to be born in the code. I might think about those … they might make the code more clear, or more reliable.

I’m also noticing a bunch of 3s and 9s. Those relate, of course, to the order of the puzzle, and its square, the length of the rows. Those will have to be changed if and when we go to different sized puzzles. Might be better to clean that up soon, before there are a lot of them. They’re definitely a kind of duplication, though giving them names won’t help with that, and they definitely don’t communicate what they mean, which names will help with. Remind me of that when we get to a refactoring stage. For now, a couple more tests:

    def test_square_containing
      assert_equal(0, @game.square_containing(10))
      assert_equal(0, @game.square_containing(0))
      assert_equal(0, @game.square_containing(20))
      assert_equal(1, @game.square_containing(13))
      assert_equal(2, @game.square_containing(16))
      assert_equal(3, @game.square_containing(37))
      assert_equal(4, @game.square_containing(40))
      assert_equal(5, @game.square_containing(43))
      assert_equal(5, @game.square_containing(33))
      assert_equal(5, @game.square_containing(53))
      assert_equal(6, @game.square_containing(64))
      assert_equal(7, @game.square_containing(67))
      assert_equal(8, @game.square_containing(70))
      assert_equal(8, @game.square_containing(60))
      assert_equal(8, @game.square_containing(80))
    end

Those new ones all run. I’m feeling a bit edgy, though, because of the regularity of those numbers. I think I’ll add a couple more, from the minor diagonal of the sub-squares:

    def test_square_containing
      assert_equal(0, @game.square_containing(10))
      assert_equal(0, @game.square_containing(0))
      assert_equal(0, @game.square_containing(20))
      assert_equal(1, @game.square_containing(13))
      assert_equal(2, @game.square_containing(16))
      assert_equal(3, @game.square_containing(37))
      assert_equal(3, @game.square_containing(29))
      assert_equal(3, @game.square_containing(45))
      assert_equal(4, @game.square_containing(40))
      assert_equal(5, @game.square_containing(43))
      assert_equal(5, @game.square_containing(33))
      assert_equal(5, @game.square_containing(53))
      assert_equal(6, @game.square_containing(64))
      assert_equal(7, @game.square_containing(67))
      assert_equal(7, @game.square_containing(59))
      assert_equal(7, @game.square_containing(75))
      assert_equal(8, @game.square_containing(70))
      assert_equal(8, @game.square_containing(60))
      assert_equal(8, @game.square_containing(80))
    end

Test still runs, so I’m confident that it’s good. I should probably have been confident sooner, since the math is so simple, but when you do modular arithmetic and integer divides, you need to be sure about the edge cases. Usually with a zero base, such as we are using, you’re good with the simple expressions, but it can’t hurt to be sure.

However, these tests are not terribly easy to read, and they were a pain to write. That might be a sign that my objects aren’t helping, but in this case I don’t think so, I think it’s just showing me that I’m testing the insides of the object. But still … maybe there is an object trying to get out. Some kind of row-column-square thing? I don’t know, and don’t see it yet.

For now, the code works. Let’s move on to constraints.

Constraints

I’m thinking that to make my big test run, I need a couple of helper methods, and I’m going to build them first. This is not my usual way. Usually I’d just code up the first_constrained method by intention and then make its inner methods work. I could do that … it’s just a loop over the cells, asking whether each is constrained. Doing the helper methods feels better to me. Maybe it means I’ve thought too much about this, or that I’m following my anticipation too much.

Talked myself into it. I’ll just write first_constrained and make it work. If it doesn’t go easily, I’ll write tests for the inner guys. Here goes:

    def first_constrained
      (0..80).each do
        | cell |
        return cell if cell.constrained?
      end
      return -1
    end

I don’t much like that minus one return if there is no cell constrained. Hmm. It should be nil. An integer in Ruby is an object. Too much C# on my mind. I’ll change it to return nil if it doesn’t find a constrained.

    def first_constrained
      (0..80).each do
        | cell |
        return cell if cell.constrained?
      end
      return nil
    end

    def constrained?(aCell)
      possible_values(aCell).length == 1
    end

    def possible_values(aCell)
      possibles = 0..9
      possibles = possibles - row(row_containing(aCell))
      possibles = possibles - column(column_containing(aCell))
      return possibles - square(square_containing(aCell))
    end

What’s going on here? Well, a cell is constrained if the number of possible values it has is one. Its possible values are 1 through 9, minus all the values already in its row, minus the values in its column, minus the values in its square. So this should do it.

I expect the test to fail, because it’s looking for -1. Then I’ll manually check the cell that it finds to see if it is in fact constrained. Here’s the test again, so we don’t have to scroll up:

    def test_first_game
      givens = 
      [0, 2, 3, 8, 0, 0, 0, 0, 7,
       0, 5, 8, 9, 7, 6, 2, 0, 0,
       7, 0, 0, 2, 0, 0, 0, 5, 0,
       9, 0, 0, 4, 0, 2, 0, 0, 0,
       6, 0, 7, 0, 0, 0, 4, 0, 5,
       0, 0, 0, 7, 0, 8, 0, 0, 6,
       0, 1, 0, 0, 0, 7, 0, 0, 8, # modified to fix typo
       0, 0, 9, 5, 8, 4, 1, 7, 0,
       8, 0, 0, 0, 0, 3, 5, 4, 0]
      game = Game.new(givens)
      assert_equal(-1, game.first_constrained)
    end

Here we go. Hold on, that was a lot of code, ten or so lines. Bfff, that was odd. I did cell.constrained? and should have said this:

    def first_constrained
      (0..80).each do
        | cell |
        return cell if constrained?(cell)
      end
      return nil
    end

That tells me something, however. I wanted to send the constrained? message to the cell. But the cell is a plain integer. This may indicate an object trying to be discovered. We’ll see. Now to run the test again … darn. Ranges don’t understand -. We’ll make our starting possibles an array.

    def possible_values(aCell)
      possibles = [1, 2, 3, 4, 5, 6, 7, 8, 9]
      possibles = possibles - row(row_containing(aCell))
      possibles = possibles - column(column_containing(aCell))
      return possibles - square(square_containing(aCell))
    end

The starting possibles were wrong anyway, I had a zero in there. Did you notice that? Running the test again … it says -1 expected, but was three. That’s not so good, since cell 3 already has something in it. I need to include the notion that the cell is empty.

    def constrained?(aCell)
      @cells(aCell) == 0 && possible_values(aCell).length == 1
    end

This is feeling a bit hackish. Two dumb mistakes. The tests are helping me, but I think I’m skating on thin ice here. Better slow down. Possibly doing the inner tests first was a good idea. Run the test again … arrgh. Syntax error. Shoudl have had:

    def constrained?(aCell)
      @cells[aCell] == 0 && possible_values(aCell).length == 1
    end

I’m definitely going too fast, and need a pair. I should take a break. I’ll see if this test will work … it should be OK now. Or am I beating my head on the keyboard? Let’s see … now it says -1 expected but was 23. That’s at least plausible. In the game, cell 23 is the one in the lower right corner of square 1. I’d better give you a picture of the game you can understand. Hold on …

The blue numbers there are the two constrained cells I found by inspection. The cell we care about, again, is the lower right in square one, right under the six. Is it constrained? Let’s see … from the column, it can’t contain 2, 3, 4, 6, 7, 8. From its row, it can’t contain 2, 5, or 7, making the list 2, 3, 4, 5, 6, 7, 8. In the square, we see 2, 6, 7, 8, 9, adding the 9 to the list. Therefore the cell can only contain 1, and it is constrained!

The first_constrained method has definitely found a constrained cell, but can we be sure that it hasn’t missed any earlier ones? The only way I see is to manually check all the preceding cells. Or I could write a test where I know the answer, I suppose. Let’s leave that to a later test. For now, I’ll fill in this value in this test, and test the proposed cell value as well:

    def test_first_game
      givens = 
      [0, 2, 3, 8, 0, 0, 0, 0, 7,
       0, 5, 8, 9, 7, 6, 2, 0, 0,
       7, 0, 0, 2, 0, 0, 0, 5, 0,
       9, 0, 0, 4, 0, 2, 0, 0, 0,
       6, 0, 7, 0, 0, 0, 4, 0, 5,
       0, 0, 0, 7, 0, 8, 0, 0, 6,
       0, 1, 0, 0, 0, 7, 0, 0, 8, # modified to fix typo
       0, 0, 9, 5, 8, 4, 1, 7, 0,
       8, 0, 0, 0, 0, 3, 5, 4, 0]
      game = Game.new(givens)
      assert_equal(23, game.first_constrained)
      assert_equal(1, game.proposed_value(23))
    end

I’ll implement proposed_value and see if the test runs:

    def proposed_value(aCell)
      possible_values(aCell).first
    end

The test runs, but that code is a bit weak … I’m relying on the fact that there will only be one possible value. Let’s make it stronger, by returning nil if there’s more than one possible:

    def proposed_value(aCell)
      possibles = possible_values(aCell)
      return nil if possibles.length != 1
      return possibles.first
    end

Tests are all good. Time to go to T’ai Chi class. See you after lunch.

The Prodigal Returns

I’m back, exercised and fed. Also had time to drop off a few CDs I had borrowed from my son, over at Jolly Pumpkin. Let’s see where we stand.

I think that the first_constrained thing actually works, but it would be nice to have a test that’s more obviously right. I’d have to invent it, I guess.

I got a couple of nice suggestions from Joseph Graves, on the TDD list. One was the suggestion that I might take a solved puzzle, remove one value, then solve it, then remove two, and so on. He suggested comparing the returned result each time to the known good answer, which is kind of a cool notion. We might try that, depending on how things go.

For right now, our tests are green, so let’s take a break and come back in the next article. Enjoy, and keep those cards and letters coming in.