I have shipped a defect, and a serious one. What happened? How could I have avoided it?

I loaded the Dung program on my tv room iPad last night, planning to browse the code and build up a sense of how best to proceed on the “big story” about providing level designers more control over dungeon objects in a given level. Fiddling around, I found something odd. I noticed that Inventory items are supposed to highlight when you touch them in the inventory bar at the top of the screen. I didn’t recall that happening, so I tried it.

It turns out that if you have two items and touch one, it highlights, but if you have only one item and touch it, it does not highlight. That’s not the defect. In testing the highlighting, at one point I had a PathFinder in inventory and when I touched it, the system crashed. The bug, sorry, defect, was immediately obvious:

function PathMap:initCells()
    self.cells = {}
    for key,tile in self.dungeon.map:pairs() do
        local x = tile.mapPoint:x()
        local y = tile.mapPoint:z() -- note z
        self.cells[key] = MapCell(x,y, self, self.dungeon)
    end
end

Dungeon does not have .map any more. The correct code is this:

function PathMap:initCells()
    self.cells = {}
    for key,tile in self.dungeon:getMap():pairs() do
        local x = tile.mapPoint:x()
        local y = tile.mapPoint:z() -- note z
        self.cells[key] = MapCell(x,y, self, self.dungeon)
    end
end

I made that change and then—and only then—this happened:

1: Initial Map -- PathFinder:147: attempt to call a nil value (method 'getMap')
2: One Frontier Step -- PathFinder:147: attempt to call a nil value (method 'getMap')
3: A few steps -- PathFinder:147: attempt to call a nil value (method 'getMap')
4: Full Loop -- PathFinder:147: attempt to call a nil value (method 'getMap')
5: Obstacle -- PathFinder:147: attempt to call a nil value (method 'getMap')
6: target accepts tile -- PathFinder:147: attempt to call a nil value (method 'getMap')

Six PathFinder tests failed. PathFinder is extensively tested, because it is a somewhat tricky algorithm. But no tests failed when I made the change requiring getMap, and when I did get PathFinder right, six tests suddenly fail. What’s up.

Test Double

PathFinder tests use a FakeDungeon test double, Reviewing the initial PathFinder article, I said

The real Dungeon class is too big. I’ll fake this one too.

I had already created a FakeMapTile, since I didn’t need any particular Tile functionality. The FakeMapDungeon is quite simple, just providing Tile access and an isAccessible method:

FakeMapDungeon = class()

function FakeMapDungeon:init(tiles, runner, player)
    self.map = tiles
end

function FakeMapDungeon:cellIsAccessible(pos)
    return true
end

function FakeMapDungeon:getTile(pos)
    return self.map:atXYZ(pos.x, 0, pos.y)
end

And there at the bottom we see that we fetch our map with self.map, so when the PathFinder initializes itself, the map member happens to be present, and the old tests used it and ran. But in the real Dungeon, map is not present, and PathFinder crashes. Implement it correctly, and it crashes during testing.

It’s easy enough to fix the tests:

function FakeMapDungeon:getMap()
    return self.map
end

function FakeMapDungeon:getTile(pos)
    return self:getMap():atXYZ(pos.x, 0, pos.y)
end

As a matter of policy, I feel that if an object has a getter for some member, it’s better to use the getter even internally. So I did that here. Now the tests are green and the PathFinder works. Ship it. Commit: Fix defect causing pathfinder to crash the system.

I am generally reluctant to use test doubles, mock objects, and the like, and what has happened here is exactly why. When I test against my real objects, defects like this one will be found. Testing against fake objects, changes to the real object have to be reflected in the fakes, or else things like this can go wrong.

Are We There Yet?

We’ve fixed the problem and fixed the tests. But are we done? It’s hard to argue that we are: suppose we were to change Dungeon’s method getMap to getTheMap. If we don’t find and change PathFinder, it would again pass all the tests, and yet not work in the game. That would be bad, especially since everyone would remember the first time and ask why we didn’t make the problem impossible.

I’m really reluctant to deal with this. The tests are all set up to rely on the fake objects. Here’s a typical test:

            local dungeonClass = FakeMapDungeon
            local tiles = FakeMapTile:create(50,50)
            local dungeon = dungeonClass(tiles)
            local targ = dungeon:getTile(vec2(10,10))
            local testing = true
            local map = PathMap(dungeon, targ, testing)
            map:targetXY(10,10)
            map:step()
            _:expect(#map:queue()).is(4)
            map:step()
            _:expect(#map:queue()).is(6)
        end)

OK, how about this. Our objects that get the dungeon don’t use it for much:

function PathMap:initCells()
    self.cells = {}
    for key,tile in self.dungeon:getMap():pairs() do
        local x = tile.mapPoint:x()
        local y = tile.mapPoint:z() -- note z
        self.cells[key] = MapCell(x,y, self, self.dungeon)
    end
end

function PathMap:nextAfter(tile)
    local cell = self:getCellFromTile(tile)
    local prev = cell.parent
    if not prev then return nil end
    return self.dungeon:getTile(vec2(prev.x,prev.y))
end

function MapCell:isAccessible()
    return self.accessible and self.dungeon:cellIsAccessible(self:posVec())
end

Could we pass in a DungeonAnalyzer or something like that, instead of a full Dungeon? Or could we build a real Dungeon but just give it the tiles as is done in the test?

The object we’d like to provide is probably the one referred to as “tile finder”, implemented in the class Tiles.

Tiles = class()

function Tiles:asTileFinder()
    return self
end

function Tiles:getMap()
    return self.map
end

function Tiles:getTile(x,y)
    return self.map:atXYZ(x,0,y) or Tile:edge(x,y,self)
end

function Tiles:setMap(aMap)
    self.map = aMap
end

function Tiles:tilePairs()
    return self.map:pairs()
end

Ah, this is seeming fruitful. The Tiles object, often referred to as “finder” or “tile finder” can do almost everything that the Path objects need. We should be able to use it in the path tests instead of our fake dungeon. Let me remind myself what that class can do:

FakeMapDungeon = class()

function FakeMapDungeon:init(tiles, runner, player)
    self.map = tiles
end

function FakeMapDungeon:cellIsAccessible(pos)
    return true
end

function FakeMapDungeon:getMap()
    return self.map
end

function FakeMapDungeon:getTile(pos)
    return self:getMap():atXYZ(pos.x, 0, pos.y)
end

I’m not fond of returning the map. Why do we need that?

function PathMap:initCells()
    self.cells = {}
    for key,tile in self.dungeon:getMap():pairs() do
        local x = tile.mapPoint:x()
        local y = tile.mapPoint:z() -- note z
        self.cells[key] = MapCell(x,y, self, self.dungeon)
    end
end

Ah. We do that differently in Tiles: we return the pairs from inside. Change PathMap to do that:

function PathMap:initCells()
    self.cells = {}
    for key,tile in self.dungeon:tilePairs() do
        local x = tile.mapPoint:x()
        local y = tile.mapPoint:z() -- note z
        self.cells[key] = MapCell(x,y, self, self.dungeon)
    end
end

Implement that in our FakeMapDungeon:

function FakeMapDungeon:tilePairs()
    return self.map:pairs()
end

I also removed the getMap function, since I don’t want anyone using it in pathing, and went back to saying .map in the getTile method:

function FakeMapDungeon:getTile(pos)
    return self.map:atXYZ(pos.x, 0, pos.y)
end

Now the two classes, Tiles and FakeMapDungeon have nearly the same members. I think we’ll find some issues in X,Y and we don’t have cellIsAccessible yet. Provde the latter:

function Tiles:cellIsAccessible(position)
    return self:getTile(position.x,position.y):isFloor()
end

I’m putting that in without a new test, because I plan to exercise it in the path tests.

One more issue, the path finder accesses by position, not by x and y, and to use a Tiles you have to go by x and y. Change that:

function FakeMapDungeon:getTile(x,y)
    return self.map:atXYZ(x, 0, y)
end

Change the references. Mostly tests. Also when I test I find that the calls with x and y don’t work in play, because the real Dungeon uses getTile with position not x and y. We need to fix that. But the real fix is this:

function GameRunner:mapToWayDown()
    return PathMap(self:getDungeon():asTileFinder(), self.wayDown:getTile())
end

We pass a Tiles (asTileFinder) to the PathMap … and it works. This tells me that I should be able to change the tests, like this:

        _:test("One Frontier Step", function()
            local tiles = FakeMapTile:create(50,50)
            local dungeon = Tiles()
            dungeon:setMap(tiles)
            local targ = dungeon:getTile(10,10)
            local testing = true
            local map = PathMap(dungeon, targ, testing)
            map:initCells()
            map:targetXY(10,10)
            _:expect(map:queue()[1]).is(map:cell(10,10))
            map:step()
            _:expect(#map:queue()).is(4)
            local queue = map:queue()
            _:expect(queue).has(map:cell(9,10))
            _:expect(queue).has(map:cell(10,9))
            _:expect(queue).has(map:cell(11,10))
            _:expect(queue).has(map:cell(10,11))
            _:expect(map:cell(9,10).parent).is(map:cell(10,10))
            _:expect(map:cell(10,11).distance).is(1)
        end)

This may break, but if it does it’ll guide me to finishing up. I am hopeful that it’ll work.

2: One Frontier Step -- Tiles:123: attempt to call a nil value (method 'isFloor')

Ah, this’ll be tricky. Or will it? We have a FakeMapTile here. Give it the method.

function FakeMapTile:isFloor()
    return true
end

Test. Tests are green. Change the others … and remove the FakeMapDungeon class, testing after each change.

We’re green. Double check by running a pathfinder. We’re all good. Commit: Change PathMap and tests to use Tiles instance instead of FakeMapDungeon. Ensures tests run on same code as game.

Let’s lift our heads out of the mud and reflect.

Reflection

Mirror mirror on the wall … never mind, I know I’m pretty.

We have been bitten firmly on the fundament by the fundamental flaw in test doubles: if they do not correctly mirror1 the behavior of the real objects, our tests cannot be relied upon. I’m sure my colleagues from the “London School2” of testing will let me know why that never happens, but it happens to me, and it happened right there3.

The fix was easy. Making sure that the problem could nor recur required either that I somehow magically keep the fake object current with the real one, or get rid of the fake. Plugging the Dungeon in there seemed daunting, but I remembered my new helper objects, the DungeonBuilder and DungeonAnalyzer, and, finally, the one I needed, Tiles. (I still think I prefer the name TileFinder.)

To plug Tiles in, I first made the two classes, Tiles and FakeMapDungeon, have the same protocol. Then plugging then in was easy, although I hit a glitch when it turned out that Dungeon uses getTiles(pos) while Tiles uses getTiles(x,y). That was easily fixed by casting the Dungeon to Tiles, which we wanted to do anyway.

Could the problem have been avoided?

Well, obviously if I were sufficiently smart, the problem could have been avoided. At the time that the pathfinding was implemented, Dungeon was a big object and hard to create, so that creating a fake one made sense in context. Perhaps a really careful search for .map would have turned up the FakeMapDungeon when we put in the getMap. But demanding “a really careful search” is not the way to avoid problems. We avoid problems by assuming—usually correctly—that we’ll be rushed, not particularly clever, and generally error-prone when we change things. We avoid problems by building traps for our mistakes.

A manual test protocol that said “always test a PathFinder” would have found the defect. That’s unrealistic.

One thing that “could” have happened is that before making my FakeMapDungeon, I could have observed that the real Dungeon was—and is—hard to set up, and done something about it then and there. Perhaps the Tiles object could have come along sooner, to everyone’s advantage. Again, that’s not entirely realistic, at least not if you’re dealing with me. Making a little fake handler for tiles really does make a lot of sense when creating something like the pathfinder.

Honestly, despite the seriousness of this defect, I don’t see a realistic idea for how the problem could have been avoided. We could have a rule against fake objects, but that seems a bit extreme. We can remind ourselves that when objects are hard to test, we should make them easier to test. We can recognize that the desire to test manually is an indication that we don’t trust our tests … but that’s not very useful in setting a direction.

Maybe next time I want to test manually—maybe every time I want to test manually—I need to then sit back and require myself to write the test that would make me not have that feeling. And certainly, whenever my manual testing finds a problem, I should try to write a test to find it.

Looking back, I think I did the best that I knew to do in the moment. It just wan’t as good as it might have been. It’s better now, and so am I.

I think that’s enough learning for today. What shall we do tomorrow?


D2.zip



  1. See what I did there? 

  2. By the way, the opposite of “London School” is not “Chicago School”. It’s “Detroit School”. Whether Londoners only know two US cities, New York and Chicago, or whether Chicago somehow contrived to steal from Detroit, I do not know. I do know where I was when we first did TDD in the Detroit style. (Strictly speaking, Center Line, but it’s part of metro Detroit.) 

  3. I hypothesize that the London School folks will be programming in a language with the Interface capability, and they can require the real class and the fake to implement the same Interface, so that when they changed Dungeon, FakeMapDungeon would have failed to compile. Yeah, maybe.