I found the defect, and I think I can fix it. Let’s find out. Then I rationalizeXXX explain something.

I did some bisecting, and discovered that the PathFinder not going down the WayDown happened back in the May-June timeframe. The same crash that we saw yesterday had cropped up, and my fix for it made the PathFinder not finish its path by vanishing. I never noticed, and if we had any users, they might not notice either, because they wouldn’t know what was supposed to happen. Anyway, we have no users, which is sad but just about impossible to fix.

Here’s the method causing the problem:

function Monster:chooseMove()
    -- return true if in range of player
    if self:isDead() then return false end
    self:executeMoveStrategy()
    return self:manhattanDistanceFromPlayer() <= 10
end

If we remove the PathFinder’s tile, the distance call fails on the nil. But then I checked the caller of the method:

function Monsters:move()
    for i,m in ipairs(self.table) do
        m:chooseMove()
    end
end

That code makes no use of the return value. Must be left over from ages ago, probably when I used that check to turn the music on and off.

So we can remove that check.

function Monster:chooseMove()
    if self:isDead() then return false end
    self:executeMoveStrategy()
end

The check for isDead is interesting, too, since we probably should kill the PathFinder when its path completes. We’ll check that while we change the strategy back to what it should be:

function PathMonsterStrategy:execute(dungeon)
    local oldTile = self.monster:getTile()
    local newTile = self.map:nextAfter(oldTile)
    if newTile then
        newTile:moveObject(self.monster)
    else
        self.monster:pathComplete()
    end
end

I think we fiddled with the PathMap too.

function PathMap:nextAfter(tile)
    local cell = self:getCellFromTile(tile)
    local prev = cell.parent
    if not prev then return tile end
    return self.dungeon:getTile(vec2(prev.x,prev.y))
end

We should return nil, not the current tile. Returning the current tile is what kept the monster where it was, on top of the WayDown.

function PathMap:nextAfter(tile)
    local cell = self:getCellFromTile(tile)
    local prev = cell.parent
    if not prev then return nil end
    return self.dungeon:getTile(vec2(prev.x,prev.y))
end

Now let’s see what pathComplete does, just to be sure.

function Monster:pathComplete()
    self.runner:removeMonster(self)
end

That looks just right, doesn’t it? Let’s test. And it works as intended. PathFinder moves to the WayDown and on its next move, vanishes. Perfect.

Commit: Fix defect where PathFinder didn’t vanish.

Since this is my second article this morning, let’s reflect and publish.

Reflection

I’m confident that the defect is fixed: we read through all the relevant code, and I tested it manually, so I think it’s fixed. However, we didn’t have a CodeaUnit test that failed, and we still don’t have one.

If we had had one, we’d have caught this defect three or four months ago. If we had one now, we’d be at least as confident that we fixed the defect, and we’d have added insurance that we wouldn’t somehow break it later.

I’m not up to writing that test today. I think we’d have to set up a bunch of fake objects and fiddle around. Maybe tomorrow. Or maybe it’s one of those things we just deal with and move on.

Let’s talk about that a bit.

Putting in the Hex capability has given me the opportunity to grub around in the depths of the code, and while I’d argue that it’s not bad, it’s not great either. I feel confident that any of my many betters would see issues, and more valuable, see reasonable ways to make things better.

Code tends to deteriorate as we work on it, because we’re not perfect, and we’re impatient and/or lazy. We put in the best code we can think of at the moment. We do make a point of looking at it to refactor it better, but, in the moment, we don’t see all the issues, or don’t see good improvements, or see some improvement but not the best, or see some improvement and it’s lunch time.

On a real product team, we are very likely to be feeling pressure, whether imposed by well-meaning but perhaps less than idea management, or by our own concerns, or, occasionally, by upcoming dates that are actually important. That pressure causes us to try to save time by leaving things “good enough”, even though they’re not as good enough as we deserve, not as good enough as to really let us move as rapidly as we wish.

Because that’s the rub: doing a little more improvement now almost always pays off in moving more easily, and faster, later. It’s like sharpening your knives. Yes, you can over-sharpen, but most of us sharpen too seldom, and our cooking process gets slowed down for lack of an effective knife.

Similarly, our development gets slowed down because our code isn’t sharp enough. It has rough edges, and those edges demand attention when we work with it, and cause surprising defects when we don’t notice.

Mostly, I think all we can do is observe what happens as we work in different ways, and try to work in the ways that seem most effective and most pleasant. (For me those are almost always aligned.) And when we find something that needs a little work, work on it the next time we’re in the area. (I don’t recommend just randomly improving code that isn’t being worked on: I think it can wait. But if you do get some free time to clean up something, sure, do it. It’s your code. As Hill puts it, the code works for you: you don’t work for the code.

Anyway, we fixed a defect. Maybe soon we’ll work out a test for it. Maybe we won’t. I’m not the boss of me.

See you next time!