Dungeon 303
Loose ends. Finding little things to do. Generalizing from few cases and many years. Delenda Chicago est.
There are lots of yellow sticky notes, my local substitute for 3x5 cards, stuck to my zebrawood desk. And there’s one ignored test that nags at me whenever I run the tests, which is really quite often. Let’s clean up some things and see whether they bear any lessons.
Ignored Test
I’ve rigged CodeaUnit to display broken test suite names in red, suites with ignored tests in yellow, and suites that are all good in green. It looks like this:
What is the ignored Decor test? I don’t even remember. It was literally days ago.
_:ignore("Decor gives item only once", function()
local tile = FakeTile()
local item = FakeItem()
local decor = Decor(tile,item)
decor:giveItem()
_:expect(receivedItem).is("item")
receivedItem = "nothing"
decor:giveItem()
_:expect(receivedItem).is("nothing")
end)
Ah. Yes. In days of yore, Decor actually gave its contained item to the player. This test insures that, once it has done that, it won’t do it again. But we’ve changed how that all happens.
Now, the Decor, when first bumped, creates a Loot item on its own square. The Loot handles the giving. So what we want to test here is that the Decor doesn’t keep creating loots.
We’d best look at how it works.
function Decor:actionWith(aPlayer)
self.sequence:action(aPlayer)
end
function Decor:firstAction(aPlayer)
self:changeAppearance()
self:showLoot()
self:applyDamage(aPlayer)
end
function Decor:secondAction()
-- nothing
end
function Decor:thirdAction()
self.open = true
end
And the sequence is this:
function Decor:initActionSequence()
local actionTable = {
self.firstAction,
self.secondAction,
self.thirdAction
}
self.sequence = ActionSequence(self, actionTable)
end
We don’t need to test the ActionSequence here. It’s tested where that class is defined. And, honestly, it’s clear by inspection that it works. But what would prevent me from accidentally improving the sequences and introducing an error? I guess, based on years of experience, nothing will prevent that, but a test might catch it.
But this object has a side effect, creating and showing the Loot, and we want to check to be sure that it does that only once. Let’s see how it does it now:
function Decor:showLoot()
self:currentTile():moveObject(self.loot)
end
Meh. We have created our loot in init
, so that all we have to do is tell our current tile to move the loot to itself. We want to be sure that Decor only does that once, no matter how many times we interact with it. But it’s all action at a distance, isn’t it? And currentTile, if I’m not mistaken, is interesting in itself:
function DungeonObject:currentTile()
return Bus:query("tileContaining", self)
end
In the olden days, to do a decent job testing this, we’d need to set up a GameRunner, a Dungeon, and all that jazz … but also in the olden days, the Decor saved its tile, which might have given us a leg up here.
Let’s see what happens if we turn our test on and recast it to use the current scheme.
So far, all I’ve done here is replace the ignore
with test
, and change the giveItem
to actionWith
. I’m hoping it won’t be too hard to make that work. One issue is that actionWith
is supposed to pass in a player, and we’re not.
Test to see what happens.
6: Decor shows Loot only once -- Decor:147: attempt to index a nil value
That’s here:
function Decor:showLoot()
self:currentTile():moveObject(self.loot)
end
So currentTile
must have returned nil. Not too much of a surprise, since we’re using a fake tile in the test. In the Decor init, we do this:
tile:moveObject(self)
And that does this:
function Tile:moveObject(thing)
self:addContents(thing)
end
function Tile:addContents(aDungeonObject)
Bus:publish("moveObjectToTile", aDungeonObject, self)
end
Is this the spooky “important business logic getting implicit” that Hill was worried about yesterday? Maybe. Be that as it may, if we add that publish to our FakeTile, what happens?
function FakeTile:addContents(something)
self.added = something
Bus:publish("moveObjectToTile", something, self)
end
This happens:
6: Decor shows Loot only once -- CombatRound:128: attempt to index a nil value (field 'defender')
Yes. That’s happening because we’re doing damage. The code is written without conditions:
function Decor:firstAction(aPlayer)
self:changeAppearance()
self:showLoot()
self:applyDamage(aPlayer)
end
function Decor:applyDamage(aPlayer)
local round = CombatRound(self,aPlayer)
self.danger(self,round)
end
The creation of the round is wasteful but harmless. We won’t use it unless danger
says to. Danger, irritatingly, is random:
local dt = {self.doNothing, self.doNothing, self.castLethargy, self.castWeakness}
self.danger = dt[math.random(1,#dt)]
We’ve done this before. We’ll do it again.
_:test("Decor shows Loot only once", function()
local tile = FakeTile()
local item = FakeItem()
local decor = Decor(tile,item)
decor.danger = decor.doNothing
decor:actionWith()
_:expect(receivedItem).is("item")
receivedItem = "nothing"
decor:actionWith()
_:expect(receivedItem).is("nothing")
end)
Here, I’ve patched the decor instance to do nothing with the combat round. A bit iffy, yes, but we’re trying to make this work.
The test now fails as we might expect:
6: Decor shows Loot only once -- Actual: nil, Expected: item
I rather hate this. We’ll talk about that below. I really just want to know that showLoot
is called only once. I don’t want to patch it, but let’s do that as a stopgap.
local lootCount
_:test("Decor shows Loot only once", function()
local tile = FakeTile()
local item = FakeItem()
local decor = Decor(tile,item)
decor.danger = decor.doNothing
decor.showLoot = function()
lootCount = lootCount + 1
end
lootCount = 0
decor:actionWith()
_:expect(lootCount, "did not show loot").is(1)
decor:actionWith()
_:expect(lootCount, "showed loot again!").is(1)
end)
I monkey-patched Decor twice. First, to be sure it doesn’t try to apply damage. Second, to replace its standard showLoot with a function that ticks a lootCount
. We expect that showLoot
to be done once, and the test checks for that.
The test runs. We’ve shown that Decor only displays its loot once. We can double-check that by inserting a defect:
function Decor:secondAction()
-- nothing
self:showLoot()
end
The test should now fail. It does:
6: Decor shows Loot only once showed loot again! -- Actual: 2, Expected: 1
Take out the defect, test, commit: Decor tests now check that Loot is rezzed only once.
Let’s reflect.
Reflection
Well, the good news is, we have a test that locks in the Decor’s requirement to show its Loot once and only once.
The bad news is that we had to patch the object to come up with a reasonable test: we patched once to be sure the Decor wouldn’t try to do damage in our testing situation, and again to cause the showLoot
to tick a counter instead of actually showing the Loot. The reason that was “desirable” was that the alternative was to test this sequence:
- The Decor sees
actionWith
. - Decor does ‘showLoot`, which
- Uses the Bus to get the Decor’s tile, and then
- Tells the tile to accept the Loot, which
- Publishes move object to tile, which
- Is subscribed by DungeonContents, which
- Tucks away the Loot.
We want to know that that happens only once. Even if we wired up enough stuff for that to work, we can’t ask the DungeonContents how many Loots are there … because it stores them by identity and even if we called it a dozen times with the same Loot, it would still contain only one.
How it is tempting to blame the Bus for this, but in the previous implementation the sequence was equally impervious to testing. The only difference was that the tile was cached in the Decor.
But wait! We do have a FakeTile in play. What does it do with the object move?
Ah!
function FakeTile:moveObject(object)
self:addContents(object)
end
Let’s enhance FakeTile to count additions.
function FakeTile:init()
self.contentsCount = 0
end
function FakeTile:addContents(something)
self.contentsCount = self.contentsCount + 1
self.added = something
Bus:publish("moveObjectToTile", something, self)
end
function FakeTile:getContentsCount()
return self.contentsCount
end
Now we should be able to do this:
_:test("Decor shows Loot only once.", function()
local tile = FakeTile()
local item = FakeItem()
local decor = Decor(tile,item)
decor.danger = decor.doNothing
_:expect(tile:getContentsCount(), "Decor not added").is(1)
decor:actionWith()
_:expect(tile:getContentsCount(), "Did not show loot").is(2)
decor:actionWith()
_:expect(tile:getContentsCount(), "Showed loot twice").is(2)
end)
OK. That’s nearly good. We still patch to doNothing
, but that doesn’t trouble me much. If we really cared, we’d be passing in the kind of danger when we create these guys, and since we do have a “big story” about better control over dungeon contents, that may well turn up. For now, I’m OK with that patch. And we were already using our FakeTile, which avoids that deep nest of calls. Enhancing it to detect how many times it was used was just the sort of thing one expects to do with a Fake.
Commit: improved test showing Loot only rezzed once.
- Pausing to reflect is a good thing
- We see here the value of stepping back. I managed to make the test work, but with a couple of monkey-patches installed. Then, pausing game me the chance to think, which made room for the idea about the FakeTile, which can legitimately count the things moved onto it.
-
Thinking is good. Clearing our head is good. It’s all good.
Let’s sum up by thinking about some still larger issues.
Larger Issues
- WARNING
- Some convoluted and wandering thinking occurs below. I include it because thinking around the fuzzy edges of our design is just what we need to come up with better design notions. Still, watch out for tight turns and slippery spots.
It “should” be simple …
You’d think we would have a pretty simple situation here. A Decor item has a Loot item. When player bumps the Loot, it is sent actionWith
. On the first bump, it should show the Loot. On the second, it should do nothing, because the Loot is going to give itself to the player. On the third, it should “open”, which means that it can be stepped upon.
The Decor uses an ActionSequence
object to provide the three states that it needs. ActionSequence
is well tested. (We claim. But it is.) So what we want to be sure of is that Decor does rez the Loot in its first step, and does not in subsequent tests. In essence, we want to know what is, or is not, in each of Decor’s three internal methods that it sequences through.
With tests, we can only determine what this kind of object does by observing what it does to the objects it interacts with, which include a tile, a loot, a player, and so on. What’s irritating is that it is easy enough to look at the code and see that it works. But, if we want to nail down that behavior even after changes are made, we need a test.
I honestly do not see a good way to do this other than to use a test double, such as FakeTile. My usual practice, as a Detroit School TDDer, is to test things by looking at the results they return, or, occasionally, regretfully looking at their internal state. I’m not opposed to test doubles … I just seem not to need them often.
Are test doubles inherently a problem?
But here, in the dungeon program, I seem to be using test doubles more and more. That raises questions in my mind, including:
- What is different about D2 that causes test doubles to be needed more often than I usually see?
- To what extent is that difference bad? To what extent is it good?
- What should I change to get more good and less bad?
This is a wild guess but I think I recognize a key difference. The dungeon program’s design is that there are a lot of little objects that are interacting with each other to get the job done. I think it’s likely that most of the OO programs I write work more in a hierarchical fashion: an object wants an answer, calls another to get it, which calls a few others, and it fans out and then rolls back up into an answer.
Is the difference tied to program structure?
So the difference might be the difference between a hierarchical decomposition of a problem and a decomposition into objects that interact more freely to produce behavior.
It might be the difference between a program that produces answers, and a program that produces complex, unpredictable behavior.
Even an incredibly complex business program, such as a payroll, about which I know more than I’d like to, mostly comes down to a long series of calls to get specific numbers, and a deterministic calculation on those numbers. Net pay is gross pay minus deductions. Gross pay is salary plus bonus. Deductions are the sum of a possibly large collection of specific values with specific calculations providing them.
Even income taxes, about which, again, I know more than I would care to, are all about adding and subtracting and multiplying specific values, each one defined by a specific calculation, all done in a specific order.
Hierarchical works well with Detroit
Programs built like that are amenable to Detroit School TDD: you get an object, ask it a question, and check the answer. You can generally even test them all one at a time or in very small groups, because you’re quite sure that if A works and B works, and C adds up whatever A and B said, the whole thing works. You don’t really need to put A and B under C when you test C, if you can just make sure that C adds whatever two numbers it’s given.
So if you do Detroit School on hierarchically decomposed programs, you might need trivial fakes to provide two numbers to C, but you don’t need to set up complex networks of objects: you can mostly test them one at a time.
Interacting objects don’t have “answers”
But the D2 program isn’t like that. It’s trying to be a loose collection of autonomous treasures, monsters, and you, all interacting to make it seem like you’re exploring a dungeon. The result is that there’s a lot of spooky action at a distance stuff going on.
You click a Lever. Levers could change anything, anywhere: we have one that turns off all the Spikes in the entire level. How do we do that? We could have the Lever own a collection of Spikes and tell them all to turn off. But we don’t. We have Spikes listen for a specific shout from a Lever, and if they hear it, they stop working. And the Lever just shouts when it is toggled.
At the logical level, it’s easy to see that this works. Lever does this:
function Lever:nextPosition()
self.position = self.position + 1
if self.position > 4 then self.position = 1 end
Bus:publish("lever", {name=self.name, position=self.position})
end
That’s a message like “lever” with name “Lever 1” and position 1,2,3, or 4. And we have this:
function Spikes:lever(info)
self.stayDown = false
local pos = info.position or 1
if pos == 4 then
self.stayDown = true
end
end
So if we hear a lever message with position 4, we stay down. It’s easy to see that this is right … or since we’re looking at it … to see that it’s wrong.
As implemented, any Lever that is set to position 4 will cause the spikes to stay down. Is that what we intend? Perhaps we do intend that. Perhaps we really only want one kind of Lever, a “Spike Control Lever”, to do it. If that were the case, we should be checking the Lever name in the code above.
How might we test this? There are at least a couple of ways:
- Create a couple of Levers and a Spikes. Send the levers
nextPosition
and check the Spikes forstayDown
. - Create a Lever and test what it publishes on a call to
nextPosition
. Create a Spikes and test what it does on a call to lever, and test that it subscribes tolever
on the Bus. (The latter might be tricky to test.)
This quickly gets complex. Our program is made up of a bunch of objects, each with its own concerns, interacting in an open and configurable way. That’s enough to make it tricky to test whether Levers and Spikes collaborate as we intend, but our current setup leaves us with additional complexity.
The pub-sub style with the Bus means that we can arrange objects like Lever and Spike so that they can collaborate without regard to much else that goes on. We’re in the process of loosening other connections, such as the connections between a dungeon object and its tile. That connection is now managed by DungeonContents, but not that long ago, dungeon objects held their tile and sent messages to it, and they still send those messages even though they don’t currently hold the tile.
But the testing is strange.
A wild thought has appeared!
Ah. Here’s another reason why musing like this is helpful. It’d be even more helpful with a team.
We have this code in our Decor:
function Decor:showLoot()
self:currentTile():moveObject(self.loot)
end
function DungeonObject:currentTile()
return Bus:query("tileContaining", self)
end
We could argue that the Decor doesn’t care about a tile. It does want to show the loot … wherever it, the Decor, happens to be.
Would the world be a better place—or a worse place—if we had this:
function Decor:showLoot()
Bus:publish("createOnMe", self, self.loot)
end
Presumably, someone reasonable would subscribe to that message. Perhaps it would even be DungeonContents, which can find the self
(the Decor) and put the loot into the same collection. The tile doesn’t get involved at all.
Then our test could come down to providing a TestBus that does things like count calls. We could just ask it how many times it saw “createOnMe”
Not a conclusion exactly, but …
It’s important, I think, to note that we did get a decent test for Decor right away, albeit with some monkey-patching, and then a better test with less monkey-patching, by enhancing a Fake object.
And I think I may have a tiny grasp on an insight about when Detroit School works well—with hierarchical decomposition—and when London School may apply better—non-hierarchical interacting objects. Just a tiny grasp, a tingle. But maybe there’s something there.
And another possible insight: if an object, or a couple of them, are hard to test in this program, maybe it’s not the loose connections getting in the way. Perhaps more use of the Bus, even looser connections, will improve things.
Back to Lever and Spikes
Let’s think a bit more about the Lever and Spikes. For that to work, the Lever has to publish a message, and the Spikes have to subscribe to that message. In addition, when the Spikes get that message, they have to set their state to stayDown. We can test that last bit directly: create a Spikes, send it the lever
message, check the state.
We cannot currently test the rule that Lever and Spikes publish and subscribe to the same message. Could we do that, in a test? I’m not sure. But I do think we can probably test easily enough wether Spikes gets the message from Lever by creating each and triggering the lever.
I don’t know. Maybe Lever/Spikes is too easy. Or maybe we should work to make our other cases, like Decor/Loot, just that easy. And maybe, just maybe, the Bus can help us.
Bottom line (for today) …
I’m fairly confident that the complexity that’s making testing hard is not increased by the Bus. So far, the Bus idea seems to be working well and it seems that more of it will likely make things better.
I do think we’ll find that we want to have an “instrumented bus” for use in testing, to check that things were called and so on. There’s some London School name for that kind of test double. Remind me to look it up.
it seems to me that we are still learning just how best to use this Bus idea, and how best to make our dungeon objects seem to be independent cooperating things that create the game experience.
And most of all, I am amazed that there is still learning to be found in here, after 300 articles and 18 months of development.
See you next time?
And don’t forget. There is no “Chicago School” of TDD. There’s “Detroit”, and “London”, and that’s all there is.