Some rationale, an experiment, some learning, some protection.

Even I was wondering yesterday why I was obsessing over the global Universe U and other tiny details. This morning I want to remind us, because here in outer space, obsessing is good.

In day to day work for most programmers, there is plenty of pressure to get things done, and often very little pressure to get them right. Just a day or so ago one of my colleagues reported a question from his class of professional programmers, “How do we convince our boss to let us test the code?”

Now I’ll remind you that my answer from yesterday applies: don’t ask, just do the right thing. But the question tells us a lot about the working situation of that person, and it’s probably not one of peaceful contemplation.

Here in this little exercise, we can think about whatever interest us, try things to see if we like them, change them out if we don’t. I’m not sure why you do it, but I do it because I enjoy building things and I enjoy figuring out how to build them well. And I enjoy sharpening my tools. That’s what this is.

So that’s why, or at least a big part of why.

As to the global U itself, the nature of Codea is such that something global simply must contain everything and dispatch at least some level of commands and information to everything. The Codea draw and touch events occur only in Main, and if anything is to draw itself, it must be told to do so via a path that starts at Main. If something else wants to draw that thing, that something else must be triggered by Main.

So we can conclude that there is some kind of path of code and data that starts at Main and reaches out to all the things that can be seen or heard. The question on my mind: what are good ways to do that?

The current Universe knows an awful lot of stuff:

function Universe:init()
    self.processorRatio = 1.0
    self.score = 0
    self.rotationStep = math.rad(1.5) -- degrees
    self.missileVelocity = vec2(MissileSpeed,0)
    self.frame64 = 0
    self.timeBetweenWaves = 2
    self:defineSounds()
    self.button = {}
    self.asteroids = {}
    self.missiles = {}
    self.explosions = {}
    self.attractMode = true
end

function Universe:defineSounds()
    self.sounds = {}
    self.sounds.bangLarge = asset.bangLarge
    self.sounds.bangMedium = asset.bangMedium
    self.sounds.bangSmall = asset.bangSmall
    self.sounds.beat1 = asset.beat1
    self.sounds.beat2 = asset.beat2
    self.sounds.extraShip = asset.extraShip
    self.sounds.fire = asset.fire
    self.sounds.saucerBig = asset.saucerBig
    self.sounds.saucerSmall = asset.saucerSmall
    self.sounds.thrust = asset.thrust
end

And Universe does a lot as well. It starts the game, creates new waves, tells everyone when to move or draw themselves, plays the beat, finds the collisions, plays almost all the other sounds, helps everything move, draws the score … there’s a lot going on here.

Is it all well organized? I’m sure it could be better but it’s not bad. But there are signs and portents.

Signs and Portents

Yesterday I convinced myself that removing the DeadAsteroids buffer collection would be harmless. The program failed to run.

When our hypotheses about our program go wrong, it’s a sign that the program may not be well structured, may be too complex, may have something wrong with it.

The program’s object collections work in an “interesting” way. Rather than keeping a linear array of objects and updating it with the attendant cost of compacting it, I chose to keep objects in collections hashed to point to themselves:

  U.asteroids[self] = self

That permits us to remove an object by just setting its slot to nil, which removes it from the collection in hash time rather than liinear array time:

  U.asteroids[self] = nil

That’s neat. You could argue that it’s too clever, and I’d argue that the programmer is supposed to know how Lua works. However…

You could also point out that in order to do that action above, the asteroid had to know the name of the collection in Universe and its format. That’s a pretty strong coupling and if you pointed that out, it would hit me harder. Even if the object hashing trick is a good idea – and I think it is – it’s hard to make a good case for coupling that idea so tightly into every object that uses it.

Similarly, when objects are created, they automatically insert themselves into the proper collection in U. Even if they didn’t know the format, it’s generally bad form for instance creation to take care of collections like that. That sort of thing would be done in a Factory object or with a call-back to the creator or something.

Now our program is too small to be glommed up with all the fancy patterns that we know (or in my case know of). So we might decide to accept that oddity.

But they are oddities, some small, some large. We do well to sniff out code smells like that, and to decide if they are something going off that needs correction, or if it’s OK, asteroids just sell like that.

So we listen to the signs and portents, we sniff the air … and then we decide. Those things are hints and allegations, not commands from above. We decide what to do. Our mission is to decide wisely.

In aid of that, I want to understand something that I don’t understand.

DeadAsteroids

Yesterday removing the use of DeaAsteoids to defer the death of asteroids until later broke the system. I didn’t think it would, and I want to understand why it did. There is a gap in my understanding and while I’ve “learned” that it has to be there, I’ve not learned why.

Understanding why might tell us how to remove it, which would be a simplification. It might tell us that we’d better use the technique elsewhere because we’ve just been lucky so far. It might just make us a micron smarter. It might. We can dream.

So my plan here is to remove that usage and see why it breaks.

Here’s where we do it:

function Asteroid:split()
    self:bang()
    DeadAsteroids[self] = self
    Splat(self.pos)
    if self.scale ~= 4 then
        Asteroid(self.pos, self.scale//2)
        Asteroid(self.pos, self.scale//2)
    end
end

When we want an asteroid destroyed, we tuck it into the DeadAsteroids collection and later that collection will be called, presumably when it is safe to delete things.

Why isn’t it safe now? Why can’t we do this:

function Asteroid:split()
    self:bang()
    U.asteroids[self] = nil
    Splat(self.pos)
    if self.scale ~= 4 then
        Asteroid(self.pos, self.scale//2)
        Asteroid(self.pos, self.scale//2)
    end
end

When we run the program, we find out:

invalid key to 'next'
stack traceback:
	[C]: in function 'next'
	Universe:112: in method 'findCollisions'
	Universe:69: in method 'draw'
	Main:12: in function 'draw'

The function next is the name of the function called in a Lua iterator. It’s an internal thing, and we can’t look at it. The Universe code looks like this:

function Universe:findCollisions()
    local needNewWave = true
    for i,a in pairs(self.asteroids) do -- line 112
        needNewWave = false
        self:checkMissileCollisions(a)
        if self.ship then self:checkShipCollision(a) end
    end
    if needNewWave == true then
        if self.timeOfNextWave == 0 then
            self.timeOfNextWave = self.currentTime + self.timeBetweenWaves
        end
    end
end

Narrowing that down, the problem occurs in this loop:

    for i,a in pairs(self.asteroids) do
        needNewWave = false
        self:checkMissileCollisions(a)
        if self.ship then self:checkShipCollision(a) end
    end

The error happens when an asteroid is destroyed. It happens whether the asteroid is hit by a missile or crashes into the ship. But you’d think, wouldn’t you, that we’d be done with the asteroid that has been removed, and that the next would move on to the next pair in pairs.

Here’s a description of next from pgl.yoyo.org:

Allows a program to traverse all fields of a table. Its first argument is a table and its second argument is an index in this table. next returns the next index of the table and its associated value. When called with nil as its second argument, next returns an initial index and its associated value. When called with the last index, or with nil in an empty table, next returns nil. If the second argument is absent, then it is interpreted as nil. In particular, you can use next(t) to check whether a table is empty.

The order in which the indices are enumerated is not specified, even for numeric indices. (To traverse a table in numeric order, use a numerical for or the ipairs function.)

The behavior of next is undefined if, during the traversal, you assign any value to a non-existent field in the table. You may however modify existing fields. In particular, you may clear existing fields.

That last paragraph makes me think I should be allowed to do what I just did, at least if “field” means table element. It’s pretty clear that I cannot. Without seeing the code for next we may be stuck. I have one more idea to try:

function Universe:findCollisions()
    local needNewWave = true
    for i,a in pairs(clone(self.asteroids)) do -- clone the table
        needNewWave = false
        self:checkMissileCollisions(a)
        if self.ship then self:checkShipCollision(a) end
    end
    if needNewWave == true then
        if self.timeOfNextWave == 0 then
            self.timeOfNextWave = self.currentTime + self.timeBetweenWaves
        end
    end
end

function clone(tab)
    local r = {}
    for k,v in pairs(tab) do
        r[k] = v
    end
    return r
end

Here I “just” copied the table and used the copy. It’s no surprise that it now works just fine. We’re no longer editing the table we’re looping over, so no harm is done.

I started to submit a bug report to the Codea forum. I had expected no joy but thought a courteous question might enlighten me. So I created a bug report that built a simple table and iterated over it while deleting some of its elements.

I was unable to make it fail with any reasonable combination of events. This confuses me and makes me wonder what’s going on.

I have one more idea: what if the search loop used the key, not the value, in the processing?

    for a,xxx in pairs(self.asteroids) do

Still fails same way. That’s good, I guess.

Further experimentation finds this:

function Asteroid:split()
    self:bang()
    Splat(self.pos)
    if self.scale ~= 4 then
        Asteroid(self.pos, self.scale//2)
        Asteroid(self.pos, self.scale//2)
    end
    U.asteroids[self] = nil
end

Doing the deletion last makes the defect not happen. Something in the creation of new ones is breaking it. Creating just one is sufficient. But wait:

Read that last bit from pgl.yoyo.org:

The behavior of next is undefined if, during the traversal, you assign any value to a non-existent field in the table. You may however modify existing fields. In particular, you may clear existing fields.

That basically says that we can’t do what I’m doing, namely adding new things to the table … unless I’m happy with “undefined” behavior. But what’s interesting is that the problem doesn’t happen unless I clear the field. That’s not going to help me in a court of law, because since I have to both delete and add to cause the problem, the results are undefined so take it or leave it.

So. The DeadAsteroids collection “fixes” the problem, by avoiding the delete, but it leaves us still adding, which is “undefined”.

What to do? Here’s my thought:

  • Use the clone function I wrote to isolate the list we’re looping over. That makes any manipulation, adding or deleting, perfectly legit, since the collection is no longer being looped on.
  • Move to “hide” the mysterious implementation of those tables by providing and using new functions on Universe, addAsteroid, deleteAsteroid, and similarly for Missiles. That will reduce coupling a bit and allow us to make any new adjustments inside Universe.
  • Remove the DeadAsteroids collection and its use.
  • Request deletion of the old asteroid last. This is superstition.

Should be “easy”. (He said.)

function Asteroid:init(pos, size)
    self.pos = pos or vec2(math.random(WIDTH), math.random(HEIGHT))
    self.scale = size or 16
    self.shape = Rocks[math.random(1,4)]
    local angle = math.random()*2*math.pi
    self.step = vec2(Vel,0):rotate(angle)
    U:addAsteroid(self)
end

function Universe:addAsteroid(asteroid)
    self.asteroids[asteroid] = asteroid
end

function Universe:deleteAsteroid(asteroid)
    self.asteroids[asteroid] = nil
end

function Universe:findCollisions()
    local needNewWave = true
    for i,a in pairs(clone(self.asteroids)) do
        needNewWave = false
        self:checkMissileCollisions(a)
        if self.ship then self:checkShipCollision(a) end
    end
    if needNewWave == true then
        if self.timeOfNextWave == 0 then
            self.timeOfNextWave = self.currentTime + self.timeBetweenWaves
        end
    end
end

function clone(tab)
    local r = {}
    for k,v in pairs(tab) do
        r[k] = v
    end
    return r
end


function Asteroid:split()
    self:bang()
    Splat(self.pos)
    if self.scale ~= 4 then
        Asteroid(self.pos, self.scale//2)
        Asteroid(self.pos, self.scale//2)
    end
    U:deleteAsteroid(self)
end

With these changes and the deletion of killDeadAsteroids, the game works fine. Now I’ll check all reverences to U.asteroids just to be sure I haven’t missed any. There are none.

Commit: “use clone in findCollisions. use add/deleteAsteroids”.

The use of clone is obscure. I can imagine some future programmer thinking that it need not be there. (I seem to recall thinking something similar only yesterday.) A better name might be needed. I could put in a comment, but I was taught that a comment is the code’s way of asking to be made better, so … yeah, I’d better put in a comment until I have a better idea.

    -- we clone the asteroids collection to allow live editing

Now let’s do the missiles similarly. They are like this:

function Missile:init(ship)
    function die()
        self:die()
    end
    self.pos = ship.pos
    self.step = U.missileVelocity:rotate(ship.radians) + ship.step
    U.missiles[self] = self
    tween(3, self, {}, tween.easing.linear, die)
end

function Missile:die()
    U.missiles[self] = nil
end

I suspect they partake of a similar defect but got away with it because they don’t add anything, they just remove and that works. Be that as it may, we do the same thing for symmetry:

function Missile:init(ship)
    function die()
        self:die()
    end
    self.pos = ship.pos
    self.step = U.missileVelocity:rotate(ship.radians) + ship.step
    U:addMissile(self)
    tween(3, self, {}, tween.easing.linear, die)
end

function Missile:die()
    U:deleteMissile(self)
end

function Universe:addMissile(missile)
    self.missiles[missile] = missile
end

function Universe:deleteMissile(missile)
    self.missiles[missile] = nil
end

All seems well. Commit: “missiles use add/deleteMissile”.

Summing Up

Today was interesting. I had not expected to need to drill that far into the workings of Codea, I’m glad I did, because I (finally) figured out that while the problem was triggered by the deletion of an element while the table is being looped over, it only happens if you’ve also adding items, which is “undefined”.

So, with some confusion, we found the bug. We fixed it with a moderately large hammer, the cloning of the table. The thing is, that pretty much always works, but it is a arguably a bit expensive.

We also isolated all the code that edits Universe’s tables into Universe. That’s a good thing.

If all we cared about was getting the game “done”, we wouldn’t be doing all this design. The point here is to do it in a space that is small enough that we can understand it, yet large enough to make the exercise real.

And we just found an anomaly deep inside Lua. How cool is that?

Here’s the code. See you next time!

Zipped Code