Let’s reach deep into the bag of tricks, and see whether we like what we find there.

Yesterday one of the things that bit me, in addition to general lack of perspicacity and vision, was adding items to a table while the table was being iterated over. In Lua, the behavior is undefined when you do that. The item might show up later in the loop, or it might not. Curiously, deleting an item does work. If it hasn’t already shown up, it won’t.

In our programs, we have the occasional need to add to a collection that’s being iterated. In Dung, we were iterating over the contents of a Tile that contained a chest. While the loop was in mid-iteration, the chest opened, created a new object. The chest added the object correctly–we’ll discuss details below–but it happened that the object also added itself immediately. That resulted in the object being found in the same loop, which meant we interacted with it before we even saw it.

Now I had a perfectly good way of ensuring that this wasn’t a problem, called add deferred, which used an auxiliary collection to store additions until they could be placed into the original collection at a safe moment. The chest correctly used add deferred, but creation did not. In general, creation of objects was done at a time when the Tile wasn’t iterating its contents, so a direct add worked fine.

But when things are created on the fly in the game, they have to be done with the add deferred approach.

The problem was obvious once I saw it, same as the elephant over in the far corner, but at first I didn’t see it. For more than two hours, I didn’t see it. It was about a one-line change. I didn’t see it.

This is not the first time I’ve made this mistake. I know this fact well, but you get to coding, and you call off to random objects, and somewhere down there they add something and you’re not thinking about iterating, you’re thinking about getting into the Tile, and BANG! you have a spooky bug.

I asked myself what I could do to avoid this problem, and I asked some of my friends. I got a few answers, including:

“I do that sometimes, too.”

“I often extract what I need from the list and then iterate over the extract.”

“I might create a special kind of “DeferredTable” that encapsulates the necessary object. A problem is knowing when to update, there might be two loops running.”

None of these seemed to be Quite The Thing, for reasons I’ll put here:

I do that sometimes, too.
Yes, well, no help here, though OP did suggest a special table type in a subsequent sentence. We’ll deal with that below.
I often extract and loop over that …
I’m not sure I understood this idea, but it sounds like copying the table and iterating over the copy. That seems like two passes, unless one has some kind of table copy internal function.
I might create a special kind of table …
This is closest to something we might be able to do, and is the subject of this article. Sorry about how long it took to get here.

Narrator (v.o.): He wasn’t sorry.

Ron: You get out of here!!!

Narrator: Sorry.

Ron: He wasn’t sorry.

Deferred Table

What might a deferred table be, and how might it work?

Tables are an essential data structure in Lua. In fact, they are the only data structure. As such, Lua code tends to come down quickly to manipulation of tables, and it’s optimized to make this easy and efficient.

A table can have, at the same time, array aspects and hash table aspects. If you insert things at consecutive integer keys, Lua will try to store the data as an array, without explicit keys, and arranged for rapid looping. If you store at explicit non-integer keys (or even distant integer keys, Lua will store that information as a hash map of key to value.

We’ll focus mostly on the hash map aspect. We put a key-value pair into the table like this:

tab[key] = value

We remove a key-value pair thus:

tab[key] = nil

(Right. Tables can’t contain nils. Deal with it.)

We loop over a k-v table with:

for k,v in pairs(tab) do
    stuffWith(k,v)
end

And that’s the sort of thing you wind up writing, indeed almost have to wind up writing, at some level of the code.

So I’d have a method in GameRunner:

function GameRunner:storeFoo(aFoo)
    self.fooTab[aFoo] = aFoo
end

(I store regular object hashed by their own identity. No need for a special key on the object. The instance itself is a perfectly good key.)

First solution: only provide deferred addition.

I also might have

function GameRunner:storeDeferredFoo(aFoo)
    self.newFooTab[aFoo] = aFoo
end

function GameRunner:updateFoo()
    for k,v in pairs(self.newFooTab) do
        self.fooTab[k] = v
    end
    self.newFooTab = {}
end

Then someone, somewhere, has to call updateFoo, or the new things never show up. So this leads me to my first “solution”, which would be to develop a habit of always creating the store deferred notion and not even implementing the regular store.

One problem with that “solution” is that most tables don’t need to be deferred, and they’d all then need us to find a place to update them, and that would be error-prone and irritating.

Another issue is that inside the GameRunner (or whatever class holds the table) one would still be writing code that ran against the raw table, and so we could commit a similar error on the inside anyway.

If we did this religiously, which we wouldn’t, because it’s not always needed, we might build up better habits and avoid the problem most of the time. That would be good, but not great.

Second solution: an object.

Could we build a class, DeferredTable that acted like a regular table, but was automatically buffered, and that provided an update method?

The short answer is “probably not”. We could certainly write a class that covered a regular table, and that whose add method stored in a buffer table, and whose update method did the copy from buffer to regular table. That’s easy.

But we can’t use a DeferredTable as a regular table. We can’t say

dtab = DeferredTable()
...
dtab[k] = v

Because dtab is a table, but it isn’t the table that contains our k-v pairs, it’s a table that contains the table that contains our k-v pairs. So we can’t, on the face of it, treat it like a table that just happens to have deferral.

(There may be a way to make this happen. That will be part of our exploration today, if time and brain power permits.)

Worse yet, we can’t iterate over our table by saying

for k,v in pairs(dtab) do
   somethingWith(k,v)
end

Because dtab is a table, and its contents are the regular table, the buffer table, and the methods of the class, like update. We don’t want to loop over that.

And we can’t readily override pairs, because there’s nowhere to stand. We could override it entirely, maybe. Then we’d have to decode whether we were looking at a DeferredTable and do realPairs on its contained table, otherwise do realPairs on the table in hand.

(Again, I’ve recently discovered ways that might possibly allow us to make this happen, and we’ll try to explore those ideas.)

Third solution: metatable

Tables in Lua all have associated “metatables” that control what happens when table accesses come up with unexpected results (or when you want to do something special with an expected access).

We might be able to use metatables to get the behavior we want. In fact, I know we can get most of us, as we’ll see in the experiment we’re about to start, because I played with the idea last night. Let’s begin by describing metamethods, and two in particular.

Metatables and Metamethods

As I mentioned, each table has, at least in principle, a metatable. That table can contain metamethods, which are pointers to functions, with defined calling sequences depending on the metamethod in question. When certain events occur, a corresponding metamethod is looked up in the metatable, and if it’s found, that function is called.

One such metamethod is __newindex. If you were to write

tab["a"] = "alphaboo"

And if there was not already an entry for “a” in tab, Lua would look in the metatable to see if there is a function stored under key __newindex. If there is, it’ll be called, passed tab, “a”, and “alphaboo”. If for our __newindex, we had something like:

function ourNewIndex(t,k,v)
    bufferTab[k] = v
end

Then, it seems, we could save our new items away in bufferTab, awaiting a call to update. How could we deal with that?

Well, Lua also has __index, like new index except for an access to a key that doesn’t exist. If we had a function kind of like this:

function ourIndex(t,k)
    if k == "update" then
        -- copy bufferTab into t
        -- empty buffertab
    end
end

Then it seems we could “pretend” that our table had a method update and if you typed

myTab:update()

We’d update it.

It appears that we could make all this work, if only we had a place to store the buffer. We can’t store it in our table: if we did, then looping over the table would find it. However, if we use a closure, we can store the buffer in an upvalue, and if that phrase just zoomed right over your head, I’m not surprised: this is quite deep in the trick bag for most of us, including me.

But as I read what I’ve just written, it seems quite likely that we can make a kind of table that does just what we want, and that behaves otherwise just like a table.

Let’s first figure out how to do it.

Yeah, your scientists were so preoccupied with whether or not they could, they didn’t stop to think whether they should. – Dr Ian Malcolm

This is a job for TDD …

If there ever was one. I’ll start a new project with CodeaUnit in it.

Seems to me that the first test can be pretty simple:

        _:test("DT has provided contents", function()
            local dt = DeferredTable{a=13, b=15}
            local a = dt.a
            local b = dt.b
            _:expect(a).is(13)
            _:expect(b).is(15)
        end)

Yes, that syntax at the top is legit, you can call a function with a single table following it and it acts as if it were in parens.

This test will fail looking for DeferredTable.

2: DT has provided contents -- Tests:19: attempt to call a 
nil value (global 'DeferredTable')

So far so good. I’m going to implement DeferredTable right here in the test tab, since this is a spike, and in the spirit of Keith Braithwaite’s “TDD as if you really meant it”.

function DeferredTable(contents)
    return {}
end

This will fail for having no contents.

2: DT has provided contents  -- Actual: nil, Expected: 13
2: DT has provided contents  -- Actual: nil, Expected: 15

Now, if contents is provided, it’s a table (or better be) and if it isn’t, we want an empty table, so:

function DeferredTable(contents)
    return contents or {}
end

I expect this to work, though it’s just a regular table at this point. Our test does pass.

Let’s compact that test a bit, with inline:

        _:test("DT has provided contents", function()
            local dt = DeferredTable{a=13, b=15}
            _:expect(dt.a).is(13)
            _:expect(dt.b).is(15)
        end)

Still runs of course.

Extend the test a bit. Maybe I should write a new one but I have a story in my head:

        _:test("DT has provided contents", function()
            local dt = DeferredTable{a=13, b=15}
            _:expect(dt.a).is(13)
            _:expect(dt.b).is(15)
            dt.c = 17
            _:expect(dt.c).is(nil)
        end)

We want c not to appear in our deferred table until we do update. This test will fail saying expected nil got 17.

2: DT has provided contents  -- Actual: 17, Expected: nil

As expected. Now we have to get to work. I’ll go in small steps, and try to explain them as I go. Here’s my first try. I’m optimistic about it:

function DeferredTable(contents)
    local actual = contents or {}
    local buffer = {}
    local mt = getmetatable(actual)
    mt.__newindex = function(t,k,v)
        buffer[k] = v
    end
    return actual
end

Our function DeferredTable defines two local variables, actual and buffer and puts any provided contents into actual and returns it as the result of the call.

Then we fetch the metatable of actual and give it a __newindex function that receives the original table, which we ignore, and the k and v trying to be stored. Our new index function stores the k-v in buffer.

I expect this to make the test run. We shouldn’t find dt.c any more. It should be in buffer. (What’s buffer? We’ll come to that in a moment.) Let’s see if the test runs. But it doesn’t:

2: DT has provided contents -- Tests:35: attempt to index a nil value (local 'mt')

Hm, how come mt is nil?

Note Bene: This is exactly the sort of thing that happens when you’re working deep in your bag of tricks. Things are a bit mysterious, tricky, and sensitive to misunderstandings, while your brain is full of misunderstandings and half-remembered notions. This is a sign that what we’re doing may be too fancy to be allowed. Naturally, we’ll press on.

I think there is no metatable at the beginning of a table’s existence. We’ll create one instead.

function DeferredTable(contents)
    local actual = contents or {}
    local buffer = {}
    local mt = {}
    mt.__newindex = function(t,k,v)
        buffer[k] = v
    end
    setmetatable(actual, mt)
    return actual
end

Same thing, except we create the table ourselves, then set it as actual’s metatable. Test runs, the value isn’t there.

Now we want to extend the test like this:

        _:test("DT has provided contents", function()
            local dt = DeferredTable{a=13, b=15}
            _:expect(dt.a).is(13)
            _:expect(dt.b).is(15)
            dt.c = 17
            _:expect(dt.c).is(nil)
            dt:update()
            _:expect(dt.c).is(17)
        end)

We want to be able to call update and after we do, we want our table to contain anything that was in the buffer. This will fail because there is no update in the table:

2: DT has provided contents -- Tests:24: attempt to call a nil value (method 'update')

Furthermore, there shouldn’t ever be an update in the table, unless someone stores that key in there, and they had bloody well not do that. (An interesting caveat in the use of deferred tables: no key can be “update”.)

Well, we know that the __index metamethod is used when a key doesn’t exist on access (rather than storing), so …

OK, you’d better sit down. Here’s what I did:

function DeferredTable(contents)
    local actual = contents or {}
    local buffer = {}
    local mt = {}
    mt.__newindex = function(t,k,v)
        buffer[k] = v
    end
    mt.__index = function(t,k)
        if k ~= "update" then
            return nil
        else
            return function()
                for kk,vv in pairs(buffer) do
                    rawset(actual, kk, vv)
                end
                buffer = {}
            end
        end
    end
    setmetatable(actual, mt)
    return actual
end

Yeah. Let’s just look at the __index bit.

    mt.__index = function(t,k)
        if k ~= "update" then
            return nil
        else
            return function()
                for kk,vv in pairs(buffer) do
                    rawset(actual, kk, vv)
                end
                buffer = {}
            end
        end
    end

Remember, __index is called when an index doesn’t exist, and its job is to return the right thing. If the thing being asked for isn’t “update”, we want to return nil, the standard return for a missing key.

If the user has asked for “update”, however, we need to return, not a result, but a function that does the update, so that the () there in the test can call it. That function is:

            function()
                for kk,vv in pairs(buffer) do
                    rawset(actual, kk, vv)
                end
                buffer = {}
            end

The function copies buffer to actual, and resets buffer to empty. Why does it use rawset? Because those indexes don’t exist, and if we were to just say

    actual[kk] = vv

We would trigger the __newindex behavior, and that would be bad.

Now we’re left with one weird question: Where the hell is buffer? We only have ahold of the table actual, and it doesn’t have a copy of buffer.

I have two questions, actually. First, is there only one copy of buffer? If we build two of these tables, will they each have their own buffers? We’d better write a test for that.

        _:test("Two DTs don't interact", function()
            local at = DeferredTable()
            local bt = DeferredTable()
            at.a = "a"
            bt.a = "b"
            at:update()
            bt:update()
            _:expect(at.a).is("a")
            _:expect(bt.a).is("b")
        end)

Here we store two different things at key “a”, namely “a” and “b”. Then we check to be sure each updated table has the right value. I expect this to work, but won’t be too surprised if it doesn’t, because of the two weird questions: where’s buffer and are we sure it isn’t shared? Let’s run the test.

The test runs correctly. So the answer is that, somehow, buffer is inside the particular invocation of the function DeferredTable. Let me try to explain, so that I can understand it. You? You’re on your own.

When the function DeferredTable is invoked, it creates a local environment, as usual, and that has the three variables actual, buffer, and mt in it. Functions defined inside that function have access to those values, lexically. But what happens when that function returns? Now what do those functions refer to?

The answer is, those functions refer to “upvalues”, which might be better called “external local variables”, according to the creator of Lua, but history prevails. And the answer is: Lua handles this correctly. And the answer is: not entirely satisfying to me, because I don’t quite see how it works, and I don’t like magic.

We might worry about garbage collection. Maybe these things are tagging around uncollected, and after a garbage collection, they might disappear, and bad things might happen. If bad things did happen, we’d be in big trouble. But that can’t happen–I’m pretty sure–because this code is compiled to have access to its buffer, and the code is stored in a metatable that is held by our deferred table, and until we let go of that, the metatable is safe, so the functions are safe, so the variables are safe.

It’s all good. We could do a garbage collection as part of out test, of course. Shall we?

        _:test("Two DTs don't interact", function()
            local at = DeferredTable()
            local bt = DeferredTable()
            at.a = "a"
            bt.a = "b"
            collectgarbage()
            at:update()
            bt:update()
            _:expect(at.a).is("a")
            _:expect(bt.a).is("b")
        end)

Test still runs. Therefore:

We have implemented a deferred table that saves added values until we call update, which is pretty much just what we were looking for.

But wow. Is this too deep in the bag of tricks to be allowed to live? Back when Extreme Programming was new, we used to say that if anyone called you over to their desk to look at some cool code, you should delete it. This code may deserve just that treatment.

And yet, it works, and I think it’s actually quite solid.

I am tempted, sore tempted, to try it in the Dung program, because it’s neat, it almost probably maybe works, and I’ll learn something if it doesn’t bear weight after all. No ghosts will be harmed if it fails. I might be embarrassed by having to report a failure, but get serious, I’m well over that concern.

Let’s stop here and start a new article about our other options.

See you then!