I think we have only one hard problem remaining. Let’s see what we can do about that. MORAL: Don’t be like Ron’s brother’s brother.

The arm broke off my glasses this morning, so I am working with even less vision than usual. If I make any mistakes at all, we’ll blame it on that, shall we?

We’re far from done, but it seems to me that we have only a very few real concerns left, followed, as usual, by a lot of work doing all the things. Let’s make a list, in no particular order:

  • Details of responses;
  • Handling multiple connections;
  • Specifying the world via control files;
  • Pits and Mines;
  • Firing at other robots.

I may refuse that last one, on the grounds that firing guns at innocent robots isn’t part of the game I want to write. Maybe it’s a game of hacky-sack tag where you toss the sack at the opponent to tag them. Maybe there’s a form of trade at a distance. Message passing, perhaps. I don’t know.

Specifying the world via control files will be tedious, but I see no big difficulty there. Pits and mines have behavior, but it’s pretty localized. And mines are even nastier than guns.

The details of responses are important but since we have requests and responses in place, there’s no mystery there. I do think we would like to wrap the dictionaries better in objects and provide better methods for building them. But again, no mystery.

We’re left with multiple connections. We’ve cracked sending JSON across a socket, so that part of the problem is in hand, but we’ve done nothing about handling more than one attached robot at a time.

So I think we need to crack that problem. But first I want to talk about the general strategy I’m following here.

End to End

We’ve spoken before about having a walking skeleton, and end-to-end, or nearly end-to-end minimal version of the program. The elegant GeePaw Hill chose socket connections as his walking skeleton, if I’m not mistaken. I chose a visible game, with a robot symbol that could move and look around.

You could certainly argue that my implementation wasn’t end to end, and I wouldn’t argue too much. It did run from robot to world and back, which is pretty much end to end, but it didn’t include a critical component in the middle, the sockets.

So I did a few experiments with sockets, to learn how they work in Codea Lua, to increase my certainty that we could build this thing with the technology we chose. And I am confident, and I can say with certainty that the whole team here is confident.

I’ve connected two sockets to a server and served both of them with trivial responses, but I’ve not built anything that can really deal with multiple connections. So to be sure that we can go from end to end, I think we need to cover that issue.

The way I look at this is that among all the many problems we have to solve in order to build our product, there are some that we are sure about our ability to solve, and some that we are uncertain. Some of the uncertain ones are critical: if we can’t solve them, we’re doomed.

So, as early in the process as is reasonable, we at least produce quick solutions, “spikes”, to ensure that we can crack the problems.

I’m not really worried about handling multiple connections, but we haven’t done it, and it’s an important capability, so, in the spirit of having everything we need ready to hand, I’m going to work on multiple connections for the next day or two. Whatever it takes.

Multiple Connection Review

Let’s see if I even remember how this stuff works.

On the server side, we have a primary server socket, to which clients connect. That socket has a method accept which, when there is a request to connect, returns a client object representing that connection.

Client objects can be sent receive, which will return a line if there is one available, otherwise a nil, signifying that either there’s nothing available or that an error has occurred. We can send a reply back to a client using the send method.

This would be sufficient to do the job: we could check for a new connection, then loop over all the connections we have, and if any of them give us a line, we’d then and there decode it, call the robot, and return the response. We have no reason to do anything more clever than call the robot and wait for a response, because we have no threads in Codea, and we only have one processor that we’re aware of.

If we were to do that, however, the server would sit in a tight loop, spinning, waiting for input. The iPad would run down its battery rapidly and probably get too hot to hold in our lap. So we’d like to provide as much idle time as possible.

There’s another angle on the idle time question. Presumably the screen of our iPad server needs to display some kind of interesting information, like the number of robots in the game and their scores. We might even want the server to display the whole World. People do weird things

In Codea Lua terms, this means that our server loop can’t spin, but it also cannot hang. And we can’t create a thread and allow Codea to interrupt us … because we don’t have first-class threads, although we do have the ability to write coroutines. I don’t see how coroutines can help us here.

Therefore, I think we’re going to have to do some polling kind of solution, serving just one socket at a time. I think I’m ready to write a story about what we’re going to do.

Multiple Connection Story

Set up a server function, to be called at regular intervals by the Codea runtime. On each call, check for new connections, and add any connections to a client table. Scan the client table and if there are requests, serve one or more (a settable parameter).

I think we’ll build this first in a new Codea project, named Multi. We’ll TDD as much of this as we can, and take other actions as needed to be sure that it works. In the process, we’ll not use real sockets, at least not at the beginning.

I’ll want a FakeServer and a FakeClient.

        _:test("FakeServer", function()
            local server = FakeServer()
            result,err = server:accept()
            _:expect(result).is(nil)
            _:expect(error).is("timeout")
            server:ready()
            client,err = server:accept()
            _:expect(client:is_a()).is(FakeClient)
        end)

This is a little story about the server. It starts with no connections. When told that it is ready, it will then return a (fake) client. I should extend the test right now, while I’m thinking of it: after it returns the client it goes back to being unready:

        _:test("FakeServer", function()
            local server = FakeServer()
            result,err = server:accept()
            _:expect(result).is(nil)
            _:expect(error).is("timeout")
            server:ready()
            client,err = server:accept()
            _:expect(client:is_a()).is(FakeClient)
            result,err = server:accept()
            _:expect(result).is(nil)
            _:expect(error).is("timeout")
        end)

The tests will drive out the implementation:

1: FakeServer -- TestFakes:15: 
attempt to call a nil value (global 'FakeServer')
FakeServer = class()
1: FakeServer -- TestFakes:18: 
attempt to call a nil value (method 'accept')
function FakeServer:init()
    self._ready = false
end

function FakeServer:accept()
    if not self._ready then
        return nil, "timeout"
    end
end

I may have gotten ahead of myself, because:

1: FakeServer  -- 
Actual: function: 0x10530f5b0, 
Expected: timeout
1: FakeServer -- TestFakes:31: 
attempt to call a nil value (method 'ready')

Looking at the test tells me that you can’t return a result into error and then expect to find it in err. Slow down, Ron.

        _:test("FakeServer", function()
            local server = FakeServer()
            result,error = server:accept()
            _:expect(result, "first result nil").is(nil)
            _:expect(error, "first err timeout").is("timeout")
            server:ready()
            client,error = server:accept()
            _:expect(client:is_a()).is(FakeClient)
            result,error = server:accept()
            _:expect(result).is(nil)
            _:expect(error).is("timeout")
        end)

Test:

1: FakeServer -- TestFakes:31: attempt to call a nil value (method 'ready')

OK …

function FakeServer:ready()
    self._ready = true
end

Test:

1: FakeServer -- TestFakes:37: attempt to index a nil value (global 'client')

That tells me we didn’t return a FakeClient, nor would we, given this:

function FakeServer:accept()
    if not self._ready then
        return nil, "timeout"
    end
end

Permit me to implement this in larger than the smallest possible steps. See if I regret it.

function FakeServer:accept()
    if not self._ready then
        return nil, "timeout"
    end
    self._ready = false
    return FakeClient(), nil
end

FakeClient = class()

Test:

1: FakeServer  -- 
Actual: false, 
Expected: table: 0x280013ac0

This is wrong:

            _:expect(client:is_a()).is(FakeClient)

Should say:

            _:expect(client:is_a(FakeClient)).is(true)

See what going too fast does? Little mistakes that actually slow me down. Test. Test is green.

We could commit, but this is just a spike … or is it? Let’s set up Working Copy so we can Git it. Commit: Initial Commit. FakeServer.

Now we need a similar feature in FakeClient … In fact it’s so similar I’m tempted to use FakeServer for the whole thing. It’s really the same except we send receive to the clients instead of accept.

I think the right thing is to use a separate class. Using the same one could lead to confusion and it’s a matter of moments to do this:

FakeClient = class()

function FakeClient:init()
    self._ready = false
end

function FakeClient:receive()
    if not self._ready then
        return nil, "timeout"
    end
    self._ready = false
    return "message\n", nil
end

function FakeClient:ready()
    self._ready = true
end

I didn’t TDD that. I copied and pasted. But I do expect that our next tests will exercise the thing well enough.

I’m not entirely sure where I’m going but I think I’m going to implement an object named Server that does the work for us. Let’s ease into it:

        _:test("Server accumulates clients", function()
            local count = 0
            local proc = function(client)
                count = count + 1
            end
            local socket = FakeServer()
            local server = Server(socket)
            server:process()
            _:expect(#server:clients()).is(0)
            socket:ready()
            server:process()
            _:expect(#server:clients()).is(1)
            server:processClients(proc)
            _:expect(count).is(0)
            server:ready(1)
            server:processClients(proc)
            _:expect(count).is(1)
        end)

I guess I’d have to agree that I didn’t ease into it. This is a full story about the new Server object. We create a FakeServer (which I now want to rename FakeSocket), and open a Server on it. We tell it to process and it doesn’t add anything to its clients. We tell the socket to be ready and process again and Server adds a client to its clients. None of them (the one) is ready, we processClients and our client-processing function proc isn’t called, leaving count at zero. We ready a client and go again, processing one.

I think we can just crunch through this long test, driving out the Server.

2: Server accumulates clients -- TestFakes:70: attempt to call a nil value (global 'Server')
Server = class()

function Server:init(socket)
    self._socket = socket
    self._clients = {}
end

Test.

2: Server accumulates clients -- TestFakes:78: attempt to call a nil value (method 'process')

Code:

function Server:process()
    local client, err = self._socket:accept()
    if client then table.insert(self._clients, client) end
end

Test:

2: Server accumulates clients -- TestFakes:83: attempt to call a nil value (method 'clients')

Code:

function Server:clients()
    return self._clients
end

Test:

2: Server accumulates clients -- TestFakes:91: attempt to call a nil value (method 'processClients')

Implement in part:

function Server:processClients()
    
end

Test:

2: Server accumulates clients -- TestFakes:97: attempt to call a nil value (method 'ready')

Implement:

function Server:ready(clientNumber)
    if clientNumber <= #self._clients then
        self._clients[clientNumber]:ready()
    end
end

Test expecting error expected 1 actual 0:

2: Server accumulates clients  -- 
Actual: 0, 
Expected: 1

Complete processClient:

function Server:processClients(callBack)
    for i,c in ipairs(self._clients) do
        callBack(client)
    end
end

Expect green. Don’t get it:

2: Server accumulates clients  -- 
Actual: 1, 
Expected: 0
2: Server accumulates clients  -- 
Actual: 2, 
Expected: 1

Better review the test. But note: here again I’ve taken a big bite, all sure of myself, and now instead of just typing in a line and knowing either it’s correct or wrong, I don’t know what’s wrong, and I’m going to take precious time finding my problem.

Take my advice, I’m not using it.

The test is wrong:

        _:test("Server accumulates clients", function()
            local count = 0
            local proc = function(client)
                count = count + 1
            end
            local socket = FakeServer()
            local server = Server(socket)
            server:process()
            _:expect(#server:clients()).is(0)
            socket:ready()
            server:process()
            _:expect(#server:clients()).is(1)
            server:processClients(proc)
            _:expect(count).is(0)
            server:ready(1)
            server:processClients(proc)
            _:expect(count).is(1)
        end)

Clearly, if there’s one client, the first expectation on count should be 1 not zero. That line’s in the wrong place. Should be:

        _:test("Server accumulates clients", function()
            local count = 0
            local proc = function(client)
                count = count + 1
            end
            local socket = FakeServer()
            local server = Server(socket)
            server:process()
            _:expect(#server:clients()).is(0)
            server:processClients(proc)
            socket:ready()
            server:process()
            _:expect(#server:clients()).is(1)
            _:expect(count).is(0)
            server:ready(1)
            server:processClients(proc)
            _:expect(count).is(1)
        end)

However, there’s a bug in my code. Watch this, if I add another check on processClients at the end:

        _:test("Server accumulates clients", function()
            local count = 0
            local proc = function(client)
                count = count + 1
            end
            local socket = FakeServer()
            local server = Server(socket)
            server:process()
            _:expect(#server:clients()).is(0)
            server:processClients(proc)
            socket:ready()
            server:process()
            _:expect(#server:clients()).is(1)
            _:expect(count).is(0)
            server:ready(1)
            server:processClients(proc)
            _:expect(count).is(1)
            server:processClients(proc)
            _:expect(count).is(1) -- still one, client should be already processed
        end)

I expect that to fail.

2: Server accumulates clients  -- 
Actual: 2, 
Expected: 1

Well, wait. Is that really wrong? We need to think a bit more deeply about what we want here.

Think, Before It’s Too Late!

There is the socket function select, which, given a collection of sockets, will return the ones that are ready to be read. I was half thinking that we’d use that function to cull the sockets we process. But there’s no real reason to do that, it seems to me.

If in our callback procedure we do non-blocking receives on each socket and process the one (or ones) that give us a value, that should do the job for us.

But let’s extend our proc to count only clients who respond with a message:

            local proc = function(client)
                local r,e = client:receive()
                if r then
                    count = count + 1
                end
            end

I expect green. I am wrong again. Test is now wrong (again), at least.

2: Server accumulates clients  -- 
Actual: 0, 
Expected: 1
2: Server accumulates clients  -- 
Actual: 0, 
Expected: 1

I’m not counting at all in proc. Means the clients aren’t returning a message?

I’m at least five minutes into debugging. It’s 1230. Let’s see how long it takes me to figure this out. Smaller steps would surely have been better.

It’s 1245 and I still do not understand what’s happening. I ready the client correctly, as far as I can see, and then when I call receive, I don’t get a message back, because it thinks it’s not ready any more.

Finally, my eyes see this:

function Server:processClients(callBack)
    for i,c in ipairs(self._clients) do
        callBack(client)
    end
end

Might help to pass the loop variable to the call back!

function Server:processClients(callBack)
    for i,client in ipairs(self._clients) do
        callBack(client)
    end
end

Test expecting green. Green. Remove any random prints. Test again. Green. Commit: Server loops over all clients using callback. Time is 1249. At least 20 minutes wasted, and quite probably more like 25.

As it is nearly 1300 hours, and we are green, let’s sum up.

Summary

Darn. Don’t be like Ron’s brother’s brother. I was sure I was hot—and truth be told, I was about as hot as I ever am—and I wasn’t hot enough to go writing ten or twenty lines of code without a single error, and definitely not hot enough to spot the error in those lines.

You’re surely better than I am, but there’s no reason to spin out long chunks of code—or even tests—when small steps are just as fast and don’t produce long 20 minutes head-scratching sessions. Small steps feel slow, but they’re not as slow as being confused.

Now sure, you’d never have made that silly mistake of calling the iteration variable c on one line and then in the next line referring to it as client Only I could be that useless. But, seriously, it seems like every time I start spinning out lots of lines, I get in trouble.

You’d think I’d learn my lesson. As for you, well, do as you see fit, but Id advise you to try smaller steps and see whether you can tell the difference between when you squeeze them down to tiny, and when you don’t.

But the result?

Yes, the result is good. We’re not done, but we have the shell of done, and I think I’ve got more clearly in my mind what will suffice for our purposes.

I think what we’ll do is put our calls to process and processClients on a timer, and call them repeatedly. Usually nothing will happen. Every now and then, we’ll get a new connection or a message from a client, and we’ll deal with it and then skip out. And I think we’ll just handle all the clients. There’s no advantage to processing only one or a couple, it’ll just queue up people waiting, and our screen update is less important than the message handling.

I think I’ll package up the calls to process and processClients into a single call, that is, before running process clients, see whether we can create any connections. And we should perhaps prepare for there to be more than new client ready, although in a few fractions of a second we’ll get them all.

I’d say that my understanding of the problem solution is about 0.7 baked, and that another day should get us pretty close to fully baked.

But darn it, I wish I’d learn to go in smaller steps. The cat is starving because it’s past her lunch time.

See you next time!