Robot 16

OK, sockets. At least for a little while. And a conversation with myself. No, I’m OK, really. And thanks to Dave1707.

Somehow, I have guilted myself into spiking a bit with sockets and inter-machine communication. Here’s why: at this point in a real product application, with a week of programming done, I think that at least some team members would be worried about the communication between machines. It is pretty clear now that we can do most anything that could be asked of the robot and world objects, since we have seen that we can keep track of the robot’s position in the world, and scan in everything around it.

There isn’t much more capability needed. The robot can turn, which seems easy enough, and we have at least one way working. It can lay mines. It can shoot its weapon, and it can be destroyed.

There are also some separate world-building requirements. If I recall, they are something like a text file containing a description of what to build. In any case, we have the fundamental code to place anything in the world that we want.

There is lots of work to do, but none of it seems difficult.

But aside from a decent understanding of how we might convert the commands and returns in and our of JSON, we’re left with a big gap in between World and Robot. In the real game, they are only connected via a socket connection that passes the JSON back and forth.

We need to learn how to do that. We need …

A Spike

In his discussion of his communications spike for the game, GeePaw Hill makes the point that a spike is code that we intend to throw away, whose purpose is to learn how to do something.

I agree with that entirely, and that’s what I’d recommend you do if you were to ask me. And, I freely grant that I often think of it a bit differently, and sometimes act differently.

I often think of a spike as proving that we can do something. The important thing is the learning, but I do think of it, often, as a demonstration that we can stop panicking over something, because we’ve cracked it.

Even worse, sometimes—I’m ashamed to say this—sometimes I don’t throw the code away. Because I generally start with very spare, even awkward code, I’ve been known to keep some of the code from my spikes, and use it directly as the basis for going forward.

This is probably a bad idea. The reason is that the code will surely not be robust, will not be well tested, perhaps with no tests at all, and is likely to be as ugly as that lady’s monkey. Not really a good place to start.

But I do it sometimes, and this is a place of truth, so there you have it. I understand why you might not want to shake my hand now, should we ever meet.

Anyway, we’re going to do a spike, and I have some code that my friend Dave1707 wrote, from which we’re going to start.

Hold on while I go get the iPad from the other room.

Sockets

Dave has an example that I expect to run, and it looks like this:

viewer.mode=STANDARD

function setup() 
    --print(osc.start())
    b,e,cnt=0,0,0
    sh=remoteDisplay()
end

function draw()
    background(0)
    
    --[[
    cnt=cnt+1
    if cnt%60==0 then
        sh:sendMessage("draw count  "..cnt)
    end
    --]]
    
    sh:draw()
end

function touched(t)
    --[[
    if t.state==BEGAN then
        b=b+1
        sh:sendMessage("BEGAN count "..b)
    end
    if t.state==CHANGED then
        sh:sendMessage("x= "..t.x.."   y= "..t.y)
    end
    if t.state==ENDED then
        e=e+1
        sh:sendMessage("ENDED count "..e)
    end
    --]]
end

remoteDisplay=class()

function remoteDisplay:init()
    self.socket=require("socket")
    self.theirIp="0.0.0.0"  
    self:getMyIp()
    self:setMySocket()
    self:setTheirSocket()
    self:getTheirIp()
end

function remoteDisplay:draw()
    self:receiveTheirMessage()
    fill(255)  
    text("My IP "..self.myIp,WIDTH/2-100,HEIGHT-25)
    text("Their IP "..self.theirIp,WIDTH/2+100,HEIGHT-25)
end

function remoteDisplay:setMySocket()    
    self.server=self.socket.udp()
    self.server:setsockname(self.myIp,5544)
    self.server:settimeout(0)
end

function remoteDisplay:setTheirSocket()
    self.client=self.socket.udp()
    self.client:settimeout(0)
end

function remoteDisplay:getMyIp()
    self.server=self.socket.udp()
    self.server:setpeername("1.1.1.1",80)
    self.myIp,self.myPort=self.server:getsockname()
    self.ip1,self.ip2=string.match(self.myIp,"(%d+.%d+.%d+.)(%d+)")
end

function remoteDisplay:getTheirIp()
    self.client=self.socket.udp()
    self.client:settimeout(0)
    -- send a message to everyone on this network except to myself
    for z=1,255 do
        if z~=tonumber(self.ip2) then
            self.client:setpeername(self.ip1..z,5544)
            self.client:send("get their ip")
        end
    end
end

function remoteDisplay:receiveTheirMessage()
    local data,msg,port=self.server:receivefrom()
    if data~=nil then
        self.theirIp=msg       
        if data=="get their ip" then
            self.client=self.socket.udp()
            self.client:settimeout(0)
            self.client:setpeername(msg,5544)
            self.client:send("got their ip"..self.myIp)            
        elseif string.sub(data,1,12)=="got their ip" then
            self.theirIp=msg
        else
            print(data)
        end
    end
end

function remoteDisplay:sendMessage(tempStr)
    self.client = self.socket.udp()
    self.client:setpeername(self.theirIp,5544)
    self.client:settimeout(0)
    self.client:send(tempStr)
end

I’ve sort of read this, in the dead of night, and I sort of get what it does. And the code is pretty clear as well, should you read it. What we do with this is we uncomment the commented bits on one machine, run it on two machines, and our touches on the uncommented one should display numbers on the other one.

We’ll try it.

Mirabile dictu, it works! Both iPads quickly dispay their own IP and the other’s at the top of the screen. The second iPad starts displaying draw counts in its console, and if you touch the screen on the first, it displays the x,y coordinates of the touch.

screen shot showing two ip addresses at top

screen shot showing x y coordinates in console

That information was sent by the first iPad, received by the second, and printed by this method:

function remoteDisplay:receiveTheirMessage()
    local data,msg,port=self.server:receivefrom()
    if data~=nil then
        self.theirIp=msg       
        if data=="get their ip" then
            self.client=self.socket.udp()
            self.client:settimeout(0)
            self.client:setpeername(msg,5544)
            self.client:send("got their ip"..self.myIp)            
        elseif string.sub(data,1,12)=="got their ip" then
            self.theirIp=msg
        else
            print(data)
        end
    end
end

Now to figure out how it works. My biggest question is this: clearly the program on the first iPad sends a message on every 60 draw cycles:

function draw()
    background(0)
    
    ---[[
    cnt=cnt+1
    if cnt%60==0 then
        sh:sendMessage("draw count  "..cnt)
    end
    --]]
    
    sh:draw()
end

And during touch events:

function touched(t)
    ---[[
    if t.state==BEGAN then
        b=b+1
        sh:sendMessage("BEGAN count "..b)
    end
    if t.state==CHANGED then
        sh:sendMessage("x= "..t.x.."   y= "..t.y)
    end
    if t.state==ENDED then
        e=e+1
        sh:sendMessage("ENDED count "..e)
    end
    --]]
end

What I’m wondering has to do with receiving those messages. Is the program in some kind of loop doing the receive? Or is the receive somehow called automatically.

We’ll read the code together, carefully, now that we know that it works and what it does.

Reading Dave’s Program.

The program begins by just initializing a few variables and creating an instance of remoteDisplay:

function setup() 
    --print(osc.start())
    b,e,cnt=0,0,0
    sh=remoteDisplay()
end

Dave most programs with the computer in his lap, and doesn’t use any more capitalized characters than he must. Different programmer, different programming style. We don’t mind. remoteDisplay is a class.

What does that class do?

remoteDisplay=class()

function remoteDisplay:init()
    self.socket=require("socket")
    self.theirIp="0.0.0.0"  
    self:getMyIp()
    self:setMySocket()
    self:setTheirSocket()
    self:getTheirIp()
end

OK, this tells a story, which makes some sense. This line, I gather, creates the socket object, and saves it in our member variable self.socket. Presumably we’ll be able to send messages to it, when our object is sent messages:

    self.socket=require("socket")

Let’s just move on through the methods in the order they’re called.

function remoteDisplay:getMyIp()
    self.server=self.socket.udp()
    self.server:setpeername("1.1.1.1",80)
    self.myIp,self.myPort=self.server:getsockname()
    self.ip1,self.ip2=string.match(self.myIp,"(%d+.%d+.%d+.)(%d+)")
end

OK, there’s good news and bad news here. The bad news first: he’s using UDP, not TCP/IP. TCP/IP guarantees delivery (or an error, I suppose), and UDP does not. It is possible that a UDP message just goes off into the ether. We probably need to use TCP/IP in our game. Maybe.

Reading on …

I’m igoring setpeername for now, because it’s some kind of setup. I think 1.1.1.1 is google, and 80 is surely the channel number. We’ll come back to that if we need to.

Then we set up our member variables myIP and myPort using getsockname and I figure it’s clear enough what that does. Then we parse myIp into ip1 and ip2. That’s going to be the first three numbers of the ip in ip1 and the last in ip2. I happen to remember some of what’s going to happen, and he’s basically going to send a message to every other port. We’ll see.

So there’s that method. Then:

function remoteDisplay:setMySocket()    
    self.server=self.socket.udp()
    self.server:setsockname(self.myIp,5544)
    self.server:settimeout(0)
end

Interesting. We get a new UPD socket and we set its name to our ip and its port to 5544. I’ve read the words on this function, and I freely grant that I don’t understand them entirely.

unconnected:setsockname(address, port)

Binds the UDP object to a local address.

Address can be an IP address or a host name. If address is '*' the system binds to all local interfaces using the constant INADDR_ANY. If port is 0, the system chooses an ephemeral port.

If successful, the method returns 1. In case of error, the method returns nil followed by an error message.

Note: This method can only be called before any datagram is sent through the UDP object, and only once. Otherwise, the system automatically binds the object to all local interfaces and chooses an ephemeral port as soon as the first datagram is sent. After the local address is set, either automatically by the system or explicitly by setsockname, it cannot be changed.

Some kind of setup that says, OK, we are really using this ip address and port. I don’t understand the ramifications yet. Moving on, we set timeout to zero.

According to my reading, receive generally blocks, with a timeout. If we set the timeout to zero, receive will not block, instead returning no message and, I’m sure, some kind of code. We’ll probably find out.

What I’m doing here may seem like fumbling, and it is. I’ve read some of the docs and sort of understand them. I have one such open now, from which I pasted that stuff above.

My knowledge is spotty, weak, and probably often wrong. I’m using this code and my understanding of what it does externally, to fill in my understanding.

That’s all for setMySocket, which I might have called setUpMySocket.

Next is this:

function remoteDisplay:setTheirSocket()
    self.client=self.socket.udp()
    self.client:settimeout(0)
end

This clearly just creates another socket, intended, I suppose, to be the one we read. We’ll find out. So far, that socket doesn’t have that setsockname stuff done to it, so the document I quoted tells me that it probably isn’t live yet. And next we call:

function remoteDisplay:getTheirIp()
    self.client=self.socket.udp()
    self.client:settimeout(0)
    -- send a message to everyone on this network except to myself
    for z=1,255 do
        if z~=tonumber(self.ip2) then
            self.client:setpeername(self.ip1..z,5544)
            self.client:send("get their ip")
        end
    end
end

This code immediately undoes the code above, dropping the old client and making a new one. If I didn’t know Dave’s habits, this would confuse me, but my guess is that the setTheirSocket and this code are reflecting something that he didn’t clean up. I suspect we could remove setTheirSocket safely. Anyway this method … loops over all the possible last numbers in the ip, 1-255, and, if it isn’t our own ip2, sends the message “get their ip” out to that ip address.

That sends “get their ip” to anyone listening to port 5544, on our local network.

I gather that setpeername is what it takes to tell the socket the address to which to send.

Now the init is over and we have a server socket and have sent “get their ip” all over the network. (I read somewhere that you can broadcast to your whole network at once. We might want to try that.)

So now what? Here’s a trick that Dave used:

function draw()
    background(0)
    
    ---[[
    cnt=cnt+1
    if cnt%60==0 then
        sh:sendMessage("draw count  "..cnt)
    end
    --]]
    
    sh:draw()
end

On every draw cycle, he calls draw on our remoteDisplay instance, and that looks like this:

function remoteDisplay:draw()
    self:receiveTheirMessage()
    fill(255)  
    text("My IP "..self.myIp,WIDTH/2-100,HEIGHT-25)
    text("Their IP "..self.theirIp,WIDTH/2+100,HEIGHT-25)
end

So about 60 times a second, we call our receive, and since there is a zero timeout, that may or may not receive anything, Let’s see what it does:

function remoteDisplay:receiveTheirMessage()
    local data,msg,port=self.server:receivefrom()
    if data~=nil then
        self.theirIp=msg       
        if data=="get their ip" then
            self.client=self.socket.udp()
            self.client:settimeout(0)
            self.client:setpeername(msg,5544)
            self.client:send("got their ip"..self.myIp)            
        elseif string.sub(data,1,12)=="got their ip" then
            self.theirIp=msg
        else
            print(data)
        end
    end
end

Well, without reading the docs, we can see that we’re trying to receive a message to the server, and we can guess that receiveFrom returns data, message, and port, and we see that data can be nil, probably when there was nothing to receive, and when it isn’t, it might be “get their ip”, which is exactly what we sent when we were trying to get their ip. So data, apparently, is a lot like what we sent out.

If we’ve received “get their ip”, we setup a client, yet again. From the call to setpeername, it looks to me as if msg is their ip. Learning, slowly. The client will be whoever sent us that message, and we send them back the message “got their ip” with our ip concatenated. It’ll be something like “got their ip192.168.197.247”. So we send that off and that’s that.

If on the other hand the string starts with “got their ip”, we save their ip in our member variable theirIp

And finally, if that’s not the case, we print the data we received.

I’m ready to write a few words about how Lua sockets UDP works and how it is used. This will be a weak description, and I’ll write it as if someone had asked “what do you know about Lua ADP?” and I didn’t want to lie to them.

What I’ve “Learned”

What do you know about socket and UPD?

Well, not as much as I’d like to. Here’s what I understand about it. Caveat Emptor.

UPD sends “datagrams”, which amount to strings, from the program to a designated ip address. I think I’ve read that that can be a dotted IP, or an address that needs to be looked up on a name server. I’ve only seen it used with an explicit IP.

Anyway, we can create a UPD object on our machine’s ip and some port, and we can call “receivefrom”, which returns the data from a sender, plus the sender’s ip and port. There’s also a receive that only returns the data.

The receive functions block exxecution, with a timeout. If we set the timeout to zero, the receives will return immediately with no data. I suppose that if the timeout were set to one second, if we still didn’t have a message, it’d return with no data. Probably there’s a message saying it was a timeout.

There’s a send function that sends provided data to a “peer”, the ip and port you want to send to. The peer is set up with setpeername. There’s also a sendTo that lets you send the message providing the peer ip and port in that function call. I think it’s equivalent to setpeername/send.

Once you have an ip and port, you can send data as often as you wish. The other end just repeatedly executes receive or receiveFrom and deals with the messages.

Do you think we should use it in our game?

I think we could use it. Our robot clients would get our World server’s IP address somehow. It might be a well-known address like “robotworld.com/world” or whatever. Then each client would register a robot, providing a name, which I think is just what the World wants anyway, and we’d save a client object with that name and whatever robot info World wants to save.

We’d have some kind of loop over the clients to see if they had sent us any messages. Since UDP packets are just strings, we could have them be the JSON that is part of our design.

One issue that I’ve read about is that when no client has sent a message, you probably don’t want to sit in a tight loop doing nothing, burning server time. Our AWS bills might get large. One article I read said that the CPU was three orders of magnitude more than the amount actually needed. I think that is a thousand times more, if they meant what they said.

I’m not sure how we’d manage that issue. I have read that there is a LuaSocket function select that blocks until there is something to receive. That function takes a collection of sockets to wait on. In our case, I think we would just have the one, the World’s one ip and port. I’ve not used this and don’t know if it really exists or exactly how it works.

There is an issue, however, and I don’t know whether it means that we can’t use UDP: it is not lossless. It is possible that the client would send a message and it would never get to the server. If that were to happen, you might see something like tapping the control to move forward, and your robot wouldn’t move.

If we used the TCP/IP part of LuaSocket instead, message delivery is “guaranteed”. I don’t know exactly what that means. Anyway, we might find that TCP/IP is a better choice for our game.

What’s next with this?

I can see two main possibilities right now. One would be to do a similar spike to Dave’s using TCP/IP instead of UDP. That would bring us more up to speed on those calls. The other would be to spike something a bit closer to our game using UPD, sending some of our own messages back and forth.

However, I think I’d suggest that we put off connecting our World and Robot over a network until we have the messages going back and forth through JSON and the official tables in the spec. Thus far, we’re just calling and returning between Robot and World.

And that reminds me of an issue. I believe, but am not certain, that with TCP/IP, the World replies to the message from the Robot, so the Robot will (I think) just block until the message comes back. I think that means that we can use the current call/return structure in the code, doing the JSON decode and such before returning to the original call on the Robot side.

If we go with UDP, the reply message is a separate message from World to Robot, not a reply, so that Robot will have to be changed a bit to accommodate that. I have a couple of ideas about how to do that, amounting to providing a callback function in each Robot call, rather than just expecting the World call to come right back with an answer.

We might do that with coroutines. I’m just not sure of the details, but I’m sure we can figure out something useful.

OK, but what’s next?

At this moment, I’m really not sure. I’d like to get back to fleshing out the robot, if fleshing out is what you do to a robot, but I might recommend doing a few more experiments with the UDP and/or TCP/IP sockets.

I’ll need some time to think about it. I’ll have a solid recommendation by tomorrow.

OK, see you then!

Right.