Client-Server III

A brief report on study and what I learned, then a look at the server side. Where is the bright side? I’d like to look at that sometimes.

I read the RealPython tutorial section on this multi-connection code, casually, not deep study. I think that my code reading was just about right. The RealPython page led me to a Socket Programming HOWTO on python.org. It was there, or near there, that I learned that the connect_ex method does return a connection, but the ex refers to the fact that it returns an error code. I gather that vanilla connect throws exceptions and this one does not. Possibly good to know.

I do recommend the HOWTO, certainly if you work with sockets, but also as an interesting, useful, and enjoyable writeup of a technical topic.

I also browsed the general description of sockets in the Python docs, which is comprehensive and quite dense, including many things one will never need to know, but which ones, which ones, that is the question.

You probably have your own preferred way of learning something new, I’m just describing mine, not as a shining exemplar, just how one person does it. I always prefer involving code in my study, and generally like to write at least a few tests in whatever testing framework I’m using, to isolate and document ideas. I rarely go back to those tests for “documentation”, but sometimes I do go back if I remember that I worked something out and no longer quite remember it.

History: Back in the early days of XP and the One True Agile, we thought that tests could serve as useful documentation of how things worked. That did not pan out as we had imagined. It is true that when a test breaks, we certainly read it to figure out what was supposed to happen. But if anyone ever just browsed the tests to learn about a class, I have not met them. Ç’est la vie, say the old folks, of whom I am one.
Aside: I have just thought of a new way of addressing my articles on the site that is even more arcane than the one I use now. I do promise not to change the address of my articles, but it should be pretty clear from the recent URLs that you can’t guess the names. There are almost 6000 files in this site, so it’s no wonder that the naming scheme has not lasted.

OK, let’s look at the server code. I’ll paste it here. Scan it with me and then we’ll pull out parts and see what we can figure out.

import selectors
import socket
import sys
import types

sel = selectors.DefaultSelector()

def accept_wrapper(sock):
    conn, addr = sock.accept()  # Should be ready to read
    print(f"Accepted connection from {addr}")
    conn.setblocking(False)
    data = types.SimpleNamespace(addr=addr, inb=b"", outb=b"")
    events = selectors.EVENT_READ | selectors.EVENT_WRITE
    sel.register(conn, events, data=data)

def service_connection(key, mask):
    sock = key.fileobj
    data = key.data
    if mask & selectors.EVENT_READ:
        recv_data = sock.recv(1024)  # Should be ready to read
        if recv_data:
            data.outb += recv_data
        else:
            print(f"Closing connection to {data.addr}")
            sel.unregister(sock)
            sock.close()
    if mask & selectors.EVENT_WRITE:
        if data.outb:
            print(f"Echoing {data.outb!r} to {data.addr}")
            sent = sock.send(data.outb)  # Should be ready to write
            data.outb = data.outb[sent:]


if len(sys.argv) != 3:
    print(f"Usage: {sys.argv[0]} <host> <port>")
    sys.exit(1)

host, port = sys.argv[1], int(sys.argv[2])
lsock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
lsock.bind((host, port))
lsock.listen()
print(f"Listening on {(host, port)}")
lsock.setblocking(False)
sel.register(lsock, selectors.EVENT_READ, data=None)

try:
    while True:
        events = sel.select(timeout=None)
        for key, mask in events:
            if key.data is None:
                accept_wrapper(key.fileobj)
            else:
                service_connection(key, mask)
except KeyboardInterrupt:
    print("Caught keyboard interrupt, exiting")
finally:
    sel.close()

As before, I’ll start looking at the bottom, where the “main” is. This part first:

host, port = sys.argv[1], int(sys.argv[2])
lsock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
lsock.bind((host, port))
lsock.listen()
print(f"Listening on {(host, port)}")
lsock.setblocking(False)
sel.register(lsock, selectors.EVENT_READ, data=None)

Here we just read the args from the “command line”, which will be the host and port. I used 127.0.0.1 and 65432. We create a socket, bind it to the host/port, and listen to it. We set it to non-blocking, because we will be listening to lots of sockets and don’t want them to hang.

Then we register our port with the selector, which was set way up above:

sel = selectors.DefaultSelector()

I’ve still not studied selectors, but what I’ve gathered is that we create one, generally a DefaultSelector, because they are generally just fine, and we register our sockets with them. I think selectors can deal with things other than sockets but I’m not sure about that. They’re on the agenda for further study, but we can see pretty well how they work just from our code here.

One thing that seems to be the case is that when we register with the selector, we can provide a data object, which, I think, can be any object one wants. Basically it serves as the buffer into which we read information or from which we write it, plus whatever other info we might want. Again, this is what I’ve gleaned, not what I’ve learned. I could be quite wrong.

So now we have our socket listening and in the selector. It is the one with the “well-known” address that our client code connected to in order to start talking with the server. We may look back at that shortly.

After we have set up our one listener, we enter our server loop:

try:
    while True:
        events = sel.select(timeout=None)
        for key, mask in events:
            if key.data is None:
                accept_wrapper(key.fileobj)
            else:
                service_connection(key, mask)
except KeyboardInterrupt:
    print("Caught keyboard interrupt, exiting")
finally:
    sel.close()

Inside a try that we’ll think about in a moment, we enter a nice tight loop “forever”. Clearly, the events list is a list of key, mask pairs, for connections that have something to say to us.

If key.data is None, that tells us that the socket is the main listener one created above. So we accept the wrapper using key.fileobj. I think that key.fileobj will be used to set up the connection, so it is probably some network identifier thing. We’ll see.

Else, there is data, in which case we do service_connection, passing the key, mask pair. That’ll be the code that accepts messages from the various connectees, and sends messages to them.

And the rest of the try is pretty clear, we handle a keyboard interrupt (control C), and we guarantee to close the socket on our way out.

That all seems to make sense to me. Let’s check accept_wrapper:

def accept_wrapper(sock):
    conn, addr = sock.accept()  # Should be ready to read
    print(f"Accepted connection from {addr}")
    conn.setblocking(False)
    data = types.SimpleNamespace(addr=addr, inb=b"", outb=b"")
    events = selectors.EVENT_READ | selectors.EVENT_WRITE
    sel.register(conn, events, data=data)

The parameter sock was key.fileobj when we called this, so we are now pretty sure that what was returned there was a socket kind of thing. I’m rather sure that it is the original listener socket. When we tell it to accept, it returns a new connection (a socket, I assume) and an addr, which, I’d bet, is a host, port pair.

We set the connection to non-blocking, we create a little data object with the addr info, and two buffers, inb and outb, containing empty binary strings. We create an events mask accepting read and write and register our new socket, conn with the selector. From now on, our while True loop will service this new connection as well as the original listener.

I do not know why they called that method accept_wrapper. I might have called it accept_connection but perhaps there is a good reason for the name.

Now service_connection is up.

def service_connection(key, mask):
    sock = key.fileobj
    data = key.data
    if mask & selectors.EVENT_READ:
        recv_data = sock.recv(1024)  # Should be ready to read
        if recv_data:
            data.outb += recv_data
        else:
            print(f"Closing connection to {data.addr}")
            sel.unregister(sock)
            sock.close()
    if mask & selectors.EVENT_WRITE:
        if data.outb:
            print(f"Echoing {data.outb!r} to {data.addr}")
            sent = sock.send(data.outb)  # Should be ready to write
            data.outb = data.outb[sent:]

We see two cases, read and write. We recall that the cliient side, upon connection, sends a series of messages from a list. So we’re likely to have something to read one of these days, and we’ll do this:

    sock = key.fileobj
    data = key.data
    if mask & selectors.EVENT_READ:
        recv_data = sock.recv(1024)  # Should be ready to read
        if recv_data:
            data.outb += recv_data
        else:
            print(f"Closing connection to {data.addr}")
            sel.unregister(sock)
            sock.close()

One of the things I learned is that there will always be at least one byte of data coming back from recv, until the remote socket disconnects. Then, and only then, data is None. So, the else clause, knowing that the remote is gone, unregisters it from the selector and closes it. My reading suggests that it will get closed anyway, but advises against counting on that.

The more interesting case, of course, is what we do if we have data: we add whatever recv.data is to the socket’s own output buffer outb. A more useful program might process that information (when it was found to be complete) and put aa useful answer into the socket’s outb.

When this is done, we are finished with the read. Next time around, and indeed all the time, we’ll get an opportunity to write:

    sock = key.fileobj
    data = key.data
    ...
    if mask & selectors.EVENT_WRITE:
        if data.outb:
            print(f"Echoing {data.outb!r} to {data.addr}")
            sent = sock.send(data.outb)  # Should be ready to write
            data.outb = data.outb[sent:]

Our not very clever server, upon finding the connection available for writing, sends bytes from data.outb. We just put some bytes in there, you’ll recall, from the socket’s own recv_data. The send returns the number of bytes sent, which may not empty the entire buffer, so we slice off the number of bytes sent, possibly leaving more in the buffer for next time around. Sooner or later, we have sent them all, in which case the event does nothing.

Reflection

So, it begins to become clear how one might do something useful. We could, for example, send a number over to our server and have our server take its square root and send that back, as an example of a very poor way to calculate the square root. For that to work, we would have to come up with an agreement between client and server as to when the input message from the client was complete. Because of the vagaries of the Internet, we are not sure that all the bytes of our message are sent, nor that they are received all at once at the other end. And there is no standard way of indicating end of message. Options include:

fixed length messages
some kind of end of message pattern
begin with a message length, probably formatted in fixed length, e.g. two bytes

But that detail aside, it’s easy enough to see that on the READ, we’ll accumulate enough bytes to represent the number, and then take its square root, convert that back to bytes, and stuff the result in the output buffer.

It’s actually a bit tempted to do something like that just for fun, but not this morning. I have visitors coming and preparations to prepare.

Summary

The shape of socket programming is starting to loom out of the fog of my ignorance. I’ve learned some general ideas and a few specifics from reading the tutorial, HOWTO, and documentation, and learned a lot more by reading and figuring out the code in front of me. For me, as the saying goes, “Exemplum docet”.

Each person’s style of learning belongs to them. Some like videos, some like to read manuals, some like to work from example code and modify it. I like all those things, though in general, I like videos the least, because we’re pretty much forced to consume those in the order presented. I often want to skip around, or I’m just looking for one small tidbit, and I don’t want to sit through a half hour of lecture to find what I need. But YMMV, you do you, and so on.

I find that having the code in an editor in front of me works much better than reading it from a tutorial, because I can skip over things the first time through and can jump around to see what I want to see. And, again, you do you!

The next example in the RealPython tutorial is called Application Client and Server, and they provide four files, not just two, ‘app-server.py’, ‘libserver.py’, ‘app-client.py’ and ‘libclient.py’. The ‘lib’ ones, reading ahead, seem to be going to contain a Message class, probably the “domain” information for the app. We’ll see, probably next time or the time after that.

See you then!