Client-Server 2

Today I plan to try to run and understand the second example in the RealPython tutorial. See how an old man learns about unfamiliar code.

I do very much recommend RealPython if you’re learning Python, or want to keep up on areas you’re not yet expert in. They’re the most consistently useful source I’ve found, and I’m counting python.org in that list. And I can assure you that their explanation of the code we’ll look at today is better and more comprehensive than mine. What you see here is a veteran programmer learning from an example. What you see there is a very good and detailed tutorial, which the veteran programmer has not studied in detail because he’s impatient to try things.

The echo server we played with yesterday just scratched the surface, showing us some elementary setup, sending and receiving. As soon as it had done one send and receive, it exited. Not terribly useful really. This next demo handles multiple connections, I’m told, and keeps running for a while. We’ll see about that.

I’m going to jump right to the whole files, none of this step-by-step for me.

Hmm: We should talk about that, I’m usually on the step-by-step team.

We’re probably going to have to fiddle these to run them, but we’ll see.

Turns out you can use file-level copy/paste to get files into your project. Thanks PyCharm!

I can’t resist trying to run the things without further ado. I find that PyCharm has a Run with Parameters option, just what I need to pass in the host and port (and, apparently, connection count in the client side code).

I’m just gonna do it. Worst case I have to reboot, I suppose.

I love it when I jump off a cliff and invent wings on the way down. It worked!

From the server:

/Users/ron/PycharmProjects/cs1/.venv/bin/python /Users/ron/PycharmProjects/cs1/multiconn-server.py 127.0.0.1 65432 
Listening on ('127.0.0.1', 65432)
Accepted connection from ('127.0.0.1', 58143)
Accepted connection from ('127.0.0.1', 58144)
Echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 58143)
Accepted connection from ('127.0.0.1', 58145)
Echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 58144)
Echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 58145)
Closing connection to ('127.0.0.1', 58143)
Closing connection to ('127.0.0.1', 58144)
Closing connection to ('127.0.0.1', 58145)

And from the client:

/Users/ron/PycharmProjects/cs1/.venv/bin/python /Users/ron/PycharmProjects/cs1/multiconn-client.py 127.0.0.1 65432 3 
Starting connection 1 to ('127.0.0.1', 65432)
Starting connection 2 to ('127.0.0.1', 65432)
Starting connection 3 to ('127.0.0.1', 65432)
Sending b'Message 1 from client.' to connection 1
Sending b'Message 2 from client.' to connection 1
Sending b'Message 1 from client.' to connection 2
Sending b'Message 2 from client.' to connection 2
Sending b'Message 1 from client.' to connection 3
Sending b'Message 2 from client.' to connection 3
Received b'Message 1 from client.Message 2 from client.' from connection 1
Closing connection 1
Received b'Message 1 from client.Message 2 from client.' from connection 2
Closing connection 2
Received b'Message 1 from client.Message 2 from client.' from connection 3
Closing connection 3

Process finished with exit code 0

The server is still running. It stops only on keyboard interrupt. I don’t seem to have a window where I can type to this baby. I have discovered, too late, that there is a checkbox in the run config. I’ll just stop it and hope that it closes the socket.

Good news! Pressing the stop button results in this in the output:

Caught keyboard interrupt, exiting

Process finished with exit code 0

Life is good, and JetBrains really does things right.

Let’s now have a look at the code to see how it does what it did. Reading the server output tells us that the server interleaved accepting connections and responding to messages. The message numbering is a bit confusing. Let’s have a quick look at what the clients actually do.

I’ll browse the code and then paste segments here as I focus on them. I start at the bottom, where the client starts:

if len(sys.argv) != 4:
    print(f"Usage: {sys.argv[0]} <host> <port> <num_connections>")
    sys.exit(1)

host, port, num_conns = sys.argv[1:4]
start_connections(host, int(port), int(num_conns))

try:
    while True:
        events = sel.select(timeout=1)
        if events:
            for key, mask in events:
                service_connection(key, mask)
        # Check for a socket being monitored to continue.
        if not sel.get_map():
            break
except KeyboardInterrupt:
    print("Caught keyboard interrupt, exiting")
finally:
    sel.close()

We start up, get our parameters (or print the hint), pull out the host, port, and number of connections and start the connections. We’ll check that in a moment. Here in the main, we enter a loop.

Rather clearly, select is going to give us some events. It’s not clear how many we’ll really get, but however many, we’ll call service_connection to deal with key and mask, whatever they may be. We’ll find out. I don’t know what get_map is, but clearly if we don’t get one, we break out. Since we did break out, it must be that we didn’t get one. I’ll search to see what it is. My search is not fulfilling. I’ll have to check the tutorial, I think. Anyway, clearly it comes back None or False and we exit. Now I want to look at service_connection, even though one might imagine looking at start_connection first. It does happen first, but I think I want to see how this thing does its work.

def service_connection(key, mask):
    sock = key.fileobj
    data = key.data
    if mask & selectors.EVENT_READ:
        recv_data = sock.recv(1024)  # Should be ready to read
        if recv_data:
            print(f"Received {recv_data!r} from connection {data.connid}")
            data.recv_total += len(recv_data)
        if not recv_data or data.recv_total == data.msg_total:
            print(f"Closing connection {data.connid}")
            sel.unregister(sock)
            sock.close()
    if mask & selectors.EVENT_WRITE:
        if not data.outb and data.messages:
            data.outb = data.messages.pop(0)
        if data.outb:
            print(f"Sending {data.outb!r} to connection {data.connid}")
            sent = sock.send(data.outb)  # Should be ready to write
            data.outb = data.outb[sent:]

We’re going to have to find more info on these selector deals but we see that the socket is stored in the fileobj member of the key. Interesting. If I knew more about UNIX and MacOS than I do, I think I’d know that all kinds of stream-like things are represented by file-like objects. But I don’t know that, I just vaguely remember it.

Anyway we get the sock (socket no doubt thanks for saving those two characters) and data.

Clearly the mask tells us whether we have a read event or a write. I wonder how we got those write events. Probably we’ll find out.

On a read, we recv (thanks again for that valuable saving of extra characters) some data and add its length into the recv_total attribute of data. And we print what we got. We do not save what we got, which is interesting. You’d think we’d have to. I guess this isn’t really an app.

Then, if either we got no data, or the total received is equal to the expected msg_total in data, we unregister and close our socket. (I try to make a mental note that there is more than one thing to do here, unregister and then close. Doubtless I’m going to forget to do that soon.)

Writing is even more arcane to me. I need to know what this data thing is that knows so much about what’s going on, Anyway when we have something to send in outb, we send it and get a return back that tells us how much we sent. We strip that much off the outb thing. This means that if outb was really long and didn’t get sent in one go, we’d keep pecking away at it.

Quick Reflection

So, I’m beginning to get the drift, and I clearly need to understand more about what this key thing is and its data component in particular. Coming up on time to RTFM if we can find one. But let’s check the rest of the client.

import selectors
import socket
import sys
import types

sel = selectors.DefaultSelector()
messages = [b"Message 1 from client.", b"Message 2 from client."]


def start_connections(host, port, num_conns):
    server_addr = (host, port)
    for i in range(0, num_conns):
        connid = i + 1
        print(f"Starting connection {connid} to {server_addr}")
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.setblocking(False)
        sock.connect_ex(server_addr)
        events = selectors.EVENT_READ | selectors.EVENT_WRITE
        data = types.SimpleNamespace(
            connid=connid,
            msg_total=sum(len(m) for m in messages),
            recv_total=0,
            messages=messages.copy(),
            outb=b"",
        )
        sel.register(sock, events, data=data)

Further Reflection

Ah. Some questions begin to be answered. The data thing is registered with the DefaultSelector sel. I don’t know quite what that is, yet, but it seems to hold a socket, the events we care about, and our data object.

The data object, we can see even if, like me, we do not know quite what SimpleNamespace is, appears to be an object with those members. We see that it sets msg_total to the sum of the lengths of the messages. So the writing action is going to write all the messages one after another until they are all consumed.

We’ll clearly register however many connections were asked for, numbering them from 1 to N. They’re not blocking, which presumably means that they don’t hang waiting for data, stopping the whole program.

And at a guess, connect_ex is the external connection to host and port, since that’s what it is set to.

Remaining Question

I don’t know how these selector things work. In particular, I don’t understand when we’d see a WRITE one. I can imagine that the socket will be checked to see if it has anything to READ … maybe a socket is always ready to write?

That aside, the code has given me a fairly decent sense of how it must work. I’m certainly not equipped to write new code, but there are already places we could imagine changing to get somewhat different behavior.

I think the way the client finishes up is odd:

    if mask & selectors.EVENT_READ:
        recv_data = sock.recv(1024)  # Should be ready to read
        if recv_data:
            print(f"Received {recv_data!r} from connection {data.connid}")
            data.recv_total += len(recv_data)
        if not recv_data or data.recv_total == data.msg_total:
            print(f"Closing connection {data.connid}")
            sel.unregister(sock)
            sock.close()

We set msg_total to the number of characters (bytes) in our messages, which means we’re going to stop when we have seen all those bytes echoed back to us. I’d think that a more normal socket would receive a ‘stop’ message, or some similar end of file indication. We wouldn’t normally know how many characters we were going to receive from the server.

I presume this is just an artifact of the example chosen, trying to keep it simple enough.

The main thing I want to learn about is the selector idea, in particular regarding when it will show ready to write.

By way of experiment, I do this:

    if mask & selectors.EVENT_WRITE:
        if not data.outb and data.messages:
            data.outb = data.messages.pop(0)
        if data.outb:
            print(f"Sending {data.outb!r} to connection {data.connid}")
            sent = sock.send(data.outb)  # Should be ready to write
            data.outb = data.outb[sent:]
        else:
            print('nothing to send')

I want to see if it just prints that once, or many times. The answer is, many times. Very many. I conclude that a socket is pretty much always ready to write, and that you’ll just get write events coming out your ears. I change the print:

            print(f'nothing to send {data.connid}')

And sure enough

/Users/ron/PycharmProjects/cs1/.venv/bin/python /Users/ron/PycharmProjects/cs1/multiconn-client.py 127.0.0.1 65432 3 
Starting connection 1 to ('127.0.0.1', 65432)
Starting connection 2 to ('127.0.0.1', 65432)
Starting connection 3 to ('127.0.0.1', 65432)
Sending b'Message 1 from client.' to connection 1
Sending b'Message 2 from client.' to connection 1
Sending b'Message 1 from client.' to connection 2
Sending b'Message 1 from client.' to connection 3
nothing to send 1
Sending b'Message 2 from client.' to connection 2
Sending b'Message 2 from client.' to connection 3
nothing to send 1
nothing to send 2
nothing to send 3
nothing to send 1
nothing to send 2
nothing to send 3
...

OK, so that means that our event loop is just going to spin, offering our program the chance to write, over and over. Probably a “real” program would have a buffer somewhere that it could fill up and it would get sent. Or it might add messages to the output queue messages list. Something like that.

Summary

I think we’ve done enough exploration for today. We’ll explore the server next time.

I mentioned step-by-step, and what we did here was slam two big chunks of code into our project and just run them.

Now the RealPython tutorial is sort of step-by-step, though I have still not gone through it. But it’s not small steps each of which will run on its own or pass tests on its own. They just go through the big program in an order that they think is useful. And, if I wanted to follow the tutorial, I’d just read it … but I couldn’t put the code in and run it bit by bit. It only work as two complete programs.

That’s what we generally encounter in our own work: a whole big program, parts of which are unfamiliar. And we need to understand them, at least well enough to fix them or enhance them. We can’t learn them all at once: we are fundamentally serial beings. We can only choose where we focus our serial learning. So you just saw me learning step by step, looking at small pieces of code. It’s just that we couldn’t really run them in pieces, so I slammed them in, ran them and observed what they did, and then set out to scan, explore, and study.

It’s step by step, but the step sizes aren’t quite as optional as we’d like and, generally, we can’t run them bit by bit.

An Option: Sometimes you’ll see people suggest that we can learn how a program works by refactoring it. Often, we can in fact do that, and often, we can even write tests for the pieces we refactor out. Because of my history, which dates back to 1961, when the computer was hundreds of feet below me, underground, I have the habit of understanding code by reading it, not so much by refactoring it. But the refactoring way is quite possibly useful, and both you and I should probably consider it.

What you see above is one programmer, very experienced overall, but not terribly experienced in Python, and particularly inexperienced in network programming, and totally clueless in whatever this Selector thing is, beginning to work out what some existing code does, in the absence of much documentation about the program.

Now in this case, there is documentation, in particular the very well-written tutorial on RealPython, but that is often not the case on the ground. On the ground, generally there is no documentation, or the documentation is so old as to need white gloves when we turn the pages, and parts of it are clearly Just Plain Wrong, and anyway the document is a million pages long and we only wanted to know these couple of things and if they’re in there we can’t find them.

The code we have here does not tell its story particularly well, but it’s at least fairly straightforward and we can begin to figure out what the story is.

And we find ourselves with some particular more pointed questions, which we can often research more effectively than our general Whatta??? questions about the code we face. In particular, my questions right now are:

What’s the deal on this selector thing? I’ll study the selectors topic in Python for that one.
What does one do about the repeated WRITE events in real life? Do we just ignore them and let the app spin? Doesn’t that overheat one of our cores?
What does the tutorial say? Since I do have it available, I’ll at least scan it now to see what I’ve mistaken, and to fill in my gaps. This is a luxuryL usually we don’t have a convenient tutorial about the horrible legacy code we’re working on, even if it’s the horrible legacy code we wrote yesterday.
What do my colleagues say? Will anyone toot me on Mastodon and correct me, give me hints, advise me? Will any of my pals who read these articles Slack-ping me with their thoughts? If we were on a team, I’d certainly be talking with the team about this. Ideally, I’d have been paired or mobbing on this.

Bottom line, in a couple of hours of reading, running, and writing (which takes most of the time but does help focus my thoughts) I have a fairly decent sense of how one deals, on the client side, with multiple connections. I expect the server will be similar, but that it will have a listen and accept kind of thing going on, as well as regular message service.

We’ll find out, probably next time! See you then!