Client-Server š š»
OK, I promised to test reading, and Iām gonna do it. I sort of have a plan.
Weāre here to test this method:
def read(self):
self._read()
if self._jsonheader_len is None:
self.process_protoheader()
if self._jsonheader_len is not None:
if self.jsonheader is None:
self.process_jsonheader()
if self.jsonheader:
if self.response is None:
self.process_response()
Last time, I listed a lot of aspects that we might want to test:
- Do we correctly accumulate bytes into the receive buffer?
- When we consume those bytes, do we correctly remove them?
- When we read the two byte āprotocol headerā, do we get the value sent in?
- When we read the protocol header do we actually set the
_jsonheader_len
variable, so that the next call will occur? - Will the code correctly drop through on a single read that has all the necessary data?
- Does the code properly handle all the gaps that might occur due to incomplete reads?
- Does
process_response
correctly decide between json and other? - Does whatever it calls deal properly with an unexpected format?
Now I think Iād argue that at most two of the questions above are really about this specific method. Iāve highlighted them: the one about properly dropping through, and the one about handling gaps.
The basic scheme of this method is that it reads whatever bytes are available and then tries to get the two-byte header, the json header, and the json response. It starts with none of those and if a given one isnāt there, it tries to process it and then goes on to see if if can do another and another.
If we knew we had all the bytes we need, this code could be more like this:
def read(self):
self._read()
self.process_protoheader()
self.process_jsonheader()
self.process_response()
If this were my code, I think Iād actually try writing it that way, with guard clauses inside those methods causing them to skip out until there is enough data to do their job. It might be better that way. It might not. I could also imagine using some explaining method names, like this:
def read(self):
self._read()
if self.we_need_the_proto_header:
self.process_protoheader()
if self.we_have_proto_header_and_no_json_header:
self.process_jsonheader()
if self.we_have_json_header_and_no_response:
self.process_response()
I suspect that python thinkers might almost read the original code and think the thoughts above. And I think that in this case, that transformation doesnāt provide enough additional clarity to be worth it. But I would at least consider it, and might try it so as to look at it (as I did above) to help me decide. My Smalltalk mentors would almost certainly advise doing something like that.
Anyway ā¦
Weāre here now and we want to see how we could build up our confidence that this works. I have a rough plan.
I propose to run the actual program and capture a message that is actually sent from the server to the client, and use that as the basis for my tests. I donāt think Iāve ever run this app before, though I did run the earlier ones.
The app-server starts with host and port, and the app-client starts with host, port, action, and value, where action pretty much has to be āsearchā and value can be āmorpheusā or āringā. I choose āringā.
The client run looks like this:
/Users/ron/PycharmProjects/cs1/.venv/bin/python /Users/ron/PycharmProjects/cs1/app-client.py 127.0.0.1 65432 search ring
Starting connection to ('127.0.0.1', 65432)
Sending b'\x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 37}{"action": "search", "value": "ring"}' to ('127.0.0.1', 65432)
Received response {'result': 'In the caves beneath the Misty Mountains. š'} from ('127.0.0.1', 65432)
Got result: In the caves beneath the Misty Mountains. š
Closing connection to ('127.0.0.1', 65432)
Process finished with exit code 0
And the server said:
/Users/ron/PycharmProjects/cs1/.venv/bin/python /Users/ron/PycharmProjects/cs1/app-server.py 127.0.0.1 65432
Listening on ('127.0.0.1', 65432)
Accepted connection from ('127.0.0.1', 58194)
Received request {'action': 'search', 'value': 'ring'} from ('127.0.0.1', 58194)
Sending b'\x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 60}{"result": "In the caves beneath the Misty Mountains. \xf0\x9f\x92\x8d"}' to ('127.0.0.1', 58194)
Closing connection to ('127.0.0.1', 58194)
So I figure I should be able to use that Misty Mountains string as my test string. (I still do not understand how they get a header length of 2 out of that.) So I begin to write my test:
def test_query_response(self):
answer = (b'\x00g{"byteorder": "little", "content-type": "text/json", '
b'"content-encoding": "utf-8", "content-length": 60}'
b'{"result": "In the caves beneath the Misty Mountains. '
b'\xf0\x9f\x92\x8d"}')
My fond hope is that I can stuff this into the input buffer and run the code and find a useful answer in the output buffer.
I should say āmy fond hope wasā, because this is _read
def _read(self):
try:
# Should be ready to read
data = self.sock.recv(4096)
except BlockingIOError:
# Resource temporarily unavailable (errno EWOULDBLOCK)
pass
else:
if data:
self._recv_buffer += data
else:
raise RuntimeError("Peer closed.")
The _read
method accesses the socket sock
. So Iāll have to put my data in there, I guess. Letās make a little test double for the socket:
class FakeSocket:
def __init__(self, data):
self.data = data
def recv(self, size):
return self.data
My new fond hope is that we can fudge around with various copies of this guy and test what we want to test. The first test weāll provide the whole message and see if we can get an expected response from it.
def test_query_response(self):
answer = (b'\x00g{"byteorder": "little", "content-type": "text/json", '
b'"content-encoding": "utf-8", "content-length": 60}'
b'{"result": "In the caves beneath the Misty Mountains. '
b'\xf0\x9f\x92\x8d"}')
sock = FakeSocket(answer)
msg = Message(None, sock, None, None)
msg.read()
This fails and by golly Iāll know the reason why or heads will roll.
Received response {'result': 'In the caves beneath the Misty Mountains. š'} from None
Got result: In the caves beneath the Misty Mountains. š
Closing connection to None
Error: selector.unregister() exception for None: AttributeError("'NoneType' object has no attribute 'unregister'")
Oh this is delicious! It worked, and then tried to unregister. I need to pass something in as the selector
, with an empty unregister
method. Letās just add the method to our fake thing and rename it.
I do that, discover that I also need close
and we have this working without exploding. No assert yet.
class FakeSocketAndSelector:
def __init__(self, data):
self.data = data
def recv(self, size):
return self.data
def unregister(self, sock):
pass
def close(self):
pass
class TestLibClient:
def test_query_response(self):
answer = (b'\x00g{"byteorder": "little", "content-type": "text/json", '
b'"content-encoding": "utf-8", "content-length": 60}'
b'{"result": "In the caves beneath the Misty Mountains. '
b'\xf0\x9f\x92\x8d"}')
fake = FakeSocketAndSelector(answer)
msg = Message(fake, fake, None, None)
msg.read()
This is nice, and it even prints the response all nicely decoded. I can fetch that same value and check it:
def test_query_response(self):
answer = (b'\x00g{"byteorder": "little", "content-type": "text/json", '
b'"content-encoding": "utf-8", "content-length": 60}'
b'{"result": "In the caves beneath the Misty Mountains. '
b'\xf0\x9f\x92\x8d"}')
fake = FakeSocketAndSelector(answer)
msg = Message(fake, fake, None, None)
msg.read()
assert (msg.response.get('result') ==
'In the caves beneath the Misty Mountains. š')
Thatās green. Commit the new test. Now we can proceed to test what happens with partial information. Letās try a simple one, with just half the data and then the other half. I think that I can satisfy my concerns by splitting the data in various ways.
Hm. I was going to create a fake with partial data, call read, the smash in new data and call read again. But now that we have gone to the trouble of writing this fake thing, letās make it a bit smarter. If we were to give it a list of byte strings, and have it tick through them, that might be useful.
Change the fake to expect a list, and change the user code accordingly.
class FakeSocketAndSelector:
def __init__(self, data_list):
self.data_list = data_list
self.index = -1
def recv(self, size):
self.index += 1
return self.data_list[self.index]
def unregister(self, sock):
pass
def close(self):
pass
Now we can try a new test:
def test_split_query(self):
answers = [b'\x00g{"byteorder": "little", "content-type": "text/json", ',
b'"content-encoding": "utf-8", "content-length": 60}',
b'{"result": "In the caves beneath the Misty Mountains. ',
b'\xf0\x9f\x92\x8d"}']
fake = FakeSocketAndSelector([answers])
msg = Message(fake, fake, None, None)
msg.read()
assert (msg.response.get('result') ==
'In the caves beneath the Misty Mountains. š')
- Note
- I copy-pasted the
fake =
line and the brackets should not be there. Skip to the next Note to see how easy this process would have been had I not confused myself with that.
I was sure that would work. Unfortunately it fails for some reason, and Iāve burnt too much time trying to figure it out. Hereās the failure:
def _read(self):
try:
# Should be ready to read
data = self.sock.recv(4096)
except BlockingIOError:
# Resource temporarily unavailable (errno EWOULDBLOCK)
pass
else:
if data:
> self._recv_buffer += data
E TypeError: cannot concat list to bytes
How did it get a list?
OK, let me see here.
def test_split_query(self):
answers = [b'\x00g{"byteorder": "little", "content-type": "text/json", ',
b'"content-encoding": "utf-8", "content-length": 60}',
b'{"result": "In the caves beneath the Misty Mountains. ',
b'\xf0\x9f\x92\x8d"}']
fake = FakeSocketAndSelector([answers])
msg = Message(fake, fake, None, None)
data = fake.recv(4096)
assert data == b''
This has returned the entire array. And I see why in the test, finally. I left the [] in the setup. Duh. And also, I need to read four times:
def test_split_query(self):
answers = [b'\x00g{"byteorder": "little", "content-type": "text/json", ',
b'"content-encoding": "utf-8", "content-length": 60}',
b'{"result": "In the caves beneath the Misty Mountains. ',
b'\xf0\x9f\x92\x8d"}']
fake = FakeSocketAndSelector(answers)
msg = Message(fake, fake, None, None)
msg.read()
msg.read()
msg.read()
msg.read()
assert (msg.response.get('result') ==
'In the caves beneath the Misty Mountains. š')
- Note
- I think weād have quickly moved to this point, had I not made the mistake of pasting that line. If I had typed it I would almost certainly not have put in the extra brackets.
Passes! Woot! Commit: split read test passing.
OK, we now are more confident about splitting, but letās find some more difficult splits. And letās change how the test works:
def test_split_query(self):
answers = [b'\x00g{"byteorder": "little", "content-type": "text/json", ',
b'"content-encoding": "utf-8", "content-length": 60}',
b'{"result": "In the caves beneath the Misty Mountains. ',
b'\xf0\x9f\x92\x8d"}']
fake = FakeSocketAndSelector(answers)
msg = Message(fake, fake, None, None)
for _ in answers:
msg.read()
assert (msg.response.get('result') ==
'In the caves beneath the Misty Mountains. š')
Now weāll read as many times as there are elements in the array. I expect at least to copy and paste this code and perhaps even to extract it.
I want just one more test for now, splitting right near the beginning. I go wild with splits:
def test_vicious_split_query(self):
answers = [b'\x00',
b'g',
b'{"byteorder": "lit',
b'tle", "content-type": "text/json", ',
b'"content-encoding": "utf',
b'-8", "content-length": 60}',
b'{"res',
b'ult": "In the caves beneath the Misty Mountains. ',
b'\xf0\x9f',
b'\x92\x8d"}']
fake = FakeSocketAndSelector(answers)
msg = Message(fake, fake, None, None)
for _ in answers:
msg.read()
assert (msg.response.get('result') ==
'In the caves beneath the Misty Mountains. š')
That passes. And it satisfies my need to test, because Iāve split up just about everything I can. I suppose we could write a test to feed one byte at a time, which might actually have been better, but honestly Iām satisfied.
Summary
This didnāt quite go as I had imagined, mot even counting my confusion over having written [answers]
when I should have written answers
. I had though Iād have to inject short strings one at a time, but once I had my fake object set up, giving it an array of byte strings came to mind and that was easier, and I think more clear. And that last test seems to me to be about as good as it can get, especially breaking out the first two bytes separately.
An additional nice thing about having this test is that now we could refactor that read
code into the different forms I was contemplating, if we wanted to, and we could be quite sure it was still working. But the main thing here is that it really wasnāt that hard to test, and had I not copied that line with the brackets, it would have gone very smoothly.
Iāll put skip notes in the article so you can see how smoothly it would have gone, just for fun.
It wasnāt that bad even with the silly typo, and itās fascinating how, when it didnāt work, I never even considered that the fake was set up wrong. And there were plenty of things I could have done to find it. But I assumed (and you know what happens when we assume) that the problem was something important and subtle, so I was looking at the read code and how it was manipulating the socket and all the stuff over there when I should have been looking right here.
This, by the way, is why it is nearly impossible to predict how long it will take to fix a defect. Unless we already know what it is and what line of code to change, the time to fix a defect is often dominated strongly by the time it takes to find the two characters that need changing.
Maybe ChatGPT or Claude would have done better. Feel free to go read their articles if you can find them. Theyāre pretty busy destroying whatās left of the environment, so donāt be surprised if theyāre even less amusing than I am.
I have mixed feelings about the fake object. It really seemed like the thing to do, and I am always pleased when I can create a simple fake and use it, and always a bit displeased that I needed it. Generally, I think the need to use a fake hints at a problem in the design.
What might be the problem here, you ask? Well, we have a very complicated wrapper for our very simple actual answer about the Misty Mountains. We have a two-byte count giving the length of a multi-element dictionary giving among other things the length of another dictionary containing a key āresultā and the string that is the only thing we care about:
In the caves beneath the Misty Mountains. š
If the two-byte header had been the complete message length, we could have a simple āread everythingā and an exceedingly simple āoh everything is the exact answer we wantedā.
But itās a tutorial, and it really is valuable to show how we might do something really complicated. Although I might have gone another way if I were doing the tutorial. But itās a very good tutorial and it can take the credit for a lot of what I now know about using sockets. I continue to find RealPython to be a very fine source of information.
Bottom line: We have tested a very intricate part of this legacy program we inherited from RealPython, and once we got started testing, it wasnāt all that difficult.
I think next time weāll stop testing this thing and decide what to do with our robot world. See you then!