OK, Now What?
I was thinking this morning, while trying not to get up yet, about the programmer, the programmer’s code, and their care. As to my recent defect, I really don’t see what to do better. Yet.
By my lights, an important understanding for any artist or craftsperson is to fully understand that we are not our product. We need to be able to look at our product with the same analytical clarity that we would look at the work of another. We need to be able to assess our own work with our senses turned up, our kindness turned up, and our sensitivity turned down.
We need to be able to look at the code, see its good bits, and, importantly, see its flaws. When we see them, we need to take them in with our eyes open. We might need to explain the code, to others or ourselves. We might need to explain how it got that way. We will benefit from reflecting on how we made it that way.
But we will not benefit, in my view, by blaming ourselves, or by feeling defensive about ourselves. If we do that, it will be harder to take in what we need to take in: how to do better next time.
This notion also applies to reviewing the code of others, whether pairing or mobbing or zooming. It’s scary for most people to have their work reviewed by others. This might stem from their younger years, where teachers may have critiqued them harshly. When we review the work of others, in their presence, one of our most important goals is surely to enable them to improve.
We can help the person whose product is being reviewed, whether they are another or ourselves, in a few ways. We can help them see:
- issues in the code;
- tests that could have been written;
- alternative code options;
- alternative designs;
- refactoring opportunities;
- and so much more …
Most importantly, I think, we can help them see that they are OK, that they can improve, that they want to improve, and that we care enough about them to treat them well.
I do not mean this in some sappy, squishy, touchy-feely way. OK, maybe I do, but be that as it may, I think we should do this because it will make our life better, the other’s life better, and our products better. It’ll be good for everyone.
Our Recent Defect
I didn’t write the above so that we could go gently on me in reviewing the defect that tied me up over the past three days. My defects here don’t hurt anyone, so I don’t have to feel regret at harming another, or even giving them a bad morning, and I have made so many mistakes in my code (and in my life) that, while I don’t want to make them, I am quite aware that I do. So let’s look at this latest one and see what we can learn from it.
The defect was a crash, a raised exception, due to sending a message to a None that the program thought was a Room. That None was right there in one’s face:
class DungeonLayout:
def _straight_path_room(self):
cells = self._straight_path_cells()
if cells:
return Room(cells, 'path')
else:
return None
I almost remember writing that and realizing that it was going to cause trouble if it ever returned None. I may have even thought that if and when it happened, some test would have caused it, and we’d deal with it in due time.
As it turns out, a test did not find it.
In the light of what I’ve learned now, we might have thought:
OK, if we try to create a path and cannot, we don’t want to add a room with no cells, because we know that is bad. Reading that now, I think, OK, some code is calling this method asking for a room that may or may not be possible. It will be up to them to add the room: there’s really nothing else to do.
An alternative and likely preferable option would be
class DungeonLayout:
def _add_straight_path_room(self):
cells = self._straight_path_cells()
if cells:
self.add_room(Room(cells, 'path'))
Now we just don’t add the room if we can’t create it.
That would have been better. It would have stopped the crash. It did not fix the real problem.
It’s worth looking at what I actually did, which was inferior to the move above. Here’s a quote from a couple of days ago:
Quote Begins
My eyes happen to fall on this method:
class DungeonLayout:
def _straight_path_room(self):
cells = self._straight_path_cells()
if cells:
return Room(cells, 'path')
else:
return None
Callers of that?
class DungeonLayout:
def ensure_connected(self):
while not self.is_fully_connected:
room = self._straight_path_room()
self.add_room(room)
And that is called from main. And there’s yer problem right there.
We can’t make a room with no cells: that would just fail on the next clause, and there is surely lots of code that expects to find at least one cell in a room.
Here is a proposed fix. Change this:
def add_room(self, room):
self.rooms.append(room)
To this:
def add_room(self, room):
if room:
self.rooms.append(room)
That should do the job. But we really should write a test that shows the defect and that than shows it as fixed.
I have an errand to run, and after that, we’ll do just that thing.
Quote Ends
My fix was to patch add_room to ignore the None. The fix considered here never creates the None, which is better at that point.
We could argue that having add_room swallow any None that happen to come by is somehow safer. We could argue that a language with more strict typing would likely have flagged our None-creating method. All arguably true, and all interesting.
Had I just made that patch and walked away, I’d have saved five or six hours of confusion and tedious work trying to figure out what was causing that None, which “should” never have happened.
The real cause of the issue was in this code:
def _straight_path_cells(self):
cells = []
suites = self.define_suites()
for s1, s2 in zip(suites, suites[1:]):
cells.extend(s1.find_path_cells(s2))
return list(set(cells))
This code, given suites S1, S2, S3, S4, returned the cells making up a path from S1 to S2, a path from S2 to S3, and a path from S3 to S4. Our caller, above, turns those cells into a Room, and stores it in the layout.
The idea behind this notion is that we are trying to get the dungeon fully connected, so we break it into Suites (sets of connected rooms) and then connect the Suites as above, until finally we have connected them all.
The defect is that we create a single room made up of all the paths created in that pass. And that room is, obviously now that we think of it, adjacent to all of S1, S2, S3, and S4. We built it that way.
But that path is not continuous! It has, as we saw in this picture, segments that are far separate from each other in the map:

Now, I knew that was happening, and I knew it was odd, because there is a card right here beside me where I made a not asking whether paths should be one room or many. And I crossed the item off, presumably because I decided it was OK as is. It was not OK as is.
In the picture above, the map is not connected. It will not appear to be connected to ensure_connected. But when we create the separate Suites, they will both include the same room, the path room. When our code decides how to draw the path between those suites, it will look for cells in all the rooms of the one and all the rooms of the other, that are closest together. Like this:
class Suite:
def find_closest_pair(self, suite):
my_best = None
his_best = None
distance = 1_000_000
for mine in self.cells:
for his in suite.cells:
dist = mine.manhattan_distance(his)
if dist < distance:
my_best = mine
his_best = his
distance = dist
return [my_best, his_best]
We consider all the cells in all the rooms of the sites. Therefore we consider all the cells of the path vs all the cells of the path. Those cells are, how shall I put it, very close to each other. So, inevitably, we decide to draw a path between some cell of the existing path and the same cell, as found in the other suite.
Path between a cell and itself? Empty. Code returns None. Crash.
But the real defect isn’t fixed by ignoring the None, nor by simply not creating the room from empty cells. The real defect is that we have a room that is discontinuous and appears in multiple locations.
Therefore the real fix for this particular defect was to create a separate room for each path between suites:
def _make_path_rooms(self):
suites = self.define_suites()
for s1, s2 in zip(suites, suites[1:]):
cells = s1.find_path_cells(s2)
if cells:
room = Room(cells, 'path')
self.add_room(room)
# cells can be empty if no path is possible
# because we only accept source room,
# target room, or available cells.
What Have We Learned?
I think that the code that never creates a None is far better than code that might, then covered by someone else ignoring the None. That seems pretty clear to me, and while I don’t remember why I did that, it would have been better to have said “No don’t return some weird case, change how we do this”. I didn’t think of that, and I’ll try to be more sensitive to it in the future.
We could perhaps benefit from type hinting. Let’s digress and find another return None. Here’s one now:
class CellSpace:
def random_available_cell(self):
available = [cell for cell in self.cells if cell.is_available]
if available:
return random.choice(available)
else:
return None
It is possible that no cell around this one is available: it could be in the middle of a room. So let’s see who’s calling this.
class Cell:
@classmethod
def random_available_cell(cls):
return cls.space.random_available_cell()
And this is commonly used without protection, for example:
main.py
def make_a_cave_room(dungeon: DungeonLayout, maker: RoomMaker):
origin = Cell.random_available_cell()
size = random.randint(50, 70)
room = maker.cave(number_of_cells=size, origin=origin)
dungeon.add_room(room)
What will the RoomMaker do? It takes a bit of digging, more than you want to read, but the bottom line is that we’ll send a message to the None, asking for adjacent_cells and we will crash.
Could this actually happen? I would bet that I could make it happen in two tries, if not one. Just loop building cave rooms until the entire space is full. The random_available_cell then returns None and we crash.
Will it happen? Probably not but it could. I suspect that the same is true in most of the cases.
Is there a principle here?
Yes, I think there is an important notion here. We are building a sort of utility portion of a dungeon game which, it seems likely, can build an infinity of random dungeons, in which our intrepid adventurer Dot can have an infinity of adventures. Crashing will get in the way of that infinity, and might cause Dot to loose all her experience points, and our customer to curse us to oblivion.
Returning a flag, as we do here, when something isn’t possible, is perhaps “better” than raising an exception, although we could debate cases. Python’s try-except is fairly compact and not notably harder than if foo is None:. But either way, we have to deal with the issue everywhere that we call random_avaiable_cell, which we already do more than a half a dozen times, and probably will more as our room types grow.
What should we do? At this writing, I do not know, and it probably depends om the individual None return. Options include:
- Use the
addrather thanreturnidea, and just don’t add if you can’t; - Use type hinting to get better advice from the IDE and compiler;
- Perhaps there is some way to use Python’s
withstatement to advantage; - Make more broad use of
try-except; - Ask permission before doing something;
- Make use of Missing Object pattern, if we can remember it;
My preference is code that has a “happy path” and no other path. Code that always works, is always harmless. In that light, as we’ll probably discuss in upcoming days, I prefer fewer conditionals, and zero try-excepts. We’ll see what we can do.
Suppose, for example, that make_a_cave_room were make_a_cave_room_if_possible, at least in concept, and that it had a convenience return of the room size, possibly zero. Most code could safely ignore the return and if anyone was actually trying to fill the dungeon, they could repeat until the return was zero.
I think we are at a convenient stopping point, though it doesn’t look much like where I thought we’d go. Standard day around here.
Summary
I’ve made a card saying, in big letters “No None”, and a few quick notes. We’ll take occasions to work on improving the None situation.
I still have no good sense of how this issue could really have been avoided. I think that with the original patch in place, we might have created dungeons that the Layout thought were connected, but that were not, but in fact it was unlikely … occurring perhaps once in twenty or thirty complex dungeons and — I believe — even more rarely in simpler dungeons. We might never have noticed.
I believe that I might be able to craft a dungeon that would display the bug, but now that you can’t make a discontinuous path room, I’d have to deliberately build a broken room to even show the problem. Don’t build discontinuous rooms, there’s an idea.
Should we verify rooms for being continuous? We could, but it would be costly and to what benefit? We’re not going to do it, are we?
I’m not going to forget this issue, to try to learn from it, and to try to see whether something similar might be lurking. But in all honesty, it was a mistake, I made it, I didn’t see an issue at the time, and it took me many hours to see it at all. I’m not sanguine about it. I’m concerned, because I like to work such that my mistakes show up quickly, right after they’re made, so they can be fixed, the code can be improved, and so that I can do better.
It’s fixed, the code is better, and so far my only solid idea, “No None”, would not have solved the problem that arose.
I don’t know what I could really have done better. Yet.
See you next time!