My design fantasies and experiments need to stop. Let’s do the periphery feature. I have an idea. Results are satisfactory.

I do believe that the new, partially implemented scheme is “better”, but it has taken perhaps as much as six hours of coding and article-writing to get things working again. Still work to be done with the new “space” idea, I think. We have no real decision in place for how we’ll represent “where” a cell is, available, in a room, or wherever else it might be.

We have performance issues and ideas for improving things. Let’s do that and see if the objects are more helpful than they were. If they’re not, we’ll improve them. First up, periphery.

Periphery

A Room builds itself by repeatedly calling find_adjacent_cell until it can find none or the room has the requested number of cells. And find_adjacent_cell just looks at random cells in the room, checking each cell’s neighbors in random order, until it finds a cell that is available.

As the room grows, this process becomes less and less efficient, because more and more of the room’s cells are interior and have no available neighbors. Today we plan to fix that.

Let’s have a test that creates a room, counting the number of cells examined and asserting it. When we have completed our improvement, that test should fail with a lower number, hopefully a lot lower. We have no direct tests for Room (but plenty of tests that use it), so we’ll create a new test file:

class TestRoom:
    def test_hookup(self):
        assert True

My standard hookup test fails when I start, with assert False and succeeds now. Auto-run is still working, and I love it. Thanks @blackoverflow, for that reminder.

Now to test the number of checks done in building a room.

    def test_number_of_probes(self):
        random.seed(456)
        space = CellBank(64, 56)
        start = space.at(32, 32)
        room = Room(space, 100, start)
        assert room.probe_count == 42

Room has no probe_count. Implement it:

class Room:
    def __init__(self, space: CellBank, size, origin):
        self.space = space
        self.origin = None
        self.cells:list[Cell] = []
        r = random.randint(0, 255)
        g = random.randint(0, 255)
        b = random.randint(0, 255)
        self.color = pygame.Color(r, g, b)
        self.probe_count = 0
        self._build(space, size, origin)

    def _build(self, bank, length, start_cell: Cell):
        self.origin = start_cell
        new_cell = start_cell
        for _ in range(length):
            bank.take(new_cell)
            self.cells.append(new_cell)
            new_cell = self.find_adjacent_cell(bank)
            if new_cell is None:
                return

    def find_adjacent_cell(self, bank: CellBank):
        for cell in sample(self.cells, len(self.cells)):
            neighbors = cell.neighbors
            for neighbor in sample(neighbors, len(neighbors)):
                self.probe_count += 1
                if neighbor in bank:
                    return neighbor
            # cell is not on periphery?
        return None

Test fails. I am hoping for a large number not matching 42.

Expected :42
Actual   :472

Perfect. Change the test to expect 472. To no one’s surprise, it passes.

Cunning Plan

Here’s my cunning plan, formulated in my head, so subject to change based on the code, but it goes like this:

We will maintain two collections of cells. The cells collection will continue to be the official list of cells in the room. The other collection will be periphery and we’ll maintain it like this:

  1. We prime it with the starting cell, provided on creation;
  2. When we add a new cell to the room, we’ll add it to periphery as well;
  3. When a cell is checked in find_adjacent_cell and found to have no suitable neighbors, we’ll remove it from the periphery.

This should mean that every cell that goes in to the interior will be checked at most one more time before it is cast into the outer darkness. I considered checking explicitly, but that would involve checking all the cells all the time. This idea seems better.

Because we sample periphery, we can safely delete things as we loop. However, because we sample it, it will have to be a list, which won’t be as quick as if it were a set. After we get it working maybe we can think of something even better than a list.

Let’s do it:

class Room:
    def __init__(self, space: CellBank, size, origin):
        self.space = space
        self.origin = None
        self.cells:list[Cell] = []
        self.periphery: list[Cell] = []
        r = random.randint(0, 255)
        g = random.randint(0, 255)
        b = random.randint(0, 255)
        self.color = pygame.Color(r, g, b)
        self.probe_count = 0
        self._build(space, size, origin)

    def __iter__(self):
        return iter(self.cells)

    def _build(self, bank, length, start_cell: Cell):
        self.origin = start_cell
        new_cell = start_cell
        for _ in range(length):
            bank.take(new_cell)
            self.cells.append(new_cell)
            self.periphery.append(new_cell)
            new_cell = self.find_adjacent_cell(bank)
            if new_cell is None:
                return

    def find_adjacent_cell(self, bank: CellBank):
        for cell in sample(self.cells, len(self.periphery)):
            neighbors = cell.neighbors
            for neighbor in sample(neighbors, len(neighbors)):
                self.probe_count += 1
                if neighbor in bank:
                    return neighbor
            self.periphery.remove(cell)
        return None

That seems right to me, but PyCharm says three tests have failed. We’ll see about that. Let’s hope that one of them is my speed test.

Yes but this is odd:

>           self.periphery.remove(cell)
E           ValueError: list.remove(x): x not in list

Ah! I meant:

for cell in sample(self.periphery, len(self.periphery)):

Now only my performance test fails:

Expected :472
Actual   :363

Well, that’s better but I don’t love it. But maybe we should only have counted the cells probed, not the number of checks of its neighbors. Move the tally up:

    def find_adjacent_cell(self, bank: CellBank):
        for cell in sample(self.periphery, len(self.periphery)):
            self.probe_count += 1 # tally
            neighbors = cell.neighbors
            for neighbor in sample(neighbors, len(neighbors)):
                if neighbor in bank:
                    return neighbor
            # self.periphery.remove(cell)
        return None

Now the magic number is 169 cells examined to create a room of 100. Put the fix back.

Expected :169
Actual   :140

Well, that’s nice, a 23% improvement, but it’s not super. I think the savings is worth having, so let’s commit: Room building culls known interior cells from the search.

Starting the game without drawing paths does seem a bit faster. There is another change to be made, however. Cells now understand is_avaiable and that should be faster than using in. Let’s do that.

    def find_adjacent_cell(self, bank: CellBank):
        for cell in sample(self.periphery, len(self.periphery)):
            self.probe_count += 1
            neighbors = cell.neighbors
            for neighbor in sample(neighbors, len(neighbors)):
                if neighbor.is_available():
                    return neighbor
            self.periphery.remove(cell)
        return None

A fanatic would have measured that. I myself … well, OK< just for the fun of it, let’s do.

    def test_in_vs_is_available(self):
        random.seed(456)
        space = CellBank(64, 56)
        start = space.at(32, 32)
        start_time = time.perf_counter()
        for _ in range(100):
            room = Room(space, 100, start)
        end_time = time.perf_counter()
        assert end_time - start_time == 0
>       assert end_time - start_time == 0
E       assert (435597.446799416 - 435597.1678505) == 0

That’s using is_available. Change to in hoping for something much larger.

>       assert end_time - start_time == 0
E       assert (435740.974675208 - 435740.709438833) == 0

Well that’s not impressive, is it? We’ll leave the is_available form in, and we should really disable the in on CellBank entirely.

Ah. Here’s why it wasn’t impressive:

class Cell:
    def __contains__(self, cell: Cell):
        cell = self.at(cell.x, cell.y)
        return cell is not None and cell.is_available()

We intercept in and do the right thing anyway. You may wonder why we disassemble the cell provided and fetch the one at that cell’s location: folx used to be creating Cells willy-nilly and we treated them as if they were identical. Now I’m not sure that no one is ginning up cells on their own, so that’s in there to be sure we check the real one.

Anyway, doing in on Room is naff. Let’s turn that off and see what break.

Six tests fail.

class PathFinder:
    def can_use(self, neighbor) -> bool:
        if neighbor in self.bank:
            return True
        for room in self.room_list:
            if neighbor in room:
                return True
        return False

Fix this guy.

    def can_use(self, neighbor) -> bool:
        if neighbor.is_available():
            return True
        for room in self.room_list:
            if neighbor in room:
                return True
        return False

Green. Remove the __contains__ entirely and commit: CellBank no longer supports in.

Reflection

I need to slow down. I feel the nervousness that goes with moving too quickly, which leads to small mistakes. Most of them, I catch, but some, I do not.

However I noticed as I removed __contains__ that in CellBank, we have this:

    def take_xy(self, x, y):
        cell = self.cell_array[x][y]
        self.cells.discard(cell)
        cell.take()

    def take(self, cell: Cell):
        self.take_xy(cell.x, cell.y)

Our new scheme has it that CellBank, soon to be renamed CellSpace, always holds on to all the cells, and the cells maintain their particular status information such as what room they are in. So that discard should not be there: we should never remove a cell once it is created.

I remove it. All the tests pass. Commit: remove improper discard from take_xy in CellBank.

I think we’ve done enough good, let’s sum up:

Summary

We saved 23 percent in room creation at size 100, which is the current nominal room size. Nothing to sneeze at, but I’d like to have seen more. That said, main creates and displays a map with a dozen rooms quickly enough to seem fast. Even with path finding turned on, it’s quite quick now, so our changes have improved PathFinder as well. I’m not sure why it’s faster but it is very discernible when watching. Probably most of the improvement came when CellBank’s in was changed not to search the cell lists. I hadn’t run main for a while, and didn’t notice.

If we were deeply concerned about performance, we could set up some ongoing performance tests, but a) we’re not concerned and b) they tend to slow down testing and then we don’t run them and then they have no value. Anyway, no.

I had been planning to work on PathFinder for performance, but that seems no longer required. But there is work I’d like to do there. Here is a typical map:

typical map, rooms connected lots of right angles

There are some improvements I’d like to see in the paths:

  1. They should not be shown in the room, only in the open space;
  2. If rooms are adjacent, do not draw paths between them;
  3. Ensure that the room “islands” are all connected;
  4. Draw lines that aren’t all big right angles: make them more diagonal where possible;
  5. Indicate “doors” on the map, showing where the paths connect to the rooms or where the rooms interconnect;
  6. Go a bit beyond “minimally connected”: some extra paths but not too many.

Probably more things will come to mind.

I think that next time we’ll do a quick scan of the code, see what needs improvement, and then see what we can do about improving the paths. One thing almost certainly needed: cells know they are not available any more but they do not know why. We might want to fix that.

Anyway good progress today. A bit more performance, some of which we had attained without realizing, and some useful little code improvements, plus removal of a bit of dunder code.

I am satisfied. See you next time!