The Small Stuff

Python Asteroids+Invaders on GitHub

We’ll make sure that our tests check the production generator function, but mainly I want to talk about small steps and TDD.

Yesterday we put in the very nice—according to me—generator function that Bill Wake suggested. We have a couple of tests that test the idea, but no test that actually checks the one installed in InvaderFleet. Let’s sort that out first thing.

The function in InvaderFleet is this:

def generate_y():
    def convert(y_8080):
        return 0x400 - 4 * y_8080

    yield convert(u.INVADER_FIRST_START)
    index = 0
    while True:
        yield convert(u.INVADER_STARTS[index])
        index = (index + 1) % len(u.INVADER_STARTS)

We have a corresponding test that tests a local version of this function:

    def test_y_generator(self):
        def gen_y():
            def convert(y_8080):
                return 0x400 - 4*y_8080
            yield convert(u.INVADER_FIRST_START)
            index = 0
            while True:
                yield convert(u.INVADER_STARTS[index])
                index = (index + 1) % len(u.INVADER_STARTS)

        y_generator = gen_y()
        assert next(y_generator) == 1024 - 4*u.INVADER_FIRST_START
        for i in range(8):
            assert next(y_generator) == 1024 - 4*u.INVADER_STARTS[i]
        assert next(y_generator) == 1024 - 4*u.INVADER_STARTS[0]

I see two main options: change this one to refer to the prod version, or add a new test using the prod version. I think the latter choice is better, since the generator function is a bit deep in the bag of tricks, and it might be nice to have it here in front of us in the event that we come back to study it later. So …

    def test_prod_y_generator(self):
        from invaders.invaderfleet import generate_y
        y_generator = generate_y()
        assert next(y_generator) == 1024 - 4*u.INVADER_FIRST_START
        for i in range(8):
            assert next(y_generator) == 1024 - 4*u.INVADER_STARTS[i]
        assert next(y_generator) == 1024 - 4*u.INVADER_STARTS[0]

I don’t generally put import statements down in the code, but this one seemed helpful when we start to wonder where that function we’re testing might be.

I have one more bit of curiosity: just what is the type of y_generator? Turns out:

<generator object generate_y at 0x1054c9620>

So, through some Python magic, defining what looks like an ordinary function, except that it contains a yield statement, causes calling that function to return, not a result, but a generator object. That’s some magic right there. Just one of the things one has to learn, I think. As Riddick put it, “I might have gone another way”. It’s compact, we’ll give it that. Just a bit less obvious than one might like.

Commit: add test for production generator function.

The Small Stuff

In the light of Kent Beck’s Canon TDD article a few days ago, I want to make some observations about Test-Driven Development in the light of step size. If you do not already subscribe to Beck’s substack, I would advise you to do so: it’s quite good. (Of course I try never to give advice, but in this case I just couldn’t help myself: the value is high enough that I don’t want you to miss it.)

I recall that quite some time ago, Kent engaged another famous internet denizen who I will not name, in some kind of joint exercise where the denizen tried TDD. The denizen did not like it. As I was reading the articles it seemed clear to me why said denizen would not like TDD: he wrote code that was not amenable to TDD.

Denizen’s code tended toward very long methods, with more than one internal step, lots of temporary variables, and hidden changes taking place in the database, all in these long methods. There were no affordances for TDD to grab onto, and the only tests one could even imagine writing would be story-based tests, and they would mostly have had to read and inspect database changes to test anything at all.

In another of his excellent articles, Kent says “TDD isn’t design — it’s design feedback”. If I had to use only six words, I might well use those six. With a few more words—well, probably lots of words, you know me—I might say:

TDD, for me, is not a design technique. It is a particular practice that, used well, provides feedback that pushes my design in a direction that I have come to prefer. It works best, for me, with a design tending toward small objects and small methods.

I don’t know. Consider that a first draft. I never tried to express that notion before.

Anyway, MMMSS …

I want to mention GeePaw Hill’s notion of Many More Much Smaller Steps.

In an important sense, this idea is more fundamental than TDD. It applies whether you’re doing TDD or not: programming goes better when we proceed from working, through not quite working, back to working, in very small steps.

And TDD works better and better as our steps get smaller and smaller. Because the denizen mentioned above used very large steps, TDD did not work well for him. As the steps get larger, TDD has less and less to grab onto.¹

So the question is …

If small steps are better, as Hill claims and as I am here to claim, why is that the case? I think it comes down to defect injection rates.

Seen from far enough away, a programmer’s rate of defect insertion probably looks like any other random arrival rate, such as hits on the database or phone calls. If we cared, we would probably start by modeling defect injection as a Poisson distribution, which I wish to assure you, I remember almost nothing about. Simply put, the longer interval we measure, the more defects we’ll find randomly sprinkled in there.

Let’s first consider a single defect in a fairly large step. It’s somewhere in here:

[----------?---------]

When we finally finish that large step and get around to testing it, it doesn’t work. And now we have to find the bug. It’s somewhere in those many statements.

What if we implemented that same capability in lots of small steps, with tests for each one?

[-][-][-][-][?][-][-][-][-][-]

In this case we know exactly where the bug is, down to a very few statements. We find out about it sooner and when we do find out it’s easier to find and fix the defect.

Small steps work better in the presence of defects, so long as they don’t slow us down. There are 20 dashes in my large step above. If starting and stopping a step were to take as long as coding it, then the time comparison might be intolerable:

[----------?---------]
[-][-][-][-][-][-][-][-][-][-][?][-][-][-][-][-][-][-][-][-]

So we know that for small steps to be most valuable, the overhead of each step needs to be small. The overhead comes in a few places. One is the extra typing to define and call those little methods. I don’t think that’s large, but it’s certainly non-zero. Another bit of overhead is in running the tests. My tests run automatically when I stop typing, so the testing overhead is close to zero.

What about the programming overhead? There’s the typing of a method name, but typing surely isn’t the bottleneck. If it is, you need to work on harder problems. You’re wasted if you can program as fast as you can type.

The time in programming, I would hope, is mostly time spent thinking. (Unfortunately, it is sometimes time spent waiting for the computer, and even more unfortunately, time spent scratching one’s head and/or cursing.) And in fact, that’s where the small steps move ahead: we spend less time debugging because our defects are generally quite quickly detected, and the flaw spotted among only a few statements.

Now if I were arguing against small methods, I’d want to say, “Yeah, but even though my defects are in larger chunks of code, I still usually find them quickly”. And that is probably true … except for the ones that take hours or days. I think the mode of debug time is probably small, but the mean is quite a bit higher because of the occasional ones that take a lot of debugging.

Enough theorizing. My experience is that the smaller and more granular my tests are, the smaller and more granular my code must become. And my experience is that although I have a long heritage of writing large functions, the smaller I make my objects and methods, the faster I go.

I believe that it is the smallness that makes things good. Certainly the tests add a lot, in confidence, rapid feedback, and pressure to keep things small … but I think it’s the “small” that really matters.

It has been TDD that got me much of the way there and it has been Hill’s Many More Much Smaller Steps mantra that has really driven it home for me.

Does it really work for you, Ron?

Often in these articles, I’ll spend days working on some tiny thing, often fiddling around with code that already works. Surely that’s the opposite of fast? Yes. What I’m doing there is deliberate practice at making things better, mostly by breaking them into smaller steps.

I have the luxury that I can play with code for days if I care to: there is no one cracking the whip over me to get something done. So I have the joyous opportunity to keep my skills sharp, and to make them more sharp, every day.

You may be wondering, not just “How did I get here?” and “Where is that large automobile?”, but “How am I supposed to find time to practice smaller steps?”

The answer may be “we need to slow down”. One of auto racing’s greats, Sir Jackie Stewart, was famous for the smoothness of his driving. While going faster than anyone, his driving would be notably most smooth. We learn to be smooth, paradoxically, by going slower for a while, so that we can learn to sense the forces that influence our code.

Maybe we and our team can make a bit of time to study code, but we may do better just to relax a bit, take our time, do a bit more refactoring before we commit, and a bit more refactoring before we start a new thing. At first, we would need to be careful not to slow down too much, but as we work in a more relaxed fashion, focusing on smaller steps, I think we’ll find that our pace of delivering working code will remain the same and even speed up.

I know that if I were ever again condemned to program for money, that’s how I’d work. I’d try never to rush, When we rush we go off the road, and then there’s all that pushing and shoving and debugging to get us going again.

I wish you the best. And I try to offer you things that will lead you to discover your best.

See you next time!

It occurs to me that much of the work on test doubles, mocks, stubs, fakes, and so on, might in large part be an attempt to “make TDD work” by giving it a view inside of code that is otherwise too large to test. I don’t want to debate that seriously just now, but I think I could make a good showing if not win the debate.² ↩
Recall, though, that my parents left me out for the wolves, and I was found and raised by a roving band of Jesuits, so my argumentation skills are sharp, and not necessarily fact-based. But I digress. ↩