Why Does This Work?

A reader wonders why I seem never to dig a hole I can’t readily get out of. It is curious, isn’t it? If this article raises questions for you, please ask them! (More in subsequent article.)

Q&A continuation can be found here.

Reader Rob comments:

As I’ve been following your journey, I’ve thought more than once, this TDD seems almost random. It won’t be long before Ron codes himself into a hole he can’t get out of. Yet each time you recover and keep marching along. I’ve concluded TDD must be doing something more than seems possible at first glance.

Let’s talk about what happens in these articles, why it happens, how it happens, and what, if any, conclusions you might draw about your own work.

If this article raises questions for you, please ask them via email or Mastodon. I’ll enhance this article or write a followup.

Note: The article got long enough and time ran out, so there is more to come on this topic.

The Effect

Let’s be clear: what you see in my programming articles is what really happens. I’m not working all afternoon to plan tomorrow’s article so that it looks like I almost bury myself and then don’t. I start somewhere, think about how to do the thing, try to figure out a very small step toward the thing, then usually write a new test for the small step, write the small step, see that it works, rinse, repeat. After a few small steps, we generally have the thing we were looking for—or a useful part of it.

As that process goes on, the code gets worse and worse. Large methods turn up. Jagged methods with conditionals and loops rise up. Chunks of code arise that seem to refer only to other objects’ internal parts, not their own object’s. It gets messy.

Then, apparently at random, I seem to notice messiness and decide to do something about it. And, almost always, in a series of small steps that seem not to take very long, the code gets better.

Throughout all of this, I have a growing suite of tests that show that things are working. The tests grow with capability and when I’m working to make the code better, the tests confirm that the things tested are still working. Sometimes tests need to be changed. More often, they do not.

The overall effect, to me, looks as if I never consciously figure out a decent overall design for the program, but somehow the program evolves both toward more and more capability and toward a better overall design anyway.

I hope it looks like that to you. If not, you probably gave up reading these articles long ago. But it should look like that, because that is exactly what happens:

The program grows in small steps. Often, the code starts to get messy, or the design isn’t good enough for what the program needs to do. Then the code gets better and the design gets stronger.

Let’s talk abut how all this happens. It’s not magic, it’s not a trick. It is a matter of a few simple practices, practices that I find to be quite enjoyable.

Most Important Practice

I think about my code’s design a lot.

I think about it for a few moments before I start coding. I generally write those thoughts down at the beginning of the article. I think about design while I’m working, and generally try to mention those thoughts in the article. I pause during the work for “reflection”, where I think about what has just happened, why it happened, and what we might want to do about it. At the end of every session, I reflect and sum up my thoughts on the whole session.

But wait, there’s more!

I think about the program and its design as I’m falling asleep. When I’m waking up, I think about what I might work on, and how I might do it.

All this is design thinking and I do a lot of it. I don’t generally write things down that you don’t see, and if I draw a picture (rarely) I’ll mention it here or more likely display it.

In essence, what we need to recognize is that while I do less design “up front” than is often recommended, I’m thinking about design all the time. I do try always to mention those thoughts in my articles. I’m not designing in secret. I’m not trying to trick you into thinking we don’t have to think about design. I’m trying to show you that attention to design flaws, and consideration of design changes, goes on all the time. Let me emphasize that:

Attention to design flaws, and consideration of design changes, goes on all the time.

Ongoing Practices

I hesitate to list these ideas as “practices”, because that sounds more formal and prescriptive than I mean to be. These are the things that I try to do “all the time”, by which I mean “most of the time”, by which I mean “when I don’t do these things, which is often, I believe I’m actually screwing up”.

Here are some of the ideas that I move between as I work.

Small Steps

I’m not sure where the sweet spot is, but I am pretty sure of this: if I try to do something and if after twenty minutes I have not accomplished anything that I am willing to commit to the HEAD of the repo, I’m in trouble.

On my best days, the commits occur every ten or twenty minutes apart, for the whole session, counting from when I type the first character into the code until the last. (There is often a warmup as I write the article introduction, and when I think about what I’m going to do. There is often a cool-down as I write the summary, not infrequently followed by an additional update as something comes to mind during reflection.)

Yesterday was an outstanding day and I had 14 commits between 0915 and 1044, about 6 minutes per commit on the average. However, it’s fair to point out that those were simple repetitive changes. But it is also important to point out that I could commit after each tiny change.

My small steps are from green to green, all tests running. I can commit and push after each small step, and I try to remember to do so. I don’t break a lot of stuff and then fix it and then commit. I work, whenever I can, to evolve the code from one shape to another, to morph it, rather than ripping it apart and putting the broken bits back together after a lot of work.

You might ask if this is always possible. See the questions below.

Test First (TDD part 1)

I try to build new capability in the smallest pieces I can, see Small Steps. When I’ve chosen the small thing to do, I find that I do best when I first write a small test for that thing.

In recent articles, I’ve been implementing Forth language in Python. Forth has a word ROT that rotates the top three items of its stack, moving the third one to the top, pushing the other two down. I wrote this test:

    def test_rot(self):
        forth = Forth()
        forth.stack.extend([0, 1, 2, 3, 4, 5, 6])
        forth.find_word('ROT').do(forth)
        assert forth.stack == [0, 1, 2, 3, 5, 6, 4]

That test failed a few times, each of which brought me closer to what I needed.

It first failed because there was no word ‘ROT’ to find. I wrote the code to define ROT, giving it an empty behavior, doing nothing. The test then failed because it didn’t change the stack to the new shape. So I wrote the code that would do that. I probably made a mistake in doing it, and if I did, the test would fail again. Finally, I got it right and the test ran. Commit.

I tell a lie. Well, I tell a story. I didn’t log what happened with ‘ROT’: I did it at a time when I wasn’t writing an article. So I don’t really know what happened. But that’s surely a lot like what happened, because it’s a lot like what always happens.

Sometimes I’ll only write part of a test, just enough to fail. For example, if I were going to create a new object, say, Heap, I might just write enough of a test to require the class:

def test_heap_exists(self):
    heap = Heap()

That wouldn’t even compile, which counts as failing, and then I’d type the code to create the Heap class. I might not write the __init__ at that time, or I might.

Important Note: You can read articles and books on how to to Test-First and TDD and all that, and you can find those articles and books prescribing exactly what you should do, such as “write only enough code to pass the failing test”. I am not here to prescribe, but I am here to tell you that if I have the init in mind, I might write it, rather than slap my wrist and say “Bad Ron, no writing code that isn’t needed by the test, no biscuit for you!”; If I know what the full test will be, I might write it even though the first line is enough to fail.; Suppose you had a great idea for what we should do, and just as you opened your mouth I showed you the hand and said “Stop! Tell me in seven words, no more and no less!” You could probably do it, but it should surely stop the orderly flow of ideas in your mind and force you to start counting on your fingers. It’s the same with tests and code: I write as little or as much as is in my mind, not limiting myself to some arbitrary rule.; But I do try to adhere to this rule: write as little as is reasonable to write, maybe a little less. If I did start writing the __init__, as soon as I find myself stopping to think, I want to go back to the test and make it begin to fail again, so that I’m always working on making the test pass.

So the pattern is, small idea, write enough test to empty my mental buffer and to fail, write enough code to empty my mental buffer and make the test pass, make the test harder or—probably preferably—write a new harder test, and continue until the small thing is done.

You might ask how much of a class I’ll write with direct Test-First TDD style. See the questions below.

Reflect Frequently (TDD part 2)

When a small step is done, I take a break to think about what has happened. (Yes, sometimes that break seems infinitesimally small. Things seem to go better when I take a discernible break.)

I think about what went well, what didn’t go well. I try to assess whether my neck is getting stiff, a really good sign that I’m getting tense, which is a really good sign that something isn’t going well.

I look at the code just written and see whether I like it. As I mentioned, I like small methods, and I do not like jagged methods. When there’s something I don’t like I make a decision and quite likely refactor it:

Small Refactoring (TDD Part 3)

After a small step, if I’ve made a small mess, I will often refactor to clean it up. At this point, that will amount to renaming things, and probably extracting a method or variable, giving it a name for better clarity. I might notice two patches of code that are similar and work to make the similar parts identical, so as to do them only once, and the make the different parts discernibly different, giving them names that describe their individual essence.

In short, I tidy up the code, after almost every small step is complete.

Again I tell a lie: I try to remember to tidy up the code after every small step. Sometimes I’m on a roll and don’t think of it. Sometimes I see an issue and decide to let it slide. Sometimes I remember to come back shortly thereafter and tidy it. Sometimes I don’t, and the code deteriorates. I’m a human, Jim, not a robot.

Larger (but still small) Cycle

It seems to me that since we usually type only one line at a time, all programming consists of nested sessions making up larger and larger changes, kind of like this:

What I want to do is to pause at the end of each of those changes, reflect on what has happened, and refactor as needed:

(
    (
        () reflect
        () reflect
        () reflect
    ) Reflect
    (
        () reflect
        () reflect
    ) Reflect
) REFLECT

After small changes, we mostly consider small improvements. After larger changes, which consist of a group of smaller ones, we mostly consider larger issues, at the scale of the outer bracket. And so on.

In these reflective moments, we think about how things have one, how we are feeling, and how the code looks, whether it seems readable, expressive, attractive.

And when we notice things, we try to do something about them immediately.

You might question: What if we can’t do something now, or if we don’t see the problem until later? See the questions below.

One thing to be aware of right now: if we do not do a bit of reflection, or if we don’t respond to what we see, or if we just don’t spot something … the code is worse than it could be, and if this goes on long enough we are likely to get stuck in that hole that Rob was worried about when he asked the question that inspired this article.

Large Comes Down to Small

When we work in small steps, everything is usually quite nice. We write a little test, we make it work, we look around and improve the code a bit, rinse repeat, all very nice.

But at each larger, outer cycle, things seem not to be so nice. If when we reflect about the last five things we did, we realize that there is a class missing, one that would have made those five things easier, it seems like a big deal to write that class now, and to put it in place.

Often, we’ll decide, well, the five things work now, and yes, it would have been nice to have had that Foo class but we’re OK without it, everything is working fine here, let’s back away slowly.

There are two important futures to consider at this point. One, future A, is the future in which those five things are the only five things we'll ever do that could benefit from a Foo class. In that future, backing away slowly is probably OK. Another, future B, is the one in which there are six or seven or ten things like our original five, and where two of the original five need just a bit of changing.

Those are not the right names for those futures. Let’s refactor them:

There are two important futures to consider at this point. One, future Almost Never Happens, is the future in which those five things are the only five things we’ll ever do that could benefit from a Foo class. In that future, backing away slowly is probably OK.

Another, future Nearly Always Happens, is the one in which there are six or seven or ten things like our original five, and where two of the original five need just a bit of changing.

If we find ourselves in a situation where we suddenly realize there’s work that “should have been” done, adding a helper Foo class or fixing up those five methods, cleaning up whatever mess, my thinking may surprise you.

When we discover a need for improvement only after the code has been done for a while, I think it’s OK to let the bad code be. Assume that it’s future Almost Never Happens. We might get lucky.

However, at the moment that it becomes clear that this is future Nearly Always Happens, and there is more work to do in this area, I believe that we will do best if we begin to fix the situation as part of doing the additional work that is called for. I emphasize begin to fix.

We’re going to work in small steps anyway. Make some of those small steps improve the code so that our changes will be easier to make, then make those easier changes¹. Putting it another way, try always to leave the campground better than we found it². If we revisit this code often, it will get better and better, moving toward the design we now see as desirable. If we do not revisit it often, our investment in improving it will be limited to about the right amount.

That is not to say that I am recommending putting off refactoring: I am not. I am, however, saying that once it has been put off, we might not want to try to schedule work just aimed at improving existing code: we will do better to focus code improvement on code that is under active change.

For more on this, see Refactoring – Not on the Backlog.

Note: In my articles I very often go back to “old” code and improve it. I do that, not because I wold recommend that a real team with real deadlines and objectives should stop doing scheduled work and spend a few days just improving code. Sure, if you have free time, you might do that, but once the code is in place and not being actively changed, I think it makes economic sense to leave it unless and until we come back to it for “business reasons”. And then we improve it.; In my articles, I want to show how, once the code has become a bit nasty, we can improve it in small tested steps. My refactoring sessions involve frequent commits, just as new work does. The point is that small steps are good, and small steps are always³ possible.

Process Summary

I’m trying to write an article here, not a book, so let’s try to draw toward a closing.

The overall pattern of work, when I’m at my best, is roughly this:

In the smallest steps I can imagine;
Write a test for an even smaller part of the thing;
Make the test run;
Make all tests run;
Commit and push;
Reflect;
Improve code;
Commit and push.

Outside that loop? It’s really just the same. I try never to take a big leap. I try always to have all my tests running. I try always to have my code neat, tidy, clean, and clear. I try always to keep my methods small and avoid jagged ones.

Minute by minute, hour by hour, day by day. Small steps, green to green, commit, then clean, then commit.

Tools

I think it’s important to mention at least two tools that I find very valuable in this work: a decent code manager, and a powerful refactoring IDE.

I can and do follow the above approach even when programming with nothing but a text editor. But it’s harder, it’s more error-prone, and I seem inevitably to start taking bigger steps, making bigger mistakes, and leaving the code in worse order.

I am a professional programmer, albeit well and truly retired. But even today, my tool kit includes Git, GitHub, PyCharm, and IDEA. There are free, or nearly free versions of these tools, though I happen to use the paid versions. Why do I use these?

Code Management

The code management tools let me lock in changes in tiny steps. Every time I make progress, I can commit it and lock it in. When things go wrong, I can roll back quickly. If things go terribly wrong, and sometimes they do, I can reach back and pull down a version from before the disaster and work forward from there.

I rarely cherry-pick, rarely go back to pick up history. I roll back a tiny bit almost every day at some point. I roll back an hour’s work or more once in a while. I can’t remember the last time I pulled an old version and started from there.

Because I am practiced in making small changes, I can almost always start from wherever I am and head over toward wherever I want to be. Once in a while I might want to back up and start from there, but the better I get at moving forward in small steps, the less I ever need to dip deep into history.

Without the code manager, I would have to resort to saving copies and trying to give them reasonable names and all that … and even then, I’d avoid some improvements because I become more tentative about changing code, just because I don’t have that safety net of all the old versions. This is odd, because I so seldom use those old versions, but they give me a kind of comfort that keeps me relaxed.

Refactoring IDE

Much of what makes my practice work is that I do things in small steps and keep things as clean as I can manage, all the time. The IDE helps me with that, in some very key ways:

It runs all my tests all the time. If I stop typing for a few seconds, PyCharm runs the tests. The results are summarized in a small unobtrusive popup that changes color when a test is red. Sometimes this alerts me to a problem, because I wasn’t expecting anything to break. More important is that when I’m working to make a test pass, I can keep working without thinking when to run the tests: they basically run all the time.

It makes refactoring easy. It’s not hard to do an Extract Variable by hand. With practice you can do a pretty smooth select cut type, type, paste and get the job done. But with a refactoring IDE it’s Option+Command+V, type the new name, Enter.

By hand, renaming a method is a pain. You have to do some kind of global search to find all the senders and change them and then change the name. Or in the other order: you do you. With a refactoring IDE, it’s Shift-F6, type new name, Enter … and if fixes them all!

The power tool, with practice, lets me do things very easily that are just hard enough that I’d put off doing them if I were programming without the tool. How do I know this? I know this because I program on my iPad, where I do not have these tools, and I program in another space where I do not have these tools, and I can see the difference that it makes in my work.

Debugger

Well, actually, not much. To my shame, I do know how to work the PyCharm debugger, at least at the level of setting breakpoints and stepping. I am happy to say, however, that I rarely do that. I do it only when something so weird is going on that I can’t find it with a test or with a few prints.

But but but you might say, if I’m going to print something, it would be faster to set a breakpoint there and just inspect. Yes, it would. It would also be far too tempting to set a breakpoint and step around looking to see what is going on. I do know how to do that. We used to do it in assembler decades ago.

The thing is, for me, stepping isn’t thinking. I do better, day in and out, to put a judicious print somewhere, run a failing test, and see what that judiciously chosen value was. I find that when I start using the debugger I spend more time than when I am more judicious in my probes. YMMV, but the debugger is not one of my primary tools. It’s not even secondary.

Time has elapsed

This has taken more time than I had available, so I’ll quickly list some things here, and write about them either in a revision to this article, or in the next.

Questions

Q&A: Q&A continuation can be found here.

Thanks to Kent Beck for this phrasing. ↩
This is supposedly a rule of the Boy Scouts. I have been unable to verify that, though General Baden-Powell, one of the founders of the scouts, said something similar. ↩
Almost always? I don’t know. I think it’s so close to always that that’s how I’d bet. You do you. ↩