Just One Word

The Robot World Repo on GitHub
The Forth Repo on GitHub

Yesterday’s pleasant discovery suggests that reducing our two word classes to one should be easy. Then, if it works as I expect, we’ll talk about woulda coulda shoulda.

Our quest was to explore having just one kind of word class, instead of our two, PrimaryWord and SecondaryWord. After a long discussion of how classical Forth implementations do things, and some exploration of what needs to happen when word definitions are made up of other word definitions are made up of other word definitions …¹

Yesterday, I started with a simple experiment in a test file. That opened my eyes to the possibility that, in essence, what we needed might actually be in place already. A “primary” word could just be a “secondary” word with only one function in the list.

And then, a simple patch to the SecondaryWord’s do method resulted in an important “primary” word, DUP, being defined as a SecondaryWord, not a PrimaryWord. Voila! It was so easy that I could hardly believe it, and if it was that easy, how could it be that I didn’t see it before? If today’s work succeeds, as I expect it will, we’ll come back to that question. But first, did it work?

Today, I plan a radical experiment to find out. Most of our PrimaryWord instances, perhaps all, are created in this convenience method:

class Lexicon:
    def pw(self, name, code, immediate=False):
        self.append(PrimaryWord(name, code, immediate=immediate))

My thought was just to change this code to create a SecondaryWord and see if everything still works. But I had forgotten the immediate flag. Does SecondaryWord include that? A quick check tells me that it does. Then can’t we just do this?

    def pw(self, name, code, immediate=False):
        self.append(SecondaryWord(name, [code], immediate=immediate))

[Expletive Deleted]! 38 tests fail. Ah! they are all failing for the same reason:

        while self.pc < len(self.words):
            w =  self.next_word()
            try:
>               w(forth)
E               TypeError: 'SecondaryWord' object is not callable

However, we then get a new series of exceptions handling that one.

I thought we had dealt with that:

    def do(self, forth):
        forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            try:
                w(forth)
            except Exception:
                w.do(forth)
        forth.end()

I really expected this to work just fine. It seems to me that one or the other of those calls has to work.

I probably would be wise² to roll back this change and work out what’s going on. But I have a glimmer of what it might be, and I want to try something.

The Secondary word expects to find a list of PrimaryWord or SecondaryWord instances. Our new scheme allows us to put functions into the list, the lambdas that we use to define primaries. I try this:

    def do(self, forth):
        forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            if callable(w):
                w(forth)
            else:
                w.do(forth)
        forth.end()

That doesn’t really help. I think that what is happening is that our active_word logic is getting in the way. That’s the begin and end stuff in do. I suspect that we’re stacking things that should not be stacked.

I also wonder, just a bit, whether there is a lurking issue with the active word in general. I make a note.

Making the note tells me something that I think is the issue. The active word, maintained by the begin and end, is there so that words that patch the code can find the place to patch. I think that we are pushing all words onto the active word stack and really should not do that in the case of a primary, or perhaps just an immediate word.

I try doing the begin and end only if the word is not immediate. No improvement. I might do well to roll back, but I’m not done thrashing yet.

One more quick hack: let’s have a new primary flag:

class SecondaryWord:
    def __init__(self, name, word_list, immediate=False, primary=False):
        self.name = name
        self.words = word_list
        self.immediate = immediate
        self.primary = primary
        self.pc = 0

    def do(self, forth):
        if not self.primary:
            forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            if callable(w):
                w(forth)
            else:
                w.do(forth)
        if not self.primary:
            forth.end()

class Lexicon:
    def pw(self, name, code, immediate=False):
        self.append(SecondaryWord(name, [code], immediate=immediate, primary=True))

And we are all green. That was it. Why was that it? Honestly, I cannot answer that question fully.

Let’s see who uses the active word capability. I’ll just list them and we’ll consider them as a batch.

 self.pw('DOES>', lambda f: f.active_word.copy_to_latest(f.lexicon))

    def define_skippers(self, forth):
        def _next_word(forth):
            return forth.active_word.next_word()
        ...
        self.pw('*#',     lambda f: f.stack.push(_next_word(f)))
        self.pw('*ELSE',  lambda f: f.active_word.skip(_next_word(f)))
        self.pw('DUMP',   lambda f: f.stack.dump(f.active_word.name, f.active_word.pc))

        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.skip(beginning_of_do_loop)

        def _zero_branch(forth):
            branch_distance = _next_word(forth)
            if forth.stack.pop() == 0:
                forth.active_word.skip(branch_distance)

        self.pw('*LOOP',  _star_loop)
        self.pw('*IF',    _zero_branch)
        self.pw('*UNTIL', _zero_branch)
        self.pw('*#',     lambda f: f.stack.push(_next_word(f)))
        self.pw('*ELSE',  lambda f: f.active_word.skip(_next_word(f)))
        self.pw('DUMP',   lambda f: f.stack.dump(f.active_word.name, f.active_word.pc))

All these uses of active_word are dealing with references into the word that is currently running. Some of them are looking up a skip distance. One is fetching a literal value from the next word.

So the active_word property is supposed to return the (secondary) word that is currently running, so that words within that word can access its contained elements out of order. A primary word is atomic, never needs to do that, and cannot do that. But before this change, we were pushing every word onto the active_word stack, not just the secondaries.

I am convinced that this fix is righteous, but while it works, it isn’t right. Let’s improve it a bit before we commit.

Since our only references to the primary flag are saying not self.primary, we’d do better with a variable named secondary.

class SecondaryWord:
    def __init__(self, name, word_list, immediate=False, secondary=True):
        self.name = name
        self.words = word_list
        self.immediate = immediate
        self.secondary = secondary
        self.pc = 0

    def append(self, word):
        self.words.append(word)

    def do(self, forth):
        if self.secondary:
            forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            if callable(w):
                w(forth)
            else:
                w.do(forth)
        if self.secondary:
            forth.end()

Corresponding change to the pw method of course:

class Lexicon:
    def pw(self, name, code, immediate=False):
        self.append(SecondaryWord(name, [code], immediate=immediate, secondary=False))

That’s better.

Now that if in do could be changed back to the try/except form. Or we could make a SecondaryWord callable. Let’s put that off until after our next commit.

I think that “next commit” can be right now. We’re green and the code is decent if not good. It’s good to have a save point. Commit: Everything defined with pw produces a SecondaryWord, not a PrimaryWord.

Are there references to PrimaryWord? There are not, other than an unused import. Remove the class. Commit: Removed PrimaryWord class.

Let’s rename SecondaryWord to Word. Green. Commit: rename SecondaryWord to Word.

Make Word callable:

class Word:
    def __call__(self, *args, **kwargs):
        self.do(*args, **kwargs)

    def do(self, forth):
        if self.secondary:
            forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            w(forth)
        if self.secondary:
            forth.end()

Green. Commit: Word is callable.

Let’s eliminate do. Find all the senders. We could do a global replace of .do(whatever) with (whatever).

Let’s try that.

It nearly worked. It broke my experimental tests from yesterday, which also used do for their own purposes, and it broke the __call__ method by making it recur upon itself. Quickly fixed. And now we can remove do:

    def __call__(self, forth):
        if self.secondary:
            forth.begin(self)
        self.pc = 0
        while self.pc < len(self.words):
            w =  self.next_word()
            w(forth)
        if self.secondary:
            forth.end()

Green. Commit: remove Word.do.

We’ve been moving very swiftly. Let’s pause and reflect.

Reflection

We’re at an interesting place. A Word in our Forth is a callable object. It is, in essence, a function. The functions we have in our Word instances are all manipulating the fundamental Forth objects, which are just the stack and return stack, and the Words themselves.

That is very much like how classical Forth works. There are differences. The main one is that we needed a flag to indicate a secondary word. Classical Forth does not need that flag, because it branches into its words, and the primary words just don’t enter themselves onto the return stack at all. In our case, we have to call things, since branching into random code is right out in Python. Since our primary words are called, not branched into, we have to suppress putting them on our “return stack”, which is the stack we call active_words.

So at this point, we have a single Word type, as does classical Forth. The Word is a collection of functions, which are either other Words or the lambdas that we use for our primitive Words. We only ever put a lambda in the first slot of a Word, because only our pw operation ever uses a lambda in that fashion. We could, in principle, put a lambda anywhere into a word. I can think of no reason to do that, but it would work. I plan not to explore that possibility.

I think this is a legitimate improvement and simplification to the scheme. It pleases me.

At this moment, I don’t see anything additional that I want to do. The code seems to have scrunched down to a nice form, and we’re at a good stopping point. I have two things to discuss in summary:

Summary: Broke Everything

When I made the simple change to create only instances of SecondaryWord, 30-odd tests broke. The “proper” reaction to such a thing is to roll back, rethink, do again better. At least my betters and my better self tell me that.

I did not do that. Some of my reasons were good, some perhaps not so good. One good reason not to immediately roll back is that when a raft of tests break all at once, it is very likely that there is a single cause for all the failures. If that cause is some kind of simple typo, rolling back and doing over gives us a very good chance of not making the same mistake again.

The immediate rollback is a very fancy move, in my opinion. I recall the first time I saw it done. Kent Beck was programming on screen with a bunch of us in the audience. He made some mistake typing. We all saw it. Before we could even express what it was, he had reverted the code and just typed in the change again, this time without the mistake. We were all awestruck by this move. I am serious. It was just so cool: made mistake, erase everything he did, do over. No apparent thought, reading, debugging. Roll back and do over.

It’s a bloody power move. Watch this, peons! This is how the master does it! Blammo! (Mind you, I don’t think Kent meant it as a power move: it’s just a good habit that he has. Nonetheless we were all blasted back in our seats.)

If the mistake we’ve made was a thoughtless typo kind of thing, then the odds are good that doing it over will avoid the mistake. We will, after all, automatically proceed a bit more thoughtfully the second time.

But if it isn’t that kind of mistake, rolling back won’t help. With the change I was making, I felt intuitively that the issue was more fundamental, not something local to the one line of code I had just typed. So I wanted to look around, to see what it was.

Then I really did just try a few things. It was clear very early on that the active_word notion was involved, and a couple of attempts led me quickly to get the idea that primary words should not be doing begin and end. If I were thinking more of classical Forth, I would have realized that, in those Forths, the primary words do not manipulate the return stack: only secondaries do. I didn’t see it that clearly, but I saw it clearly enough to decide to try flagging words as primary or secondary, and it worked.

When all the breaking tests suddenly work, it’s a sign that you’ve found your fundamental error, just as it reflects some very central error when they all break. So as soon as things went green, I could be sure we had nailed the issue … or at least as sure as we can every be in this business.

“Should” I have rolled back? Looking at it from here, I don’t think it would have made things better. So perhaps my “power move” was to break the roll back rule and use my precious brain instead. Anyway, it worked out and leads me to my second summary topic:

Summary: Woulda Coulda Shoulda

This current definition of a single Word class, which is callable, seems like a very clean and obvious way of implementing a Forth word in Python. It echos the design of classical Forth rather well, and the fact that there’s just one class instead of two, with the one winding up simpler and shorter than the original makes it seem all the more right.

So it’s simpler and better. I shoulda seen it. I coulda seen it, I woulda seen it except … except what?

Was I bad? Am I a bad programmer? Am I a bad person? Is there some rule that only a fool like me would ignore? If only I had spent more time in design, would this all have gone so much better? Would a few UML diagrams, like everyone says we should do, have avoided this?

It’s so easy to look at where we are now and conclude something about what we should have done in the past. However, we cannot go back on our own time line, and if we did, it would just branch reality at that past point, and the poor devil in this branch would still feel just as bad, even if in the other branch reality works out much better.

Two days ago the program was working just fine with two word classes. Today, it is working just fine with only one, simpler class. That’s better. Better is what we want.

My friend and colleague Diana Larsen once said to me “Don’t should all over yourself”. Darn good advice.

What we can do, when we’re inclined to think about what we “should” have done in the past, is to turn our eyes to the future. What would we like to do differently in the future. Not what should we do: we don’t know what the future will hold, and it’s better to wait until we have more information before we decide what to do. But what we might like to do is to have another option at our fingertips.

I have the options “roll back immediately” and “try to figure out what you broke” as two options when my tests break. It is better, I think, to have options.

But with this change, from two classes to one simpler class, I don’t even see something I would like to do differently in the future. I didn’t see the option of having just one kind of word. At the time, I was aware that there were two kinds of words, and there really are two kinds. They just happen to be able to be represented in one class with a flag.

And I just happen to think that’s better. I could even be wrong: maybe there is a two-class solution that would be better. But was it a mistake not to see the possibility? Don’t know, don’t care. Our program was and is well-enough designed to be amenable to changes in this area. That’s good.

We develop code incrementally and iteratively. On some iterations we find opportunities to make it more compact and generally better. We take those opportunities. Things improve.

I choose not to should all over myself. It helps me be happy about today, instead of borrowing sorrow from the past. The day the two classes worked was a good day. The day we got it down to one class was a good day.

Summary Summary

Two good days. See you soon for another one, with any luck at all!

No, ChatGPT doesn’t write my articles or my code. I make these mistakes all by myself. The loop above was humor, not a glitch in the Matrix. ↩
I’m going to talk about “should” anyway, in the Summary. So it’s convenient that this extra possible “should” arises. And I ignore it: I don’t think I “should” roll back. If I thought I “should”, I would. ↩