Review

The Robot World Repo on GitHub
The Forth Repo on GitHub

#ElonaldDelendaEst! Let’s review the conditionals and looping. Let’s think about what seems to happen when we do as we do. Let’s think about more Forth and less Python.

Over the past articles, we’ve improved our implementation of conditionals and looping. We’re implemented the CASE-OF-ENDOF_ENDCASE words. We’ve created a more robust compile stack structure that allows for detecting stack errors. We’ve centralized and simplified the code we use to compile those loops and conditionals. We have more capability in our Forth, and have reduced the code needed on a per-statement basis. Shortly, we’ll review the current implementations to see whether there is more to be done to improve things.

But first: How does all this work?

As I work, it seems to me that I just view things through a very small lens most of the time, working on single methods, or a couple of methods that together support some feature. It seems that I look for things that are similar, and then make them more similar, and after a while, there seems only to be one thing that is used in multiple places. All that seems to me to happen without a great deal of planning or big-picture thinking.

It really does seem that way to me and I rather hope that it seems that way to you as well. I work in small spots, finding similarity, making the similarities more simple, and then isolate the similar code so that all the similar things can use it.

Now, I freely grant that I think about my code and design often, not while I’m sitting here writing the article and code, but other times, resting or musing or even sometimes studying the “literature” about whatever I’m working on. I don’t often do more than muse about it, and when I do draw a diagram, it’s generally very simple and I will usually include it in the next article. But it’s pretty close to true that while I think about design a lot, I don’t actually create “big” designs for things. Having thought about what the design might look like, I turn to the code and start improving it, whether adding a feature or just refactoring. I sort of lean in the direction of the design I was thinking about, but mostly I let the code guide what I do.

And it works. It works almost every time. Over time, the design stays flexible and at intervals it actually improves, becoming simpler, more clear. It’s almost magical, even to me, and I’m doing it.

What’s really going on here?

I think there’s a deep reason for why this works. As I am not really a deep thinker, I’m not going to try to get down too deep but it seems to me that all programming is made up of chunks of code that do similar things. Maybe it’s windows and database calls. Maybe it’s accumulating transactions. Maybe it’s calculating line items on tax forms or defining word in Forth. Whatever it is, the program is usually dealing with many cases of the same kind of thing.

There is almost always some conceptual integrity, some central ideas, behind the desire to have a program, and behind the particular way a program is built. If there isn’t such an integrity, the program is going to be nearly impossible to write and impossible to use: it just won’t “make sense”.

If there are these central ideas in the program, there will be only a few ways of dealing with those ideas in code, and most of those ways will have to be similar, because they’re doing similar things. We may not fully understand good ways of doing what needs doing, so we may write code that kind of wanders around in the concepts until it manages to produce something we want. But as we do it more and more, patterns start to arise. We begin to copy and paste code, or to use one bit as a prototype for writing another.

I’ve seen a lot of code in my life that just goes that far. Every module looks like every other module, except for some differences that are interspersed and sprinkled around in the similar-looking code.

Those differences are trying to be parameters. The similar code is trying to be a function, a method, a module.

Often, we just can’t quite see what to do, and the pressure of time seems to keep us copying and pasting.

Can we draw a conclusion?

If my work here ever “shows” anything, I think it shows that if we take a little extra time to make similar code more similar, to keep the differences isolated and the similarities together, functions, methods, classes, and modules “emerge”. And when we make those things explicit, we can implement our next feature by calling the code, rather than by copying it and trying to modify it to do what we need.

Good design emerges from creating similarity, and then centralizing it. Or so it seems to me.

The Code

OK, sermon over, let’s take a look at the conditionals and loops and see if there are still things that could be improved. There certainly are: the trick is spotting them. We may spot a few now. If not, probably later.

Here are our main conditional and looping constructs:

    def _define_begin_until(self):
        self.pw('BEGIN', lambda f: f.push_compile_info('BEGIN'), immediate=True)
        self.pw('UNTIL', lambda f: f.compile_branching_word('0BR', 'BEGIN'), immediate=True)

    def _define_do_loop(self):
        def _do(forth):
            forth.compile_word('*DO')
            forth.push_compile_info('DO')
            # : DO SWAP >R >R ;

        self.pw('DO', _do, immediate=True)
        self.pw('LOOP',
                lambda f: f.compile_branching_word('*LOOP', 'DO'),
                immediate=True)

    def _define_if_else_then(self):
        def _if(forth):
            forth.compile_branch('0BR', 'IF')

        def _else(forth):
            forth.compile_branch('BR', 'IF')
            forth.compile_stack.swap_pop().patch('IF')

        def _then(forth):
            forth.compile_stack.pop().patch('IF')

        self.pw('IF',   _if,   immediate=True)
        self.pw('ELSE', _else, immediate=True)
        self.pw('THEN', _then, immediate=True)

    def define_skippers(self, forth):
        def _next_word(forth):
            return forth.next_word()

        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.branch(beginning_of_do_loop)

        self.pw('*LOOP',  _star_loop)

Of course _star_loop pops right out at us as very different from the others. We’ll come back to that one, looking for some smaller wins if we can find them.

I think our current practice is to inline one line defs:

    def _define_if_else_then(self):
        def _else(forth):
            forth.compile_branch('BR', 'IF')
            forth.compile_stack.swap_pop().patch('IF')

        self.pw('ELSE', _else, immediate=True)
        self.pw('IF',   lambda f: f.compile_branch('0BR', 'IF'),   immediate=True)
        self.pw('THEN', lambda f: f.compile_stack.pop().patch('IF'), immediate=True)

Commit: convert one line functions to lambda.

The _next_word in define_skippers isn’t carrying its weight. Let’s inline that and reorder, to see if it suggests anything further.

    def define_skippers(self, forth):
        def _star_loop(forth):
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            beginning_of_do_loop = forth.next_word()
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.branch(beginning_of_do_loop)

Commit: inline and reorder

I think we could combine the + 1 into the index fetching, and use a better name:

    def define_skippers(self, forth):
        def _star_loop(forth):
            new_index = forth.return_stack.pop() + 1
            limit = forth.return_stack.pop()
            beginning_of_do_loop = forth.next_word()
            if new_index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(new_index)
                forth.active_word.branch(beginning_of_do_loop)

Feature Envy much? Every line of this operation refers to forth, except for the index comparison.

Now the fact is, all of our words use the forth class extensively, because it is the source and repository of all knowledge, all the stacks and so on. So we expect to see a reference or two to forth everywhere: it’s central to our implementation.

I suspect that if we make this one into a method on Forth, we may not find a lot of commonality but we might well get a bit of simplification. Even if not, 6 out of 7 lines referencing forth is a pretty clear sign.

class Forth:
    def star_loop(self):
            new_index = self.return_stack.pop() + 1
            limit = self.return_stack.pop()
            beginning_of_do_loop = self.next_word()
            if new_index < limit:
                self.return_stack.push(limit)
                self.return_stack.push(new_index)
                self.active_word.branch(beginning_of_do_loop)

class Lexicon:
    def define_skippers(self, forth):
        self.pw('*LOOP',  lambda f: f.star_loop())
        self.pw('*#',     lambda f: f.stack.push(f.next_word()))
        self.pw('DUMP',   lambda f: f.stack.dump(f.active_word.name, f.active_word.pc))

Green. Commit: move star_loop to Forth class.

More Forth, less Python?

We can just about express star_loop in Forth:

: STAR_LOOP
   R> R> SWAP 1+
   2DUP <
   IF
       SWAP >R >R
       BR ??
  THEN
;

I’m pretty sure that’s correct, and if not, it’s close. We just don’t know where to branch to. But as compiled now, it’s nearly ready to be done that way. I believe we could make it work.

In a classical Forth, I’m sure that it would be implemented in Forth, because in a classical Forth, everything is implemented either in Forth or in assembler/binary. Forth is easier, and generally speaking one implements as few words as possible in assembler, because all the others are portable from one Forth to the next. That makes getting the next hardware up and running Forth easier and quicker.

Our situation, implementing on Python, is a bit different. Our Python code is easier to write than assembler and easier to understand. So there’s no big saving to be had by writing things like this in Forth—although it certainly would be cool and display a certain style if we were to do it.

It is possible to write the Forth interpreter and compile code in Forth. Again, one just implements a few very rudimentary words in assembler and all the rest is Forth. I would like to know how to do that. Doing that would be also be cool and stylish, if we were to do it.

We just might. But not today. Today, let’s reflect and sum up.

Reflective Summary

Again (and again and again) we find common code, combine it, replace multiple lines with one line. Things get simpler.

Our last move, however, was a bit different: there was no duplication, no reduction in lines of code. We just moved some code to a class that is more equipped to support it. But that class, Forth, is 139 lines of code. The Lexicon, which defines all of Forth, is larger at 190 lines. All the other classes that make up Forth are smaller. Stack (76), CompileInfo(30), and Heap (18) are the smallest. Word is 58 lines including an 11 line __repr__ for convenient debugging display and a five-line index method used only in tests.

These numbers might be OK: 190, 139, 76, 58, 30, 18. But I feel that Forth itself needs a look. It surely has at least two responsibilities, one of them being compiling and executing Forth, and the other being support for the specific needs of words during compilation or operation.

We might find two objects in there. Another clue suggesting more than one object is the very long __init__:

class Forth:
    def __init__(self):
        self.true = -1
        self.false = 0
        self.active_words = []
        self.compile_stack = Stack()
        self.compilation_state = False
        self.c_stack_top = None
        self.heap = Heap()
        self.lexicon = Lexicon()
        self.return_stack = Stack()
        self.stack = Stack()
        self.tokens = None
        self.token_index = 0
        self.word_list = []
        self.lexicon.define_primaries(self)

That’s a bit of a mish-mash, and especially with a couple of naked lists and some odd-seeming names like c_stack_top, I think the Forth class might benefit from a bit of improvement. We’ll look into that in the near future.

For today, we’ve made a bit of improvement and that’s all it takes to make a good day.

#ElonaldDelendaEst! See you next time!