Improving DO-LOOP

The Robot World Repo on GitHub
The Forth Repo on GitHub

#ElonaldDelendaEst! One pair of looping words is now using our new CompileInfo compile stack protocol. There is one more looping construct that needs attention. We create and remove duplication, making the program simpler.

We have DO-LOOP implemented and needing conversion. Well, we want to convert it, as a matter of principle, though it works fine as it stands. Why do we want to convert it? Because, almost mysteriously, when we make two or more patches of code look similar, good things happen: the program gets smaller and simpler.

Without further ado:

class Lexicon:
    def _define_do_loop(self):
        def _do(forth):
            forth.compile_stack.push(len(forth.word_list))
            forth.compile_word('*DO')
            # : DO SWAP >R >R ;

        def _loop(forth):
            jump_loc = forth.compile_stack.pop()
            forth.compile_word('*LOOP')
            forth.append_word(jump_loc - len(forth.word_list))

        self.pw('DO', _do, immediate=True)
        self.pw('LOOP', _loop, immediate=True)

    def define_skippers(self, forth):
        def _next_word(forth):
            return forth.next_word()

        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.skip(beginning_of_do_loop)

        self.pw('*LOOP',  _star_loop)

Interestingly enough, *DO is defined as a secondary Forth word, not a primary:

class Lexicon:
    def define_secondaries(self, forth):
        ...
        forth.process_line(': *DO SWAP >R >R ;')

Let’s take a moment to understand what’s going on, looking at a test.

    def test_do_loop(self):
        f = Forth()
        f.compile(': TEST 5 0 DO I 10 * LOOP ;')
        f.process_line(' TEST ')
        assert f.stack.stack == [0, 10, 20, 30, 40]

The word DO expects to see the limit of the loop and the starting index on the stack. It executes the code between DO and LOOP, and at LOOP, increments the index and loops back if index is less than the limit. The stack is unchanged except for whatever the user’s loop code puts on it or removes. The user code can get the current index with the word I. So our loop above puts 10*0, 10*1, …, 10*4 on the stack.

DO-LOOP uses the return stack, R. At the end of each iteration LOOP expects the return stack to contain limit and current index, with the current index on top of the return stack. Our *DO word sets up that condition, swapping the inputs to be limit first, then pushing limit and index to the return stack. >R pops the stack and pushes the result to the return stack.

Reviewing the compilation of DO and LOOP, we see DO saving a location, and LOOP setting up a branch, using the old scheme of a skip distance:

class Lexicon:
    def _define_do_loop(self):
        def _do(forth):
            forth.compile_stack.push(len(forth.word_list))
            forth.compile_word('*DO')
            # : DO SWAP >R >R ;

        def _loop(forth):
            jump_loc = forth.compile_stack.pop()
            forth.compile_word('*LOOP')
            forth.append_word(jump_loc - len(forth.word_list))

        self.pw('DO', _do, immediate=True)
        self.pw('LOOP', _loop, immediate=True)

_do saves the location, old style, and _loop unstacks it and patches the word after *LOOP. And reviewing _star-loop:

class Lexicon:
        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.skip(beginning_of_do_loop)

We add that it fetches the next word after itself, the patched location. I’m not sure why it fetches it there. Could we move that line inside the if? A quick test tells us why not: we need to skip that word when we exit and unless we call next_word unconditionally, we fall into the integer. So we’ll leave that alone.

We can certainly adjust this code to use the word address instead of the skipping scheme. In fact, this will suffice to do that:

class Lexicon:
        def _loop(forth):
            jump_loc = forth.compile_stack.pop()
            forth.compile_word('*LOOP')
            forth.append_word(jump_loc)

        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.branch(beginning_of_do_loop)

Arrgh! What I just tried does not work. Roll back. Do again, fail better:

class Lexicon:
    def _define_do_loop(self):
        def _do(forth):
            forth.compile_word('*DO')
            forth.compile_stack.push(len(forth.word_list))
            # : DO SWAP >R >R ;

        def _loop(forth):
            jump_loc = forth.compile_stack.pop()
            forth.compile_word('*LOOP')
            forth.append_word(jump_loc)

        def _star_loop(forth):
            beginning_of_do_loop = _next_word(forth)
            index = forth.return_stack.pop()
            limit = forth.return_stack.pop()
            index += 1
            if index < limit:
                forth.return_stack.push(limit)
                forth.return_stack.push(index)
                forth.active_word.branch(beginning_of_do_loop)

I moved the compile_stack push down one line in the DO. I’m not sure why the skip dealt with it and the branch didn’t but in any case we want to branch to after the *DO, not into it.

We are green. Commit: _star_loop uses branch instead of skip.

We’d like to make our current loop stacking and patching code look more like the other uses of CompileInfo. Here’s the BEGIN-UNTIL code to compare:

class Lexicon:
    def _define_begin_until(self):
        def _until(forth):
            forth.compile_word('0BR')
            info = forth.compile_stack.pop()
            forth.append_word(info.locations[0])

        self.pw('BEGIN', lambda f: f.push_compile_info('BEGIN'), immediate=True)
        self.pw('UNTIL', _until, immediate=True)

I’d like to see the code we’re compiling for the loop, just to be sure where the branch address is going. I expect it to be at the end.

: TEST *# 5 *# 0 *DO I *# 10 * *LOOP 3 ;
        0    1    2  3  4    5  6    7

We see there why *LOOP must consume the next word, the 3, to skip over it on exit.

So it seems to me that if we have DO push the compile info, and use compile_word as we do with _until, our code for DO-LOOP will look much the same as for BEGIN-UNTIL.

class Lexicon:
    def _define_do_loop(self):
        def _do(forth):
            forth.compile_word('*DO')
            forth.push_compile_info('DO')
            # : DO SWAP >R >R ;

        def _loop(forth):
            forth.compile_word('*LOOP')
            info = forth.compile_stack.pop()
            forth.append_word(info.locations[0])

And in fact we are green. Commit: making do-loop look like begin-until

Compare that with this:

class Lexicon:
    def _define_begin_until(self):
        def _until(forth):
            forth.compile_word('0BR')
            info = forth.compile_stack.pop()
            forth.append_word(info.locations[0])

        self.pw('BEGIN', lambda f: f.push_compile_info('BEGIN'), immediate=True)
        self.pw('UNTIL', _until, immediate=True)

The green lines in _until and _loop are the same except for the parameter, the word compiled. and they all refer to forth, not to Lexicon. Feature Envy, calling for a method on forth. I think its name might be compile_branching_word, at least for now.

    def compile_branching_word(self, branch_name, info_name):
        self.compile_word(branch_name)
        info = self.compile_stack.pop()
        assert info.name == info_name, f'{info.name} != {info_name}'
        self.append_word(info.locations[0])

Now it seems that I could call that:

    def _define_begin_until(self):
        def _until(forth):
            forth.compile_branching_word('0BR', 'BEGIN')

Green. Commit create new compile_branching_word in Forth, use in BEGIN-UNTIL.

Do again:

    def _define_do_loop(self):
        def _do(forth):
            forth.compile_word('*DO')
            forth.push_compile_info('DO')
            # : DO SWAP >R >R ;

        def _loop(forth):
            forth.compile_branching_word('*LOOP', 'DO')

Green. Commit: use compile_branching_word in DO-LOOP

Both of those can now be lambdas, I believe:

    def _define_do_loop(self):
        def _do(forth):
            forth.compile_word('*DO')
            forth.push_compile_info('DO')
            # : DO SWAP >R >R ;

        self.pw('DO', _do, immediate=True)
        self.pw('LOOP',
                lambda f: f.compile_branching_word('*LOOP', 'DO'),
                immediate=True)

Commit: convert function to lambda.

    def _define_begin_until(self):
        self.pw('BEGIN', lambda f: f.push_compile_info('BEGIN'), immediate=True)
        self.pw('UNTIL', lambda f: f.compile_branching_word('0BR', 'BEGIN'), immediate=True)

Commit ditto.

Let’s sum up, there’s bacon nearly ready to be eaten.

Summary

Again and again, we find code that appears to be quite different. We manipulate it in small simple ways and soon enough it looks the same as some other code. We take that similar-looking code, turn it into a method or sometimes a function) and put it on the class it seems to belong to. That class is almost always obvious from the code itself, and surprisingly often, not the class we find it in.

The code becomes smaller and simpler, and now contains fewer methods that have to be understood, and fewer methods that are almost the same but may be different, we can’t quite tell.

And just about all we do is to try to make things look the same, and to figure out what class a few lines might best be placed in.

And the code improves, while nothing breaks. This is a good thing.

#ElonaldDelendaEst! See you next time!