The Repo on GitHub

Today, before the celebration begins, we’ll do ELSE. Then we refactor the code. It takes me three tries to get code that I like. Long article, mostly just pasted and re-pasted code with small changes. Unwrap when ready, if you celebrate in a truly odd fashion by reading this.

Let’s do ELSE. In Forth one could write:

1 IF 5 ELSE 50 THEN

And that would wind up with a 5 on the stack, while this:

0 IF 5 ELSE 50 THEN

We wind up with 50 on the stack.

Perhaps we should have a test or two:

    def test_else(self):
        f = Forth()
        s = ': TEST IF 5 ELSE 50 THEN ;'
        test_word = f.compile(s)
        f.stack.push(1)
        test_word.do(f)
        assert f.stack.pop() == 5
        f.stack.push(0)
        test_word.do(f)
        assert f.stack.pop() == 50

This fails, of course, for want of ELSE:

>               raise SyntaxError(f'Syntax error: "{word}" unrecognized')
E               SyntaxError: Syntax error: "ELSE" unrecognized

ELSE needs to work like THEN, in that when it compiles, it needs to tell the IF how far to skip. The target should be after whatever we compile for the ELSE. And we need to compile something, because if the IF is true, the code will run until it encounters the ELSE’s compiled code, which should skip to the THEN.

This probably means that the THEN code needs to accept a key of IF or ELSE, and that we need to compile a *ELSE, similar to how we compile *IF.

I’m going to try to do this all in one go, although I can imaging belaying the test above and writing a test for *ELSE, as we did for *IF. I think I’m up to it. Let’s find out.

class Forth:
    def define_primaries(self):
        lex = self.lexicon
        lex.append(PrimaryWord('*IF', lambda f: f.star_if()))
        lex.append(PrimaryWord('*ELSE', lambda f: f.star_else()))
        ...
class Forth:
    def star_else(self):
        self.active_word.skip(self.next_word())

The ELSE behavior, when encountered, always skips. If I had written the hand-crafted *ELSE test, it would be passing now, I think. But we have more to do: we have to detect and compile the ELSE, and we have to fix up the THEN.

Here’s what I tried:

    def compile_word_list(self, rest):
        word_list = []
        for word in rest:
            if word == 'IF':
                word_list.append(self.find_word('*IF'))
                self.stack.push( ('IF', len(word_list)))
                word_list.append(0)
            elif word == 'THEN':
                last_loc = len(word_list) - 1
                key, patch_loc = self.stack.pop()
                if key != 'IF' and key != 'ELSE':
                    raise SyntaxError(f'THEN did not find matching if, found: "{key}"')
                word_list[patch_loc] = last_loc - patch_loc
            elif word == 'ELSE':
                word_list.append(self.find_word('*ELSE'))
                last_loc = len(word_list) + 1
                key, patch_loc = self.stack.pop()
                if key !='IF':
                    raise SyntaxError(f'ELSE did not find matching if, found: "{key}"')
                word_list[patch_loc] = last_loc - patch_loc
                self.stack.push( ('ELSE', len(word_list)))
                word_list.append(0)

Test does not run, but I think the + 1 is the problem. It would be + 1 before the append of the *ELSE but not after. Change that line and the test runs.

    elif word == 'ELSE':
        word_list.append(self.find_word('*ELSE'))
        last_loc = len(word_list)
        key, patch_loc = self.stack.pop()
        if key !='IF':
            raise SyntaxError(f'ELSE did not find matching if, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
        self.stack.push( ('ELSE', len(word_list)))
        word_list.append(0)

Kind of a nasty patch of code, but I believe it is righteous. I’ll print the definition of our test so that we can look at it:

[PW: *IF, 4, PW: *#, 5, PW: *ELSE, 2, PW: *#, 50]
      0   1      2   3      4      5      6   7   8
                                          ^-------^
                 ^------------------------^   

If the IF is false, we need to skip *#, 5, *ELSE, and the 2, so 4 is correct for the value after *IF. If it is true, we’ll continue until we encounter the *ELSE, which must then skip 2, the *# and the 5.

I believe this to be working. Now let’s see if we can improve the code a bit:

    if word == 'IF':
        word_list.append(self.find_word('*IF'))
        self.stack.push( ('IF', len(word_list)))
        word_list.append(0)
    elif word == 'THEN':
        last_loc = len(word_list) - 1
        key, patch_loc = self.stack.pop()
        if key != 'IF' and key != 'ELSE':
            raise SyntaxError(f'THEN did not find matching if, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        word_list.append(self.find_word('*ELSE'))
        last_loc = len(word_list)
        key, patch_loc = self.stack.pop()
        if key !='IF':
            raise SyntaxError(f'ELSE did not find matching if, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
        self.stack.push( ('ELSE', len(word_list)))
        word_list.append(0)

I’m going to see if I can arrange this code so that the three sections look more alike.

    if word == 'IF':
        last_loc = len(word_list) + 1

        self.stack.push( ('IF', last_loc))
        word_list.append(self.find_word('*IF'))
        word_list.append(0)
    elif word == 'THEN':
        key, patch_loc = self.stack.pop()
        if key != 'IF' and key != 'ELSE':
            raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
        last_loc = len(word_list) - 1
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        key, patch_loc = self.stack.pop()
        if key !='IF':
            raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
        last_loc = len(word_list) + 1
        word_list[patch_loc] = last_loc - patch_loc

        self.stack.push( ('ELSE', last_loc))
        word_list.append(self.find_word('*ELSE'))
        word_list.append(0)

I put in some extra spacing to help us see what we’ve got there. I’m working toward extracting some methods, if I can get the code similar enough. Let’s change the keys we push to the starred versions:

    if word == 'IF':
        last_loc = len(word_list) + 1

        self.stack.push( ('*IF', last_loc))
        word_list.append(self.find_word('*IF'))
        word_list.append(0)
    elif word == 'THEN':
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
        last_loc = len(word_list) - 1
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
        last_loc = len(word_list) + 1
        word_list[patch_loc] = last_loc - patch_loc

        self.stack.push( ('*ELSE', last_loc))
        word_list.append(self.find_word('*ELSE'))
        word_list.append(0)

I’m dong this in tiny steps, so tiny that I could be committing all this time. In fact, let’s start. Commit: ELSE working, refactoring.

I made those keys start with * so that they are the same as the name we compile. Now extract variables:

            if word == 'IF':
                last_loc = len(word_list) + 1

                keyword = '*IF'
                self.stack.push( (keyword, last_loc))
                word_list.append(self.find_word(keyword))
                word_list.append(0)
            elif word == 'THEN':
                key, patch_loc = self.stack.pop()
                if key != '*IF' and key != '*ELSE':
                    raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
                last_loc = len(word_list) - 1
                word_list[patch_loc] = last_loc - patch_loc
            elif word == 'ELSE':
                key, patch_loc = self.stack.pop()
                if key !='*IF':
                    raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
                last_loc = len(word_list) + 1
                word_list[patch_loc] = last_loc - patch_loc

                keyword = '*ELSE'
                self.stack.push((keyword, last_loc))
                word_list.append(self.find_word(keyword))
                word_list.append(0)

Now extract method, then inline the temps:

            if word == 'IF':
                last_loc = len(word_list) + 1

                self.compile_and_stack_keyword('*IF', last_loc, word_list)
            elif word == 'THEN':
                key, patch_loc = self.stack.pop()
                if key != '*IF' and key != '*ELSE':
                    raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
                last_loc = len(word_list) - 1
                word_list[patch_loc] = last_loc - patch_loc
            elif word == 'ELSE':
                key, patch_loc = self.stack.pop()
                if key !='*IF':
                    raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
                last_loc = len(word_list) + 1
                word_list[patch_loc] = last_loc - patch_loc

                self.compile_and_stack_keyword('*ELSE', last_loc, word_list)

So close. Let’s change the messages to be the same for a moment and also the key checking:

    elif word == 'THEN':
        check_list = ['*IF', '*ELSE']
        key, patch_loc = self.stack.pop()
        if key not in check_list:
            raise SyntaxError(f'malformed IF-ELSE_THEN, found: "{key}"')
        last_loc = len(word_list) - 1
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        check_list = ['*IF']
        key, patch_loc = self.stack.pop()
        if key not in check_list:
            raise SyntaxError(f'malformed IF-ELSE_THEN, found: "{key}"')
        last_loc = len(word_list) + 1
        word_list[patch_loc] = last_loc - patch_loc

        self.compile_and_stack_keyword('*ELSE', last_loc, word_list)

I think we can extract now. Oops, not yet, need to move the last_loc out.

    elif word == 'THEN':
        check_list = ['*IF', '*ELSE']
        last_loc = len(word_list) - 1
        key, patch_loc = self.stack.pop()
        if key not in check_list:
            raise SyntaxError(f'malformed IF-ELSE_THEN, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        check_list = ['*IF']
        last_loc = len(word_list) + 1
        key, patch_loc = self.stack.pop()
        if key not in check_list:
            raise SyntaxError(f'malformed IF-ELSE_THEN, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc

        self.compile_and_stack_keyword('*ELSE', last_loc, word_list)

Still green, still could commit every time I show this.

    if word == 'IF':
        last_loc = len(word_list) + 1
        self.compile_and_stack_keyword('*IF', last_loc, word_list)
    elif word == 'THEN':
        check_list = ['*IF', '*ELSE']
        last_loc = len(word_list) - 1
        self.check_and_patch_skip(check_list, last_loc, word_list)
    elif word == 'ELSE':
        check_list = ['*IF']
        last_loc = len(word_list) + 1
        self.check_and_patch_skip(check_list, last_loc, word_list)
        self.compile_and_stack_keyword('*ELSE', last_loc, word_list)

Inline those temps:

    if word == 'IF':
        self.compile_and_stack_keyword('*IF', len(word_list) + 1, word_list)
    elif word == 'THEN':
        last_loc = len(word_list) - 1
        self.check_and_patch_skip(['*IF', '*ELSE'], last_loc, word_list)
    elif word == 'ELSE':
        last_loc = len(word_list) + 1
        self.check_and_patch_skip(['*IF'], last_loc, word_list)
        self.compile_and_stack_keyword('*ELSE', last_loc, word_list)

I’m not satisfied with that. I don’t like the ‘and’ in either method. I like that we’ve combined the common elements, but I’m just not satisfied.

Roll Back:

    if word == 'IF':
        last_loc = len(word_list) + 1

        self.stack.push( ('*IF', last_loc))
        word_list.append(self.find_word('*IF'))
        word_list.append(0)
    elif word == 'THEN':
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
        last_loc = len(word_list) - 1
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
        last_loc = len(word_list) + 1
        word_list[patch_loc] = last_loc - patch_loc

        self.stack.push( ('*ELSE', last_loc))
        word_list.append(self.find_word('*ELSE'))
        word_list.append(0)

We’ll try another angle. Extract the name we use in the two stack-append blocks.

    if word == 'IF':
        last_loc = len(word_list) + 1

        keyword = '*IF'
        self.stack.push((keyword, last_loc))
        word_list.append(self.find_word(keyword))
        word_list.append(0)
    elif word == 'THEN':
        ...
    elif word == 'ELSE':
        ...
        keyword = '*ELSE'
        self.stack.push((keyword, last_loc))
        word_list.append(self.find_word(keyword))
        word_list.append(0)

Extract and inline the temp:

            if word == 'IF':
                last_loc = len(word_list) + 1
                self.compile_conditional('*IF', last_loc, word_list)
            elif word == 'THEN':
                key, patch_loc = self.stack.pop()
                if key != '*IF' and key != '*ELSE':
                    raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')
                last_loc = len(word_list) - 1
                word_list[patch_loc] = last_loc - patch_loc
            elif word == 'ELSE':
                key, patch_loc = self.stack.pop()
                if key !='*IF':
                    raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
                last_loc = len(word_list) + 1
                word_list[patch_loc] = last_loc - patch_loc

                self.compile_conditional('*ELSE', last_loc, word_list)

    def compile_conditional(self, keyword, last_loc, word_list):
        self.stack.push((keyword, last_loc))
        word_list.append(self.find_word(keyword))
        word_list.append(0)

No again. Why not? Because I think we can avoid the separate last_loc calculations.

Roll Back.

The original code is actually good enough as it stands, in the sense that it passes all the tests. It’s just too much code in a row, and too similar for my tastes, while not yet similar enough. We’ll try again:

    if word == 'IF':
        skip_adjustment = 1
        last_loc = len(word_list) + skip_adjustment

        self.stack.push( ('*IF', last_loc))
        word_list.append(self.find_word('*IF'))
        word_list.append(0)
    elif word == 'THEN':
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'THEN did not find matching IF or ELSE, found: "{key}"')

        skip_adjustment = -1
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'ELSE did not find matching IF, found: "{key}"')
        skip_adjustment = 1
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

        self.stack.push( ('*ELSE', last_loc))
        word_list.append(self.find_word('*ELSE'))
        word_list.append(0)

I’m trying to make the last_loc calculations more similar, only varying by a constant. I think I can avoid passing last_loc around, passing the adjustment constant instead.

    if word == 'IF':
        skip_adjustment = 1
        self.stack.push( ('*IF', len(word_list) + skip_adjustment))
        word_list.append(self.find_word('*IF'))
        word_list.append(0)
    elif word == 'THEN':
        skip_adjustment = -1
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        skip_adjustment = 1
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

        self.stack.push( ('*ELSE', len(word_list) + skip_adjustment))
        word_list.append(self.find_word('*ELSE'))
        word_list.append(0)

Extract the word being compiled:

    if word == 'IF':
        skip_adjustment = 1
        word_to_compile = '*IF'
        self.stack.push((word_to_compile, len(word_list) + skip_adjustment))
        word_list.append(self.find_word(word_to_compile))
        word_list.append(0)
    elif word == 'THEN':
        ...
    elif word == 'ELSE':
        skip_adjustment = 1
        ...
        word_to_compile = '*ELSE'
        self.stack.push((word_to_compile, len(word_list) + skip_adjustment))
        word_list.append(self.find_word(word_to_compile))
        word_list.append(0)

Extract method, then inline the word. (I just noticed both cases have skip = 1, we’ll deal with that shortly.)

    if word == 'IF':
        skip_adjustment = 1
        self.compile_conditional('*IF', skip_adjustment, word_list)
    elif word == 'THEN':
        skip_adjustment = -1
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        skip_adjustment = 1
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

        self.compile_conditional('*ELSE', skip_adjustment, word_list)

Change the method and its signature:

    def compile_conditional(self, word_to_compile, word_list):
        self.stack.push((word_to_compile, len(word_list) + 1))
        word_list.append(self.find_word(word_to_compile))
        word_list.append(0)

Remove skip_adjustment from IF but not the other two:

    if word == 'IF':
        self.compile_conditional('*IF', word_list)
    elif word == 'THEN':
        skip_adjustment = -1
        key, patch_loc = self.stack.pop()
        if key != '*IF' and key != '*ELSE':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        skip_adjustment = 1
        key, patch_loc = self.stack.pop()
        if key !='*IF':
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

        self.compile_conditional('*ELSE', word_list)

Change the error check to expect a list and provide it:

    elif word == 'THEN':
        skip_adjustment = -1
        expected = ['*IF', '*ELSE']
        key, patch_loc = self.stack.pop()
        if key not in expected:
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        skip_adjustment = 1
        expected = ['*IF']
        key, patch_loc = self.stack.pop()
        if key not in expected:
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

        self.compile_conditional('*ELSE', word_list)

Now even PyCharm sees the duplication. Extract method. Inline the temps back into the parameter lists:

    def compile_word_list(self, rest):
        word_list = []
        for word in rest:
            if word == 'IF':
                self.compile_conditional('*IF', word_list)
            elif word == 'THEN':
                self.patch_the_skip(['*IF', '*ELSE'], -1, word_list)
            elif word == 'ELSE':
                self.patch_the_skip(['*IF'], 1, word_list)
                self.compile_conditional('*ELSE', word_list)
            elif (definition := self.find_word(word)) is not None:
                word_list.append(definition)
            elif (num := self.compile_number(word)) is not None:
                definition = self.find_word('*#')
                word_list.append(definition)
                word_list.append(num)
            else:
                raise SyntaxError(f'Syntax error: "{word}" unrecognized')
        return word_list

    def patch_the_skip(self, expected, skip_adjustment, word_list):
        key, patch_loc = self.stack.pop()
        if key not in expected:
            raise SyntaxError(f'malformed IF-ELSE-THEN, found: "{key}"')
        last_loc = len(word_list) + skip_adjustment
        word_list[patch_loc] = last_loc - patch_loc

    def compile_conditional(self, word_to_compile, word_list):
        self.stack.push((word_to_compile, len(word_list) + 1))
        word_list.append(self.find_word(word_to_compile))
        word_list.append(0)

That’s what I’m talkin about! commit: finally, a refactoring that I like!

There are goodies in the stockings and a breakfast to be had. Let’s sum up.

Summary

Our ELSE code worked on the second try. The error was an off-by one.

I then wanted to clean up the code and reduce the duplication in the cases. It took three tries, two ending in a rollback. Each time through I learned about something that would help for next time.

The pattern here, which I first learned from Kent Beck, is to make code that seems similar look more and more similar, until it’s clear how to extract the duplicate code into separate methods. It can take a few tries, as it did here.

The first attempt was actually very close to what we wound up with, and possibly we could have refactored from there to the better place. But I wasn’t seeing it and I wasn’t feeling it. Better to roll back and try again.

We have to ask ourselves: was it worth this much effort? When we look at the starting point and ending point of the compile method, I think the answer is clear, at least to me:

Before:

    if word == 'IF':
        word_list.append(self.find_word('*IF'))
        self.stack.push( ('IF', len(word_list)))
        word_list.append(0)
    elif word == 'THEN':
        last_loc = len(word_list) - 1
        key, patch_loc = self.stack.pop()
        if key != 'IF' and key != 'ELSE':
            raise SyntaxError(f'THEN did not find matching if, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
    elif word == 'ELSE':
        word_list.append(self.find_word('*ELSE'))
        last_loc = len(word_list)
        key, patch_loc = self.stack.pop()
        if key !='IF':
            raise SyntaxError(f'ELSE did not find matching if, found: "{key}"')
        word_list[patch_loc] = last_loc - patch_loc
        self.stack.push( ('ELSE', len(word_list)))
        word_list.append(0)

After:

    if word == 'IF':
        self.compile_conditional('*IF', word_list)
    elif word == 'THEN':
        self.patch_the_skip(['*IF', '*ELSE'], -1, word_list)
    elif word == 'ELSE':
        self.patch_the_skip(['*IF'], 1, word_list)
        self.compile_conditional('*ELSE', word_list)
    elif (definition := self.find_word(word)) is not None:
        word_list.append(definition)

There is possibly even some way to remove the +1 -1 parameter from patch_the_skip, but I’m not seeing it just now.

And it wasn’t really all that much effort. Listing all the changes, and talking about them here makes it look like this was a lot of work. But almost every step was handled by PyCharm, with a few little edits done by hand, plus some judicious swapping of lines up and down. At every point, the tests ran on auto-run, so I was made immediately aware if I had moved the wrong line or done something else wrong.

I am happy with the result, and look forward to seeing what happens when we do some other patching type words such as the looping construct. I hope you’ll join me!