The Repo on GitHub

Ready for the next step in implementing IF-THEN. Is this a debacle, or a successful spike? Very long. Somewhat confused.

We have a rudimentary implementation of *IF, the run-time part of IF that pops and checks the flag and then does or does not branch around the code. When it skips, it always skips 2 cells:

    def star_if(self):
        flag = self.stack.pop()
        if flag == 0:
            self.active_word().skip(2)

Hey, it’s enough to pass the test. Them’s the rules.

What we need now includes:

  • A flag in a Word to indicate that it is an “immediate” word, that is, instead of being compiled in when encountered in text, it is executed immediately. We’ll set that in IF and THEN.
  • A parameter on every Word, initialized to None. Wasteful but we have memory to burn.
  • IF will compile a *IF into the current word being compiled, and will save its index on a stack.
  • THEN will pop that stack, compute the number of words compiled since then, and put that value into the *IF’s parameter.
  • *IF will use its parameter to decide how far to skip.

That’s quite a few changes but fortunately most of them can be made without harming anything. The “real” question is what to test and how to test it. GeePaw Hill speaks of testing “branching logic”. And we will do sufficient tests to ensure that this all works, so maybe it’s OK to put in the flag and the parameter directly?

I’m going to do that. For now, only a PrimaryWord can be immediate. If we want a Secondary to be immediate, we’ll deal with it then.

class PrimaryWord:
    def __init__(self, name, code, immediate=False):
        self.name = name
        self.code = code
        self.immediate = immediate
        self.parameter = None

Green, of course.

Let’s set up *IF to use the parameter. In our test:

    def test_star_if(self):
        f = Forth()
        s = ': TEST *IF DUP + ;'
        f.compile(s)
        star_if = f.find_word('*IF')
        star_if.parameter = 1
        f.stack.extend([2, 0])
        test_word = f.find_word('TEST')
        test_word.do(f)
        assert f.stack.pop() == 2
        f.stack.extend([2, 1])
        test_word.do(f)
        assert f.stack.pop() == 4

I’ve set the parameter to 1. It should be 2. So, when we change *IF to use it, we should fail the test. And—yikes:

class Forth:
    def star_if(self):
        flag = self.stack.pop()
        if flag == 0:
            self.active_word().skip(2)

The design starts crumbling at this point.

This code, like all the code, is in the Forth instance. We need the parameter from the PrimaryWord. We do not have that. OK, we might be OK, where it its definition?

        lex.append(PrimaryWord('*IF', lambda f: f.star_if()))

We have no way to write code in that constructor that accesses the instance.

Are we missing a critical connection? Are we in big trouble, guys? We might be. However … let’s see .. active_word is the (secondary) word we are now executing. That word contains the *IF we care about and we are executing it now. Therefore active_word.ix is the index of the word we’re executing. Therefore … maybe …

No. That won’t do. It’s possible that I could find the *IF, but it’s unbearably messy.

Can we conclude and decide right here and now that a PrimaryWord needs access, not just to the forth instance (to get the stack) but also to itself, to get the parameter? We could just change the way they’re called. One thing seems clear: the definition of *IF does not belong in Forth class. It’s a PrimaryWord and its definition should be in its own class.

Design Rethinking

Unless I’m missing some obvious fact, the design we have does not hold water. Specifically, we need to associate a parameter with some words, and the word’s code needs to be able to see that parameter.

As things stand, PrimaryWord instances hold code that looks like this:

        lex.append(PrimaryWord('*IF', lambda f: f.star_if()))
        lex.append(PrimaryWord('DROP', lambda f: f.stack.pop()))
        lex.append(PrimaryWord('DUP', lambda f: f.stack.dup()))
        lex.append(PrimaryWord('OVER', lambda f: f.stack.over()))
        lex.append(PrimaryWord('ROT', lambda f: f.stack.rot()))
        lex.append(PrimaryWord('SWAP', lambda f: f.stack.swap()))

They access all their information from the Forth instance f, which made some sense since mostly they just need access to the stack. *IF is special, the first of its kind.

What options have we? We can easily pass another parameter to our words. But wait. The do method is on the Primary:

class PrimaryWord:
    def do(self, forth):
        self.code(forth)

The code receiving that call is in fact the lambda above. So what if we pass the parameter at this point?

Let’s try that.

    def do(self, forth):
        self.code(forth, self.parameter)

Tests break. Now we need to change the lambdas for all the primaries:

    def define_primaries(self):
        lex = self.lexicon
        lex.append(PrimaryWord('*IF', lambda f, p: f.star_if(p)))
        lex.append(PrimaryWord('DROP', lambda f, p: f.stack.pop()))
        lex.append(PrimaryWord('DUP', lambda f, p: f.stack.dup()))
        lex.append(PrimaryWord('OVER', lambda f, p: f.stack.over()))
        lex.append(PrimaryWord('ROT', lambda f, p: f.stack.rot()))
        lex.append(PrimaryWord('SWAP', lambda f, p: f.stack.swap()))
        lex.append(PrimaryWord('+', lambda f, p: f.stack.push(f.stack.pop() + f.stack.pop())))
        lex.append(PrimaryWord('-', lambda f, p: f.stack.push(f.stack.swap_pop() - f.stack.pop())))
        lex.append(PrimaryWord('*', lambda f, p: f.stack.push(f.stack.pop() * f.stack.pop())))
        lex.append(PrimaryWord('/', lambda f, p: f.stack.push(f.stack.swap_pop() / f.stack.pop())))
        lex.append(PrimaryWord('SQRT', lambda f, p: f.stack.push(math.sqrt(f.stack.pop()))))

One test breaks, because I sent the parameter to star_if. Change it:

    def star_if(self, parameter):
        flag = self.stack.pop()
        if flag == 0:
            self.active_word().skip(parameter)

Still breaks but I am hoping for the right reason:

    def pop(self):
>       return self.stack.pop()
E       IndexError: pop from empty list

Probably correct. Change the test:

    def test_star_if(self):
        f = Forth()
        s = ': TEST *IF DUP + ;'
        f.compile(s)
        star_if = f.find_word('*IF')
        star_if.parameter = 2   # 2 is the correct value
        f.stack.extend([2, 0])
        test_word = f.find_word('TEST')
        test_word.do(f)
        assert f.stack.pop() == 2
        f.stack.extend([2, 1])
        test_word.do(f)
        assert f.stack.pop() == 4

Test is green. We need to reflect.

Reflection

We solved one issue fairly easily, passing the primitive’s parameter to the code. I am not convinced that the idea, simple though it is, holds water.

It might be better to pass the word itself to the code lambda. And we still haven’t really seen whether we can correctly compile this stuff. Let’s think about “real” Forth a moment.

In a 1970s-style Forth, everything is in one assembly program (or even worse, hand-assembled machine code program, but generally one had an assembler). So everything was global and available. And you could get your current address in the code and look four cells ahead to find your parameter if you needed to. Ah, those were the days. And the words each contained code that you basically just branched into as things happened.

We’re trying to implement a forth-like language using a more modern technology, and we’d like to wind up with a decent modern design.

Pause
Right here in the middle of this design reflection, I want to try passing the word itself to the lambda, not the parameter. And I might pass more than that, if need be. For now, we’ll pass the word because that’s more general than passing the parameter.
class PrimaryWord:
    def do(self, forth):
        self.code(forth, self)

And in the definitions:

    def define_primaries(self):
        lex = self.lexicon
        lex.append(PrimaryWord('*IF', lambda f, pw: f.star_if(pw.parameter)))
        lex.append(PrimaryWord('DROP', lambda f, pw: f.stack.pop()))
        lex.append(PrimaryWord('DUP', lambda f, pw: f.stack.dup()))
        lex.append(PrimaryWord('OVER', lambda f, pw: f.stack.over()))
        lex.append(PrimaryWord('ROT', lambda f, pw: f.stack.rot()))
        lex.append(PrimaryWord('SWAP', lambda f, pw: f.stack.swap()))
        lex.append(PrimaryWord('+', lambda f, pw: f.stack.push(f.stack.pop() + f.stack.pop())))
        lex.append(PrimaryWord('-', lambda f, pw: f.stack.push(f.stack.swap_pop() - f.stack.pop())))
        lex.append(PrimaryWord('*', lambda f, pw: f.stack.push(f.stack.pop() * f.stack.pop())))
        lex.append(PrimaryWord('/', lambda f, pw: f.stack.push(f.stack.swap_pop() / f.stack.pop())))
        lex.append(PrimaryWord('SQRT', lambda f, pw: f.stack.push(math.sqrt(f.stack.pop()))))

OK, I feel better about that. I have not committed yet this morning, however, and we’ll decide on that later. Back to reflecting:

I’m not convinced that this is quite the right design. I think there is a chance that it is quite wrong. In particular, I do not like the fact that the star_if code is in the Forth instance, since it is a word definition. Let’s move it:

    def define_primaries(self):
        lex = self.lexicon
        lex.append(PrimaryWord('*IF', lambda f, pw: pw.star_if(f)))

That fails, of course, but we can put star_if on PrimaryWord, just so:

class PrimaryWord:
    def star_if(self, forth):
        flag = forth.stack.pop()
        if flag == 0:
            forth.active_word().skip(self.parameter)

Green. That makes more sense I think.

Decision …

I think we’ll push on. I’m concerned, though, that it’s going to take more than this morning to sort out whether this IF-THEN will work or not. We need to do the immediate operator thing. That’s in the compiler:

class Forth:
    def compile(self, text):
        # why don't we just store the word in the list, it's no larger than the index
        words = text.split()
        match words:
            case ':', defining, *rest, ';':
                word_list = [ix for word in rest if (ix := self.find_word_index(word)) is not None]
                self.lexicon.append(SecondaryWord(defining, word_list))
            case _:
                raise SyntaxError(f'Syntax error: "{text}". Missing : or ;?')

The work is all done in that first case. Prior to there being immediate words, we just look up the words in the lexicon and string their indices together to make up the definition. With immediates in play, we need to look at each word before we do that.

Extract method:

    def compile(self, text):
        # why don't we just store the word in the list, it's no larger than the index
        words = text.split()
        match words:
            case ':', defining, *rest, ';':
                self.compile_word_list(defining, rest)
            case _:
                raise SyntaxError(f'Syntax error: "{text}". Missing : or ;?')

    def compile_word_list(self, defining, rest):
        word_list = [ix for word in rest if (ix := self.find_word_index(word)) is not None]
        self.lexicon.append(SecondaryWord(defining, word_list))

Now we have a place to stand. Expand this code into an actual loop. I think we’ll want the word and the index.

    def compile_word_list(self, defining, rest):
        word_list = []
        for ix, word in enumerate(rest):
            if self.find_word(word) is not None:
                word_list.append(ix)
        self.lexicon.append(SecondaryWord(defining, word_list))

Tests fail. Revert that, do again. Right. ix is the result of the find, not the index of the word in the original list.

    def compile_word_list(self, defining, rest):
        word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word_list.append(ix)
        self.lexicon.append(SecondaryWord(defining, word_list))
Note
I could have looked at the tests to see what they were saying. Instead I just flushed the change and did it again, more thoughtfully. Got it right. I recall the first time I saw Kent Beck do that. He was programming in front of a class, on the screen. We all saw the bug he typed. He didn’t listen, he just flushed the code and did it again right. We were enlightened.

OK, now what I really want to do is look at the word to see if it is immediate and if it is, I guess we want to call its definition right then.

We need a test.

    def test_compile_if(self):
        f = Forth()
        s = ': TEST IF DUP + ;'
        f.compile(s)
        test_word = f.find_word('TEST')
        assert test_word.code[0] == 0

Honestly I’m not sure about that assert. This is a sign that I need a break. I’ll push on a bit further: I think I’ve got this.

    def compile_word_list(self, defining, rest):
        word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word = self.lexicon[ix]
                if not word.immediate:
                    word_list.append(ix)
        self.lexicon.append(SecondaryWord(defining, word_list))

Getting ugly but so far so good. I had to implement immediate as False on SecondaryWord to get this to fail only our current test. No surprise.

    def compile_word_list(self, defining, rest):
        word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word = self.lexicon[ix]
                if not word.immediate:
                    word_list.append(ix)
                else:
                    match word:
                        case 'IF':
                            pass
                        case _:
                            raise SyntaxError(f'Syntax error: "{word}" is unknown. Missing : or ;?')
        self.lexicon.append(SecondaryWord(defining, word_list))

Still failing, of course. Oh right:

>       raise ValueError(f'cannot find word "{word}"')
E       ValueError: cannot find word "IF"

We don’t have code for IF yet. This is getting out of hand. I would be wise to roll back. I’ll push a bit further. It comes to mind that we don’t need this case, we will just call the word.

I am committed to rolling this entire morning back, but I want to learn a bit more first.

        lex.append(PrimaryWord('IF', lambda f, pw: pw.if_immediate(f)))
>       assert test_word.code[0] == 0
E       AttributeError: 'SecondaryWord' object has no attribute 'code'

Right, that should be word_indices. Now it fails:

tests/test_compile.py:83 (TestCompile.test_compile_if)
6 != 0

Six is the index of the IF primary. How did that get in there? I have not marked it immediate.

        lex.append(PrimaryWord(
            'IF', 
            lambda f, pw: pw.if_immediate(f), 
            immediate=True))

Now I get the error I expected. And:

    def compile_word_list(self, defining, rest):
        word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word = self.lexicon[ix]
                print(f'{ix=} {word=} {word.name=}')
                if not word.immediate:
                    word_list.append(ix)
                else:
                    word.do(self)
        self.lexicon.append(SecondaryWord(defining, word_list))

Now I’m getting this, as intended:

tests/forth.py:57: in compile_word_list
    word.do(self)
tests/word.py:11: in do
    self.code(forth, self)

>   lex.append(PrimaryWord('IF', lambda f, pw: pw.if_immediate(f), immediate=True))
E   AttributeError: 'PrimaryWord' object has no attribute 'if_immediate'. Did you mean: 'immediate'?

I would like to know whether this is going to be possible, even though I feel that I’ve just about lost the thread. Let’s see if we can make this method on PW insert an if:

    def if_immediate(self, forth):
        star_if = forth.find_word('*IF')
        forth.active_word().add_to_list(star_if)

The add_to_list method does not exist and it almost cannot, because we are building the list inside a method:

    def compile_word_list(self, defining, rest):
        word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word = self.lexicon[ix]
                print(f'{ix=} {word=} {word.name=}')
                if not word.immediate:
                    word_list.append(ix)
                else:
                    word.do(self)
        self.lexicon.append(SecondaryWord(defining, word_list))

But we could make it a member …

    def compile_word_list(self, defining, rest):
        self.word_list = []
        for word in rest:
            ix = self.find_word_index(word)
            if ix is not None:
                word = self.lexicon[ix]
                print(f'{ix=} {word=} {word.name=}')
                if not word.immediate:
                    self.add_to_word_list(ix)
                else:
                    word.do(self)
        self.lexicon.append(SecondaryWord(defining, self.word_list))

    def add_to_word_list(self, ix):
        self.word_list.append(ix)

Note that I need the index not the word, so:

    def if_immediate(self, forth):
        star_if_ix = forth.find_word_index('*IF')
        forth.active_word().add_to_list(star_if_ix)

But the test is not green. I actually thought it might be.

    def active_word(self):
>       return self.active_words[-1]
E       IndexError: list index out of range

Hm, we have no active word … when do we get that?

Ah … active_word is a run-time thing, not a compile-time thing. But the active word list is right there in the Forth instance. So what if we did this:

    def if_immediate(self, forth):
        star_if_ix = forth.find_word_index('*IF')
        forth.add_to_word_list(star_if_ix)

The test would run, that’s what.

I’ve lost the thread and my compass is spinning and the map is wet. But the test has inserted a *IF where it should. Let’s sum up.

Summary

I’m two hours in and my brain is tired. And I think the IF has correctly installed a *IF in the compiled word, which is what it’s supposed to do,

I’ve been feeling increasingly bad about the design since about line 88 and this is line 500. But each time something seemed impossible, a simple and sensible-seeming change got us through. Let me review the diff, since I have neither committed nor rolled back this morning.

The first thing I notice is that I failed to commit last time, and there is code in here that I really want to retain.

After that …

  1. Changed the lambdas in PrimaryWords to receive the forth instance and the word itself.
  2. Added immediate flag to both Words, and a parameter cell to PrimaryWord.
  3. Moved star_if to PrimaryWord, which makes sense.
  4. Changed compile so that Forth has a word-list member, which seems a bit off.
  5. Implemented if_immediate on PrimaryWord, to add a *IF to the word list in Forth. This seems a bit iffy to me, no pun intended.

That’s really all that has been done. I do think that publishing the word list in Forth so that Words can get at it might be a bit odd. That is a consequence of putting the code for the immediate into its lambda, and calling a method on PrimaryWord for the immediate. If we were to implement the immediate to call Forth, which is, after all, doing the compiling, access to the word list would make more sense, since it would be a local method in Forth. We had a match/case for a while that could do that, then decided to put the code in Primary.

I wouldn’t go so far as to say that this is right, but it does work and can possibly be made right.

I recall a quote in the Loeliger book: “If all of this sounds confusing, don’t panic — it is”. The relationship between the words and the compiler in Forth is intricate. It’s called Threaded Code for a reason, and so it is perhaps OK if what we have here is a bit confusing as well.

Perhaps. I’m going to let this ride for the morning, think a bit, maybe even draw a picture. I recall one thing that is interesting about compilation in Loeliger: the compiler opens a new word definition in the lexicon and leaves it open, appending things to it. In our current scheme, we have a word list that’s open, but the lexicon entry is not open, it’s created at the last minute.

But it is confusing … and I’m not going to panic yet. We have a spike open in the editor. We’ll let it live, for now.

See you next time!