Forth: Numbers

The Repo on GitHub

I haven’t forgotten about IF. It’s just that tests would be a lot easier if I had literal numbers. This goes really well.

Yesterday, I made the secret word ‘*#’ work. It consumes the cell after its location in the word, and pushes whatever it finds there onto the stack. Typically we expect a number. Since it has consumed the number, execution continues 2 cells after the *# rather than the usual one.

Today, the plan is to cause the compiler to recognize numbers and compile them into the word. This will go quite easily, I think, except that surely I’ve just jinxed it. This effort will be aided by the fact that I’ve done much of it before, but it got deleted in the last rollback.

Numeric recognition needs to take place in here:

class Forth:
    def compile(self, text):
        # why don't we just store the word in the list, it's no larger than the index
        words = text.split()
        match words:
            case ':', defining, *rest, ';':
                word_list = [ix for word in rest if (ix := self.find_word_index(word)) is not None]
                self.lexicon.append(SecondaryWord(defining, word_list))
            case _:
                raise SyntaxError(f'Syntax error: "{text}". Missing : or ;?')

Note: I do note that comment to myself about what goes in the word list. Perhaps we’ll deal with that at another time. Here our purpose is to extend what we have, not change it.

In particular, it has to take place in the first line of the first case, which just scarfs all the words out of the input line, looks them up in the lexicon, and stuffs their indices into the Word. We extract a method to give ourselves a place to stand:

    def compile(self, text):
        # why don't we just store the word in the list, it's no larger than the index
        words = text.split()
        match words:
            case ':', defining, *rest, ';':
                word_list = self.compile_word_list(rest)
                self.lexicon.append(SecondaryWord(defining, word_list))
            case _:
                raise SyntaxError(f'Syntax error: "{text}". Missing : or ;?')

    def compile_word_list(self, rest):
        word_list = [ix for word in rest if (ix := self.find_word_index(word)) is not None]
        return word_list

Now we need to unwind that one-liner into a loop that creates one item at a time. Amazingly, PyCharm offers “Convert comprehensions to for loop” and gives us this:

    def compile_word_list(self, rest):
        word_list = []
        for word in rest:
            if (ix := self.find_word_index(word)) is not None:
                word_list.append(ix)
        return word_list

So far I’ve only done the thinking, and PyCharm has done all the work. And, remember, this is without the cursed “AI” turned on. Remind me to rant about “AI” some time real soon now.

OK, so if find_word_index returns None, it might be a number. I think I have to code something now.

    def compile_word_list(self, rest):
        word_list = []
        for word in rest:
            if (ix := self.find_word_index(word)) is not None:
                word_list.append(ix)
            elif (num := self.compile_number(word)) is not None:
                ix = self.find_word_index('*#')
                word_list.append(ix)
                word_list.append(num)
            else:
                raise SyntaxError(f'Syntax error: "{word}" unrecognized')
        return word_list

Tests are failing for want of that compile_number method. And I suddenly realize that in my excitement I forgot to write a test. First a null method:

    def compile_number(self, word):
        return None

Right. Tests are green. Let’s write a test for numbers:

    def test_lit_compiled(self):
        f = Forth()
        s = ': TEST 3 4 + ;'
        f.compile(s)
        test_word = f.find_word('TEST')
        test_word.do(f)
        assert f.stack.pop() == 7

This fails. I expect the syntax error. I get an error I did not expect:

    def find_word_index(self, word):
        lex = self.lexicon
        for i in range(len(lex)):
            if lex[i].name == word:
                return i
>       raise ValueError(f'cannot find word "{word}"')
E       ValueError: cannot find word "3"

Right. We should return None from that now.

    def find_word_index(self, word):
        lex = self.lexicon
        for i in range(len(lex)):
            if lex[i].name == word:
                return i
        return None

Now do I get my expected error?

            else:
>               raise SyntaxError(f'Syntax error: "{word}" unrecognized')
E               SyntaxError: Syntax error: "3" unrecognized

Perfect. However, now another test is breaking, one that expected that ValueError. We’ll deal with it second, sticking to our current path just now.

We need to make compile_number somewhat more robust. For now, we’ll just accept integers.

    def compile_number(self, word):
        try:
            num = int(word)
            return num
        except ValueError:
            return None

Our 3 + 4 = 7 test runs. Numbers work. Amazing, isn’t it? We have this other test that expected a specific error, let’s fix it up:

    def test_undefined_word(self):
        f = Forth()
        s = ': SQUARE DUMB + ;'
        with pytest.raises(ValueError) as e:
            f.compile(s)
        assert str(e.value) == 'cannot find word "DUMB"'

That is surely failing not liking the type of error nor the message. After I change it to expect SyntaxError, it still complains, as expected:

Expected :'cannot find word "DUMB"'
Actual   :'Syntax error: "DUMB" unrecognized'

Paste the correct message. Test runs. Green. Commit: now compiling integers correctly.

As it is Sunday and there is bacon in the offing, let’s do a reflective summary and call it a morning.

Summary

This really went swimmingly. The only actual mistake that I made, not mentioned above, was that I left the semicolon off my test expression the first time I typed it in.

I find, reflecting, that the word “DUMB” is somewhat offensive. Change it.

    def test_undefined_word(self):
        f = Forth()
        s = ': SQUARE UNKNOWN_WORD + ;'
        with pytest.raises(SyntaxError) as e:
            f.compile(s)
        assert str(e.value) == 'Syntax error: "UNKNOWN_WORD" unrecognized'

Much better. Sincere apologies to anyone who was offended by the prior word. I never wish to offend unintentionally.

The changes we made were simple and very much aided by PyCharm. It went about like this:

Extract the method that parsed all the words; (PyCharm)
Convert the comprehension to a for; (PyCharm)
Add elif calling compile_number;
Add else raising exception; (Could have deferred this.)
Add empty compile_number;
Suddenly remember to write a test;
Fill in compile_number;
Fix up old test to expect new error message.

Along the way I thought of something that I think would be useful. If the compile method were to return the word, tests would be simpler. Like this:

    def compile(self, text):
        # why don't we just store the word in the list, it's no larger than the index
        words = text.split()
        match words:
            case ':', defining, *rest, ';':
                word_list = self.compile_word_list(rest)
                word = SecondaryWord(defining, word_list)
                self.lexicon.append(word)
                return word
            case _:
                raise SyntaxError(f'Syntax error: "{text}". Missing : or ;?')

Then given this test, for example:

    def test_lit_compiled(self):
        f = Forth()
        s = ': TEST 3 4 + ;'
        f.compile(s)
        test_word = f.find_word('TEST')
        test_word.do(f)
        assert f.stack.pop() == 7

We can do this:

    def test_lit_compiled(self):
        f = Forth()
        s = ': TEST 3 4 + ;'
        test_word = f.compile(s)
        test_word.do(f)
        assert f.stack.pop() == 7

Or, even this:

    def test_lit_compiled(self):
        f = Forth()
        s = ': TEST 3 4 + ;'
        f.compile(s).do(f)
        assert f.stack.pop() == 7

We could even inline the text like this:

    def test_lit_compiled(self):
        f = Forth()
        f.compile(': TEST 3 4 + ;').do(f)
        assert f.stack.pop() == 7

I rather like that. We’ll commit it: compile returns the word compiled, useful in testing.

So. We have numbers, and it went well. We have blessed the notion of putting a value into the word that is not executed but is instead consumed by the word ahead of it. Given the constraints of Python, I think this is a pretty good way of doing what we needed.

We still have IF to do, and now we have at least two ways it might work: if could leave a cell in the word to be patched when THEN is found, or, when THEN is found, it could patch the skip count into the *IF word’s parameter field. We’ll consider those options another time. For now, this smooth morning makes up for some of my thrashing of the previous few days.

Is there a lesson for me in this? I think so. It might go something like this:

When things go well, it is a tribute to the quality of the design and code near where the changes are made. Yes, it’s also a tribute to one’s powerful brain, but when that powerful brain is thrashing, or near thrashing, it’s because the design and code, and the imagined new design and code, are not yet clear, clean, and good. Small steps, experiments, and a willingness to throw away code that isn’t right are valuable in these difficult times.

OK, that doesn’t go nicely into seven succinct and poetic words, but that’s what I’ve got. I thrash when I don’t know what I’m doing and when I don’t know what I’m doing, experiments and throwing away code generally pay off.

See you next time!