Further Review

The Robot World Repo on GitHub
The Forth Repo on GitHub

Well, our creation of Heap yesterday went smoothly, but it was all we did. Let’s review the Forth code a bit more and see what it might be wishing for.

I do wish that I had remembered about the little Heap experiment. No harm done, but it is a bit embarrassing to have done an experiment and then not remember that I had done it. I did look it over and prefer the one we have now, because it does only the things we actually need, and it is more supportive of the things we want to do, such as allot space or fetch or store a value anywhere inside.

Anyway, what else do we see? One issue is the Lexicon. It was created to contain all the words and to facilitate creating the words that we have built so far. With few exceptions, we have created the fundamental words that one expects to find in Forth. We presently have 49 words defined: I’ll put them in a list at the end of the article just as a record.

Historical Note: It’s possible to build a Forth system based on only a handful of hand-crafted words, written in assembler (or binary), and to bootstrap the system from there. Many more words are then written, again in binary, based on the initial handful, and after a while there you are.

Where were we? Oh, right, the Lexicon. The file is now over 200 lines long, basically all construction methods, and its organization isn’t very useful to me. Here’s a bit of it:

    def define_primaries(self, forth):
        self.define_immediate_words(forth)
        self.define_stack_ops()
        self.define_skippers(forth)
        self.define_arithmetic()
        self.define_comparators()
        self.pw('SQRT', lambda f: f.stack.push(math.sqrt(f.stack.pop())))
        self.pw('.', lambda f: print(f.stack.pop(), end=' '))
        self.pw('CR', lambda f: print())
        forth.compile(': CONSTANT CREATE , DOES> @ ;')
        forth.compile(': VARIABLE CREATE ;')

    def define_immediate_words(self, forth):
        self._define_begin_until()
        self._define_colon_semi()
        self._define_create_does()
        self._define_do_loop()
        self._define_if_else_then()

    def _define_begin_until(self):
        def _begin(forth):
            forth.compile_stack.push(len(forth.word_list))

        def _until(forth):
            until = forth.find_word('*UNTIL')
            forth.word_list.append(until)
            jump_loc = forth.compile_stack.pop()
            forth.word_list.append(jump_loc - len(forth.word_list) - 1)

        self.pw('BEGIN', _begin, immediate=True)
        self.pw('UNTIL', _until, immediate=True)

I find that while the organization here makes some sense, it’s difficult to find a particular word’s definition when I’m looking for it. Well, I say “difficult”. Usually I click the lexicon tab and scroll around a bit looking for the word. Then, if I don’t find it, I use PyCharm’s Command-F find to search for the word’s name. That usually gets me close, but look at the code right above and suppose I was looking to see how BEGIN works.

If I search with case sensitivity on, I get to the BEGIN there at the bottom. In this case, everything I want is near by, but in a longer _define method, I might still have to scroll up to find the _begin code. But if I search with case sensitivity off I’ll get at least one extra hit before I find what I want.

No big deal, but it’s irritating, and almost everything we do involves working on the lexicon.

Additionally, up there in the define_primaries method, there are two secondaries being compiled, CONSTANT and VARIABLE. Those are not primaries. We could argue strongly that they deserve their own method.

But wait, while I’m grumbling, there’s more. Check this out:

    def define_stack_ops(self):
        def _2dup(forth):
            top = forth.stack[-1]
            bot = forth.stack[-2]
            forth.stack.push(bot)
            forth.stack.push(top)

        def _at(forth):
            index = forth.stack.pop()
            forth.stack.push(forth.heap.at(index))

        def _put(forth):
            index = forth.stack.pop()
            value = forth.stack.pop()
            forth.heap.put(index, value)

        def _allot(forth):
            number_to_allot = forth.stack.pop()
            forth.heap.allot(number_to_allot)

        def _comma(forth):
            value_to_store = forth.stack.pop()
            forth.heap.comma(value_to_store)

        self.pw(',', _comma)
        self.pw('ALLOT', _allot)
        self.pw('@', _at)
        self.pw('!', _put)
        self.pw('2DUP', _2dup)
        self.pw('DROP', lambda f: f.stack.pop())
        self.pw('DUP', lambda f: f.stack.dup())
        self.pw('OVER', lambda f: f.stack.over())
        self.pw('ROT', lambda f: f.stack.rot())
        self.pw('SWAP', lambda f: f.stack.swap())
        self.pw('>R', lambda f: f.return_stack.push(f.stack.pop()))
        self.pw('R>', lambda f: f.stack.push(f.return_stack.pop()))
        self.pw('R@', lambda f: f.stack.push(f.return_stack.top()))

At some small loss in clarity, all of those other than 2DUP could be done with lambda. As they are very base-level operations, and unlikely to change, the compactness of the lambda form might be better. Let’s try it and see.

I’ll proceed by turning a given underbar method into a one-liner and then into a lambda. Like this:

        def _comma(forth):
            forth.heap.comma(forth.stack.pop())

That’s done with PyCharm’s inline refactoring so it’s guaranteed to work and anyway the tests also run green.

Then it’s a simple edit to get this:

    self.pw(',', lambda forth: forth.heap.comma(forth.stack.pop()))

And a rename from forth to f, because that is our pattern.

    self.pw(',', lambda f: f.heap.comma(f.stack.pop()))

All green, all good. Commit? Sure, why not, it’s a good practice. Converting possible one-liners to lambdas.

Same for allot, a little more smoothly. Commit. And put. Commit. And at. Something went wrong, undo, do again, green, commit.

We’re left with this:

    def define_stack_ops(self):
        def _2dup(forth):
            top = forth.stack[-1]
            bot = forth.stack[-2]
            forth.stack.push(bot)
            forth.stack.push(top)

        self.pw('2DUP', _2dup)
        self.pw(',', lambda f: f.heap.comma(f.stack.pop()))
        self.pw('ALLOT', lambda f: f.heap.allot(f.stack.pop()))
        self.pw('@', lambda f: f.stack.push(f.heap.at(f.stack.pop())))
        self.pw('!', lambda f: f.heap.put(f.stack.pop(), f.stack.pop()))
        self.pw('DROP', lambda f: f.stack.pop())
        self.pw('DUP', lambda f: f.stack.dup())
        self.pw('OVER', lambda f: f.stack.over())
        self.pw('ROT', lambda f: f.stack.rot())
        self.pw('SWAP', lambda f: f.stack.swap())
        self.pw('>R', lambda f: f.return_stack.push(f.stack.pop()))
        self.pw('R>', lambda f: f.stack.push(f.return_stack.pop()))
        self.pw('R@', lambda f: f.stack.push(f.return_stack.top()))

I can’t do 2DUP, as written, in a one-liner. However, it is only referencing the stack. This is a classic case of Feature Envy. Stack can help:

class Stack:
    def two_dup(self):
        top = self[-1]
        bot = self[-2]
        self.push(bot)
        self.push(top)

class Lexicon:
    def define_stack_ops(self):
        self.pw('2DUP', lambda f: f.stack.two_dup())
        self.pw(',', lambda f: f.heap.comma(f.stack.pop()))
        self.pw('ALLOT', lambda f: f.heap.allot(f.stack.pop()))
        ...

I think I like that. It’s very compact and while a few of the implementations require a bit of effort to read, we won’t be reading them often.

Note: Yesterday I expressed the possibility that inlining this might be a concern:

    def _put(forth):
        index = forth.stack.pop()
        value = forth.stack.pop()
        forth.heap.put(index, value)

My concern was that inlining would only work if Python processes function arguments left to right. It does, and if it didn’t, our tests would fail. So I decided to go with the one-liner -> lambda thing here.

There is no question here that readability is reduced by the changes I just made. Scanability, if that were a word, is increased. I’ve made my choice and am prepared to live with it until it irritates me into changing again.

I change a few more one liners to lambda, and discover that the word 2PC@ is used only in a test for 2PC@, so I remove the word and the test.

OK, enough cleansing, let’s sum up.

Summary

I chose to trade off individual readability of some word definitions to get a more compact and scannable file. This may or may not have been a wise decision, but if I am correct that we will rarely look at those definitions again, it will have been wise enough.

The Lexicon file is shorter, down to about 175 lines from over 200, but the actual methods of the class, other than the initializing define ones, are only about 15 lines and five methods counting the __init__.

The defines are not called from Lexicon: Forth calls them. Therefore—this just occurred to me—they do not belong in Lexicon at all, but instead should be in some kind of factory method or object belonging to Forth. Perhaps we’ll think about that later, but it’s not really bugging us now.

Anyway, for a leisurely Sunday morning, we’ve reduced the code size at no loss in capability and have arguably made things a bit easier to handle. I think converting to one-line lambdas will be OK, and if not, we’ll do something better.

See you next time!

Forth words as of 2025-01-12-0920:

! * *# *DO *ELSE *IF *LOOP *UNTIL + , - . / 
1+ 1- 2DUP 2PC@ : ; < <= = > >= >R @ ALLOT 
BEGIN CONSTANT CR CREATE DO DOES> DROP DUMP 
DUP ELSE I IF LOOP OVER R> R@ ROT SQRT SWAP 
THEN UNTIL VARIABLE