The Repo on GitHub

I think we’ll do variables and constants today. I also feel a pivot coming on. Unless I’m just hungry.

Last night, while watching the Lions roar with one eye, I was reading fantasy with the other eye, and occasionally fiddling with the py4fun Forth with the other eye1. I implemented Forth’s VARIABLE word with a line of py4fun Forth, and today I think we’ll try something similar.

But before we begin, I’m considering a pivot for this Forth work study play2. I am considering a) making an actual Forth that I can run at the command line and b) making it much more comp]act and tightly coded than the current one. Whether I’ll refactor to that point or start over, I haven’t decided: I haven’t even decided to do it at all … yet.

Anyway Forth has words VARIABLE and CONSTANT, and the related word ALLOT3.

These two words will implemented somewhat similarly and somewhat differently.

A Forth VARIABLE is a memory address, so that

VARIABLE X 1 ALLOT
X     ( is the address of the variable )
X @   ( is the value of the variable )
3 X ! ( sets X to 3 )

Let’s talk about how the VARIABLE and CONSTANT will be represented in our python Forth.

To define a constant we might say

2025 CONSTANT YEAR

That will be represented by a word named YEAR, containing the code *# 2025. *# is our word that pushes the next word of the definition onto the stack. So that word definition makes a constant. We could, therefore, define YEAR by hand, thus:

: YEAR 2025 ;

The constants can just be words containing the constant’s value. We could allow VARIABLES to work similarly, storing dynamic data in the lexicon, but we’re not going to do that. Why? It just seems wrong to me, though I am sure we could do it. Instead, we’ll have a new internal list, the “heap”, and we will allocate space in that list for our variables. The “address” of the variable will be the index of the variable in the heap list. This is much like we implemented yesterday in our little spike.

We will implement @ and ! much as we did yesterday, fetching and storing into the heap. In the fullness of time, we’ll make the heap smart enough to grow, and to protect itself against people reaching outside of it. Probably.

Forth typically has words CREATE and DOES, with various punctuation attached such as :CREATE or DOES>, which are used to help build up words in the lexicon. So far, we have not done it that way. We may wind up with CREATE and DOES, or we may not. Let’s find out.

CREATE

I think I’ll try to implement the CREATE-DOES words. Here’s a test:

    def test_create_does(self):
        f = Forth()
        f.compile('CREATE FOO 666 DOES>')
        assert f.stack.stack == []
        f.compile('FOO')
        assert f.stack.stack == [666]

I think that CREATE and DOES need to be immediate. Also, I think it is well past time to start accepting lower case words and up-casing them. Soon.

Here are CREATE and DOES>:

    def _define_create_does(self):
        def _create(forth):
            forth.compile_stack.push(('CREATE', forth.next_token()))

        def _does(forth):
            key, definition_name = forth.compile_stack.pop()
            word = SecondaryWord(definition_name, forth.word_list[:])
            forth.lexicon.append(word)
            forth.word_list.clear()

        self.append(PrimaryWord('CREATE', _create, immediate=True))
        self.append(PrimaryWord('DOES>', _does, immediate=True))

I made the immediate, but I am not sure that this is going to hold. But the current test passes.

Bah!4

I’m on the wrong track here. We can’t really use CREATE for our constants and variables. If we are to be able to do that, our word : has to be built to use CREATE: a true Forth CREATE lies closer to the bottom than does :.

Let me go ahead and just create CONSTANT. We’ll let CREATE and DOES> sit here for now.

    def test_constant(self):
        f = Forth()
        f.compile('666 CONSTANT FOO')
        f.compile('777 CONSTANT BAR')
        f.compile('888 CONSTANT BAZ')
        assert f.stack.stack == []
        f.compile('BAZ BAR FOO')
        assert f.stack.stack == [888, 777, 666]

And just this:

        def _constant(forth):
            name = forth.next_token()
            value = forth.stack.pop()
            literal = forth.find_word('*#')
            word = SecondaryWord(name, [literal, value])
            forth.lexicon.append(word)

        self.append(PrimaryWord('CONSTANT', _constant))

OK, let’s push on to VARIABLE. But first a grumpy reflection:

Grumpy Reflection

By now, if this Forth were any good at all, I should be able to define words like CONSTANT and VARIABLE with colon definitions. the fact that I cannot leads me to only a couple of possible conclusions:

  • My bottom level Forth definitions are too abstract and need even more primitive underlying words;
  • I do not understand well enough how to implement Forth.

I think both of these are correct and it is part of what makes me want to start over with a new implementation of Forth, aimed at being far more primitive at base, and relying on more colon definitions where here I have primary words.

VARIABLE

OK, I feel better now. Thanks. Let’s review our little heap spike:

    def test_rudimentary_heap(self):
        f = Forth()
        f.compile('666 4 !')
        assert f.heap[4] == 666
        f.compile('4 @')
        assert f.stack.pop() == 666

    def test_rudimentary_heap_arithmetic(self):
        f = Forth()
        f.compile('666 4 !')
        f.compile('1 3 + @')
        assert f.stack.pop() == 666

    def test_rudimentary_heap_overflow(self):
        f = Forth()
        with pytest.raises(IndexError):
            f.compile('666 10 !')

class Lexicon:
        def _at(forth):
            index = forth.stack.pop()
            forth.stack.push(forth.heap[index])

        def _put(forth):
            index = forth.stack.pop()
            value = forth.stack.pop()
            forth.heap[index] = value

        self.append(PrimaryWord('@', _at))
        self.append(PrimaryWord('!', _put))

Let’s pretend that this heap is strong enough, and imagine what VARIABLE and ALLOT must do.

VARIABLE FOO should define a CONSTANT whose name is FOO and whose value is the index of the next available word in the heap. (Our heap does not have that information just now. Aside: what’s the difference between a heap and a stack? A: To get to the other side. We never pop the heap.)

We’re going to follow the rule that until you say n ALLOT, the next value in the heap is still the next value. So VARIABLE A VARIABLE B will leave A and B pointing to the same value. Identical same, not just equal same.

So we’ll init our heap to have one cell. We’ll implementn ALLOT to add n more words to the heap.

class Forth:
    def __init__(self):
        self.active_words = []
        self.compile_stack = Stack()
        self.heap = [0]

That breaks our tests because they do not do ALLOT. (Nor do they do a lot.)

    def test_rudimentary_heap(self):
        f = Forth()
        f.compile('9 ALLOT')
        f.compile('666 4 !')
        assert f.heap[4] == 666
        f.compile('4 @')
        assert f.stack.pop() == 666

    def test_rudimentary_heap_arithmetic(self):
        f = Forth()
        f.compile('9 ALLOT')
        f.compile('666 4 !')
        f.compile('1 3 + @')
        assert f.stack.pop() == 666

And:

        def _allot(forth):
            forth.heap.extend([0]*forth.stack.pop())

        self.append(PrimaryWord('ALLOT', _allot))

Now we can do VARIABLE, I think. I think it has to be immediate.

I forgot to write a test but here goes:

    def test_variable(self):
        f = Forth()
        f.compile('VARIABLE FOO 1 ALLOT')
        f.compile('VARIABLE BAR 1 ALLOT')
        f.compile('VARIABLE BAZ 1 ALLOT')
        f.compile('666 FOO !')
        f.compile('777 BAR !')
        f.compile('888 BAZ !')
        f.compile('BAZ @ BAR @ FOO @')
        assert f.stack.stack == [888, 777, 666]

And the code:

        def _variable(forth):
            name = forth.next_token()
            value = len(forth.heap)
            literal = forth.find_word('*#')
            word = SecondaryWord(name, [literal, value])
            forth.lexicon.append(word)

        self.append(PrimaryWord('VARIABLE', _variable))

Let’s call a break, and sum up. Commit: CONSTANT and VARIABLE initial implementations.

Summary

From the outside, the new CONSTANT and VARIABLE words are pretty decent, at least if you use them properly. Forth language is in itself very much like a double-edged razor blade: quite useful but you can cut yourself if you don’t use it with great care. So our implementation meets that spirit pretty well, though I would prefer it to be different in a few regards:

  • It would be nice to build it from fewer true primitives, even if we then choose to implement more primitives for “efficiency”.
  • If it’s going to be written in Python it can be a lot more safe than it is. Maybe we could get it down to a single-edged razor blade.
  • Despite the ease of writing our new primitives, as we saw above, I’d like it to be more compact. I think we can accomplish a more compact formulation without losing much clarity.

All that said, the addition of VARIABLE and CONSTANT will surely be useful for our eventual robot users, in the unlikely event that there ever are any. And for me, if I ever build the thing as a Forth to run on my Mac.

The thing I miss most right now, I think, is that I don’t see a way to implement something like VARIABLE as a colon definition. I’d like to resolve that, ideally without starting over.

But we have CONSTANT and VARIABLE in a simple form, working fine if you use them with care. That’s a good thing.

See you next time!



  1. Wait, what? How many eyes? 

  2. We might as well face facts about the significance of the things I work on: there is very little. Where I hope you can find value is in observing how I think about code, and what I do to manage my progress through it. 

  3. I think my earlier understanding and explanation of ALLOT was mistaken. VARIABLE words start with zero length, so that VARIABLE FOO 1 ALLOT is the right way to specify a single-cell variable. 

  4. Despite the fact that this seems like a quick “Curses, foiled again”, soon forgotten in our next nefarious scheme, I actually pay attention to these feelings of frustration and use them to inform my decisions about refactoring and other design improvements. YMMV, but maybe you could do the same.