The Repo on GitHub

I come before you today armed with information, if not knowledge. We’ll try some small steps toward what I think I somewhat understand.

Information

Since last I said “Bah!” here, I have read more in Loeliger, I have experimented with and read the code in the py4fun Forth, I have obtained and read quite a bit of the Forth standard, and have shared some useful lines with GeePaw Hill, who has dredged up some ancient memories of his days as a Forth Wizard. Here is a very lightly edited view of our exchange:

GeePaw Hill
So, in today’s article, I don’t think there’s any immediacy anywhere. CREATE is definitely not immediate, and I don’t think DOES> is.

What DOES> does in a regular memory map is this: CREATE has just dropped a CFA (Code Field Address, The address of the primitive that interprets the rest of the word.) for the word it just named. DOES> takes that CFA and makes it unconditionally jump to the code that follows it.

Ron Jeffries
Thing is in : CONSTANT CREATE , DOES> @ ; some of the code goes in CONSTANT and some in whatever constant you create, and I don’t see how to get it to do those two things.

Is it branching back into the definition of CONSTANT from DOES>?

GeePaw Hill
It looks like DOES> is compiling @ into the word that’s just been made, but it’s not. @ or any other code post-DOES> is compiled just like normal. DOES> makes the newly minted word jump (not call) to that code.
Ron Jeffries
so the @ is inside the def of CONSTANT and is branched to from wherever CONSTANT builds a new word?
GeePaw Hill
Yes, that’s it exactly. As I recall, different Forth’s do this differently, but that’s what fig-Forth did.
Ron Jeffries
and meanwhile CONSTANT itself somehow skips out without hitting what’s in the DOES>?
GeePaw Hill
Ahhhh, I see your question. So, yeah. DOES> must do a premature return. But there’s no immediacy, it just ends the function right there.
Ron Jeffries
Hm that’s helpful, thanks. but since my code isn’t threaded … I would need to call the CONSTANT word from the new word with a non-zero program counter. not a feature that I have but I could have, I suppose.

Maybe I have to manage the program counter explicitly.

But the key idea is compile CONSTANT to have the DOES> code in it and (then a miracle occurs) get to it from the CREATE’s word, via a branch kind of thing not a call kind of thing. Nothing to it … 🙂

I think my current structure is wrong in some key way that I don’t quite understand, revolving around compile mode and state ideas, vs the sort of p-baked way the py4fun version works. And Loeliger is saying the loop words only work in compile mode, which I don’t know if that was true in real forth or not, because Loeliger isn’t really doing Forth, he’s doing a tune of his own invention.

GeePaw Hill
Good luck with it. Don’t know that I’ve ever seen you have two shitty days in a row. And feel free to de-brief here if it helps.
Ron Jeffries
Thank you kindly, that may sort me out. At least it makes me hopeful. 🙂

Vague Summary of Weak Understanding

Summing that up, what I have gleaned is that when we say:

: CONSTANT CREATE , DOES> @ ;

None of those words are immediate, so that the word CONSTANT has exactly those words in its definition, so that when we say:

2025 CONSTANT YEAR

Exactly those words will be executed: CREATE, ,, DOES> … and, we believe, DOES> will skip over the @ during the execution of CONSTANT, but will arrange for YEAR to do the @. According to Hill, the Forths he knew would branch back into the actual CONSTANT code, but I see no reason why we couldn’t just copy the rest of the CONSTANT’s code into the new word (YEAR) that we’re creating.

We “know” that CREATE is supposed to define a word that pushes the address of the next available cell on the heap (at the time CREATE runs, not each time the word YEAR runs). So,supposing the next available cell is at 5, it should be as if we had said:

: YEAR 5 @ ;

At this point I’m thinking “why don’t people just say that and spare me this trouble?”

Primary Concern

My main concern is that our compile method doesn’t really know that it’s compiling a definition. It is compiling a list of words, and if it is in “compile mode”, it keeps building that list until it exits compile mode. If it is not in compile mode after compiling a word, it returns the list to be executed.

The colon and semicolon are “immediate” words. The colon operator reads the next word from the input, which will be the name of the word to b defined, and pushes that name onto the compile stack. The semicolon, also immediate, pops the name off the compile stack and creates a SecondaryWord with that name and the current list of words. I think then we’ll return an empty list of words to be executed, before things continue.

In a “real” Forth, the colon word (or a word that it calls) would open a new word definition at the end of the word list. Those Forths often have a partially-complete word under construction, so that when they are in compile mode, they just look up the word and “enclose” it in the definition at the end of the list. When they hit the semicolon, they might compile a SEMI operation into the word, if it’s needed to manage the return from that word. It’s a detail of the particular implementation.

So when CONSTANT runs, we know that CREATE should create a word and start its definition with the literal word *# followed by the current heap address. Then CONSTANT should execute the comma word, storing the stack top into the heap (and allocating the word so that it won’t be reused). Then CONSTANT will execute the DOES> word. And then a miracle occurs:

two men at blackboard. one points to "and then a miracle occurs" and says "I think you should be more explicit here in step two."

Somehow we have to cause the code in CONSTANT that follows the DOES to be executed as part of the new word we’re creating, and we skip over it or otherwise exit the CONSTANT code. I see two ways to do somehow:

  1. We could just copy everything from DOES> to the end of CONSTANT into the YEAR word verbatim;
  2. We could compile something into YEAR that would call back to CONSTANT, entering at the location after DOES>, and run to the end.

Easy Start

We can certainly define CREATE and DOES> to do almost nothing and at least compile our definition of YEAR. Let’s quit talking and write some code. Begin with a test:

    def test_compile_create_does(self):
        f = Forth()
        s = ': CONSTANT CREATE , DOES> @ ;'
        f.compile(s)

This fails. I imagine that it will be whining about most of those words, one after another.

E               SyntaxError: Syntax error: "CREATE" unrecognized

I’m just going to give these words primary word definitions that do nothing.

        def _create(forth):
            pass

        self.pw('CREATE', _create)

It doesn’t matter what the words do until we try to execute CONSTANT, so pass will do. We should be objecting to comma now:

E               SyntaxError: Syntax error: "," unrecognized

The comma word is supposed to store the stack into the heap, allocating that word. For now we’ll just keep stubbing, to get the compile to work.

        def _comma(forth):
            pass

        self.pw(',', _comma)

Now I expect the error on DOES>:

E               SyntaxError: Syntax error: "DOES>" unrecognized

Same deal:

        def _does(forth):
            pass

        self.pw('DOES>', _does)

The test is passing: we can compile the CONSTANT definition. I think I’ll commit this: initial create, comma, does, all no-op primary words.

Harder Part

I am inclined to sort of hack this. Here are some idle thoughts:

We need to be creating a new SecondaryWord, but we can’t enter compile mode, can we? Or can we? We’re in the middle of doing a word that is already compiled. Let’s try something:

        def _create(forth):
            name = forth.next_token()
            print(f'create {name}')
            forth.compile_stack.push(name)
            pass

        def _does(forth):
            name = forth.compile_stack.pop()
            print(f'does {name}')
            pass

I’m going to extend our test to try to execute a constant definition:

    def test_compile_create_does(self):
        f = Forth()
        s = ': CONSTANT CREATE , DOES> @ ;'
        f.compile(s)
        f.compile('2025 CONSTANT YEAR')

This fails. Why?

    def _at(forth):
        index = forth.stack.pop()
>       forth.stack.push(forth.heap[index])
E       IndexError: list index out of range

Was that the final @ that failed? Let’s put some more things on the stack. Still fails. The prints come out:

create YEAR
does YEAR

So we stacked the word YEAR and got it back.

Ah! The stack underflow is coming from the heap not from the stack. The final @ should not be executed at all, inside CONSTANT. How can we make DOES> finish the word?

We’ll give SecondaryWord a new method:

class SecondaryWord:
    def finish(self):
        self.pc = len(self.words)

And in DOES>:

        def _does(forth):
            name = forth.compile_stack.pop()
            print(f'does {name}')
            forth.active_word().finish()
            pass

I had to build the active_word on Forth:

class Forth:
    def active_word(self):
        return self.active_words[-1]

I change it to a property:

    @property
    def active_word(self):
        return self.active_words[-1]

And use it in a few places that were doing it the hard way. Commit: DOES> ends current word execution. no other action yet.

Tests are green as far as they go. I’m going to take the partial win and take a break as well. I think I’ve been failing to blink for the past two hours.

Reflective Summary|yrammuS

We definitely have a partial win here. The colon-definition of CONSTANT compiles correctly, though we know that by inspection: we don’t have a good testing scheme for checking the actual compiled word. And the definition of CONSTANT runs well enough to avoid crashing when we use it. It does not do what it should, but it gets through all the words and skips out at the DOES>.

Small steps toward what we need. I don’t quite see how it all hangs together, but we are inching toward the goal and so far I haven’t quite lost the thread.

I’m not sure of next steps, we’ll figure them out next time, but we’ll need to do at least these items:

  • comma, which should be similar to or use ALLOT;
  • the copying part of DOES>;
  • the whole creation of the word implied by the name after CONSTANT;

I’m inclined to brute-force the Word and then see how to plug it into things. I think but am not certain that we would be better off to have the compile mode for colon work the same as whatever we do for CREATE.

Oh and this question just came to mind: is CREATE allowed to be used outside a colon definition? What if it is not followed by a DOES>?

But at least this far, we’ve found small steps in which I feel fairly confident. Far better than the situation for the past two days.

Failed better! See you next time!