Down to CASEs
The Robot World Repo on GitHub
The Forth Repo on GitHub
Today I plan to actually begin the coding of the CASE-Of-ENDOF-ENDCASE phrases. And I have a glimmer of a better way to do it. Will the glimmer become light … or darkness? Wisely, for once, I stop while I’m ahead.
Rhythm
As I think about compiling things like the CASE words, I’m beginning to feel the process as mostly linear, in a smooth rhythmic kind of sense. We’ll put this over here, we’ll put this in the word, we’ll put that in the word, we’ll reach over here and put this in the places we find over there, etc.
I don’t think I can express that sense well enough for you to feel it, but I hope you have sometimes had a similar sense, that the task before you will just kind of flow out, as opposed to the sense of fumbling and finding our way and cutting and trying and such that goes with our early relationship with programming some new thing.
The upshot of this is that, if I’m in fact getting a better sense of how these things should be done, is that we’ll probably create some new words that better express the smooth flow of creating new words.
I’ve come by this better feeling from doing things here, but also from reading about classical Forth. Remind me to list all the things I’ve been reading. At least three books and lots of web pages and here and there some Reddit and Stack Overflow. It’s not like I just rest when I’m not writing these things. (I just mostly rest.)
My sense of classical Forth is that it does most of its definitional work by appending things to the dictionary, which is really just a long list of cells that is mainly appended to. There are bits of back-patching for branches, but otherwise you just add stuff. So that means that defining some word mostly comes down to append this append that append this other thing, with the occasional reach over to put something on a shelf or to grab something off a shelf. The pattern for all the words is very similar: it all comes down to figuring out what to add to the word, and appending it.
Details
-
I believe that the text diagram I provided yesterday was a bit off. If I determine that to be the case, I’ll provide a better version here, when we’re done or more nearly done.
-
At one point early in writing this Forth, we had a structured object that we pushed onto our compilation stack. I think that now we’re just pushing numbers. I have come to better understand what that stack is good for, and I plan to begin to put structured objects onto it again. This can be done incrementally, since everything on that stack is a conversation, typically between the beginning of a structure and the end, so as long as what we put in matches what we take out, everything is fine. It’s when someone messes up that process that the typed object can help, because we can verify that we got what we wanted.
-
There will be two layers to the CASE phrases, one for the CASE and one for the OF. Each of these will stack some information. At the CASE level, we’ll record the cells that need to branch out of the whole structure, namely after completing an OF that we have selected. At the OF level, we’ll keep track of the cell in the zero-branch at the top, that takes us to the next OF if this one is not selected.
-
I think we’ll start by defining CASE-ENDCASE, which will get the basic structure in place. Most of the action is in OF-ENDOF, but with the structure in place, that should be easier.
-
I plan to write mostly tests that assert on the actual code compiled, not just results of running the code.
-
I could be wrong about any or all of the above, but I am not wrong about this line.
Down to CASE
Begin with a test. This might be interesting. I’ll start writing it and see if I can figure out what to assert. Right now I am not sure. Let’s make a new test class, the TestCompile is getting long and kind of random.
class TestCase:
def test_hookup(self):
assert False
That fails. I’m hooked up properly. Always a good place to start.
def test_trivial_case(self):
f = Forth()
f.process_line(': TEST 3 CASE ENDCASE ;')
I believe that this should compile to a literal 3, and a DROP, and nothing else.
I’m not sure how to express this, so we’ll go with this for now:
def test_trivial_case(self):
f = Forth()
f.process_line(': TEST 3 CASE ENDCASE ;')
w = f.find_word('TEST')
assert str(w) == ''
This isn’t going to drive out much, but let’s go for it.
class Lexicon:
def define_case_of_endof_endcase(self):
def _endcase(f):
f.word_list.append(f.find_word('DROP'))
self.pw('CASE', lambda f: None, immediate=True)
self.pw('ENDCASE', _endcase, immediate=True)
The test gives an answer that I do not expect:
Expected :''
Actual :': TEST DROP ;'
Where is the 3? Is this a problem with the data or with the __repr__
that I wrote a while back to print words?
I print the word list and get this:
[: *# 3 ;, DROP: <code>]
I think my __repr__
hack needs work. Turn it off.
- GRRR
- I’ve been messing with this a bit, just a few minutes. I think that what’s at issue is that when we define a literal, we don’t just plunk the two cells into the current list, we compile a nameless word and put that into the current word. Let’s divert and see if that’s true.
def get_literal(self, token):
try:
return Word('', [self.find_word('*#'), int(token)])
except ValueError:
return None
Yes, it’s true. For a literal, instead of just compiling it into the word list, we create a new word and push that into the list. That should be improved. For now, I can live with it, but I’m going to give the word a name:
def get_literal(self, token):
try:
return Word(f'*# {int(token)}', [self.find_word('*#'), int(token)])
except ValueError:
return None
And now the test fails nicely:
Expected :''
Actual :': TEST *# 3 DROP ;'
Close enough. Put that into the string then we should break for a moment.
def test_trivial_case(self):
f = Forth()
f.process_line(': TEST 3 CASE ENDCASE ;')
w = f.find_word('TEST')
words = w.words
assert str(w) == ': TEST *# 3 DROP ;'
We could commit now. Let’s do. Commit: first trivial CASE-ENDCASE
Reflect
I was derailed and diverted as well as pushed off track by that issue with the display, which is actually, if not a defect, a flaw, because a literal can be compiled in line, and should be, whereas I’m creating a new word for that particular literal and then calling it. So, having worked around that issue, it makes sense to take a little break and shake off the confusion.
OK, I feel better now.
Back to it
What should happen in CASE is that it should put a structure on the compile stack, which will ultimately contain a list of all the cells that need to be patched to jump out of the case phrase. Then, in ENDCASE, we patch them up.
I want to take this occasion to create a trivial object to be placed on the compile stack. I plan to name that class Sys, because the Forth standard refers to the things on the compile stack with words like “case-sys” and “of-sys”. You might think that Sys is a bad name for a class, but I will argue that it is a term of art in the Forth community and therefore it is OK. Having made that argument, I find it inadequate and agree with you: we need a better name.
But we’ll go with Sys for now.
I’m going to do this “by intention”, or, as I like to call it “programming by wishful thinking”. I write the code that I want to have work and then make it work.
But how can we test this? Could we create a special testing word that we can put in a definition and have it help us out somehow? Let’s suppose that we can, and write a new test:
def test_c_stack_set_up(self):
f = Forth()
f.process_line(': TEST 3 CASE GET_C_STACK ENDCASE ;')
assert f.c_stack == Sys("CASE", [])
I’m supposing that GET_C_STACK will put the value on the top of the compile stack into f.c_stack and that we’ll find it to be a Sys as shown. This will take a bit of work but not too much.
It didn’t take long at all, and I have a revised test that works:
def test_c_stack_set_up(self):
f = Forth()
f.process_line(': TEST 3 CASE GET_C_STACK ENDCASE ;')
assert f.c_stack.name == 'CASE'
assert f.c_stack.cells == []
And:
class Lexicon:
def define_case_of_endof_endcase(self):
def _get_c_stack(f):
f.c_stack = f.compile_stack[-1]
def _case(f):
sys = Sys('CASE')
f.compile_stack.push(sys)
def _endcase(f):
f.word_list.append(f.find_word('DROP'))
f.compile_stack.pop()
self.pw('CASE', _case, immediate=True)
self.pw('ENDCASE', _endcase, immediate=True)
self.pw('GET_C_STACK', _get_c_stack, immediate=True)
class Sys:
def __init__(self, name):
self.name = name
self.cells = []
We’re green and commit: CASE-ENDCASE stack and unstack new Sys instance.
Reflection
OK, so far so good. A bit of confusion there with the repr not looking as I expected, but we recovered quickly. And now we have a nice little class. My theory on it is this: we know that the CASE construct needs to patch up an arbitrary number of branches, so we made the Sys contain a list. As we convert or create other words that use the compile stack, we’ll use that list, even if we only have one thing to patch. If we are reasonably clever, we’ll fix up the alignment so that we don’t have those weird +1 -1 adjustments that occur now.
You don’t recall those, do you? I hardly do myself. No worries. If we never encounter them again, they’re fine, and if we do encounter them, we’ll sort them out.
What’s next? Let’s write a partial test and talk about it.
def test_for_discussion(self):
f = Forth()
test = (': TEST'
' 2 CASE'
' 1 OF 111 ENDOF'
' 2 OF 222 ENDOF'
' ENDCASE'
';')
expected_stack = [222]
I think we’re going to break before we implement any of this but let’s see what we can say about the code we expect in the word TEST. Informally, writing (z) as the address of cell z:
*# 2
*# 1 OVER = 0BR a DROP *# 111 BR x
(a)
*# 2 OVER = 0BR b DROP *# 222 BR x
(b)
DROP
(x)
Let me explain that OVER = 0BR DROP sequence. We enter the case with the particular case value integer on the stack, call it CV. Each OF stacks the case value it wants to check, call it V. So we enter the sequence with stack (CV V) and in our test, it goes like the following. Remember that true is -1 in Forth.
label | stack | op |
---|---|---|
2 1 | OVER | |
2 1 2 | = (false) | |
2 0 | 0BR a (done) | |
DROP (skipped) | ||
*# 111 (skipped) | ||
BR x (skipped) | ||
a | 2 2 | OVER |
2 2 2 | = (true) | |
2 -1 | 0BR b (skipped) | |
2 | DROP | |
_ | *# 222 | |
222 | br x | |
b | 222 | DROP (skipped) |
x | 222 | (result) |
So the sequence OVER = 0BR leaves the case value on the stack, so inside the OF we drop it and if we skip to the next, it’s there waiting.
Tricky to think about, but simple once you get your head around it.
But we’ll break here. We have a bit of the feature working and I’ll sum up by describing what I think we’ll want to do next time.
Summary
I see two ways we could go. One would be just to write two new word definitions _of
and _endof
and brute it out much as we’ve always done. The other, which may be possible, is to refactor a bit so that extending a word definition is a bit more like just add this, add that, add other. I don’t see right now how to do that, but if we can do it it will make writing the actual new words easier and should make them more clear.
I have the feeling that I’m close to some simple refactoring that will make things go more smoothly. I’d like to rest and let that perk. On a real team, I might just ask to pair with someone on whatever they’re doing. Or I might sneak off to the broom closet and grab a nap.
Either way, so far so good and I have no doubt that we can sort out something next time.
One more thing: it seems clear that if we’re going to properly test these words, we’re going to want to use the new GET_C_STACK test word a lot, which suggests to me that we’ll want to extend how it works, probably having it create a separate testing stack of results that we can push to and then check at the end of the test.
And one more one more thing: I think I’ll look up how to do fixtures in pytest. I’m tired of seeing f=Forth()
at the beginning of each one.
A good first few steps. Stop while we’re ahead. See you next time!