Defect!?!?

The Robot World Repo on GitHub
The Forth Repo on GitHub

Reader Laurent points out a problem. I was going to fix it today, by accident. Let’s do it on purpose. And maybe we can get down to cases.

Laurent pointed out that Python and and or will probably early-out when they can, and that therefore, my current implementations of AND and OR may leave spurious and undesired information on the stack. Thanks, Laurent, I think you’re right!

I was already planning to change how those words work, along with some others, and I think that what I was planning will fix the issue. But if there are issues, why haven’t our tests found them?

    def test_or(self):
        f = Forth()
        f.compile('0 0 or')
        assert f.stack.pop() == 0
        f.compile('1 0 or')
        assert f.stack.pop() == 1
        f.compile('0 1 or')
        assert f.stack.pop() == 1
        f.compile('1 1 or')
        assert f.stack.pop() == 1

    def test_and(self):
        f = Forth()
        f.compile('0 0 and')
        assert f.stack.pop() == 0
        f.compile('1 0 and')
        assert f.stack.pop() == 0
        f.compile('0 1 and')
        assert f.stack.pop() == 0
        f.compile('1 1 and')
        assert f.stack.pop() == 1

These tests, and many others, check the top of the stack. They do not check the entire stack, so if they are leaving unwanted items on the stack, we’ll never hear about it. I’ll change all of those:

    def test_or(self):
        f = Forth()
        f.compile('0 0 or')
        assert f.stack.stack == [0]
        f.compile('1 0 or')
        assert f.stack.stack == [1]
        f.compile('0 1 or')
        assert f.stack.stack == [1]
        f.compile('1 1 or')
        assert f.stack.stack == [1]

    def test_and(self):
        f = Forth()
        f.compile('0 0 and')
        assert f.stack.stack == [0]
        f.compile('1 0 and')
        assert f.stack.stack == [0]
        f.compile('0 1 and')
        assert f.stack.stack == [0]
        f.compile('1 1 and')
        assert f.stack.stack == [1]

Two tests fail. Let’s make sure the reason is what we think it is, extra items on the stack.

        f.compile('1 0 or')
>       assert f.stack.stack == [1]
E       assert [0, 1] == [1]

        f.compile('0 0 and')
>       assert f.stack.stack == [0]
E       assert [0, 0] == [0]

Yep! Well-spotted, Laurent!

Here’s the current code that is failing us, and that I intended to change anyway:

    def define_logical_operators(self):
        self.pw('OR',
                lambda f: f.stack.push(0 if f.stack.pop() == 0 and f.stack.pop() == 0 else 1))
        self.pw('AND',
                lambda f: f.stack.push(1 if f.stack.pop() != 0 and f.stack.pop() != 0 else 0))
        self.pw('INVERT', lambda f: f.stack.push(1 if f.stack.pop() == 0 else 0))

Over the weekend, I was reading the Forth Standard, as one does, and I realized that Forth AND and OR work on integers, and that Forth’s standard “true” value is -1, although it generally accepts any non-zero value as true.

So I was going to change our code here to use -1 for true, instead of 1 as we now do, and I was going to change AND, OR, and INVERT to be what they need to be to make integers properly support AND and OR.

I am a little trepidacious and also nervous about this, because I know that Python uses variable length integers and so at least some ANDs and ORs might do weird things. My reading assures me that Python makes it all work out. We’ll find out.

Let’s add a few tests of AND, OR, and INVERT on integers.

    def test_and_or_invert_on_integers(self):
        f = Forth()
        f.compile('7 2 and')
        assert f.stack.stack == [2]
        f.compile('3 7 and')
        assert f.stack.stack == [3]

These fail, of course. I am impatient and want to cut to the chase here. I’ll grit my teeth and do a couple more tests first.

    def test_and_or_invert_on_integers(self):
        f = Forth()
        f.compile('7 2 and')
        assert f.stack.stack == [2]
        f.compile('3 7 and')
        assert f.stack.stack == [3]
        f.compile('2 5 or')
        assert f.stack.stack == [7]
        f.compile(' 2 2 = invert')
        assert f.stack.stack == [0]
        f.compile(' 2 3 = invert')
        assert f.stack.stack == [-1]

I think that’s enough to drive out what I plan to do. But I can make these better and drive out a tiny bit:

        f.compile(' 2 2 = invert')
        assert f.stack.stack == [f.false]
        f.compile(' 2 3 = invert')
        assert f.stack.stack == [f.true]

I want those values anyway. Those tests are failing to compile now, so:

class Forth:
    def __init__(self):
        self.true = -1
        self.false = 0

Now to fix up a whole lot of 0 and 1 values in Lexicon. First I just plug in true and false:

    def define_comparators(self):
        self.pw('=',  lambda f: f.stack.push(f.true if f.stack.pop() == f.stack.pop() else f.false))
        self.pw('>',  lambda f: f.stack.push(f.true if f.stack.pop() > f.stack.pop() else f.false))
        self.pw('<',  lambda f: f.stack.push(f.true if f.stack.pop() < f.stack.pop() else f.false))
        self.pw('>=', lambda f: f.stack.push(f.true if f.stack.pop() >= f.stack.pop() else f.false))
        self.pw('<=', lambda f: f.stack.push(f.true if f.stack.pop() <= f.stack.pop() else f.false))

    def define_logical_operators(self):
        self.pw('OR',
                lambda f: f.stack.push(f.false if f.stack.pop() == f.false and f.stack.pop() == f.false else f.true))
        self.pw('AND',
                lambda f: f.stack.push(f.true if f.stack.pop() != f.false and f.stack.pop() != f.false else f.false))
        self.pw('INVERT', lambda f: f.stack.push(f.true if f.stack.pop() == f.false else f.false))

Lots of tests newly fail. They will be checking literal 1, for true I expect.

I’ll spare you all the replacements of 0 with f.false and so on. I’m back to my three tests failing because of the early out and the numeric and / or / invert.

Now I think we can do this:

        self.pw('OR',
                lambda f: f.stack.push(f.stack.pop() | f.stack.pop()))

Hm, I really expected that to fix some stuff.

Oh #$%@@!, if I’m going to check the stack explicitly, I need to clear it. There’s that rule about just one assert in a test. Life is so short, though. Clear between checks:

    def test_or(self):
        f = Forth()
        f.compile('0 0 or')
        assert f.stack.stack == [f.false]
        f.stack.clear()
        f.compile('-1 0 or')
        assert f.stack.stack == [f.true]
        f.stack.clear()
        f.compile('0 -1 or')
        assert f.stack.stack == [f.true]
        f.stack.clear()
        f.compile('-1 -1 or')
        assert f.stack.stack == [f.true]

Do that in the other tests as well. Maybe we need a test method to return the stack and empty it?

    def test_or(self):
        f = Forth()
        f.compile('0 0 or')
        assert f.stack.flush() == [f.false]
        f.compile('-1 0 or')
        assert f.stack.flush() == [f.true]
        f.compile('0 -1 or')
        assert f.stack.flush() == [f.true]
        f.compile('-1 -1 or')
        assert f.stack.flush() == [f.true]

class Stack:
    def flush(self):
        copy = self.stack[:]
        self.stack.clear()
        return copy

The OR test is green now. Let’s fix AND.

        self.pw('AND',
                lambda f: f.stack.push(f.stack.pop() & f.stack.pop()))

And now we are down to just the new and or invert test, so:

        self.pw('INVERT', lambda f: f.stack.push(~f.stack.pop()))

And we are green. Commit: fix AND, OR, INVERT, and comparators to use proper Forth true and false, and to work on all integers.

Reflection

I’m glad we wrote those additional tests, and I am happy with the new flush helper on Stack, which makes it easier to write tests and encourages checking the whole stack, not just the top. And I am glad that Laurent caught me out, because that made me just a bit more careful, which generally pays off.

And, curiously, it appears from the line count here on my screen that we have more than enough lines to make an article. But a couple of things first.

Think! You Better Think!

It is good advice, if you’re the sort of person who is interested in advice, to think, after fixing a problem, about what similar problems might exist in the program. In this instance, I wonder whether there are other tests that do not check whether the stack is correct “all the way down” and that might thus be weak … or even leaving defects in the system.

I do a bit of that but didn’t look at all the tests. I’m getting bored and tired. Make a note, just in case I become ambitious.

Hey, Wait! What about “craftsmanship”?

Isn’t it unprofessional of me to skip out without checking every test to see if it can be improved? Should I turn in my craftsman badge? I don’t think so. I’ve taken a decent look around and made some improvements, and I’ve left myself a note to increase my chances of noticing this kind of issue in the future. I’m not some perfect god of programming, and I don’t expect anyone else to be. I try to notice things, try to bump up my attention to things, try to make the code better than it was yesterday.

My craft is small steps in the direction of better, not some rote adherence to some old dude’s rules of programming. Even this old dude’s rules.

One more thing, I said there were a couple.

Getting down to cases

I’ve been thinking about how I might program a bot in Forth, a bot of the kind we have in the Python robot world program, and in particular, thinking about the state machine we have, which has three states, Walking, Looking, and Laden.

I’m not sure just how I’d represent those states, but a good guess is that there is a word for each, and a variable somewhere saying what the current / next state is, something like

0 CONSTANT LADEN
1 CONSTANT LOOKING
2 CONSTANT WALKING

VARIABLE STATE 1 ALLOT 
    \ need to check whether 1 is assumed in the standard

WALKING STATE !

We have the states defined as integers and we have set state to WALKING, or 2.

How do we run the bot? Presumably there is some agreed name between us and the World about what word will be repeatedly called to give a bot a turn. Let’s pretend it is YOUR_TURN. What will we want YOUR_TURN to do? We’ll want it to run whichever state is active, I suppose. We could do that with IF and ELSE, but it wouldn’t be pretty. (If Forth is ever pretty.)

So I am planning to implement CASE, which would go, I think, like this:

STATE @ CASE
  LADEN OF DO_LADEN ENDOF
  LOOKING OF DO_LOOKING ENDOF
  WALKING OF DO_WALKING ENDOF
ENDCASE

This will be, I think, more than a little tricky, because if the OF does not match we have to skip to the next OF, and we have to have the right stuff still on the stack, and when we hit an ENDOF, we have to skip to the end, and, because it is in the standard, if none of the OF clauses hits, we should execute the code between the final one and the ENDCASE. Throughout, there are clear statements of what you’ll find on the stack as all this happens.

So there will be two different kinds of things on the compilation stack, one for the OF-ENDOF jump, and another for CASE-ENDCASE, that will link up all the ENDOF jumps to the end. (I suppose one could jump from the first to the second to the third .. which might be easier in some ways, albeit kind of tacky and slightly inefficient. Obviously the “right” think is from each directly to the end.)

Our handling of IF-ELSE_THEN and the looping constructs has been a bit ad hoc, as I learned about what Forth does and worked out ways of doing it. I think I have a better understanding now, and after I hand-compile a few more CASE-ENDCASE examples, I think I’ll have an even better set of ideas. So I am guessing that we’ll have a bit of overall improvement, reducing “technical debt” as Ward Cunningham originally defined it, as a spate of improvement that follows naturally as we learn more and more about our design and how to improve it.

(As opposed to the common use of the term today to mean “crappy code that we wrote knowing full well that it was crap”.)

We’ll see about that, probably tomorrow, unless I go wild and come back this afternoon. It could happen.

I should mention that there is another way to handle the state machine. We could define the STATE variable to contain the executable link to the state code we want. We would set STATE and use it like this:

' DO_LADEN STATE !

STATE @ EXECUTE

The first line gets the “address” (called execution token in Forth) of the DO_LADEN word, and stores the token in STATE. The second line fetches that token and executes it. We don’t yet have the “tick” and EXECUTE words, but we could. We probably will. But I still think we’ll do CASE, although it is tempting not to: I expect it to be a bit of a pain.

But it’ll be fun, too. Wait and see.

Summary

There was a serious defect in AND and OR. In a real program, they would have sometimes left the stack in a bad state, which would have manifested as strange behavior at best and possibly as a stack overflow late on in running of a program. By chance, I was going to fix it accidentally, because of the requirement for those words to work on regular numbers. I was fortunate to have a remote partner, Laurent, who spotted the issue. Had I been able to pair or mob with folx, one of my partners might have spotted the issue.

Now, we are more alert to the possibility and I think we’ll find that we tighten up our testing, and may add a few new capabilities to help ensure that the stack winds up as we intend. Certainly the largest issue with Forth is getting the stack into a state that isn’t quite what someone expects, so we’ll do well to pay more attention to that.

And I cannot begin to express how weird it seems to me to represent true with -1. Minus one seems so … negative. And no is negative. Weird.

That aside, a good morning. And a better one than it would have been without Laurent! See you next time!