Field Values

FAFO on GitHub

We need to work on assignments and field values. I was thinking values, and then we do assignments instead. That’s where the path looked best.

Good morning, Friends!

Recall that our expression tokens include literals, operators, and scopes. Our expressions are intended to be applied to records whose scopes are, in essence, alphanumeric field names. Our lexing creates the tokens, the parser arranges them in RPN order, and our Expression object interprets them, producing a value.

As written, the Expression is created with a special field, ‘scope’, that was intended to be the field name into which the expression value should be stored. At the time that decision was made, I was thinking that we’d strip the Foo = part off the input expression and save Foo as the field name. I am now thinking that we can let the lexer and parser do that job.

If we do that, then there will be two “meanings” for a scope token in the RPN. Most of them will be intended to fetch their value from the record being processed, but the final one would represent the name being stored into.

Not so fast. We may wish to create a new record with the new field in it, but we also may be creating a virtual record that calculates the field on the fly so that we do not have to create an entirely new set just because we want a calculated field. In that case, we will surely want to have the final name, the field name being computed, known before we even begin calculating.

I think we can arrange for all the right things to happen. I believe that our parsing should produce RPN that shows the assignment, and that our Expression object should inspect the RPN and find its name.

That’s not entirely clear, is it? It’s not clear to me either. Our mission is to make it clear, by making it work.

I think we’ll start with the parsing. If we allow ‘=’ as an operator, and we give it the lowest priority, we should get what we want. Let’s find out.

Processing Assignment

Begin with a test.

    def test_assignment(self):
        text = 'four = 3 + 1'
        rpn = Parser(text).rpn()
        values = [t.value for t in rpn]
        assert values == ['3', '1', '+', 'four', '=']

That fails, of course:

Expected :['3', '1', '+', 'four', '=']
Actual   :['four', '=', '3', '1', '+']

I believe, but am far from certain, that all we have to do is provide a low-priority operator for =.

    def make_token(self, string_item):
        if string_item in ['*', '/']:
            return Token('operator', string_item, 2)
        elif string_item in ['+', '-']:
            return Token('operator', string_item, 1)
        elif string_item == '=':
            return Token('operator', string_item, 0)
        elif string_item[0].isalpha():
            return Token('scope', string_item, None)
        else:
            return Token('literal', string_item, None)

This did not pass the test. I am disappointed but let’s see what happened.

Expected :['3', '1', '+', 'four', '=']
Actual   :['four', '3', '1', '+', '=']

I think that’s actually correct and that my test is wrong. When we evaluate that, we’ll push four, then 3, then 2, we’ll add and push 4, and then hit =, which would, in principle, pop the name and value and store the value in the name.

However, this bungs up my theory of how to deal with the = in our case. I was thinking I could just check for ‘=’ at the end and pop off two values, saving the name in the Expression’s scope. Now it’s a bit more tricky. Let’s try a further test, after fixing up this one.

    def test_expression_gets_scope(self):
        text = 'four = 3 + 1'
        rpn = Parser(text).rpn()
        expr = Expression('wrong', rpn)
        assert expr.scope() == 'four'

Fix Expression. It’s currently like this:

class Expression:
    def __init__(self, scope, operations):
        self._scope = scope
        self._operations = operations[::-1]

We want to look for the = and if we find it, edit the operations, extracting the assignment name. (We should test the operations as well.)

We code:

class Expression:
    def __init__(self, scope, operations):
        self._scope = scope
        if operations and operations[-1].value == '=':
            self._scope = operations[0].value
            operations = operations[1:-2]
        self._operations = operations[::-1]

That’s kind of creepy but it works. Commit: Parser parses assignment operator = and Expression extracts its name from rpn if provided.

Now let’s clean that code up a bit:

class Expression:
    def __init__(self, scope, operations):
        self._scope = scope
        if operations:
            last_token = operations[-1]
            if last_token.is_assignment():
                first_token = operations[0]
                self._scope = first_token.value
                operations = operations[1:-2]
        self._operations = operations[::-1]

And in Token, of course

class Token:
    def is_assignment(self):
        return self.value == '='

I still don’t love that code. I want to extract at least one meaningful method from it. Let’s reorder things, first setting up the member variable and then using it:

class Expression:
    def __init__(self, scope, operations):
        self._scope = scope
        self._operations = operations[::-1]
        if self._operations:
            last_token = self._operations[0]
            if last_token.is_assignment():
                first_token = self._operations[-1]
                self._scope = first_token.value
                self._operations = self._operations[1:-2]

Note that I had to reverse the 0 and -1 in the if statement, because we’ve already reversed the expression for convenience. Now we can extract:

class Expression:
    def __init__(self, scope, operations):
        self._scope = scope
        self._operations = operations[::-1]
        self.handle_assignment()

    def handle_assignment(self):
        if self._operations:
            initial_token = self._operations[0]
            if initial_token.is_assignment():
                final_token = self._operations[-1]
                self._scope = final_token.value
                self._operations = self._operations[1:-2]

I recognized that the token names were wrong after the reversal. And then I realized that _operations is also not a good name, they are tokens. So:

class Expression:
    def __init__(self, scope, tokens):
        self._scope = scope
        self._tokens = tokens[::-1]
        self.handle_assignment()

    def handle_assignment(self):
        if self._tokens:
            initial_token = self._tokens[0]
            if initial_token.is_assignment():
                final_token = self._tokens[-1]
                self._scope = final_token.value
                self._tokens = self._tokens[1:-2]

Good enough. Commit: refactoring.

While doing that, I noticed something that’s incomplete. Here’s a test that should fail:

    def test_float(self):
        text = '20.5 * 2 + 1'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert result == '42.0'

Huh. I thought that would fail. It does not. I expected it to fail, because I saw this code:

class Expression:
    def result(self, record):
        def to_number(string):
            return int(string)

        stack = []
        while self._tokens:
            op = self._tokens.pop()
            if op.kind == 'literal':
                stack.append(op.value)
            elif op.kind == 'operator':
                op1 = self.to_number(stack.pop())
                op2 = self.to_number(stack.pop())
                match op.value:
                    case '+':
                        res = op1 + op2
                    case '-':
                        res = op1 - op2
                    case '*':
                        res = op1 * op2
                    case '/':
                        res = op1 / op2
                    case _:
                        res = f'Unknown operator{op.value}'

                stack.append(str(res))
        return stack.pop()

How is it that we’re getting the right answer here? Ah. We are calling self.to_number, and that is correct. The internal function was left over from a copy-paste. Bad Ron, no biscuit. Remove the unused function. Here’s what’s used, and it works, as we’ve previously tested and discussed:

class Expression:
    def to_number(self, string):
        try:
            return int(string)
        except ValueError:
            return float(string)

So we are good. Commit: new test that didn’t fail after all.

I think we’re at a good stopping point, so let’s sum up.

Summary

I thought we would work on getting the values out of an associated XSet record and using them in our Expression. A bit of consideration of the larger problem, which includes assignment to a name as well as fetching a value given a name, caused us to step, not straight to fetching value, but instead to discriminating between a name as holding value and a name to be assigned a value.

We could have gone to fetching the value first and then dealt with the assignment, and the end result might be the same, at the end of the next session or two. But it seemed that how we handled the assignment might matter, and I had thought about how it would fit into the parser.

Possibly, it’s the difference, given A + B = C, between doing A first or B first, which seems to make no difference at all. But when we’re standing at zero, A looks like A, and B looks like B. Once we step to A, A itself usually looks a little different, A’, and we have a different view of B, B’. We absolutely know that we’re going to learn something about A, and we can be nearly certain that with that done, our view of B will be different, generally better. (I don’t mean that B will be easier: it might not. I mean that we will see it from another angle and thus have a better sense of how to do it.)

It seemed to me that if we changed parsing to deal with the assignment, it would have some impact on the Expression. In the back of my mind, I was thinking about unwinding the expression and ultimately coming to the = and doing it. But then I remembered that we want our expression to know in advance what scope it produces, so that we can use it as a view, not just to make a real field. So I stopped short of executing = and instead inspected the token list to see if it had an assignment, and if it did, set the Expression’s scope to that value.

Try as I might, I can’t write down my every thought in these articles. You may think I try too hard, but in fact I always fall short of describing every design thought that I have. The point of showing as many as I can is that we are always designing, always thinking, and we allow what we see and think to affect what we do, and to affect the direction we move in. We try to keep as much of our thinking actually in the code as we can.

Well, the good thinking. We try to keep the bad ideas out: which is why we think all the time, so that we’ll more often recognize the difference between a good idea and a bad one.

This morning, I think we had good ones. Our tests are improved and they demonstrate that we can recognize an assignment and make note of it.

What about errors?

I’m glad you asked. Our Parser and Expression do not handle errors at all. We can readily make bad things happen.

    def test_too_many_operators(self):
        text = '2 + - *'
        rpn = Parser(text).rpn
        with pytest.raises(TypeError):
            result = Expression('Ignored', rpn).result(None)

That raises a TypeError. I have no idea why and don’t intend to explore it just now. The point is, malformed input will misbehave and could crash us. So we do need to deal with some error situations. My tentative and ill-formed notion of a suggestion of a plan is to cause the expression to return an error message as its result. We’re leaving that for a future time, but yes, we do need to deal with errors in the expression, without crashing the system.

So that’s the good news: there’s always something to do. I think next time we’ll probably extract values from records, on the way to making our expressions at least somewhat useful.

For this morning, we’ve made a step in the direction of calculated values and it feels like a pretty solid step.

See you next time!