Errors

I have a bit of time, with distractions. Let’s see about error conditions in our expressions.

Good afternoon!

I have this test, improved to make sense from the one I wrote this morning:

    def test_too_many_operators(self):
        text = '2 + - *'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)

My plan is to create errors and then turn them into results of the RPN, where possible. The current error is:

>       elif string_item[0].isalpha():
E       IndexError: string index out of range

This is a zero-length item in the input. I didn’t think my lexer could produce such a thing. It turns out that it puts null spacers between adjacent operators. We’ll ignore them:

    def lex(self):
        no_space = self._expr.replace(' ', '')
        rx = '([^a-zA-Z0-9.])'
        split = re.split(rx, no_space)
        return [item for item in split if item]

Now we get a different error:

>               op2 = self.to_number(stack.pop())
E               IndexError: pop from empty list

Right. Too many operators. New test:

    def test_too_many_operators(self):
        text = '2 + - *'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert 'Too many operators' in result

And code:

    def result(self, record):
        stack = []
        while self._tokens:
            op = self._tokens.pop()
            if op.kind == 'literal':
                stack.append(op.value)
            elif op.kind == 'operator':
                try:
                    op1 = self.to_number(stack.pop())
                    op2 = self.to_number(stack.pop())
                except IndexError:
                    return f'Too many operators'
                match op.value:
                    case '+':
                        res = op1 + op2
                    case '-':
                        res = op1 - op2
                    case '*':
                        res = op1 * op2
                    case '/':
                        res = op1 / op2
                    case _:
                        res = f'Unknown operator{op.value}'

                stack.append(str(res))
        return stack.pop()

I think I’d like to have the actual expression in the message. To do that, we’ll have to pass it in as part of setting up the Expression.

What if we create a tiny object to be the result of passing a text to the Parser, that contains the original text and the RPN? Then the Expression would have access to the text in the rare case that it was needed. Or, we could cache the tokens in RPN form and display those.

Let’s do that first.

    def test_too_many_operators(self):
        text = '2 + - *'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert result == "Too many operators: ['2', '*', '-', '+']"

And:

class Expression:
    def __init__(self, scope, tokens):
        self._scope = scope
        self._cached_tokens = [t.value for t in tokens]
        self._tokens = tokens[::-1]
        self.handle_assignment()

        ...
            elif op.kind == 'operator':
                try:
                    op1 = self.to_number(stack.pop())
                    op2 = self.to_number(stack.pop())
                except IndexError:
                    return f'Too many operators: {self._cached_tokens}'

So that’s nice. Commit: adding error messages to expression.

The other possible error is that there is more than one thing on the stack when we exit.

As I work on that, I get a surprising lexing result that shouldn’t really surprise me:

    def test_lex_2(self):
        text = 'abc def + 5'
        lexed = Parser(text).lex()
        assert lexed == ['abcdef', '+', '5']

I think I need to stop removing the spaces, and instead clean up after. Because this is the result I want:

    def test_lex_2(self):
        text = 'abc   def  +    5'
        lexed = Parser(text).lex()
        assert lexed == ['abc', 'def', '+', '5']

Fix up lexing:

    def lex(self):
        rx = '([^a-zA-Z0-9.])'
        split = re.split(rx, self._expr)
        return [item for item in split if item and item != ' ']

Now for the message, I have this test, which I plan to use to capture the error detailed print:

    def test_malformed(self):
        text = 'abc def + 5'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert result == "operand/operator mismatch: xyz"

But instead I get this:

Expected :'operand/operator mismatch: xyz'
Actual   :"Too many operators: ['abc', 'def', '5', '+']"

I do not see off hand how that happens. Oh. We do not handle scopes yet. Change my test.

    def test_malformed(self):
        text = '6 7 + 5'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert result == "operand/operator mismatch: xyz"

And this code:

    def result(self, record):
        stack = []
        while self._tokens:
            op = self._tokens.pop()
            if op.kind == 'literal':
                stack.append(op.value)
            elif op.kind == 'operator':
                try:
                    op1 = self.to_number(stack.pop())
                    op2 = self.to_number(stack.pop())
                except IndexError:
                    return f'Too many operators: {self._cached_tokens}'
                match op.value:
                    case '+':
                        res = op1 + op2
                    case '-':
                        res = op1 - op2
                    case '*':
                        res = op1 * op2
                    case '/':
                        res = op1 / op2
                    case _:
                        res = f'Unknown operator{op.value}'

                stack.append(str(res))
        if len(stack) != 1:
            return f'operator/operand mismatch: {self._cached_tokens}'
        return stack.pop()

Just those two lines before the final return, and we get a message such that this will pass:

    def test_malformed(self):
        text = '6 7 + 5'
        rpn = Parser(text).rpn()
        result = Expression('Ignored', rpn).result(None)
        assert result == "operator/operand mismatch: ['6', '7', '5', '+']"

Commit: error message for operator operand mismatch.

Summary

With just a couple of lines, we’ve added some semi-reasonable error messages to our expressions, and ensured that exceptions are, if not impossible, at least less likely. We may find more to do. It would be “better” if we showed the original expression in the error message. We can rationalize that our users are programmers and can cope with RPN. Perhaps we’ll change it as we go forward.

We found some issues with our very quick and dirty lexing, and modified that code to improve it a bit. Putting two symbols adjacent is never legit but at least now they will come out as separate.

A noticeable improvement!

See you next time!