FAFO on GitHub

Let’s look around and see what we might do. I’ll even make a list. Jira is an abomination upon the land.

Good morning, friends!

Let’s review the notes on my cards, and whatever else comes to mind. In no particular order, just what comes off my cards:

  • Cardinality: optimize.
  • Fix Flat symbol table use XST (We tried this once already.)
  • a, b, c = an_xset. returns tuples. Is this good?
  • think more about pop. Use scope set? union a default?
  • re_scope method could use comprehension?
  • big sets?
  • spill storage for large sets? tuples easy to store?
  • is_null?
  • use reduce in XFlat fields?
  • use reduce elsewhere? do we need map?
  • improve class creation methods (mostly OK now?)
  • XRelation - one set of symbols for entire set
  • record (verb) scopes in project operator?
  • weird bound method display in assert
  • duplicates in restrict output?
  • think about complex scopes e.g. <symbol, type>

I’ve been scribbling these, more cryptic than even shown above, on 3x5 cards, more than one note per card. I need a bit more room for these notes than will fit on a 1.5x1 sticky note. BUt then I cross a few off the card and soon there’s like one tiny note per card.

“You should use Jira.”
Thanks for that suggestion. Jira is an abomination upon the land. Its primary effect is to reduce human-to-human communication, reducing it to note passing. This is exactly what we were not talking about when we wrote “Individuals and interactions over processes and tools”.

But I digress. What else might we work on?

  • Create a decent README page for the Repo
  • Devise a way to print sets more reasonably
  • Work a bit with more highly nested sets, just in case you’re not suffering enough already
  • See whether there are more operations defined by Childs that might be useful

Pick One

I think the first thing I’d like to do is to look at the weird display I get when asserts fail. I’ve been working around them long enough. Let’s see if I can get one to happen. Easily done. I invert the sense of one test and get this display:

>       assert not employee.includes('jeffries', 'last')
E       AssertionError: assert not True
E        +  where True = <bound method XSet.includes of XSet(XFlat('jeffries    ron         coder          12000'))>('jeffries', 'last')
E        +    where <bound method XSet.includes of XSet(XFlat('jeffries    ron         coder          12000'))> = XSet(XFlat('jeffries    ron         coder          12000')).includes

The first thing I want to try is to extract the includes as a variable and see how that changes the result.

        has_jeffries = employee.includes('jeffries', 'last')
>       assert not has_jeffries
E       assert not True

Problem goes away. I wonder if this is an artifact of pytest, or something about my code. Let’s write a specialized test.

    def test_bound_method_thing(self):
        class Foo:
            def __init__(self, arg):
                self.arg = arg

            def includes(self, arg):
                return arg == self.arg

        foo = Foo(3)
        assert foo.includes(5)

This gives me the same kind of result:

>       assert foo.includes(5)
E       assert False
E        +  where False = <bound method TestMisc.test_bound_method_thing.<locals>.Foo.includes of <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x103d0f350>>(5)
E        +    where <bound method TestMisc.test_bound_method_thing.<locals>.Foo.includes of <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x103d0f350>> = <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x103d0f350>.includes

Conclusion: it’s just something weird that pytest does, and nothing about my XSet code.

I check whether adding a message helps:

>       assert foo.includes(5), "did not include 5"
E       AssertionError: did not include 5
E       assert False
E        +  where False = <bound method TestMisc.test_bound_method_thing.<locals>.Foo.includes of <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x1039d76d0>>(5)
E        +    where <bound method TestMisc.test_bound_method_thing.<locals>.Foo.includes of <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x1039d76d0>> = <test_misc.TestMisc.test_bound_method_thing.<locals>.Foo object at 0x1039d76d0>.includes

Not really. I think what is happening is that pytest’s attempt to dig into the assertion isn’t as helpful as it might be. In any case, it’s not my code, and since I can always improve the display if I need to, we’ll call this issue closed.

Commit: testing to check out pytest bound method displays. not our bug.

Pop, and Returning Tuples

I accidentally learned that this works:

    def test_tuples_from_set(self):
        s = XSet.n_tuple(['a', 'b', 'c'])
        try:
            a, b, c, d = s
        except ValueError:
            pass
        try:
            a, b = s
        except ValueError:
            pass
        a, b, c = s
        a_e, a_s = a
        assert a_e in ['a', 'b', 'c']
        assert a_s in [1, 2, 3]

You can assign an XSet to a series of tuples, if you know exactly how many there will be. You can even break them up:

    def test_tuple_split(self):
        s = XSet.n_tuple(['a', 'b', 'c'])
        (a0, a1), (b0, b1), (c0, c1) = s
        assert a0 in ['a', 'b', 'c']
        assert b0 in ['a', 'b', 'c']
        assert c0 in ['a', 'b', 'c']

That’s almost nifty.

What is not nifty is that even though s is an n-tuple, like this:

{ a1, b2, c3 },

its elements do not come out in numeric order. It seems, intuitively, that they should, but aside from our own expectations, there’s no difference between that set ones like this:

{ ax, b42, chello }
{ jeffriesfirst, ronfirst, serfjob }

We have no solid expectation for the order of those sets, although two candidate ideas come to mind:

  • same order as they went in
  • ordered by sorted scope

Formally, however, there is no “order”, there are only scopes. We conventionally make use of the integers to define an ordered n-tuple, but even so:

<a, b, c> == { c3, a1, b2 }

Our basic memory storage scheme for sets is XFrozen, which is a frozenset of 2-tuples t, with the element as t[0] and the scope as t[1]. That structure was chosen because the frozen set automatically removes duplicates, and while in set theory {a, a} == {a}, inside the computer, duplicates are a pain. This was a conscious design decision on my part, and it could have been a poor one. There are other ways of dealing with duplicates, certainly. Generally we like to get rid of them in output, and even just tagging along before display they take up time and space. But there are other ways. I chose this one on purpose.

There’s a human design issue here that I’m concerned about. It would be easy to make n-tuples produce their elements in numeric order, so long as the system knows it has an n-tuple in hand. We will probably do that shortly: I have an XTuple implementation under way.

But if we do that, then we will get used to that happening, and we may bias our tests in that direction or even real program output. Then, when we least expect it, some set will either not be an n-tuple, or won’t be recognized as one, and things will come out in “the wrong order”.

To the best of my knowledge, there is no formal definition of the way that an XST system produces values that can be displayed in the real world. The developers of the system just come up with some conventions, sorting, winnowing, whatever, and then take advantage of an artifact.

What do you mean, artifact?

Thanks for asking. Suppose that we modify our set creation method n_tuple such that it stores elements in actual order from first to last, and suppose we make that “official”: it’ll always be true. Then, to get some set into a desired order, if we can associate its records with the integers, we can stuff it into an n-tuple and when we print from the n-tuple, we’ll get what we wanted.

That’s not an inherent property of the math: the math allows any order, even in an n-tuple. It’s just the mapping between element and the integers that counts. It happens to work because we crafted it that way. It’s a good thing, but it’s because we built it that way, not because the fundamental model of things, extended sets, support it. It’s not naturally present: it happens as a result of our code.

It’s not necessarily bad, but while you could legitimately send me a bug report if a union came up with too few elements, you can’t legitimately complain if it contains two copies of the same thing. Same with this.

Maybe I’m quibbling, but I’m trying to be precise about what is in the set theory and what is not. Output order is not. We can build it, but it’s outside the theory. That means we have to be careful with it. It may confuse us, it may confuse our users if we have any, and it might change.

Could we get down to earth here?

Well, maybe closer to earth. We have this method pop, that returns one element-scope pair from a set:

    def pop(self):
        it = iter(self)
        return next(it, (None, None))

One objection to this method is its name, pop, which usually suggests that the element is removed from the collection, off the top. We’ll rename this as soon as I get an idea that appeals to me more. I’ve considered choose or element or fetch, but nothing is singing to me.

Another objection is that it returns None, None if there is no element-scope pair in the set. We might want to consider what that means. We could redefine pop so that you pass in the default you want back if there is no element in the set, must as the Python next does right there in front of us.

I had been thinking about trying to define pop as processing the union of two sets, the set you pop, unioned with another set { NoneNone> }, but then you have to describe the fact that it prefers an item from the original.

What is your point, even?

My point is that designing an information system, even one with a formal foundation, gets tricky at the edges, where the formality meets the code and the code meets the human. I believe, and I could be wrong, that some careful thought about these boundaries pays off, like what does it mean to grab an element from a set, and what does the order of iteration mean. IF we can get these edges smooth, no one will snag themselves on them, and programming with out system will be easier.

I could be going too far. Don’t worry, it all comes back to code.

XTuple

I’ve mentioned XTuple a few times. It’s an XImplementation that I’ve been working on off and on. Here’s a test that usually fails:

    def test_n_tuple_string(self):
        n = XSet.n_tuple(['a', 'b', 'c'])
        output = [f'{e}^{s}' for e, s in n]
        result = ', '.join(output)
        assert result == 'a^1, b^2, c^3'
Expected :'a^1, b^2, c^3'
Actual   :'a^1, c^3, b^2'

The test could, in principle, pass, because the order of those items is whatever frozenset picks. In practice, I think the items always hash the same. Anyway, we really want 1, 2, 3 order here.

I have one comprehensive test for my new XTuple, and I’ve just added the formatting test to the end of it:

    def test_hacked_n_tuple(self):
        n_tuple = ("a", "b", "c")
        test_set = XSet.n_tuple(n_tuple)
        impl = XTuple(n_tuple)
        test_set.implementation = impl  # Jam it in there
        assert test_set.includes("a", 1)
        assert test_set.includes("c", 3)
        assert test_set.excludes("a", None)
        assert test_set.excludes("a", 3)
        assert test_set.excludes("d", 4)
        assert test_set.excludes("c", 0)
        assert test_set.excludes("b", 0)
        assert test_set.excludes("a", 0)
        assert test_set.excludes("c", -1)
        assert test_set.excludes("b", -1)
        assert test_set.excludes("a", -1)
        tally = 0
        for e,s in test_set:
            assert e in ['a', 'b', 'c']
            tally += s
        assert tally == 6
        output = [f'{e}^{s}' for e, s in test_set]
        result = ', '.join(output)
        assert result == 'a^1, b^2, c^3'

The test runs green. As you can see, the test is just jamming the implementation into the XSet, which is my usual practice until I release a new implementation. Let’s change XSet.n_tuple to use it.

    @classmethod
    def n_tuple(cls, a_list) -> Self:
        return cls(XTuple(a_list))

All the tests go green, after I provide __hash__ and __repr__. Here is XTuple:

class XTuple(XImplementation):
    def __init__(self, data):
        self.data = data

    def __contains__(self, t):
        if not isinstance(t, tuple):
            return False  # should raise?
        e, s = t
        return isinstance(s, int) and 0 < s <= len(self.data) and self.data[s-1] == e

    def __iter__(self):
        scope = 1
        for e in self.data:
            yield e, scope
            scope += 1

    def __hash__(self):
        return hash(self.data)

    def __repr__(self):
        return 'XTuple(' + repr(self.data) + ')'

Even my failing test about the print being out of order runs now, because our new n_tuple class delivers its elements in numeric order.

Commit: class method n_tuple uses XTuple implementation, saving scope storage and ensuring element production in numeric order.

Reflection

I’m thinking about sorting and order. Suppose we have a record r

r = { jeffries^last, ron^first, serf^job }

Suppose that we want to display ‘ron jeffries, serf’ in the salutation to the email we send out, informing the employees that their bonuses will be zero again this year, because the cost of fuel for the president’s yacht has gone up. We might set up the re_scoping set:

s = { first^1, last^2, job^3 }

then r.re_scope(s) is:

{ jeffries^2, ron^1, serf^2 }, or
< ron, jeffries, serf >

And — if the system knew that was an n_tuple — we could iterate it and get the desired result.

Furthermore … is it the case that if a re-scoping set is an n_tuple, then any set re-scoped by that set is an n_tuple? No, not quite, because the input set might have two jobs. So we can’t quite just assume that we can rescope to an n_tuple …

Sorting and output needs more thinking. We may just have to ad-hoc it and hope for the best.

Enough for now, what have we wrought?

Summary

We have wrought a new form of set implementation which behaves as an n-tuple “should”, while storing only the elements and computing the scopes on the fly. So that’s nice, and it will surely play a role in our work later. Note that the XFlatFile also represents an n-tuple by the way, since it numbers its elements 1 through n as well.

Today, I’ve done a bit more speculating than seems ideal, and a bit less coding. But the trick is to decide what to code, and part of that really does come down to seeing clearly what these sets are, and how one would like to access them.

I am reminded of a method I had in the Codea XST that I worked on for a while, named at. a_set.at(a_scope) would return whatever was in the set at that scope. I don’t recall whether it insisted there be only one thing, whether it returned a collection, or what. Doesn’t really matter: I think we’re on the cusp of wanting something like that here, so that we can compute a set and then display it in some human-readable form.

Maybe I have to write a little app. That might be amusing. We’ll see.

Be well, friends. See you next time!