FAFO on GitHub

Let’s have another try at wrapping our frozenset in an object that will work for us. Things go much better … so far.

I suppose I should be asking myself if the effort of wrapping frozenset is necessary. But I feel sure that it is, if we are to allow new kinds of classes to serve as alternate implementations for XSets. It’s probably possible to leave the frozenset as is, but if we do, there will almost certainly be methods of frozenset that we rely on, and therefore must implement in any new implementation class.

In view of this morning’s um experience, I will try to proceed more judiciously, which mostly means with smaller steps, but perhaps also with smaller tests. And this time, let’s prepare the ground a bit by narrowing the methods that we use on frozenset, at least initially.

Theory, and theory

I believe that we can, in principle, implement set theory using only includes and iteration. I think we will stick with the current notion of includes, which takes two parameters, the element and the scope under which it is to be found.

Let’s begin by inspecting what methods of frozenset we use. PyCharm will give us all references to its implementation member, which right now is always a frozenset instance. It seems that there are 13:

Two are in tests:

    assert len(bosses.implementation) > 0
    ...
        test_set.implementation = impl 

We’ll deal with those later. The rest, all in XSet, are:

class XSet:
    def __bool__(self):
        return bool(self.implementation)

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.implementation == other.implementation
        else:
            return NotImplemented

    def __hash__(self):
        return hash(self.implementation)

    def __init__(self, a_list):
        def is_2_tuple(a):
            return isinstance(a, tuple) and len(a) == 2

        self.implementation = frozenset(a_list)
        if not all(is_2_tuple(a) for a in self.implementation):
            raise AttributeError

    def __iter__(self):
        return self.implementation.__iter__()

    def __repr__(self):
        if self == self.null:
            return "∅"
        else:
            return f"XSet({self.implementation})"

    def includes(self, element, scope) -> bool:
        if not scope:
            scope = self.null
        try:
            return self.implementation.includes(element, scope)
        except AttributeError:
            return (element, scope) in self

    def is_subset(self, other) -> bool:
        if isinstance(other, self.__class__):
            return self.implementation.issubset(other.implementation)
        else:
            return NotImplemented

Can we do without __bool__? There is a direct test of it and all the other tests run. We could mark the test to be skipped and clean it up later. Let’s remove the notion entirely for now. It is surely pythonic to have empty sets be Falsey, but let’s simplify the space for now. I’ve made a note on a card to put it back and I’ll make other notes as well. Commit: remove bool.

Reading on down here, I think we’re going to want to implement is_subset as the formal definition, and then use it to implement __eq__. That will be inefficient, but we’ll see if we can improve it later. For now, let’s be correct and minimize use of powerful frozenset operations.

    def is_subset(self, other) -> bool:
        if isinstance(other, self.__class__):
            return all(other.includes(e, s) for e,s in self)
        else:
            return NotImplemented

We do not have much in the way of testing for that. In fact, we have none, we just use it in restrict. Let’s do a bit of testing for this one.

    def test_classical_is_subset(self):
        c1 = XSet.classical_set((1, 2, 3, 4, 5))
        c2 = XSet.classical_set((2, 4))
        c3 = XSet.classical_set((1, 6))
        assert c2.is_subset(c1)
        assert not c3.is_subset(c1)

    def test_is_subset(self):
        r1 = XSet([("jeffries", "last"), ("ron", "first")])
        r2 = XSet([("jeffries", "last")])
        assert r2.is_subset(r1)
        assert not r1.is_subset(r2)

Green. Commit: use set theoretic definition of is_subset in XSet.

What’s next? Oh, right, eq. How well tested is that?

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.implementation == other.implementation
        else:
            return NotImplemented

If I just not the condition, 8 tests fail. Good enough. Plug in the formal definition, nasty but correct.

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.is_subset(other) and other.is_subset(self)
        else:
            return NotImplemented

Tests are all green. Commit: use set theoretic definition of eq.

What’s left now?

class XSet:
    def __init__(self, a_list):
        def is_2_tuple(a):
            return isinstance(a, tuple) and len(a) == 2

        self.implementation = frozenset(a_list)
        if not all(is_2_tuple(a) for a in self.implementation):
            raise AttributeError

    def __repr__(self):
        if self == self.null:
            return "∅"
        else:
            return f"XSet({self.implementation})"

    def __hash__(self):
        return hash(self.implementation)

    def __iter__(self):
        return self.implementation.__iter__()

    def includes(self, element, scope) -> bool:
        if not scope:
            scope = self.null
        try:
            return self.implementation.includes(element, scope)
        except AttributeError:
            return (element, scope) in self

I think that repr, hash, and iter are as they should be: We’ll defer those to the implementation, which must always provide them. As for includes, I think we should define that all implementations implement in, and expect a 2-tuple, (element, scope) as the parameter. I am not solidly convinced about this, but I think we’ll try it for now. It should work with frozenset in any case:

    def includes(self, element, scope) -> bool:
        if not scope:
            scope = self.null
        return (element, scope) in self
Note
The above is wrong. I should have said self.implementation. Discovered below.

One test fails:

    def test_hacked_n_tuple(self):
        n_tuple = ("a", "b", "c")
        test_set = XSet.tuple_set(n_tuple)
        impl = X_tuple(n_tuple)
        test_set.implementation = impl  # Jam it in there
        assert test_set.includes("a", 1)
        assert test_set.includes("c", 3)
        assert test_set.excludes("a", None)
        assert test_set.excludes("a", 3)
        assert test_set.excludes("d", 4)

That is our partially-completed X_tuple class. It looks like this:

class X_tuple:
    def __init__(self, data):
        self.data = data

    def includes(self, e, s):
        return isinstance(s, int) and s <= len(self.data) and self.data[s-1] == e

I don’t want to override includes in the implementations. I think that part of my confusion this morning was that the implementations used too many words and notations that are similar to those in XSet. We’ll try to use different terms at the two levels, at least for now. I do think we’ll need __contains__, and I am sure we’ll need __iter__. Let’s try __contains__ on X_tuple.

I thought this was supposed to work:

    def __contains__(self, t):
        e, s = t
        return isinstance(s, int) and s <= len(self.data) and self.data[s-1] == e

However, the error I get is this:

    def __iter__(self):
>       return self.implementation.__iter__()
E       AttributeError: 'X_tuple' object has no attribute '__iter__'

Oh. That’s referring to the __iter__ in XSet. XSet does not know __contains__. It’s includes that’s wrong. Should be:

    def includes(self, element, scope) -> bool:
        if not scope:
            scope = self.null
        return (element, scope) in self.implementation
Note
Bug / typo / omission / mistake discovered. Life is good!

We are green. Commit: includes implemented as (element, scope) in self.implementation. This is probably canon.

Good point to stop. Let’s sum up.

Summary

Much better than this morning. Instead of banging in a new class wrapper and fixing tests, I’ve narrowed the span of the API substantially, which means that when (if) we do bang in the new wrapper, fewer things will break. The changes we have to make should now be substantially easier.

We can line out that note about deferring includes: we have decided always to defer it, using in. Replace it with a note saying

Need Abstract Class: __contains__, __iter__, __hash__, __repr__(?).

That’s to remind us to define canonically what operations an XSet implementation must implement. We’ll use Python’s abstract class capability to set up that interface.

I think we have definitely smoothed out the landing field for our new class, and that next time, probably tomorrow, we’ll almost certainly have an easier time of things than we did this morning. I don’t regret the morning at all, however, because we learned some things explicitly and surely built up some tacit knowledge as well.

A good outcome for the afternoon. I wonder if we’ll have nachos with the game.

See you next time!