FAFO on GitHub

Childs has defined two “re-scope” operations. Let’s see if we can implement one of them, and if we’re glad we did.

I’ll spare you the Childs definition of these if I possibly can. Let’s take Re-Scope by Scope. Informally, given a set S of scoped elements

S = { old1new1, old2new2 }

and some set Z:

Z = { abcold1, defold2, ghiold3 }

Z.re_scope(S) will be:

{ abcnew1, defnew2 }

The operation both selects elements from Z, picking only those whose scopes are the elements of S, and renames the scopes of the selected elements to the corresponding scope from S.

Let’s write a test.

    def test_re_scope(self):
        s = XSet.from_tuples((('first', 'first_name'), ('last', 'last_name')))
        z = XSet.from_tuples((('ron', 'first'), ('jeffries', 'last'), ('serf', 'job')))
        r = z.re_scope(s)
        assert r.includes('jeffries', 'last_name')
        assert r.includes('ron', 'first_name')
        count = 0
        for _e, _s in r:
            count += 1
        assert count == 2

I expect the operation to find the first and last elements of z and rename them to first_name and _last_name. The test fails for want of the operator.

Now to write the thing. Hm, that wasn’t hard at all:

    def re_scope(self, other):
        new_tuples = []
        for e, s in self:
            for old, new in other:
                if old == s:
                    new_tuples.append((e,  new))
        return XSet.from_tuples(new_tuples)

Test is green. Commit: re-scope (by element).

Now let’s ask ourselves, selves, what would a re-scope do if the old names and new names were the same? And we answer after a little thought: it would select all the elements with those scopes out of the set and no others.

And then we’d ask ourselves, isn’t that exactly what a project operator does to each record? And again after a little thought, we would answer: you know, I think you’re right. And then we would look at project to see if we can use re_scope to do it.

    def project(self, field_selector: Self) -> Self:
        projected = [self.project_one_record(record_element, field_selector)
                     for record_element, record_scope in self]
        return XSet.classical_set(projected)

    def project_one_record(self, record_element, field_selector):
        new_atoms = [(field, field_name)
                     for field, field_name in record_element
                     for desired_field_name, _ in field_selector
                     if desired_field_name == field_name]
        return XSet.from_tuples(new_atoms)

Now if we read that carefully, I think we find that the field_selector set has any old scopes and the elements are the scopes to be selected (projected).

Let’s make a set of { xx } from that and rescope with it. Like this:

    def project(self, field_selector: Self) -> Self:
        projector = XSet.from_tuples([(e, e) for e,_ in field_selector])
        projected = [self.project_one_record(record_element, projector)
                     for record_element, record_scope in self]
        return XSet.classical_set(projected)

    def project_one_record(self, record_element, projector):
        return record_element.re_scope(projector)

We are green. Inline:

    def project(self, field_selector: Self) -> Self:
        projector = XSet.from_tuples([(e, e) for e, _ignored in field_selector])
        projected = [record_element.re_scope(projector)
                     for record_element, record_scope in self]
        return XSet.classical_set(projected)

We could inline projected but let’s not.

There is another issue here, which is the fact that we return a classical set from project, no matter what the scopes were on the original input set. That’s fairly convenient, but a serious case can be made for returning the records with their original scopes. I’ll make a note of that.

The restrict and select operators, which return a subset of records from a set, do preserve the original scopes:

    def restrict(self, restrictor) -> Self:
        def other_has_match(e, s):
            return any((e_r.is_subset(e) for e_r, s_r in restrictor))

        if not isinstance(restrictor, self.__class__):
            return NotImplemented
        return self.select(other_has_match)

    def select(self, cond) -> Self:
        tuples = list((e, s) for e, s in self if cond(e, s))
        return XSet.from_tuples(tuples)

Formally speaking, that’s better. Clearly we can define operators that do anything we want, and our project, given any set, returns a classical set, deal with it. But “deal with it” isn’t quite the ideal way to do things.

One possibly surprising property of this project is this one:

    def test_project(self):
        path = '~/Desktop/job_db'
        fields = XFlat.fields(('last', 12, 'first', 12, 'job', 12, 'pay', 8))
        ff = XFlatFile(path, fields)
        flat_set = XSet(ff)
        fields = XSet.classical_set(["last"])
        projected = flat_set.project(fields)
        count = 0
        for person, s in projected:
            count += 1
        assert count == 5

We project the 1000 records of the flat file on ‘last’, and get 5 records, because there are only five last names in the set:

        lasts = ["jeffries", "wake", "hill", "hendrickson", "iam"]

If we were to include the scopes for each record in our project, we’d have returned 1000 records, 200 for each last name.

Where did the set get compressed? I think it happened at the last minute:

class XSet:
    @classmethod
    def classical_set(cls, a_list) -> Self:
        null = cls.null
        wrapped = [(item, null) for item in a_list]
        return cls.from_tuples(wrapped)

    @classmethod
    def from_tuples(cls, tuples):
        def is_2_tuple(a):
            return isinstance(a, tuple) and len(a) == 2

        concrete = list(tuples)  # ensure we didn't get passed a generator
        if not all(is_2_tuple(a) for a in concrete):
            raise AttributeError
        return cls(XFrozen(frozenset(concrete)))

The frozenset will have compressed them from the list. Since we do want a set here, let’s push that up a bit sooner in the code:

    @classmethod
    def from_tuples(cls, tuples):
        def is_2_tuple(a):
            return isinstance(a, tuple) and len(a) == 2

        frozen = frozenset(tuples)  # ensure we didn't get passed a generator
        if not all(is_2_tuple(a) for a in frozen):
            raise AttributeError
        return cls(XFrozen(frozen))

That will actually be a bit more efficient all around, I think. Can’t be worse, might be better.

Commit: Use re_scope in project. Add test for project. Improve from_tuples. Commit more often, you tiny fool.

This has been fun and productive. Let’s sum up.

Summary

I got so interested in what I was doing that I missed a couple of places where I could have committed. It is only slightly tempting to commit on every fully green test run. Only slightly. I’ll continue to try to do better.

The re_scope operation went in readily: I literally typed it in correctly on the first try. I think it could be done with a comprehension, but I’ll save that for a future occasion.

The way that project collapsed down into a call of the more general re_scope is gratifying and the sort of thing that happens when you get your underlying operations right or nearly so. Complicated things call other things. There may still be some complicated things in there, but they are fewer and we don’t need to think of their details, just what they accomplish.

An interesting question occurred to me, however. Suppose that we have a set, like one of our people records, with several fields, and we want to rename just some of the fields. How can we readily construct the correct re-scoping set to do the job?

I think the answer may turn out to be interesting. We’ll put that high on the list of things to do soon.

See you then!