XST 9: More cowbell.

Reflection leads me to focus a bit more on set operations, and less on internal methods. Does this call for a new layer? Also: real technical debt!

The new select function works well, and looks pretty good too.

function XSet:select(predicate)
    return coroutine.wrap(function()
        for e,s in self:elements() do
            if predicate(e,s) then
                coroutine.yield(e,s)
            end
        end
    end)
end

An issue with it is that it’s a conditional iterator, not a set operation. The corresponding set operation would return a new XSet. We could still iterate it if we need to, with elements. As it stands, we use select like this:

function XSet:restrict(B)
    -- return all elements (a) of self such that
    --  there exists a record (b) in B such that
    --      b:subset(a)
    local result = XSet()
    for e,s in self:select(function(e,s) return B:hasSubsetElement(e,s) end) do
        result:addAt(e,s)
    end
    return result
end

function XSet:copyIntoSuchThat(result, predicate)
    local added = false
    for e,s in self:select(predicate) do
        result:addAt(e,s)
        added = true
    end
    return added
end

(We also have a test for it.)

The first case, we’re creating a set. In the second, we use in this sequence:

function XSet:elementsSuchThat(predicate)
    local result = XSet()
    local added = self:copyIntoSuchThat(result, predicate)
    return added and result:lock() or XSet.NULL
end

function XSet:intersect(anXSet)
    return self:elementsSuchThat(function(e,s)
        return anXSet:hasAt(e,s)
    end)
end

In the case of intersect, we’re going three rails to create a set that select could create all on its own. And the added flag there is an optimization, intended to mirror the one we do in union, which has actually been moved to addAt:

function XSet:addAt(element, scope)
    if self.locked then error("set is locked") end
    if self:hasAt(element,scope) then return end
    if not self.contents[scope] then
        self.contents[scope] = {}
    end
    self.contents[scope][element] = true
    return self
end

We check to see whether the target set already has the element we’re adding, and don’t add it.

Reflection

What we’re seeing here is true “technical debt”, a deviation between the program we’d write today and the program we wrote yesterday. It’s not dirty code or untested code. It’s just that the best code we could see to write in the past isn’t as good as what we can see now, with all the insights of the past adding up.¹

As a programmer, one of my fears in writing this program has to do with performance. I “know” that making a set and then iterating it is less efficient than iterating the elements and saving them if I want them. So I tend, sometimes, to write code that is a bit too focused on performance as I imagine it. A far better outlook would be to think:

If my sets are too slow to use, I’ll speed them up.

So let’s change our test and move to a select that returns an XSet, and unwind and remove that stuff in intersect.

But wait, there’s more. It seems to me that there are coming to be some increasingly clear distinctions among the methods of XSet. Some are really set operations, like union, intersect, restrict. Others are useful on the interface between an app and the sets, like elements, which would be needed if you wanted to display an answer or the like. And then there are purely internal operations that we probably don’t want app writers using, such as addAt and copyInto. These are really intended to be used to build up the somewhat complex internal structure of sets.

In some languages, it might suffice to make these methods private. In Lua and other languages, there’s no such facility. All the methods are there just asking to be used. Now my usual reaction if someone uses some internal method and breaks things is to say “Doctor, it hurts when I move my arm like this”. But there’s more.

When we start having feelings like this, it’s a likely sign that there is some kind of object trying to be born. There’s probably some kind of top-level object with the app-oriented methods, and the interface methods, and some lower-level object containing the detailed mechanisms.

At this moment, I don’t see what those objects are. But I see them, vaguely, trying to emerge. We’ll try to stay alert for some way to express this understanding in our code.

For now, the select method wants to return an XSet.

Select

Here’s my current select test:

        _:test("Coroutine select", function()
            local s = XSet():addAt("1","a"):addAt("2","b"):addAt("3","c")
            local pred = function(e,s) return e=="1" or s=="c" end
            for e,s in s:select(pred) do
                _:expect(e=="1" or s=="c").is(true)
            end
        end)

Let’s just recast this to expect select to return an XSet, then make it so.

        _:test("Coroutine select", function()
            local s = XSet():addAt("1","a"):addAt("2","b"):addAt("3","c")
            local pred = function(e,s) return e=="1" or s=="c" end
            local selected = s:select(pred)
            for e,s in selected:elements() do
                _:expect(e=="1" or s=="c").is(true)
            end
        end)

This will fail, not understanding elements, I reckon.

12: Coroutine select -- Tests:184: attempt to index a function value (local 'selected')

Yes, that’s what that means. select returns a coroutine, i.e. a function.

Let’s just fix select that will break its other users.

My first move was this:

function XSet:select(predicate)
    local provider = coroutine.wrap(function()
        for e,s in self:elements() do
            if predicate(e,s) then
                coroutine.yield(e,s)
            end
        end
    end)
    local result = XSet()
    for e,s in provider do
        result:addAt(e,s)
    end
    return result
end

We’ve got the coroutine, just use it. But that’s just silly. We can do this:

function XSet:select(predicate)
    local result = XSet()
    for e,s in self:elements() do
        if predicate(e,s) then result:addAt(e,s) end
    end
    return result
end

This makes me sad. My cool coroutine from yesterday is no longer needed. And, of course, the intersect tests are failing.

Well, let’s look at intersect:

function XSet:intersect(anXSet)
    return self:elementsSuchThat(function(e,s)
        return anXSet:hasAt(e,s)
    end)
end

And so on. Just write it here:

function XSet:interset(anXSet)
    return self:select(function(e,s) return anXSet:hasAt(e,s) end)
end

This seems right to me but the tests continue to fail. Let’s look more deeply.

They’re all failing saying “attempt to call a table value”. Two at line 242, one at line 358.

Oh look:

function XSet:copyIntoSuchThat(result, predicate)
    local added = false
    for e,s in self:select(predicate) do
        result:addAt(e,s)
        added = true
    end
    return added
end

We can fix that, though I think in the end we plan to get rid of it. Let’s get green first.

function XSet:copyIntoSuchThat(result, predicate)
    local added = false
    local temp = self:select(predicate)
    for e,s in temp:elements() do
        result:addAt(e,s)
        added = true
    end
    return added
end

I expect the two complaining about 242 to work or at least improve. They work. Now the other line:

function XSet:restrict(B)
    -- return all elements (a) of self such that
    --  there exists a record (b) in B such that
    --      b:subset(a)
    local result = XSet()
    for e,s in self:select(function(e,s) return B:hasSubsetElement(e,s) end) do
        result:addAt(e,s)
    end
    return result
end

Yes, well, we can just use the new select now:

function XSet:restrict(B)
    -- return all elements (a) of self such that
    --  there exists a record (b) in B such that
    --      b:subset(a)
    return self:select(function(e,s) return B:hasSubsetElement(e,s) end)
end

Nice. Comment longer than the code. Sweet.

Let’s look at hasSubsetElements.

function XSet:hasSubsetElement(a)
    -- return true if some element b of this set 
    --    is a subset of a
    --    i.e. b:isSubset(a)
    return self:exists(function(b,s) return b:isSubset(a) end)
end

Let’s inline that in restrict and see how we feel about it.

function XSet:restrict(B)
    -- return all elements (a) of self such that
    --  there exists a record (b) in B such that
    --      b:subset(a)
    return self:select(function(a,s) return B:exists(function(b,s) return b:isSubset(a) end) end)
end

That works. It’s not as readable as it might be. Let’s first try a different layout:

function XSet:restrict(B)
    -- return all elements (a) of self such that
    --  there exists a record (b) in B such that
    --      b:subset(a)
    return self:select(function(a,s) 
        return B:exists(function(b,s) 
            return b:isSubset(a) 
        end) 
    end)
end

That says what it does and does what it says. Tests are green. Commit: select returns a set. restrict uses new select. inlined restrict` predicate.

The three-sentence commit message suggests that there were a couple of other commits that I could have done.

Now that has subset method must be unused.

Uh oh! I’ve made a mistake. First revert.

In the intersect method above, I spelled it interset. That means that the method wasn’t used, and as it happens, I left the old one in for reference. So they were using:

function XSet:intersect(anXSet)
    return self:elementsSuchThat(function(e,s)
        return anXSet:hasAt(e,s)
    end)
end

function XSet:interset(anXSet)
    return self:select(function(e,s) return anXSet:hasAt(e,s) end)
end

And I’m not sure just why but the new one is failing.

Ah … I bet it’s this:

function XSet:elementsSuchThat(predicate)
    local result = XSet()
    local added = self:copyIntoSuchThat(result, predicate)
    return added and result:lock() or XSet.NULL
end

The original maps all empty sets back to NULL. That’s likely at least part of the problem.. Let’s run the tests and find out the truth.

Intersect  -- Actual: nothing thrown, Expected: set is locked
Null Intersect  -- Actual: XSet(0 elements), Expected: NULL
Null Intersect  -- Actual: nothing thrown, Expected: set is locked

Yep. We need to return a proper NULL and we need to lock the set. The latter should happen in select.

function XSet:select(predicate)
    local result = XSet()
    for e,s in self:elements() do
        if predicate(e,s) then result:addAt(e,s) end
    end
    return result:lock()
end

Right. That made the locked messages go away.

By the way, I am thinking that lock isn’t really needed by users, as they will ultimately have no way to add elements to sets other than by legitimate set operations. But I do like having it for my own purposes. It hasn’t caught any errors yet, but the month is young.

Now the NULL thing. I could do this:

function XSet:intersect(anXSet)
    local result = self:select(function(e,s) return anXSet:hasAt(e,s) end)
    return result:card()>0 and result or XSet.NULL
end

That works, the tests are green. Commit: fix problem with intersect.

I have a concern. This code to check things for being null and if they are, returning the canonical null set needs to be done everywhere that a set could be null. I haven’t tested a restrict that gets no results, but the code we just looked at tells me that it won’t return null.

Why do I care? Well, mostly I don’t. We could have a method isNull that people could use if they cared. I do have a concern about–possibly–the fact that NULL is the scope of things like my record collections, nested sets. The concern is that if we’re going to compare them, they might not match.

There’s that optimization thinking again. Let’s continue to provide NULL, but let’s relax the constraint that all empty sets must == NULL.

I’ll remove all the or XSet.NULL bits and fix the tests.

I implement this:

function XSet:isNull()
    for e,s in self:elements() do
        return false
    end
    return true
end

This is a sneaky efficient check for null. I apologize, but I couldn’t resist. If there are elements, the loop immediately exists early, returning false. If there are none, the loop immediately completes, returning true.

I’ll test for that.

        _:test("Null Intersect", function()
            local A = XSet()
            A:addAt("Jeffries", "last")
            A:addAt("Ron", "first")
            _:expect(A:card(), "#A").is(2)
            local B = XSet()
            B: addAt("Ron", "notFirst")
            _:expect(B:card(), "#B").is(1)
            local C = A:intersect(B)
            _:expect(C:card()).is(0)
            _:expect(C:isNull()).is(true)
            _:expect(function() C:addAt("x", 1) end).throws("set is locked")
        end)

function XSet:__tostring()
    if self:isNull() then return "NULL" end
    return "XSet("..self:card().." elements)"
end

This one’s deleted as unused:

function XSet:elementsSuchThat(predicate)
    local result = XSet()
    local added = self:copyIntoSuchThat(result, predicate)
    return added and result:lock() or XSet.NULL
end

This:

function XSet:intersect(anXSet)
    local result = self:select(function(e,s) return anXSet:hasAt(e,s) end)
    return result:card()>0 and result or XSet.NULL
end

Goes back to this:

function XSet:intersect(anXSet)
    return self:select(function(e,s) return anXSet:hasAt(e,s) end):lock()
end

(I added the lock(). I propose to do that all over.)

Method copyIntoSuchThat removed as unused.

The cool but unused method goes:

function XSet:elementsDo(f)
    for e,s in self:elements() do
        f(e,s)
    end
end

The for will remain our standard for now, as it is less obscure. Better to have one decent way than two to chose between. If we need it, we’ll revive it.

hasSubsetElement is unused. Remove.

Green. Commit: removed huge tracts of unused code.

I ran across this:

function XSet:element_co()
    for scope,elements in pairs(self.contents) do
        for element, _true in pairs(elements) do
            coroutine.yield(element,scope)
        end
    end
end

function XSet:elements()
    -- return an iterator for contents
    return coroutine.wrap(function() self:element_co() end)
end

Let’s inline that.

function XSet:elements()
    -- return an iterator for contents
    return coroutine.wrap(function() 
        for scope,elements in pairs(self.contents) do
            for element,_true in pairs(elements) do
                coroutine.yield(element,scope)
            end
        end
    end)
end

I’m of two minds about that but I think it’s just barely better. Tomorrow I may change my mind.

Green. Commit: inline coroutine in elements.

Let’s sum up.

Summary

We’ve removed 45 or 50 lines of code with no reduction in capability. If I were being paid by the line, I’d be losing money. We’ve moved toward creating sets even if all we want to do is throw them away, rather than trying to write tricky faster internal code. We’ve centralized set iteration around a single method, elements() rather than a few.

Over the past few days, we’ve implemented some quantifiers, exists and every. Every might perhaps be better named forAll and exists thereExists, but we’ll let that ride.

We’ve left the NULL literal available but no longer require created empty sets to turn magically into NULL. We provide isNull as a convenient (and faster) way of finding out whether a set is null.

The program has a smaller surface, and that’s good. I suspect there are still layers waiting to be created. We’ll think about that in future articles.

Today we saw a very nice example of reduction of technical debt. We had perfectly reasonable code, and we saw a way to make it better, a way that we couldn’t quite see even a few days ago.

I just love when that happens. I’m going to include the whole program today, because I want it on my other iPad. You’re welcome to unzip it if you wish. Improve it and send me ideas. It’s all good.

XST.zip

Postscript

Carl Manaster tweeted:

If I’m understanding right, it seems like isNull() could be implemented in terms of exists(), fwiw.

I think Carl is correct, and I’m gonna try it.

We have:

function XSet:isNull()
    for e,s in self:elements() do
        return false
    end
    return true
end

Let’s try:

function XSet:isNull()
    return not self:exists(function(e,s) return true end)
end

Tests run but let’s enhance a few more, there’s only about one test for it.

        _:test("Is element", function()
            local A = XSet()
            A:addAt("Jeffries", "last")
            A:addAt("Chet", "first")
            _:expect(A:hasAt("Jeffries", "last"), "has").is(true)
            _:expect(A:hasAt("Chet", "last"),"hasnt").is(false)
            _:expect(A:isNull()).is(false)
            _:expect(A:card()).is(2)
        end)

And I tossed in a couple of others. Tests green.

Commit: isNull uses exists per Carl Manaster.

The notion of technical debt is due to the great Ward Cunningham, and has to do with learning, and not to do with writing dirty code. Using the term to mean the latter is a misuse of the term. ↩