A look at the code. Maybe a bit more on stats. P.S. I learn something and ditch almost all the code I wrote this morning.

It’s my birthday and I’ll code if I want to. At least until breakfast. It’s also Sunday, so we have a ritual to follow.

Anyway, I wanted to glance over the code and see if there are things to improve. There usually are. Then, if time permits, we might do some more with the stats method. Like rename it to statistics?

To the code, me hearties!

Here’s what we have for the newest method, stats:

function XSet:stats(controlTable)
    local sum = function(input,accumulator, control)
        local fields = control.sum
        for i,field in ipairs(fields) do
            local accum = accumulator:at(field) or 0
            accum = accum + input:at(field) or 0
            accumulator:putAt(tostring(accum),field)
        end
    end
    local result = XSet()
    local groupFields = controlTable.group or {}
    for record,scope in self:elements() do
        local accumulator = result:findOrCreateAccumulator(record, groupFields)
        sum(record, accumulator, controlTable)
    end
    return result
end

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = self:createMatchRecord(record, groupFields)
    local matchSet = XSet():addAt(matchRecord,NULL)
    local accumulatorSet = self:restrict(matchSet)
    if accumulatorSet:isNull() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:at(NULL)
    end
    return accumulator
end

function XSet:createMatchRecord(record,groupFields)
    local matchRecord = XSet()
    for i,field in ipairs(groupFields) do
        local value = record:at(field) or "MISSING"
        matchRecord:addAt(value, field)
    end
    return matchRecord
end

My eyes went first to the sum function, which makes stats look more complicated than it is.

And they went second to the findOrCreateAccumulator method, whose name is a code scent, and whose implementation is a bit odd. I think we should either move creating the match set down to the third method, or at least package its use better. But there’s a problem.

When we don’t find an existing accumulator record in the result set, we use the match record as the initializer. So we need both of them as things are currently laid out. Let’s try creating a new set operation, restrictElement which takes a single element to use in the restrict, rather than a collection. Then we can write this:

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = self:createMatchRecord(record, groupFields)
    local accumulatorSet = self:restrictElement(matchRecord)
    if accumulatorSet:isNull() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:at(NULL)
    end
    return accumulator
end

Tests will fail for want of that method.

5: Grouping -- XSet:250: attempt to call a nil value (method 'restrictElement')
etc.

Write it:

function XSet:restrictElement(E)
    return self:restrict(XSet():addAt(E,NULL))
end

I just wrapped it and called the original. Tests should be green. They are. Commit? Sure, why not. Commit: Create and use restrictElement.

I still don’t exactly love this:

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = self:createMatchRecord(record, groupFields)
    local accumulatorSet = self:restrictElement(matchRecord)
    if accumulatorSet:isNull() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:at(NULL)
    end
    return accumulator
end

Let’s see if this is better:

function XSet:findOrCreateAccumulator(record, groupFields)
    local matchRecord = self:createMatchRecord(record, groupFields)
    local accumulatorSet = self:restrictElement(matchRecord)
    local accumulator = matchRecord
    if accumulatorSet:isNull() then
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:at(NULL)
    end
    return accumulator
end

It is, but not much better. But wait. It seems like what we really want1 is to find a matching element in the accumulator set, even if we have to put it in there ourselves. There must be a better way to describe that.

I’m going to try typing in what I want, i.e. expressing my intention.

I wind up with this:

function XSet:stats(controlTable)
    local sum = function(input,accumulator, control)
        local fields = control.sum
        for i,field in ipairs(fields) do
            local accum = accumulator:at(field) or 0
            accum = accum + input:at(field) or 0
            accumulator:putAt(tostring(accum),field)
        end
    end
    local result = XSet()
    local groupFields = controlTable.group or {}
    for record,scope in self:elements() do
        local accumulator = result:getAccumulator(record, groupFields)
        sum(record, accumulator, controlTable)
    end
    return result
end

function XSet:getAccumulator(record, groupFields)
    local matchRecord = self:createMatchRecord(record, groupFields)
    return self:findMatchOrInsert(matchRecord)
end

function XSet:findMatchOrInsert(matchRecord)
    local match = self:findMatch(matchRecord)
    return match or self:addAndReturnElement(matchRecord,NULL)
end

function XSet:findMatch(matchRecord)
    local found = self:restrictElement(matchRecord)
    if found:isNull() then
        return nil
    else 
        return found:at(NULL)
    end
end

function XSet:addAndReturnElement(e,s)
    self:addAt(e,s)
    return e
end

function XSet:createMatchRecord(record,groupFields)
    local matchRecord = XSet()
    for i,field in ipairs(groupFields) do
        local value = record:at(field) or "MISSING"
        matchRecord:addAt(value, field)
    end
    return matchRecord
end

What I’m starting to like here is that most of these methods are small and not full of a lot of conditions. What I am not liking is that they are a bit more ad hoc than I’d prefer. They don’t seem to have clear “set theoretic” definitions. As such, they rather contaminate the XSet namespace, which is mostly about set operations.

I think we can do better. I’m going to revert this back to the restrictElement commit, and I’m not entirely sure I like that one, but it is at least a sensible set operation.

I think I need another layer or something, to provide for the legitimate need of operations like stats to have a workspace of their own to define things in.

We’ll address that next time. This morning, we tried a few things, got some results worth considering, and treated the whole thing as a learning experience. Not in a bad way. There wasn’t really any way that I could have foreseen that I wouldn’t like what I did quite enough.

An odd result for the morning, but a legitimate one.

See you tomorrow, if all goes as I plan.


  1. … what we really really wanna zigazig ah …