Today I plan to get grouping and summing working. Who knows, it might happen. If not, there’s always tomorrow or my birthday.

The really good thing about my situation here with summing and grouping is that I actually have a plan for how it should work, and even better, that plan is rather set-theoretic in nature. In my first cut at the problem, I didn’t have much of a plan. Oh, sure, I knew I was going to build up a table structure and then somehow munge it into sets for returning the result, but I didn’t know what the table should look like. In addition, I had chosen an output format that was just too complex to be allowed to live.

Now we have a very simple scheme. The grouped sum records will each contain the specific key fields for that group (Michigan, LIvingston; Ohio, Butler; and so on) plus whatever summed fields are saved. Currently those are saved under the same name as the input. The sum of pay is named pay, the sum of bonus is named bonus. I suspect that we’ll want to change that rule when we do additional statistics.

The summing function was created as a function right in the test that checked it:

        _:test("Summing", function()
            local control = {sum={"pay","bonus"}}
            local summingSet = XSet()
            local sum = function(input,accumulator, control)
                local fields = control.sum
                for i,field in ipairs(fields) do
                    local accum = accumulator:at(field) or 0
                    accum = accum + input:at(field) or 0
                    accumulator:putAt(tostring(accum),field)
                end
            end
            sum(p1,summingSet,control)
            _:expect(summingSet:at("pay")).is("1000")
            _:expect(summingSet:at("bonus")).is("50")
            sum(p2,summingSet,control)
            _:expect(summingSet:at("pay")).is("1100")
            _:expect(summingSet:at("bonus")).is("100")
            sum(p3,summingSet,control)
            sum(p4,summingSet,control)
            _:expect(summingSet:at("pay")).is("3300")
            _:expect(summingSet:at("bonus")).is("200")
        end)

It’s pretty simple. The function is given an input record, an accumulator record, and a control set. It goes over all the field names to be summed, and fetches the accumulated value of that field from the accumulator record, which is, of course, a set. If there is no value, it assumes zero. Then it fetches the same field from the input (or zero), and adds. Then it puts the accumulated value back into the accumulator set. (This last bit is a bit special, because we haven’t really invented updating yet.)

So the loop is just three lines and if I wanted to be fancy it could be two or one. I think it communicates better as it is.

So the mission of the group part now becomes simple. My rough plan is:

Create a result set, a set of accumulators;
For each record in the input set;
  Find the corresponding accumulator in the result set;
  If none is found, create one and put it in.
  Call the summing function,
    passing input, accumulator, and control
Return the result set

Of course I need a test, but I think we have one that is pretty close. Let’s get started.

        _:test("Grouping", function()
            local control = {sum={"pay"}, group={"state"} }
            local result = People:stats(control)
            _:expect(result:card(),"result").is(2)
        end)

Well, that’s not much, but honestly it’s a nice starting point. I think I’d best see what we have now for the stats method, since we did start on it.

Oh yes there’s a bunch. I should remove it but I’ll just rename it to XXX for now. No. Put on my big boy pants and remove the old stuff.

6: Grouping -- TestSum:71: attempt to call a nil value (method 'stats')

Perfect. I removed that heinous group string method and test as well.

function XSet:stats()
    local result = XSet()
    return result
end

Now I should get the wrong result for the card call.

5: Grouping result -- Actual: 0, Expected: 2

Sweet. Now I need to do a little work. I could fake this, but I’d rather start some of the actual function.

function XSet:stats(controlTable)
    local result = XSet()
    local groupFields = controlTable.group -- deal with missing
    for record,scope in self:elements() do
        local accumulator = findOrCreateAccumulator(result, record, control)
        sum(record, accumuulator, control)
    end
    return result
end

Look, he’s actually working by intention a bit: I intend to have a function findOrCreateAccumulator. Test says I need it:

5: Grouping -- XSet:223: attempt to call a nil value (global 'findOrCreateAccumulator')

Best write it.

This is getting pretty deep. I should have written a finer-grained test. I may have to do that in a moment. Anyway here’s what I’ve got:

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = XSet()
    for i,field in ipairs(groupFields) do
        local value = record:at(field) or "MISSING"
        matchRecord:addAt(value, field)
    end
    local matchSet = XSet():addAt(matchRecord,NULL)
    local accumulatorSet = self:restrict(matchSet)
    if accumulatorSet:isNULL() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:choose()
    end
    return accumulator
end

This could use some refactoring even before it works. Let’s just do that, making the intention a bit more clear.

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = self:createMatchRecord(record, groupFields)
    local matchSet = XSet():addAt(matchRecord,NULL)
    local accumulatorSet = self:restrict(matchSet)
    if accumulatorSet:isNULL() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:choose()
    end
    return accumulator
end

function XSet:createMatchRecord(record,groupFields)
    local matchRecord = XSet()
    for i,field in ipairs(groupFields) do
        local value = record:at(field) or "MISSING"
        matchRecord:addAt(value, field)
    end
    return matchRecord
end

You may not have noticed the call to choose down near the end. The restrict function returns a set of matching records. We know that there will be at most one, by construction. But we have no existing way to get an element from a set if we don’t …

Oh … but we HAVE … Let me recast that right now. I’m glad we had this little chat.

function XSet:findOrCreateAccumulator(record, groupFields)
    local accumulator
    local matchRecord = self:createMatchRecord(record, groupFields)
    local matchSet = XSet():addAt(matchRecord,NULL)
    local accumulatorSet = self:restrict(matchSet)
    if accumulatorSet:isNULL() then
        accumulator = matchRecord
        self:addAt(matchRecord,NULL)
    else
        accumulator = accumulatorSet:at(NULL)
    end
    return accumulator
end

I can use :at(NULL) to get that record. Should probably check cardinality but we’re already in over my head. I need finer-grained tests. But let’s run now to see what explodes.

5: Grouping -- XSet:234: attempt to call a nil value (method 'isNULL')

The function is isNull. OK.

5: Grouping -- XSet:224: attempt to call a nil value (global 'sum')

Well, in for a penny, let’s move that function over from the test.

function XSet:stats(controlTable)
    local sum = function(input,accumulator, control)
        local fields = control.sum
        for i,field in ipairs(fields) do
            local accum = accumulator:at(field) or 0
            accum = accum + input:at(field) or 0
            accumulator:putAt(tostring(accum),field)
        end
    end
    local result = XSet()
    local groupFields = controlTable.group -- deal with missing
    for record,scope in self:elements() do
        local accumulator = result:findOrCreateAccumulator(record, groupFields)
        sum(record, accumuulator, control)
    end
    return result
end

Well, I called it with “control” and there ain’t no such thing.

5: Grouping -- XSet:223: attempt to index a nil value (local 'accumulator')

Uh oh, there’s a misspelled member there. Too many u’s in accumulator.

The test runs, it actually returned two records. Let me catch my breath before I even look to see what’s in them. Let’s relax and reflect.

Reflection

That was a lot of code to write with no real testing support, 35 or 40 lines. Three different loops, creation of a few sets, and a few set operations. I’d like to have had tests for those individual bits. The thing is, I didn’t know what the bits were going to be until I wrote them.

Now, a more strict master than I am might have insisted that we think or draw on the whiteboard or throw tarot cards until we did know, but I am not that master.

My view is that we give what we have, when we have it. And what I had was a vision of some looping and some other looping. I did manage, once the code was sketched in, to pull out a couple of functions. Had I seen those as part of my intention at the beginning, I could have written tests for them and then written them with more confidence.

And in fact, looking back, it simply must be the case that right before I wrote that match-creating loop, I had in mind creating a match record. I could have right then and there typed the call to createMatchRecord, and dropped into writing a test for it. I didn’t. My mind didn’t bubble the idea up at the right moment, so I gave what I had.

It worked out OK, I suspect we have the right values in there right now. Shall we find out?

pay=3000
state=MI

pay=300
state=OH

That’s the result of adding this code:

            local report = ""
            for record,_ignored in result:elements() do
                for value,field in record:elements() do
                    report = report..field.."="..value.."\n"
                end
                report = report.."\n"
            end
            print(report)

So I’d say that’s a success, because those are in fact the right answers.

But the thing is this. There were bugs in my code, but fortunately they were all easy ones. The misspelling of accumulator, typing control when I meant controlTable. That’s just about all there was. That’s a pretty low error density for me, and my existing test was certainly enough to give a failure, if not to point directly to the error.

I was on pretty thin ice here, but I got away with it. You’ve seen me do similar things, even simpler things than this, and then fall into an hour or two of debugging. So I’d have done better with better tests.

That reminds me of something I wanted to write about. Excuse me while I write another article.

We’ll call this one done, though we need more testing, and to deal with the non-grouping case, but we’re clean enough to commit: group summing first test runs.

See you next time, and in today’s other article, No Blame, No Forgiveness.


XSet2.zip

CodeaUnit2.zip