No, I’m not hearing voices. But the code does tell us things, just like any working material. We need to learn to listen. Today, we listen and the result is good.

I’m pretty sure I first heard the phrase “listening to the code” from Kent Beck. I recall also the notion of “letting the code participate in our design sessions”. These ideas are not some new form of animism or digital mysticism. They express the common experience that a maker has in working with their materials, that the materials have a say in what happens.

The carpenter knows that you saw and polish different woods differently, and even depending on whether you’re working with or across or against the grain. The knife maker knows that different metals work differently. The chemist knows that pressure and temperature change how chemicals react. The artist knows that different paints mix differently, and different materials take paint differently. The child knows that some snows make good snowballs and snowmen, and some do not.

As we work with any material, and as we improve in that work, we begin to feel the material responding to the work. It’s relatively easy to get that feeling with physical materials: they do have differing hardness and friction, which influences how our tools, and our hands, respond.

The metaphor remains strong in writing and programming. The work flows well when we’re approaching it well, and gets all sticky and messy when we aren’t going about things well enough. We’re here for the code, of course, so let’s look at our statistics making code again today, and see if we can see what it’s trying to tell us.

## Stats Operation

Here, again, is our `stats` operation:

``````function XSet:stats(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = controlTable.group or {}
for record,scope in self:elements() do
local accumulator = result:findOrCreateAccumulator(record, groupFields)
sum(record, accumulator, controlTable)
end
return result
end

function XSet:findOrCreateAccumulator(record, groupFields)
local accumulator
local matchRecord = self:createMatchRecord(record, groupFields)
local accumulatorSet = self:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end

function XSet:createMatchRecord(record,groupFields)
local matchRecord = XSet()
for i,field in ipairs(groupFields) do
local value = record:at(field) or "MISSING"
end
return matchRecord
end
``````

It makes some sense that `stats` could be a set operation, although it is pretty custom-sized to a specific kind of set. But `createMatchRecord`? `findOrCreateAccumulator`? These do not seem like set operations at all.

Set operations, mostly, have fallen into two groups. There are what we might call primitives, which are low-level operations, like `hasAt`, which answers whether a particular element occurs at a particular scope, or `elements`, which enumerates the elements and scopes of a set. And then there are general operations, `union`, `restrict`, and so on, which carry out quite general operations on sets.

I guess there’s a third group, I suppose, like `isNull`, or `isSubset`. Should I count the various iterators, `reduce`, `select`, `exists` and `every` as another group?

Be that as it may, all those methods apply to all sets. More or less. Go with me here. There’s always a grey area. Or a gray area. Whichever.

These stats methods just aren’t like that at all. They are telling us something. What are they telling us? They’re telling us “I don’t belong here: I belong somewhere else”.

Let’s do a tiny bit of design thinking.

## What’s Going on Here?

What’s going on here is that we have a set with pretty much the second simplest structure we use. It’s a set of “records”, where we mean by that “all the elements of this set have pretty much the same scopes throughout”, and we probably also mean “and most of the elements are simple values”.

And we want an output set that has a record for each combination of some grouping fields in those records, and that record should contain the sum (or other statistic) of certain fields that we’ll name.

That’s pretty darn specific.

I think the code is telling us that we want some new class here, to contain all this code. We can think of it, perhaps, as like a factory, given some input it produces a set containing the requested statistics. Maybe it’s a kind of “Factory” pattern, but I’m not going to push that notion. I am, however, going to try putting this code into a set of its own.

I think I’d like to leave the method `stats` on XSet, though we’ll move all the work elsewhere. I think we have sufficient tests to support this refactoring.

``````function Xset:stats(controlTable)
return XStatistics(self,controlTable)
end
``````

I’ll just start by creating the new class and moving the methods over. I think we’ll want a new tab for this.

``````function XStatistics:init(input, control)
self.data = input
self.control = control
return self:getStatistics()
end
``````

Now I can almost just move everything, with a few adjustments.

``````
function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = controlTable.group or {}
for record,scope in self.data:elements() do
local accumulator = result:findOrCreateAccumulator(record, groupFields)
sum(record, accumulator, controlTable)
end
return result
end
``````

Oh, I’d better fix the references to controlTable …

``````function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = self.control.group or {}
for record,scope in self.data:elements() do
local accumulator = result:findOrCreateAccumulator(record, groupFields)
sum(record, accumulator, self.control)
end
return result
end
``````

This is rather a large method all of a sudden, since it’s not trivial to scan it for places where I need to rename or add `self`. Moving along …

``````function XStatistics:findOrCreateAccumulator(record, groupFields)
local accumulator
local matchRecord = self:createMatchRecord(record, groupFields)
local accumulatorSet = self.data:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end

function XStatistics:createMatchRecord(record,groupFields)
local matchRecord = XSet()
for i,field in ipairs(groupFields) do
local value = record:at(field) or "MISSING"
end
return matchRecord
end
``````

I found one more place where I needed to change `self` to `self.data`. Let’s run the tests and see what breaks, let them lead the way if they need to.

First thing they do is find the typo in the initial call, where I said `Xset`, not `XSet`. Then:

``````5: Grouping -- XStatistics:21: attempt to call a nil value (method 'findOrCreateAccumulator')
``````

I get three occurrences of that. It’s here:

``````function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = self.control.group or {}
for record,scope in self.data:elements() do
local accumulator = result:findOrCreateAccumulator(record, groupFields)
sum(record, accumulator, self.control)
end
return result
end
``````

Ah. Yes. Our weird method `findOrCreateAccumulator` is not called on the data but on a specific record. We can either pass in result as a parameter, or make that into an actual set operation. We’re in the mode of not creating weird single purpose set ops, so let’s draw it inside `XStatistics` for now.

A bit of reading carefully and we have this:

``````function XSet:stats(controlTable)
local stat = XStatistics(self,controlTable)
return stat:getStatistics()
end
``````

That’s because we can’t return anything but an XStatistics from the `init` so we have go ask it to give us the stats.

And the implementation, now working:

``````-- XStatistics
-- 20211227

XStatistics = class()

function XStatistics:init(input, control)
self.data = input
self.control = control
end

function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = self.control.group or {}
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record, groupFields)
sum(record, accumulator, self.control)
end
return result
end

function XStatistics:findOrCreateAccumulator(result, record, groupFields)
local accumulator
local matchRecord = self:createMatchRecord(record, groupFields)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end

function XStatistics:createMatchRecord(record,groupFields)
local matchRecord = XSet()
for i,field in ipairs(groupFields) do
local value = record:at(field) or "MISSING"
end
return matchRecord
end
``````

First let’s commit: Moved stats method to new XStatistics class.

Now that it works, let’s make it more nearly right.

## Making It Right

What’s not to like about this code? Well. First of all, the methods hardly ever refer to the input data at all. They are processing other sets. Let’s start with the worst offender:

``````function XStatistics:createMatchRecord(record,groupFields)
local matchRecord = XSet()
for i,field in ipairs(groupFields) do
local value = record:at(field) or "MISSING"
end
return matchRecord
end
``````

This thing, given a list of scopes and a record, produces the set containing the specific values under those scopes.

I think we have a name for that. I believe it is `scopeRestrict`, though it expects a set as its second parameter, not a table:

``````function XSet:scopeRestrict(B)
return self:select(function(_ignored,aScope)
return B:exists(function(bElement,_ignored)
return bElement==aScope
end)
end)
end
``````

Let’s see if we can use that method, and if we can, whether we can get code that we actually like:

``````function XStatistics:createMatchRecord(record,groupFields)
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return record:scopeRestrict(scopeSet)
end
``````

Here I create a set of {scope@scope} for each of the scopes in `groupFields` and use the `scopeRestrict` to get the match record. Commit: use scopeRestrict to create matchRecord.

Now let’s compute that scopeSet once and for all. First extract the method:

``````function XStatistics:createMatchRecord(record,groupFields)
self.scopeSet = self:getScopeSet(groupFields)
return record:scopeRestrict(self.scopeSet)
end

function XStatistics:getScopeSet(groupFields)
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return scopeSet
end
``````

Commit: extract method for scopeSet.

Now move the call to the init.

``````function XStatistics:init(input, control)
self.data = input
self.control = control
self.scopeSet = self:getScopeSet(groupFields)
end
``````

Now groupFields isn’t defined, so:

``````function XStatistics:init(input, control)
self.data = input
self.control = control
self.scopeSet = self:getScopeSet(self.control.group or {})
end
``````

Tests green. Commit: move creation of scopeSet to init.

Now these methods don’t use groupFields:

``````function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
local groupFields = self.control.group or {}
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record, groupFields)
sum(record, accumulator, self.control)
end
return result
end

function XStatistics:findOrCreateAccumulator(result, record, groupFields)
local accumulator
local matchRecord = self:createMatchRecord(record, groupFields)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end

function XStatistics:createMatchRecord(record,groupFields)
return record:scopeRestrict(self.scopeSet)
end
``````

Remove the parameter throughout.

``````function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record)
sum(record, accumulator, self.control)
end
return result
end

function XStatistics:findOrCreateAccumulator(result, record)
local accumulator
local matchRecord = self:createMatchRecord(record)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end
``````

Green, commit: Remove unused groupFields parameter.

Now we can remove the parameter from the match record creation as well.

``````function XStatistics:init(input, control)
self.data = input
self.control = control
self.scopeSet = self:getScopeSet()
end

function XStatistics:getScopeSet()
local groupFields = self.control.group or {}
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return scopeSet
end
``````

Green. Commit: push group field fetch down to method that needs it.

So that’s nice.

## What Else?

Is that all? Not remotely.

``````function XStatistics:getStatistics(controlTable)
local sum = function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
local result = XSet()
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record)
sum(record, accumulator, self.control)
end
return result
end
``````

I don’t like including the details of the sum function here in the method that uses it. We can give ourselves a method for that now.

``````function XStatistics:getStatistics(controlTable)
local result = XSet()
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record)
self:sumFunction()(record, accumulator, self.control)
end
return result
end

function XStatistics:sumFunction()
return function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
end
``````

Is that too obscure, with the `sumFunction()(record...)`? I think I’ll allow it, but it is a bit uncommon. Let’s see what else we can find and then check to see whether that sticks out as a problem.

Green. Commit: sumFunction returned from a method in XStatistics.

This …

``````function XStatistics:getStatistics(controlTable)
local result = XSet()
for record,scope in self.data:elements() do
local accumulator = self:findOrCreateAccumulator(result,record)
self:sumFunction()(record, accumulator, self.control)
end
return result
end
``````

Isn’t that last bit just a `reduce` in wolf’s clothing? Let’s see if we can make it so.

Well, my first attempt crashed. Fortunately, we just committed. So I’ll revert. Try again:

``````function XStatistics:getStatistics(controlTable)
return self.data:reduce(XSet(), function(result,record,scope)
local accumulator = self:findOrCreateAccumulator(result,record)
self:sumFunction()(record, accumulator, self.control)
return result
end)
end
``````

Green: use `reduce` in `getStatistics`.

We could debate how much better or worse that is. It’s not the cleanest reduce ever. For now, I’ll allow it: I’m trying to make the operations like `reduce` part of our official coding standard, so we’ll build up our comfort with them as time goes on.

My thinking is that `reduce` is solid, and ad-hoc loops are less so, so that we should push things into `reduce` and its associates when we can.

Let’s glance at this:

``````function XStatistics:sumFunction()
return function(input,accumulator, control)
local fields = control.sum
for i,field in ipairs(fields) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
end
``````

I think we can inline a bit here to our benefit. After reflection, just one tiny change, removing that local.

``````function XStatistics:sumFunction()
return function(input, accumulator, control)
for _i,field in ipairs(control.sum) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
end
``````

I’d like to reduce that three line loop content, but I don’t see quite how to do it. No sense walking out on thin ice. Green, commit: Inline temp in `sumFunction`.

## And Now …

``````function XStatistics:findOrCreateAccumulator(result, record)
local accumulator
local matchRecord = self:createMatchRecord(record)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
accumulator = matchRecord
else
accumulator = accumulatorSet:at(NULL)
end
return accumulator
end
``````

This one’s a real [DELETED], isn’t it? I just hate methods with “Or” or “And” in their names. You know they’re not doing just one thing.

What could we call it that wouldn’t have a boolean operator in the name? `produceAccumulator`? Maybe.

What could we write that wouldn’t be so messy? Something like this, maybe?

``````function XStatistics:getStatistics(controlTable)
return self.data:reduce(XSet(), function(result,record,scope)
local accumulator = self:produceAccumulator(result,record)
self:sumFunction()(record, accumulator, self.control)
return result
end)
end

function XStatistics:produceAccumulator(result, record)
return XStatistics:produceAccumulatorMatching(result, self:createMatchRecord(record))
end

function XStatistics:produceAccumulatorMatching(result, matchRecord)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
return matchRecord
end
return accumulatorSet:at(NULL)
end
``````

I extracted the `createMatchRecord` to make that one-liner, removing one line from the `produceAccumulatorMatching` method. Green. Commit: refactoring `getStatistics` to use `produceAccumulator`.

Now this:

``````function XStatistics:produceAccumulatorMatching(result, matchRecord)
local accumulatorSet = result:restrictElement(matchRecord)
if accumulatorSet:isNull() then
return matchRecord
end
return accumulatorSet:at(NULL)
end
``````

We can do this:

``````function XStatistics:produceAccumulatorMatching(result, matchRecord)
local accumulator = result:restrictElement(matchRecord):at(NULL)
if not accumulator then
return matchRecord
end
return accumulator
end
``````

``````function XStatistics:produceAccumulatorMatching(result, matchRecord)
return result:restrictElement(matchRecord):at(NULL) or self:createOne(result,matchRecord)
end

function XStatistics:createOne(result,matchRecord)
return matchRecord
end
``````

Now that’s what I’m talkin’ about. Here’s where we are:

``````-- XStatistics
-- 20211227

XStatistics = class()

function XStatistics:init(input, control)
self.data = input
self.control = control
self.scopeSet = self:getScopeSet()
end

function XStatistics:createMatchRecord(record)
return record:scopeRestrict(self.scopeSet)
end

function XStatistics:createResultRecord(result,matchRecord)
return matchRecord
end

function XStatistics:getScopeSet()
local groupFields = self.control.group or {}
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return scopeSet
end

function XStatistics:getStatistics(controlTable)
return self.data:reduce(XSet(), function(result,record,scope)
local accumulator = self:produceAccumulator(result,record)
self:sumFunction()(record, accumulator, self.control)
return result
end)
end

function XStatistics:produceAccumulator(result, record)
return XStatistics:produceAccumulatorMatching(result, self:createMatchRecord(record))
end

function XStatistics:produceAccumulatorMatching(result, matchRecord)
return result:restrictElement(matchRecord):at(NULL) or self:createResultRecord(result,matchRecord)
end

function XStatistics:sumFunction()
return function(input, accumulator, control)
for _i,field in ipairs(control.sum) do
local accum = accumulator:at(field) or 0
accum = accum + input:at(field) or 0
accumulator:putAt(tostring(accum),field)
end
end
end
``````

Now then. Most of that is pretty short and simple.

We started off this morning with 40 lines. We now have 54, of which around ten are whitespace or commentary. And what we have is overall simpler, with fewer if statements and loops, though of course the conditions are still met and the right sets are still enumerated. We’ve just hidden that complexity away, pushed down to existing methods.

Most of what we have is pretty coherent with the notion of creating statistics. I have some interest in this function, however:

``````function XStatistics:getScopeSet()
local groupFields = self.control.group or {}
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return scopeSet
end
``````

That seems perilously close to doing something generically useful. We do have a function now, `scopeSet`, that returns the scopes from a given set, under NULL. Let’s change the definition to put them under themselves, and see if the tests mind.

``````function XSet:scopeSet()
return self:reduce(XSet(), function(r,e,s)
end)
end
``````

This test fails:

``````        _:test("restrictToScope", function()
local S = XSet:on(CSVSet(CSVnames,CSVdata))
local miFolksScopesHardWay = S:restrict(restrictor):scopeSet()
local miFolksScopes = S:restrictToScope(restrictor)
_:expect(miFolksScopes:card(),"scopes").is(14)
_:expect(miFolksScopesHardWay:card(),"hard").is(14)
_:expect(miFolksScopesHardWay:equals(miFolksScopes),"sets equal").is(true)
end)
``````

Another one needed the results changed to match. I think we need to look at the `restrictToScope` and see if it’s OK for it to return scope@scope.

``````function XSet:restrictToScope(B)
local scopes = XSet()
for a,s in self:elements() do
if B:exists(function(b,_ignored) return b:isSubset(a) end) then
end
end
return scopes
end
``````

Let’s do it.

``````function XSet:restrictToScope(B)
local scopes = XSet()
for a,s in self:elements() do
if B:exists(function(b,_ignored) return b:isSubset(a) end) then
end
end
return scopes
end
``````

Test. Green. Commit: scope sets are now scope at scope.

And now:

``````function XStatistics:getScopeSet()
local groupFields = self.control.group or {}
local scopeSet = XSet()
for i,field in ipairs(groupFields) do
end
return scopeSet
end
``````

Well, darn. I was thinking I could now call `scopeSet` but this is not to be. The input to `scopeSet` needs to be a set of records with some scopes. To create that is just as much code as we have here now. So, boo.

The code is telling me that it doesn’t want to do that.

But it’s also telling me that XSet is much simpler without all the statistics stuff woven in, and XStatistics is cleaner than a pure extraction of the stats stuff into a new class.

## Summary

We listened to the code telling us that it didn’t belong in XSets, then telling us that bits of it were too complex. We moved things around, at least once finding we were working against the grain, and smoothed out much of the code. We deferred some operations back to XSet, as one would hope would happen. Is there more we could do? Almost always, and here, yes. Is it better? By my lights, it’s much better, if only because we’ve split the XSet_stats blob into two separate things.

And the new thing’s code is simpler as well.

Is it perfect? No, it’s not perfect. But it’s better. That’s a good thing: if it were perfect, there’d be nothing to do tomorrow.

But there are plenty of things to do tomorrow. See you then!

XSet2.zip