I was thinking before I got up about median and mode. Then I had a truly marvelous idea.

What if a set, a record, could contain a function as an element?

I came upon this idea around the Horn. I was thinking about median. Since the median is the “middle” value in a sorted list of all the values of a variable, to compute a grouped median, we’d have to accumulate all the values for each group. That seemed easy enough, but then we’d have to do some final operation, after all the other summing and such was done, to find the median. Mode would be similar except that it’s a more complicated and expensive process.

Still, it didn’t seem too hard, we could “just” add a pass over the grouped set before we return it, plugging in the median and mode. (Median, I could imagine doing incrementally. Mode, not so much. Possible but way expensive.)

So that led inevitably to the understanding that after the grouping is done, we could apply a function to the grouped set, and in fact if we had the grouped details, we could even to the summing and averaging there at the end.

The upside of that is that it would be more powerful than doing the summing in line, and a big disadvantage would be that it generates a larger output set, so that if we only wanted statistics that can be done incrementally, it would be costly in memory and quite possibly time.

Somehow that led me to imagine a function as a first-class element of a set, that would be called whenever the set was enumerated. Presumably the function would get the whole set as an argument. Then it could do things like return total compensation:

return (rec:at("pay") or 0) + (rec:at("bonus") or 0),"total_comp"

(I think the function needs to return an element and a scope, which is why I put the “total_comp” there.)

But that’s not terribly interesting because in most cases, the result of the function is a constant in each record. Set elements don’t generally change value, so most any computation on the elements of a set is constant. (Exceptions exist, often time-based: “days_until_birthday”.)

In addition, it would be a pain to inject the function into all the records. Unless …

It’s the same function for all the records. The only thing that changes from record to record is the other values in the record.

Therefore, we can have a new kind of XData subclass that applies a provided function to each element of the set, and returns it just as if it were another element.

This could change everything! Even if it doesn’t, it seems rather neat. Let’s do it.

XFunction

We’ll need a test.

        _:test("Function set", function()
            local S = XSet:fromTable{
                {a=5,b=50},
                {a=40,b=4}
            }
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b"), "total"
            end)
            for rec,s in F:elements() do
                for fld,name in rec:elements() do
                    print(name,fld)
                end
            end
        end)

I don’t love this because it doesn’t check anything, it just prints, but it’ll get me started. It fails for lack of withFunction.

function XSet:withFunction(S,F)
    return XSet:on(XFunction(S,F))
end

This just imagines that there’s an XFunction and defers the problem to it. Will fail looking for that. Now I have to do some work.

Well, there’s good news and bad news. I did this:

XFunction = class(XData)

function XFunction:init(Set, Function)
    self.set = Set
    self.fcn = Function
end

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for record,_s in self.set:elements() do
            for value,name in record:elements() do
                coroutine.yield(value,name)
            end
            coroutine.yield(self.fcn(record), "fcn")
        end
    end)
end

And my test failed trying to iterate its inner loop. So, to see what was going on, I changed the test:

        _:test("Function set", function()
            local S = XSet:fromTable{
                {a=5,b=50},
                {a=40,b=4}
            }
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b"), "total"
            end)
            local output = ""
            for rec,s in F:elements() do
                output = output..s.."="..rec.."\n"
            end
            print(output)
        end)

And the result was fascinating.

b=4
a=40
fcn=44
b=50
a=5
fcn=55

So. That’s almost what I expected, but not quite. The set is acting as if it was only one level. Clearly I don’t understand quite what I did … nor quite what I want.

Let’s think this out.

Thinking

We are trying to create a new kind of set that knows a function and applies it. Now what I have in mind is that the set is a set of records, not a set of values. The output of elements() should be a set of records, and each record should look as if it contains all its existing members, plus a member consisting of the value of the function, applied to the record,, named “fcn” for now.

What I’ve built seems to be a double loop, appending together all the inner elements as one big thing. Which is interesting but not what we wanted.

Meh. This says to me that we have to push the magic down one level.

I’m going to change my test, to first build the thing that applies the function directly.

        _:test("Function set", function()
            local S = XSet:fromTable{a=5,b=50}
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b"), "total"
            end)
            local output = ""
            for value,name in F:elements() do
                output = output..name.."="..value.."\n"
            end
            print(output)
        end)

This will fail in some interesting way.

7: Function set -- XData:81: XData:154: attempt to call a nil value (method 'elements')

Yes, makes sense. Let’s redo elements from this:

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for record,_s in self.set:elements() do
            for value,name in record:elements() do
                coroutine.yield(value,name)
            end
            coroutine.yield(self.fcn(record), "fcn")
        end
    end)
end

There’s the inner loop, which is the fail. We want to do all the elements and yield those, and then yield the function.

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for value,field in self.set:elements() do
            coroutine.yield(value,field)
        end
        coroutine.yield(self.fcn(self.set),"fcn")
    end)
end

That gives me this output:

a=5
b=50
fcn=55

Perfect. Now what about hasAt?

        _:test("Function set", function()
            local S = XSet:fromTable{a=5,b=50}
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b"), "total"
            end)
            local output = ""
            for value,name in F:elements() do
                output = output..name.."="..value.."\n"
            end
            print(output)
            _:expect(F:hasAt("5","a")).is(true)
        end)

That was the easy part:

function XFunction:hasAt(element,scope)
    return self.set:hasAt(element,scope)
end

I’ll add the other field, and fcn. fcn will fail.

            _:expect(F:hasAt("5","a")).is(true)
            _:expect(F:hasAt("50","b")).is(true)
            _:expect(F:hasAt("55","fcn"),"fcn ok").is(true)
            _:expect(F:hasAt("22","fcn"),"fcn wrong").is(false)
7: Function set fcn ok -- Actual: false, Expected: true

As expected. We can do this:

function XFunction:hasAt(element,scope)
    if scope == "fcn" then
        return tostring(self.fcn(self.set)) == element
    end
    return self.set:hasAt(element,scope)
end

I had to plug in that tostring because I’m expecting all my numeric outputs to be converted to strings. This will not work if the function were to return a set, and there’s no reason why it couldn’t.

This idea is more complicated than I thought. This is so often the case that I should get a tattoo.

The tests now run. Commit: initial XFunction applies function named fcn to the set, so that it appears that fcn’s value is a field of the set at scope “fcn”.

Arrgh. WorkingCopy has lost its link to the repo and I can’t get this project, or a duplicate of it, back into WorkingCopy. I’ve lost my git repo. This is mostly not a big deal because I really only use it to revert one step back, but that’s of great value all on its own. I have a request into Anders to see if he can sort this out.

OK, between that and a late start, this may have to do for the day. But it is a good thing. I’ll take a break and come back.

After Break

OK, well. I am now without source control on this project until further notice. I’ll just have o be very careful to run the tests often and revert manually if I need to. Definitely took the wind out of my sails.

Anyway, where are we? We have a new kind of XData subclass that holds a function and applies it, such that the function appears to be just another named element of the set. Let’s enhance that a bit: right now the name is always fcn. Let’s allow for more a different name. Later, probably not today, we might allow for more than one function. It seems like it’ll be needed.

The test already seems to think that the function can return its name:

        _:test("Function set", function()
            local S = XSet:fromTable{a=5,b=50}
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b"), "total"
            end)
            local output = ""
            for value,name in F:elements() do
                output = output..name.."="..value.."\n"
            end
            print(output)
            _:expect(F:hasAt("5","a")).is(true)
            _:expect(F:hasAt("50","b")).is(true)
            _:expect(F:hasAt("55","total"),"fcn ok").is(true)
            _:expect(F:hasAt("22","total"),"fcn wrong").is(false)
        end)

Let’s allow that and then test one that doesn’t return the name. First I’ll fix the test comments:

...
            _:expect(F:hasAt("5","a")).is(true)
            _:expect(F:hasAt("50","b")).is(true)
            _:expect(F:hasAt("55","total"),"total ok").is(true)
            _:expect(F:hasAt("22","total"),"total wrong").is(false)
        end)

Should fail:

7: Function set total ok -- Actual: false, Expected: true

Fix … Ah. This is more tricky than I thought. The hasAt function does not know what the function will return. I guess I have to try the function unconditionally.

function XFunction:hasAt(element,scope)
    local value,name = self.fcn(self.set)
    return tostring(value)==element and name==scope or self.set:hasAt(element,scope)
end

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for value,field in self.set:elements() do
            coroutine.yield(value,field)
        end
        local result,name = self.fcn(self.set)
        coroutine.yield(result,name)
    end)
end

The change to hasAt is unfortunate, as we have to call the function on every oh wait no we don’t …

function XFunction:hasAt(element,scope)
    if self.set:hasAt(element,scope) then return true end
    local value,name = self.fcn(self.set)
    return tostring(value)==element and name==scope
end

Now we check only if we don’t find the regular element,scope pair. So this is good.

Commit … oh. I can’t. Bummer.

I no longer have a test for an implicit name: the function must return a value and a name, (element and scope). A common error will probably be to return nil for the scope, so let’s write a test that requires us to return an automatic scope:

        _:test("Function set auto name", function()
            local S = XSet:fromTable{a=5,b=50}
            local F = XSet:withFunction(S,function(r)
                return r:at("a") + r:at("b")
            end)
            local output = ""
            for value,name in F:elements() do
                output = output..name.."="..value.."\n"
            end
            print(output)
            _:expect(F:hasAt("5","a")).is(true)
            _:expect(F:hasAt("50","b")).is(true)
            _:expect(F:hasAt("55","fcn"),"fcn ok").is(true)
            _:expect(F:hasAt("22","fcn"),"fcn wrong").is(false)
        end)

This should fail looking for “fcn ok” … well, no, it fails deeper in:

8: Function set auto name -- XData:98: attempt to concatenate a nil value (local 'name')

But that’s what it is. We must do these two things:

function XFunction:hasAt(element,scope)
    if self.set:hasAt(element,scope) then return true end
    local value,name = self.fcn(self.set)
    return tostring(value)==element and (name or "fcn") ==scope
end

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for value,field in self.set:elements() do
            coroutine.yield(value,field)
        end
        local result,name = self.fcn(self.set)
        coroutine.yield(result,name or "fcn")
    end)
end

I have two places where I have to plug in “fcn”. That won’t do.

function XFunction:evaluate()
    local value,name = self.fcn(self.set)
    return tostring(value),name or "fcn"
end

function XFunction:hasAt(element,scope)
    if self.set:hasAt(element,scope) then return true end
    local value,name = self:evaluate()
    return tostring(value)==element and (name or "fcn") ==scope
end

function XFunction:elements()
    -- return iterator
    return coroutine.wrap(function()
        for value,field in self.set:elements() do
            coroutine.yield(value,field)
        end
        local result,name = self:evaluate()
        coroutine.yield(result,name)
    end)
end

The evaluate method packages up the creation of the name if the function doesn’t provide it. Nice.

I’m still troubled with the tostring stuff. That may require a bit of thought to unwind.

Tomorrow, or next time should tomorrow not be next time, we’ll create a new thing that holds the function and applies it one level inside. That’s what we’re really after.

Let’s sum up. What have we here?

Summary

On the face of it, we have a set that knows a function and applies the function when the set is iterated, making the function value look like an additional field.

Clearly we can produce another kind of set that knows a function and applies it one level down. We’ll do that by returning any set elements of the outer set as XFunctions, holding the function, i.e wrapping on the fly.

But I wonder if there may be some more exotic / interesting things that we could do with this idea. Could the function perhaps return multiple elements? We’ll have to think about that. Certainly seems like a possibility that there’s more power here than is immediately obvious.

In any case, it has gone well once I got over my confusion that caused the first implementation to unwrap the set two levels down. That might be a useful set operation but it wasn’t the one I had in mind.

Happy new year! See you next time!