Let’s see if we can find a decent way to provide filter, map, reduce functions in Codea Lua.

Modern languages provide “functional programming” operations like filter, map, and reduce. I’ve provided analogues of these applying to Extended Sets, in my XST series. Just for exercise, and because we’ll probably learn something, and because I think they’ll be generally useful in my “work”, but mostly because I think it’ll be fun, let’s see about providing capabilities like those in Codea Lua, operating on Lua’s table type.

Looking Forward

I am not one for “big” up front planning, but I do always try to think before doing something, and today is no exception. There are some issues that we’ll need to deal with. Coming to mind:

Lua Tables
Languages like Python have separate types for arrays, dictionaries, and the like. Lua does not: it just has tables. A table can act like an array, if its indexes are consecutive integers starting from 1. (Yes, one-based, not zero.) A table can act like a dictionary with arbitrary keys and values.

Lua has two ways of iterating a table, ipairs and pairs. The former assumes that the table has only consecutive integer indexes and iterates from 1 up to the first missing index. (If a table has indexes 1,2,3 and 5, it’s going to iterate over 1, 2, and 3. (It’s really even a bit worse: there is no guarantee as to what will happen with a table has missing integer keys and you use ipairs. Just don’t.)) The pairs iterator will produce all key-value pairs from the table, in arbitrary order, just like you generally get if you iterate a dictionary.

This means that we will probably want versions of some of our functions that consider arrays and some that think in terms of dictionaries. We’ll see.

Naming and Name Spaces
We’ll need to come up with a reasonable way to name our functions, to make them convenient and useful, and yet to avoid stepping on other names that the programmer might have used. In past experiments, I’ve named them as globals, like map, which seems risky.

Lua’s built in types are named in lower case, string and table and such. Functions are often grouped with a prefix. There are functions like math.atan, table.sort, and so on. We might want to follow their lead in some way. But we probably also want our names to be as short as possible, for convenience. I expect to pick a way and live with it, and I expect that any choice will be a compromise.

Metatables
Lua has a sort of “secret table”, called a metatable, that can be associated with any table. The metatable can contain “metamethods”, which can change the behavior of a table. For example, we could set up a table that returns a default value for any missing key in it. We can even arrange for the table to understand operators like +.

It’s possible, even probable, that we can set up a metatable on any table that causes it to understand operations like filter and so on. What is not possible is to arrange for all tables to have a standard automatically provided metatable. At least as far as I know.

I want to experiment with easy ways of attaching a “functional programming” metatable to any table. As we’ll see in the details, that might give us a nicer syntax for these operations.

Because of these issues and other issues that have not come to mind, I expect that we’ll experiment with a few different ideas before we settle on an “official” answer. Accordingly, we’ll put our new project under Working Copy for versioning, and we’ll work with TDD tests. We’ll arrange the Codea project so that it can be used as a dependency in other projects.

Let’s get started.

Starting Out

I create a Codea project named fp, on my CodeaUnit base. I plan to have only two tabs, Main and the fp tab, with the tests in Main, which will allow a dependency reference to get just the code without the tests, since dependencies don’t bring in their Main tab.

Then I create a Working Copy repo and connect it to my fp program. Then I commit the first version with nothing useful in it yet:

-- fp

function setup()
    CodeaUnit_Detailed = false
    --CodeaUnit_Lock = false
    if CodeaUnit then 
        _.execute()
    end
end

function draw()
    background(40)
    if CodeaUnit and CodeaUnit_Lock then 
        _.showTests()
    else
        background(128)
        stroke(0)
        fill(0)
        text("fp", WIDTH/2,HEIGHT/2)
    end
end

function touched()
    CodeaUnit_Lock = false
end

-- RJ 20220225

function testfp()
    
    _:describe("fp", function()
        
        _:before(function()
        end)
        
        _:after(function()
        end)
        
        _:test("HOOKUP", function()
            _:expect("Foo", "testing fooness").is("Bar")
        end)
        
    end)
end

Unfortunately, now we have to do some work. I’m going to try to go in very small steps here. I think it’ll help me explain what I’m thinking about, and besides, small steps are the way.

Of filter, map, and reduce, I think map is the easiest to code, so let’s work on that first. We want an operation map that accepts a table and a function, and produces a new table where the function has been applied to every member of the input table. We want the output to match the input. That is, for every key-value pair k,v, we want the output to contain k,f(v).

Begin with a test:

        _:test("map", function()
            local input = {a=5,z=6}
            local f = function(x) return 2*x end
            local result = map(input,f)
            _:expect(result.a).is(10)
            _:expect(result["z"]).is(12)
        end)

I used two different access methods in the expects, just to remind us that there are two. In the input table, the keys a and z are strings. result.a is exactly result["a"].

The test should fail looking for map.

1: map -- Main:42: attempt to call a nil value (global 'map')

I love when a plan comes together.

For now, I think I’ll just define map right in line:

        _:test("map", function()
            local function map(t,f)
                local result = {}
                for k,v in pairs(t) do
                    result[k] = f(v)
                end
                return result
            end
            local input = {a=5,z=6}
            local f = function(x) return 2*x end
            local result = map(input,f)
            _:expect(result.a).is(10)
            _:expect(result["z"]).is(12)
        end)

So that works as advertised. I think we’ll find that all these functions are pretty easy to write.

I am about to try something potentially weird. I want to try the metatable idea and see how it might work. To do that, I’m going to refactor this test and implementation in place. Best commit the code first: initial map works.

Metatable Version

My starting idea is to create a metatable that understands map and attach it to my input table, which will allow me to write the call differently …

Curiously, this works quickly, modulo one mistake that I’ll mention below:

        _:test("map", function()
            local mt = {}
            mt.map = function(t,f)
                local result = {}
                for k,v in pairs(t) do
                    result[k] = f(v)
                end
                return result
            end
            mt.__index = function(tab,key)
                return mt[key]
            end
            local input = {a=5,z=6}
            setmetatable(input,mt)
            local f = function(x) return 2*x end
            local result = input:map(f)
            _:expect(result.a).is(10)
            _:expect(result["z"]).is(12)
        end)

We define a new table, mt. We define two functions in it. The first is our map function, as we had it but now plugged into me instead of into a local. The second function in mt is __index. This is a metamethod which is called when there is an access to the base table that is not found. So when we do

            local result = input:map(f)

We do not find map in our input table, so Lua calls __index, passing the table with the problem, input in this case, and the missing key, “map”. We just return whatever mt has for “map”, which turns out to be just the function we want. The test runs green.

My mistake? Oh, well, I forgot to implement __index, so the first time I ran the test it couldn’t find “map”.

You may be surprised by the :map(f) there. Recall that Lua has a syntax trick. That statement is exactly the same as

            local result = input.map(input,f)

Or even:

            local result = input["map"](input,f)

These all mean exactly the same thing. I hope it’s clear why I prefer:

            local result = input:map(f)

If I were less of an OO person, I might prefer the dot notation.

Head’s Up, Ron

Popping my mind up a level, I realize that I may have finessed my naming problems, at least in part. To explain, let’s do our implementation another way. First I want to commit this: map in metatable.

Now consider this test, which also runs green:

        _:test("map alternate notation", function()
            local fp = {}
            function fp.map(t,f)
                local result = {}
                for k,v in pairs(t) do
                    result[k] = f(v)
                end
                return result
            end
            
            local f = function(x) return 2*x end
            local input = {a=5,z=6}
            local result = fp.map(input,f)
            _:expect(result.a).is(10)
            _:expect(result["z"]).is(12)
        end)

Here, we’ve put the function map into a table fp. The idea would be that fp would be a global repository for all the fp functions and when you wanted to call one, you would just say, as shown there, something like fp.map(input,f).

That’s pretty analogous to what Lua does now, with things like table.sort(myTable). My guess is that many Lua programmers would prefer that notation. If it can be carried through, I prefer the input:map(f) style.

But why not both?

There’s not much difference between the mt table and the fp table, just __index, which would be harmless in the fp table. So perhaps we can set things up so that you can refer to a table as it if were an object, by setting its metatable to the same table that you can use in the fp.map style, with little or no additional code. That would be nice.

Something odd has happened here …

On another day, I might have gone ahead and implemented some more functions, filter and so on. There are problems to be addressed there, as we’ll see. I might have set up a fairly complex structure to deal with those problems, about which I do have some ideas, and certainly I’d have more tests and functions to deal with. Then, only then, I might have tried the metatable route.

We can’t know now which way would have been “better”, but my intuition is that this way might be better, because we might have committed to a style that was hard to map into a metatable solution. Or we might not.

As it happens, I didn’t try to figure out which path was better. I just found myself thinking, after making map work, that I’d like to try the metatable. Now that we’re here, I’m not going back. Always move forward if you can, that’s my motto1.

But now that we’re here, a question that interests me is figuring out how we might deal with a function that wants to differ between an array-style table and a dictionary-style table. One such example is filter.

Naively, filter takes an input table and a boolean function, and produces and output table of the values where the boolean is true. But what we likely want may differ between array-like and dictionary-like tables.

For the dictionary, I think the desired result is pretty obvious. Let’s write a test:

        _:test("filter dictionary style", function()
            local filter = function(t,pred)
                local result = {}
                for k,v in pairs(t) do
                    if pred(v) then result[k]=v end
                end
                return result
            end
            
            local p = function(v) return v%2==0 end
            local input = {one=1, two=2, three=3, four=4}
            local result = filter(input,p)
            _:expect(result.one).is(nil)
            _:expect(result.two).is(2)
            _:expect(result.three).is(nil)
            _:expect(result.four).is(4)
        end)

Here we have a dictionary-style table and we wind up with a table with the matching keys. But think about what this version of filter will do to an array-style table. It will produce a dictionary-style result. What we would probably expect is a shorter array:

{1,2,3,4} > {2,4}

But our current function will return a table with two keys, 2 and 4, with values 2 and 4 respectively. Weird, to say the least.

Let’s write that test:

        _:test("filter array style", function()
            local even = function(v) return v%2==0 end
            local input = {1,2,3,4}
            local result = filter(input,even)
            _:expect(result[1]).is(2)
            _:expect(result[2]).is(4)
        end)

It’s easy enough to write that version of filter:

        _:test("filter array style", function()
            local filter = function(t,pred)
                local result = {}
                for _i,v in ipairs(t) do
                    if pred(v) then table.insert(result,v) end
                end
                return result
            end
            
            local even = function(v) return v%2==0 end
            local input = {1,2,3,4}
            local result = filter(input,even)
            _:expect(result[1]).is(2)
            _:expect(result[2]).is(4)
        end)

We just use ipairs to iterate in array style, and table.insert to append the values to result compactly.

The question now is to devise a decent way to accommodate these two conflicting needs, dealing differently with arrays and tables.

Arrays vs Dictionaries

There is no practical way to determine whether a given Lua table is array style or key-value, dictionary style. So we can’t just check and then call the right function. I only have two ideas, which isn’t enough, but it’s what I’ve got. For both of them the user has to decide.

Different Function Names
We could have filter_array and filter_dictionary functions defined and let the user decide that way
Different Name Space
We could have two separate name spaces, maybe array and dict and you say array.filter(ary,pred) or dict.filter(dict,pred), depending on which you mean.

Between just these two, I much prefer the second, as it is consistent with Lua’s math, table and string functions, as I’ve mentioned above. My cunning plan is to build two tables with the desired functions, and you can use whichever one applies. In addition, I’ll try to rig them up to serve as metatables, which seems easy enough. Slightly less easy will be removing the duplication, but maybe we won’t even care about that.

I think it’s time to start working on the “real” version. First I’ll commit what we have: two versions of filter.

Now I’ll create a new tab, named fp, I guess, to contain the real stuff. And I’ll redo my few tests to refer to whatever is over there. My tentative plan is to create two name spaces ar and kv. We’ll want to discuss whether those are good names or not. They have good aspects and bad. For now, that’s where we’ll go.

New Tab

I’m going to keep our existing tests and write new ones, because the existing ones show me how to do the work. I’ll delete them as they become obsolete, that is, after I copy-paste what I need over into the “fp” tab. And I’ll start by using the name spaces, and then later attach the metatable. I think my spike will make that easy, but I’ll still hook it up as early as I can.

Let’s begin with filter, because it requires both name spaces.

        _:test("name space filter", function()
            local even = function(v) return v%2==0 end
            local input = {101,102,103,104}
            local ra = ar.filter(even)
            _:expect(ra[1]).is(102)
            _:expect(ra[2]).is(104)
            local rd = kv.filter(even)
            _:expect(rd[1]).is(nil)
            _:expect(rd[2]).is(102)
            _:expect(rd[3]).is(nil)
            _:expect(rd[4]).is(104)
        end)

I decided to put both versions in one test to emphasize the common and differing aspects. This should fail trying to index nil.

5: name space filter -- Main:113: attempt to index a nil value (global 'ar')

Let’s code:

-- fp
-- functional programming support
-- RJ 20220225

ar = {}
kv = {}

ar.filter = function(t,pred)
    local result = {}
    for _i,v in ipairs(t) do
        if pred(v) then table.insert(result,v) end
    end
    return result
end

This should pass the array bit and fail on looking for filter in kv, I think.

Er …

5: name space filter -- attempt to index a function value

Oh. I didn’t pass the tables in on the test.

        _:test("name space filter", function()
            local even = function(v) return v%2==0 end
            local input = {101,102,103,104}
            local ra = ar.filter(iinput, even)
            _:expect(ra[1]).is(102)
            _:expect(ra[2]).is(104)
            local rd = kv.filter(input, even)
            _:expect(rd[1]).is(nil)
            _:expect(rd[2]).is(102)
            _:expect(rd[3]).is(nil)
            _:expect(rd[4]).is(104)
        end)

Now then.

5: name space filter -- attempt to index a nil value

I do not know what that is. Perhaps ra is nil?

Oh. Two i’s in input. Nice. Fix that and get the error I expect:

5: name space filter -- Main:116: attempt to call a nil value (field 'filter')

Back on track. Implement kv.filter:

kv.filter = function(t,pred)
    local result = {}
    for k,v in pairs(t) do
        if pred(v) then result[k] = v end
    end
    return result
end

The test runs. Commit: filter implemented in ar and kv name spaces.

Now I think I can remove my old tests for filter. Commit: remove old filter tests.

Now for map. We could take the easy way out and implement it twice, once in each name space. But if we do, the two implementations will be identical and we’re at risk of changing one and not the other when we get some clever idea. Let’s write the test, duplicate the code, and then see if we can improve it.

        _:test("name space map", function()
            local double = function(v) return 2*v end
            local input = {12,20}
            local ra = ar.map(input,double)
            _:expect(ra[1]).is(22)
            _:expect(ra[2]).is(40)
        end)

Here’s the array one. Should fail not finding map.

4: name space map -- Main:92: attempt to call a nil value (field 'map')

Implement.

ar.map = function(t,f)
    local result = {}
    for k,v in pairs(t) do
        result[k] = v
    end
    return result
end

I expect this to work. Well, no, I forgot to apply the function:

ar.map = function(t,f)
    local result = {}
    for k,v in pairs(t) do
        result[k] = f(v)
    end
    return result
end

Also no, because I did the arithmetic wrong:

        _:test("name space map", function()
            local double = function(v) return 2*v end
            local input = {12,20}
            local ra = ar.map(input,double)
            _:expect(ra[1]).is(24)
            _:expect(ra[2]).is(40)
        end)

What really happened was that I meant to type 11 in the input, and 22 was right. I fixed it the other way. Test runs.

Now extend the test to show the other way:

        _:test("name space map", function()
            local double = function(v) return 2*v end
            local input = {12,20}
            local ra = ar.map(input,double)
            _:expect(ra[1]).is(24)
            _:expect(ra[2]).is(40)
            local rd = kv.map(input,double)
            _:expect(rd[1]).is(24)
            _:expect(rd[2]).is(40)
        end)

Should fail not knowing map.

4: name space map -- Main:95: attempt to call a nil value (field 'map')

We could have committed a moment ago, with ar.map working. Too late now until we make it work:

ar.map = function(t,f)
    local result = {}
    for k,v in pairs(t) do
        result[k] = f(v)
    end
    return result
end
kv.map = ar.map

It just occurred to me that I can just stuff the function into both tables, avoiding code duplication (with a bit of in-memory duplication, a key and pointer in the kv table).

Test should run. It does. Commit: ar and kv map implemented.

Now, if you’ll permit me, I want to take a break. Let’s do a mini retrospective first.

Mini-Retro

This is going nicely. It looks like we’ll wind up with two small name spaces, ar and kv, each implementing whatever functional programming functions we want to provide, including at least one more, reduce, to be written.

The two spaces will have the same functions, and the identical function under the same name where that’s feasible, and a unique version where one is needed. I anticipate that with the addition of an __index function, these tables will serve directly as metatables for any tables we want to use that way.

And I foresee implementing metatables and constructors allowing something like this:

local even = function(v) return v%2==0 end
local d = Dictionary{a=2, b=3}
d:filter(even)

We’ll see how that goes. I’m going on break. You can break too, and think about what might come next. Or not: I’m not the boss of you.

Later: Back from break!

Reduce

I think the next thing to do is reduce. This is a very powerful function–in fact, both filter and map can be implemented using reduce–and it’s good for things like summing up arrays and such. We’ll begin with a test, of course, which will help describe what it has to do.

        _:test("reduce", function()
            local sum = function(result, value) return result+value end
            local input = {1,2,3,4}
            local answer = ar.reduce(input, 0, sum)
            _:expect(answer).is(10)
        end)

The reduce function takes a table, an initial value, and a function that returns an updated value and applies the function to all the elements of the table to produce its result.

Let’s build it.

ar.reduce = function(t, init, f)
    local result = init
    for k,v in pairs(t) do
        result = f(result,v)
    end
    return result
end

Our test runs. Let’s make sure it works on kv as well.

        _:test("reduce", function()
            local sum = function(result, value) return result+value end
            local input = {1,2,3,4}
            local answer = ar.reduce(input, 0, sum)
            _:expect(answer).is(10)
            answer = kv.reduce(input,0,sum)
            _:expect(answer).is(10)
        end)

Will fail not knowing reduce:

5: reduce -- Main:105: attempt to call a nil value (field 'reduce')

We implement:

kv.reduce = ar.reduce

That’s all there is to it: use the same function. Test runs. Commit: ar and kv reduce.

I think we’ll sum up and print this.

Summary

This went quite smoothly, and honestly, I like what we have so far. As provided, we have two little name spaces, ar and kv, that implement filter, map, and reduce for arrays and key-value tables respectively. We have just a few tests, but I’m confident that the functions are good.

In the next release or two, I will address some or all of these issues:

  • Verify use of reduce in creating collections, more for documentation than checking;
  • Provide wrappers such as Array and Dictionary to provide the functions via metatables;
  • Add some additional functions such as exists and every.

Then I plan to look at XST and see whether importing it can improve some code. I expect that it will.

Overall, though, just a fun exercise in the face of reality.

I hope it distracted you for a moment. It has done so for me.

See you again soon!



  1. A motto for every occasion, that’s my real motto.