Aberration, Probably

Save me, I’m thinking about Extended Set Theory again.

Probably because Craft has ticked me off after the Dungeon hex problem has ticked me off, and probably because every few years it comes around on the piano of my mind, I was thinking about Extended Set Theory, this time in Codea Lua.

I last wrote anything serious about extended set theory in 2005. Around the 198x-1990 time frame I was deeply into it. Extended Set Theory (XST) was invented by Dave Childs, then at the University of Michigan. It is a reformalization of set theory that resolves the problem of ordered pairs.

Typically the ordered pair <a,b> is defined in set theory as {a, {a,b}}. Generally you are supposed to tilt your head at that and then move on. But this definition, and all the definitions provided up until Dave lead to problems. For example, what is the intersection of <a,b> and <a,c>? Yes, well, let’s not bother ourselves with that.

Childs’ formulation was a mathematical exercise, sort of, but it also seems that it was aimed at computer database software, as it seems to offer a more formal and compact way of dealing with data structures. If we look at relational databases, they are commonly implemented with structures that cannot be usefully defined in terms of relational algebra, indexes and the like. With XST, it can be sets all the way down, which is “better”.

XST has been a bête noire for me throughout my career. I’ve tried, with various teams, to implement it at least three times with what I might call “limited” results. Once I was invited to leave the company before the product got done, though it did in fact get done. Once the company folded before we could get done. And once, a pal and I wrote a very powerful set program for the PC and sold, if I correctly recall, six copies. Oh, and there is a series of articles on this site where I played with Ruby and set theory for a while.

You’d think I’d learn better. But Lua is interesting and I think it might be fun. Let’s talk about what it is.

Objectives

Have Fun: I think implementing whatever we implement here will be fun, because we’ll encounter some interesting issues and probably have to invent some things that are deep in the bag.
A Product: No. I see no possibility of a product here.
OO Learning: Yes. I think we’ll wind up with an interesting hierarchy of objects.
FP Learning: Also yes. I think we’ll probably want to create some functional programming concepts around our collections.
Useful: This is barely possible. The iOS file capability is pretty weak, so it’s not clear that anything useful will come of this.
Whimper: This series will probably die with a whimper, not with a bang. Like much of my art, it will be abandoned, not finished.
Net Net Net: This is likely to be an interesting way to observe how I try to solve ill-defined problems, and to laugh at the way that I fall into obvious traps.

Let’s just get started. I’ll try to avoid as much deep math as I can.

Getting Started

XST begins at the bottom of set theory. In set theory we say that “x is an element of Y”, when we mean that the set Y has x in it. In XST we have two notions, element and scope. This is usually written with the scope as a superscript not to be confused with exponentiation).

The format is

x ∈₁ Y

when we’re saying that x is in y at scope 1. but when we show the set contents, we write:

{ x¹, y², z³ }

By the way, that happens to be the format definition of the ordered triple <x,y,z>. So now you know extended set theory.

However, typing that into my articles is a pain, and we have no possibility of super and subscripts in the code, so we’ll be devising a syntax that’s easier to type as we go.

I will typically write x elt-a Y for x ∈<sub>a</sub> Y, and probably will write the above set as

{ x^1, y^2, z^3 }

Again, scope is not exponentiation. In case of confusion, I’ll try to be clear. Or, this just occurs to me, how about

{ x@1, y@2, z@3 }

Let’s try that. Belay the ^.

We generally use the scope as the “name” of a record or element. We could write:

{ Jeffries@last, Ron@first, Pinckney@city }

Formally, that just means the set with those three elements and three scopes, but you and I realize right away that this is a database record about me. Or at least part of one.

One caveat that I want to remind myself of. Formally, there is no meaning to this:

x@1

or to this

Jeffries@last

The element-scope thing can only appear inside a set. It’s not a free standing thing. There is an element x or Jeffries, and scopes 1 or last, but the combination doesn’t stand alone: it only refers to a particular occurrence of the element with a particular scope in a particular set.

This probably turns out to be important and it certainly means that whatever objects we build to represent the element and its scope inside a set will be somehow “under the covers”.

I mention it now because I think it’ll come up.

Before we get started suffice to say that there are lots of set operations, the good old union and intersection, of course, but many more, intended to make it possible to manipulate more highly-structured sets like this one:

{
  { Jeffries@last, Ron@first, Pinckney@city}@1  
  { Hendrickson@last, Chet@first, Atlanta@city }@2
  { Smith@last, John@first, Atlanta@city }@3
}

What will we implement? I’m not sure. We need some stories and a program.

The Program

I create a new Codea program and, with some difficulty, put it under source control with Working Copy. (Working Copy seems to have made some changes that cause it sometimes not to be able to create HEAD.) Anyway we’re there now.

I’d like to get the product to doing something that seems real ASAP.

Stories

Given a set of people, select elements based on a Restrict operation using another set of partial records.
Provide at least two alternate storage structures for the above operation.

Those two stories seem to be rather large, but I’m hoping that I’m smarter than I probably am. Let me explain the first one.

Given the sets below:

people = {
  { Jeffries@last, Ron@first, Pinckney@city}
  { Hendrickson@last, Chet@first, Atlanta@city }
  { Smith@last, John@first, Atlanta@city }
  { Jones@last, John@first, Pinckney@city }
}
sel = {
  { Jeffries@last }
  { Atlanta@city }
}

We want people:restrict(sel) to produce:

{
  { Jeffries@last, Ron@first, Pinckney@city }
  { Hendrickson@last, Chet@first, Atlanta@city }
  { Smith@last, John@first, Atlanta@city }
}

Note that the shape of the set sel is particularly interesting, because it’s not a relation. It’s a perfectly good set, however.

I think this story is a big one and that we’ll be slicing rather a lot before we get there, but we’ll try to work in as lean a fashion as we can.

Let’s get started.

Getting Started

I’m going to TDD this thing, of course. What can I possibly test?

Well, the definition of restrict is that A:restrict(B) returns all the elements of A such that there is an element b in B and A :intersect(B) is B. That is, there’s a B all of whose “fields” match the record in A.

So we’re going to need intersect. And, of course, we’ll need sets with elements and scopes.

I’ll start with this:

        _:test("Is element", function()
            local A = XSet()
            A:addAt("Jeffries", "last")
            A:addAt("Chet", "first")
            _:expect(A:hasAt("Jeffries", "last"), "has").is(true)
            _:expect(A:hasAt("Chet", "last"),"hasnt").is(false)
        end)

I’m using addAt and hasAt with the sense that the first argument is the element and the second the scope.

This test should explode for a while. I predict it fails looking for XSet, and it does. Code

XSet = class()

Expect fail on addAt.

2: Is element -- Tests:20: attempt to call a nil value (method 'addAt')

Implement:

function XSet:addAt()
    
end

Expect fail on hasAt.

2: Is element -- Tests:22: attempt to call a nil value (method 'hasAt')

Implement:

function XSet:hasAt()

end

Expect the test to fail expecting true and getting nil.

2: Is element has -- Actual: nil, Expected: true

Return true, expect fail on false:

function XSet:hasAt()
    return true
end

2: Is element hasnt -- Actual: true, Expected: false

OK, what happened there? I wrote just about the minimum code to get the test to pass a bit better each time. Now I have the full framework for the methods in place, but of course they don’t receive arguments or use them. Let’s try to do a little bit of actual coding.

I swear I just typed this in and it worked:

function XSet:init()
    self.contents = {}
end

function XSet:addAt(element, scope)
    if not self.contents[scope] then
        self.contents[scope] = {}
    end
    table.insert(self.contents[scope], element)
end

function XSet:hasAt(element, scope)
    local tab = self.contents[scope]
    if not tab then return false end
    for _i,elt in ipairs(tab) do
        if elt == element then return true end
    end
    return false
end

At this writing, the contents of a set is a table. The table is indexed by the scopes provided, whatever they are, and the contents at each scope are another table, and that table contains all the elements that were added at that scope location.

I think this is problematical. In particular, if we add something twice at the same scope, it will appear in the contents[scope] table twice. There is a deeper issue as well, having to do with nesting. Our people table above is a set containing sets. We will surely have to check sets for equality.

I think we’ll burn that bridge when we come to it.

Let’s push next toward intersect, now that we can check whether a set has given elements. I think that will lead us to the equality bridge soon enough.

        _:test("Intersect", function()
            local A = XSet()
            A: addAt("Jeffries", "last")
            A: addAt("Ron", "first")
            local B = XSet()
            B: addAt("Ron", "first")
            local C = A:intersect(B)
            _:expect(C:hasAt("Ron", "first")).is(true)
            _:expect(C:cardinality()).is(1)
        end)

Expect a fail looking for intersect:

3: Intersect -- Tests:32: attempt to call a nil value (method 'intersect')

Implement. I’d like to write this:

function XSet:intersect(anXSet)
    local result = XSet()
    for scope, element in pairs(anXSet) do
        if self:hasAt(element, scope) then
            result:addAt(element, scope)
        end
    end
    return result
end

(I put scope first there, because I think of it as the key. That may cause me trouble. We’ll see.

A bigger problem is that I don’t see how to implement pairs on an XSet. I think we’ll really need to work out how to do iterators. For now let’s write it out longhand. And for now, I’ll violate the parameter a bit and pull out its contents.

I expanded my non-working first cut to this:

function XSet:intersect(anXSet)
    local result = XSet()
    local cont = anXSet.contents
    for scope,elements in pairs(cont) do
        for element in pairs(elements) do
            if self:hasAt(element, scope) then
                result:addAt(element, scope)
            end
        end
    end
    return result
end

It fails, returning false, that Ron@first isn’t in there.

Let’s implement cardinality (size), and see if, as I suspect, there’s nothing in the result.

function XSet:cardinality()
    local card = 0
    for scope,elements in pairs(self.contents) do
        card = card + #elements
    end
    return card
end

I can use # there because I use table.insert to create the elements tables. At least that’s my story.

3: Intersect  -- Actual: 0, Expected: 1

Does cardinality really work? I don’t know, but I can find out.

        _:test("Intersect", function()
            local A = XSet()
            A: addAt("Jeffries", "last")
            A: addAt("Ron", "first")
            _:expect(A:cardinality(), "#A").is(2)
            local B = XSet()
            B: addAt("Ron", "first")
            _:expect(B:cardinality(), "#B").is(1)
            local C = A:intersect(B)
            _:expect(C:hasAt("Ron", "first")).is(true)
            _:expect(C:cardinality()).is(1)
        end)

The other checks work, so I think we can be sure that what’s failing is that intricate intersect. I think I’ll just do some prints.

That quickly tells me that I’m checking 1@first, not Ron@first. Yes, I forgot the index in my call to pairs:

function XSet:intersect(anXSet)
    local result = XSet()
    local cont = anXSet.contents
    for scope,elements in pairs(cont) do
        for _i, element in pairs(elements) do
            if self:hasAt(element, scope) then
                result:addAt(element, scope)
            end
        end
    end
    return result
end

My tests now run. Well past time to commit. But first, I’m already tired of mistyping “cardinality”. Rename that to card. Commit: XSet class has addAt, hasAt, card, and intersect methods.

But Now …

But now, I think we’d like to do something about that double loop iterating the set. I have two ideas.

The “right” idea, is probably to learn (again) how to do iterators in Lua and then implement next, which I believe is how one would do it.

There’s a semi-right idea as well, however, which would be to iterate a function call. I think I can do that right now, and then be in shape for studying iterators (again).

        _:test("Iterate", function()
            local A = XSet()
            A:addAt("Jeffries", "last")
            A:addAt("Ron", "first")
            A:addAt("Hughes", "last")
            local count = 0
            A:iterate(function(element, scope)
                count = count + 1
            end)
            _:expect(count).is(3)
        end)

That’ll be enough to get me started. Expect fail on iterate not existing.

4: Iterate -- Tests:45: attempt to call a nil value (method 'iterate')

Implement empty:

function XSet:iterate(f)
    
end

Expect count wrong.

4: Iterate  -- Actual: 0, Expected: 3

Code it:

function XSet:iterate(f)
    for scope,elements in pairs(self.contents) do
        for _i, element in ipairs(elements) do
            f(element, scope)
        end
    end
end

That passes. Recode card:

function XSet:card()
    local card = 0
    self:iterate(function(e,s) card = card+1 end)
    return card
end

This is slower, since it no longer uses the # trick but it is also safer, I think.

Record intersect to use iterate.

function XSet:intersect(anXSet)
    local result = XSet()
    local cont = anXSet.contents
    anXSet:iterate(function(element,scope)
        if self:hasAt(element,scope) then
            result:addAt(element, scope)
        end
    end)
    return result
end

So that’s a bit nicer. Commit: implement iterate, use in card and intersect.

And Now?

And now we’re facing an issue. Suppose we were to do union. The union of two sets contains all the elements of the one and all the elements of the other. What if they each have an element in common?

If we implement that naively, we could wind up with a set like this:

{ Jeffries@last, Ron@first, Ron@first }

Wouldn’t that be bad? Well, formally, it isn’t bad, because if we ask whether Ron is at first in the set, we’ll get the answer yes and there you go. The fact that it’s in there twice isn’t mathematically interesting.

But if They were to use iterate to print the set, well then, They’d see Ron@first twice, and They might not like that. There are two well-known solutions to this concern.

One is to implement all the operations so that they never insert duplicates. Depending on how things are stored, that can be easy or hard, fast or slow. The other solution is to implement a function, say weed, that removes all duplicates if there are any, and to use that function when it matters.

Both solutions have merit, and of course both invite a sort of status bit on an XSet saying whether it is already weeded.

Today, I don’t propose to choose either of these approaches, but depending on how we implement our restrict test, I can see it coming up. Suppose we select on Ron@first and Pinckney@City. Depending on the specifics of how the loops are written we might find my record and insert it into the answer twice.

We’ll see what happens. I’m sure it’ll be interesting.

Summary

That’ll do for today. I tried to be really low-key with my tests, predicting the failures, implementing just enough to get the next failure, and testing everything I could think of. (I’m sure there are more things to test but I feel that what we have is pretty solid.)

I picked the first storage scheme that came to mind, a table of elements stored by scope. There are certainly other possibilities, including scopes stored by elements, and some kind of pair stored once and done. We would like to wind up not caring about how things are stored, except that addAt, hasAt, and iterate need to all make the same assumptions about how things are stored.

We might even have multiple storage schemes. We might store things à la JSON or CSV. I’m hoping we’ll get to at least one alternate scheme, since I already have a story for it.

OK, I had fun today, unlike beating my head against hexagons or physics fails.

See you next time!