No, not smiles and frowns. Algebra. At this moment I don’t think much code will be done today. Feel cute, might delete later.

I am facing an operational concern. I want to draw some pictures about set expressions and transformations, and I want to tell you about it as I do it. However, to draw I do better to remove the iPad from its “Magic” keyboard, and to type I do better to have it connected. Let me show you what happens when I draw with the iPad mounted to the keyboard:

horrible drawing of three boxes A RESTRICT B

The angle of the screen is wrong, tilted back maybe 30 degrees, and when you draw it wobbles. Together with my uncertain drawing, well you see what you get. I guess I’ll go into drawing mode and perhaps type notes to be expanded when I’m done drawing. I only want a few pictures anyway, I think.

Here’s what I drew:

complex diagram showing optimization

Not much prettier, but I can read it and it wasn’t so awkward. If I were an artist I’d have written a bookk full of pictures. Oh, wait, I did.

As I indicated with the red marks, I’m not sure that this all quite lines up, but the idea is this:

Given a set A of records, in a tuple (i.e. indexed by integers), to be restricted by a set B of city names, if there is an index of A, consisting of a set of sets of record numbers with city name scopes, we can get the answer by finding the record numbers for all the cities in B, and then doing a Scope Restrict operation, which selects elements based on their scopes.

If the scopes are integers, and the set A is a tuple, indexed by integers, then the desired records can be delivered by an indexing operation if the set is in memory, or a direct read if the set is on a drive.

We can get that set of integers by scope-restricting the index set by city name, and then (somehow) munging those sets all together into a suitable form for the Scope Restrict against A. (I’ve indicated that operation by “Reduce Rank”, but I’m not sure that’s what’’s needed.)

I hope that it is vaguely clear, if that’s a concept, that with a fairly simple operation or two, we can get the right numbers out of the index, and another simple set operation can fetch the records. The trick, in Extended Set Theory, is getting the “shapes” of the sets to be the shape expected by the various set operations, or, conversely, providing set operations that expect the sets we have.

If we were just coding this up, we’d look at our restricting set, see what field (City) we were restricting on, look in our indexes to see whether we have a City index, and if we have, restrict out the cities we need, resulting in a bunch of record numbers, which we’d use to fetch the records.

We’d have a bunch of code to do that, and because we’re good programmers, with a bit of testing, we could make it work.

(I have in the past made such things work, and I’m sure you have as well. I have sometimes had a very hard time getting the last bug out of such things. Perhaps you’ve had that experience as well. I hope not, but maybe you can relate.)

In XST, our set operations are generally quite small and tight and well tested. If we need another slightly different one, we know how to test and write it. The operations all fit together like Lego blocks, and they typically produce either exactly the right answer or total trash, depending on whether they are given inputs of the proper shape or not.

So we expect, with XST, that the series of operations we produce will work perfectly when it’s the right series, and fail utterly until we get it right. And … it’s only three or four operations, not many lines of custom-written code.

So we expect, if we’re doing XST right, that it’ll be quite easy.

An Existing Starting Example

We do have a starting example of that already, represented in this code:

function XSet:selectFieldValue(scope,value, index)
    local ix = self:findSelectIndex(scope,value,index)
    if ix then
        return self:scopeRestrict(ix)
    else
        return self:select(function(record, recordScope)
            return record:hasAt(value,scope)
        end)
    end
end

function XSet:findSelectIndex(scope,value,index)
    if not index then return nil end
    local path = XSet:tuple{"INDEXES",scope,value}
    return index:elementProjectPath(path)
end

function XSet:tuple(tab)
    local result = XSet()
    result.data = XTuple(tab)
    return result
end

function XSet:elementProjectPath(aTuple)
    local result = self
    for pathElement,_scope in aTuple:elements() do
        result = result:elementProject(pathElement)
        if not result then return nil end
    end
    return result
end

Now that code isn’t crystal clear, I’ll be the first to say that. That’s in part because finding the right index is a tricky problem, and in part because it is working down a nested path of elements, and in part, because it’s a first ad-hoc implementation.

We do have a test for it. Let’s examine the test.

        _:test("indexedRestrict", function()
            local r1 = XSet():addAt("Ron","name"):addAt("MI","state")
            local r2 = XSet():addAt("Bill","name"):addAt("VA","state")
            local r3 = XSet():addAt("Pat","name"):addAt("MI","state")
            local r4 = XSet():addAt("Chet","name"):addAt("GA","state")
            local people = XSet:tuple{r1,r2,r3,r4}
            local mi = XSet():addAt(1,NULL):addAt(3,NULL)
            local va = XSet():addAt(2,NULL)
            local ga = XSet():addAt(4,NULL)
            local indexes = XSet():addAt(mi,"MI"):addAt(va,"VA"):addAt(ga,"GA")
            local stateSet = XSet():addAt(indexes,"state")
            local allIndexes = XSet():addAt(stateSet,"INDEXES")
            local descoped = people:scopeRestrict(mi)
            _:expect(descoped:card()).is(2)
            local desiredIndex = indexes:elementProject("MI")
            local usedIndex = people:scopeRestrict(desiredIndex)
            _:expect(usedIndex).is(descoped)
            local oldSR = people.scopeRestrict
            local done = false
            people.scopeRestrict = function(A,B)
                done = true
                return oldSR(A,B)
            end
            local selected = people:selectFieldValue("state","MI", allIndexes)
            people.scopeRestrict = oldSR
            _:expect(done,"index not used").is(true)
            _:expect(selected,"wrong result").is(descoped)
        end)

So the size of this thing tells us that what we’re doing is tricksy.

Let me put in some commentary so that we can talk about what’s going on here.

        _:test("indexedRestrict", function()
        -- create four person records
            local r1 = XSet():addAt("Ron","name"):addAt("MI","state")
            local r2 = XSet():addAt("Bill","name"):addAt("VA","state")
            local r3 = XSet():addAt("Pat","name"):addAt("MI","state")
            local r4 = XSet():addAt("Chet","name"):addAt("GA","state")
            
        -- create set of the four people as a tuple
        -- indexed 1,2,3,4
            local people = XSet:tuple{r1,r2,r3,r4}
            
        -- hand-craft indexes, 
        -- record numbers for MI, VA, GA records
            local mi = XSet():addAt(1,NULL):addAt(3,NULL)
            local va = XSet():addAt(2,NULL)
            local ga = XSet():addAt(4,NULL)
            
        -- build index control set
            -- indexes { {1,3}@MI, {2}@VA, {3}@GA }
            local indexes = XSet():addAt(mi,"MI"):addAt(va,"VA"):addAt(ga,"GA")
            
            -- { { {1,3}@MI, {2}@VA, {3}@GA }@state }
            local stateSet = XSet():addAt(indexes,"state")
            
            -- { { { {1,3}@MI, {2}@VA, {3}@GA }@state }@INDEXES }
            local allIndexes = XSet():addAt(stateSet,"INDEXES")
            
            -- just fetch people using the index directly
            local descoped = people:scopeRestrict(mi)
            _:expect(descoped:card()).is(2)
            
            -- get the desired index, MI, directly
            local desiredIndex = indexes:elementProject("MI")
            -- use it ensuring same result
            local usedIndex = people:scopeRestrict(desiredIndex)
            _:expect(usedIndex).is(descoped)
            
            -- monkey patch to see where we went
            local oldSR = people.scopeRestrict
            local done = false
            people.scopeRestrict = function(A,B)
                done = true
                return oldSR(A,B)
            end
            
            -- the real test: 
            -- selectFieldValue should find and use index
            local selected = people:selectFieldValue("state","MI", allIndexes)
            -- remove monkey patch
            people.scopeRestrict = oldSR
            
            -- check answers
            _:expect(done,"index not used").is(true)
            _:expect(selected,"wrong result").is(descoped)
        end)

Most of the horribilitude of that test is in the setup, building the data and indexes. Note the complexity of the index set: it is highly nested:

{ { { {1,3}@MI, {2}@VA, {3}@GA }@state }@INDEXES }

It needs to be that complicated, because a given set might have auxiliary structures that are not INDEXES, because it might have INDEXES by more keys than state, and because each index is a set of tuple numbers indexed by the specific value of the key (state).

Hard to Think About

This stuff is hard to think about in any form, and set theory may make us more precise, but it doesn’t make the thinking any easier. It’s not easy to keep the structures in mind, even though it’s actually fairly simple:

MI-state-indexes = info["INDEXES"]["state"]["MI"]

All that path-select stuff is basically just figuring out what path we want (info->INDEXES->state->our-state), and then following it, returning nil (no such index) if we don’t find it.

What we have in this example is a hand-crafted ad-hoc tested but not proven refactoring, done on the fly. We refactor a specific selectFieldValue into a scopeRestrict if we have an index, and leave it as a long-hand search otherwise:

function XSet:selectFieldValue(scope,value, index)
    local ix = self:findSelectIndex(scope,value,index)
    if ix then
        return self:scopeRestrict(ix)
    else
        return self:select(function(record, recordScope)
            return record:hasAt(value,scope)
        end)
    end
end

What we’d like to have, and at this writing I don’t know how to get it, what we’d like to have is a way of expressing a “plan” for getting a specific result, such as

Restrict(A,B)

We might express it as a table or a list of some kind and then ultimately we want to rewrite that plan as a better plan, something like

b = Scopes(B)
if Singleton(b) then
    Z = info[INDEXES][b]
    Y = ScopeRestrict?(Z,B)
    X = ReduceRank?(Y)
    ScopeRestrict(A,X)
else
    Just do it hard way
end

For various kinds of top-level operations that can be sped up by auxiliary structures, we would like to have a rather formal rewrite rule. As we see in the “rule” above, the rule will do set operations on the auxiliary structures (called “info” here), and make hopefully simple decisions and then emit standardized new tables, which are then finally interpreted and voila! the right answer comes out.

Approach

I think I’ll approach this part of the XST exercise, to begin with, in a separate Codea program that’s focused on expressions in set theory, and transformations thereof. I’ll continue that until I understand enough to do it for real, or, ideally, until what we have in the new program is ready to “just plug in” to the XSet program.

Note the “just” in the above. Assume trouble ahead whenever anyone, including yourself, says “just”.

Perhaps we’ll start that tomorrow, or perhaps I’ll manage to think of some new set operations that need exploring before working on the expressions. I’m expecting to be confused for a while with this next step, and while part of me wants to dive in, another part is enjoying the experience of things going smoothly.

Maybe we can find a way to go smoothly with the new scheme. To do that, we probably have to come up with a simpler rewrite rule than the one we have in hand.

And there is another path that may need to be explored: flat data files. It would be great if there were a way to create a large flat-formatted file on the iPad, and use this smart rewrite stuff to access it. To do that, I’d have to learn how to use Lua’s file I/O, and the iPad’s. Or, of course, I could probably fake it with copy/paste, as I did for the CSV data.

We’ll see. For now, the thinking and code review are enough. Now I need to do some first of the year admin activities. We’ll see tomorrow what tomorrow brings.

I hope to see you then!