I’ve had an idea about a change–perhaps an improvement–to the class/method reporter. As I write this, even I don’t know what happens next.

It’s 0730. Since I had this idea around 4 AM, I’ve been quite reasonable about not leaping up at oh dark thirty to work on it. But now its time has come.

The idea is this: the class-method reporter we’re working on has some interesting code dealing with Codea’s two forms of comments. The original code looked like this:

            sc1,ec1=string.find(str1,"%-%-%[%[")  
            if sc1 ~= nil then
                startComment=true
            end            
            --sc2,ec21=string.find(str1,"%-%-%]%]")  
            sc2,ec21=string.find(str1,"%]%]")  
            if sc2 ~= nil then
                startComment=false
            end  
            
            comment=false
            sc3,ec3=string.find(str1,"%-%-")            
            if sc3 ~= nil then
                comment=true
                sc4,ec4=string.find(str1,"class")
                if sc4~=nil and sc4<sc3 then
                    comment=false
                end
                sc5,ec5=string.find(str1,"function")
                if sc5~=nil and sc5<sc3 then
                    comment=false
                end
            end  
            
            process=true
            if comment or startComment then
                process=false
            end 

It’s pretty easy to see what’s going on here. Ah, that reminds me. Dave wanted me to mention that he writes most of his programs while watching TV. He doesn’t preplan the programs, he just writes them. When the program works, he’s done. He was concerned that people would think he’s a terrible coder.

I don’t think, that, and I can tell you why. When we look at the code above, it is broken into small blocks, and each block does one conceptual thing. Most of the variables used in that block are unique to the block, and there are a few that are used in the next block or blocks. He doesn’t even reuse obvious temp variables like the scX ecX ones: they are all one-time assign, check, never to be looked at again.

Only the ones with true forward purpose are used again, such as comment and startComment, which are state flags telling us whether to process or not.

When we code in a recreational fashion–well, when I do–a thing commonly happens. Things are going well, and then suddenly they aren’t, and it takes me a long time to figure out why. When that happens enough–and with my brain it happens quite often–I come up with ways of programming that tend to prevent those mistakes.

When I’m programming recreationally, and I do that sometimes, I actually usually wind up with code that has much more to object to than the code we’re seeing here. I think the reason is that my good habits, lots of functions, longer names, and so on, are difficult to use recreationally: they require too much concentration to do them while watching TV. And I use tiny variable names, because it’s a pain to type on the iPad’s on-screen keyboard.

But the code above? It’s clear, it’s divided into sections that express ideas, and it doesn’t have many weird, long connections between variables at the top and the bottom.

If Dave were to ask me “given how I program, what one thing would make it better to read”, I’d probably suggest coming up with a simple naming convention for variables that have forward meaning and those that do not. An underbar on the truly local ones or vice versa. Some textual cue that this variable will be seen again.

Anyway, this isn’t bad code at all. It’s procedural and linear, but it is quite clear. Nothing I’m saying here should be taken as dissing the code, much less the person who wrote it. I’ve learned lessons from this code, and from other programs from the same author.

What I’m doing here is putting this perfectly reasonable code into a form that I prefer, a form that I believe is generally better for code that is going to be read and maintained, for a long time. For these small programs, what I do is probably too much. Doing what I do for large programs is more valuable, and may not even be enough.

Where Was I?

Oh yes, I had an idea about comments, probably triggered by the discovery yesterday of a comment issue where my code may not exactly reproduce what the original code did. Both programs get the same answer but they do it a bit differently, and the original program might well have had a case where the refactored one would have broken.

Anyway, it came to me at 0400 that the real thing with comments is that we just wish they weren’t there. Everything to the right of a “–” comment should be ignored, and everything between “–[[” and “]]” should be ignored.

Then I thought: could the Lua gsub, the global substitution function, get rid of comments for us? If so, couldn’t we just strip out the comments at the beginning, and then carry on as if there had never been any, simplifying everything?

I’m here to find out. I’m going to write at least two tests. The first one is just to get warmed up.

        _:test("gsub removes -- comments", function()
            local prog = "abcd--efgh ijkl\n"
            local res = prog:gsub("%-%-.*\n","\n")
            _:expect(res).is("abcd\n")
        end)

This test runs. Note that it substitutes anything starting with “–”, up to a newline, with a newline. I am thinking of doing this substitution on the entire tab contents at once. We can get a bit more information this way:

        _:test("gsub removes -- comments", function()
            local prog = "abcd--efgh ijkl\n"
            local res,numberOfSubs = prog:gsub("%-%-.*\n","\n")
            _:expect(numberOfSubs).is(1)
            _:expect(res).is("abcd\n")
        end)

The gsub returns a second result, the number of substitutions it has made.

At this point, I conclude that we can run this substitution on the whole tab contents and it will remove all the – comments. Let’s beef up this test to be sure it doesn’t go too far:

        _:test("gsub removes -- comments", function()
            local prog = "abcd--efgh ijkl\nMNOP\nQRST\n"
            local res,numberOfSubs = prog:gsub("%-%-.*\n","\n")
            _:expect(numberOfSubs).is(1)
            _:expect(res).is("abcd\nMNOP\nQRST\n")
        end)

This fails! The result is just “abcd\n”. Why? Because gsub matches are greedy: they find the longest possible match. There is a way to make them not greedy. Let’s see if I can find it. Ah:

        _:test("gsub removes -- comments", function()
            local prog = "abcd--efgh ijkl\nMNOP\nQRST\n"
            local res,numberOfSubs = prog:gsub("%-%-.-\n","\n")
            _:expect(numberOfSubs).is(1)
            _:expect(res).is("abcd\nMNOP\nQRST\n")
        end)

The change is “.-“ in the pattern rather than “.*”. That causes Lua’s pattern matching to stop on the first match, rather than the longest one it can find.

Let’s make the test a bit harder, with two comments.

        _:test("gsub removes -- comments", function()
            local prog = "abcd--efgh ijkl\nMNOP--XXXX\nQRST\n"
            local res,numberOfSubs = prog:gsub("%-%-.-\n","\n")
            _:expect(numberOfSubs).is(2)
            _:expect(res).is("abcd\nMNOP\nQRST\n")
        end)

Still works fine. Now the substitution for the “–[[” comments promises to be more tricky. First I’ll concoct an example:

        _:test("gsub removes --[[ comments", function()
            local prog = "abc--[[def\nghi\njkl]]mno\npqr\n"
            local res,numberOfSubs = prog:gsub("","")
            _:expect(res).is("abcmno\npqr\n")
        end)

Let me see if I can create a pattern for that.

I’m going to try this:

        _:test("gsub removes --[[ comments", function()
            local prog = "abc--[[def\nghi\njkl]]mno\npqr\n"
            local res,numberOfSubs = prog:gsub("%-%-%[%[.-%]%]","")
            _:expect(res).is("abcmno\npqr\n")
        end)

I’m replacing the comment with an empty string. I don’t expect this to work, though I am hopeful. My expectations are based on my long experience with patterns: I usually get them wrong.

6: gsub removes --[[ comments  -- OK

Well how about that?

Let’s make the test harder. (I think it is possible to nest these comments, and lord knows what happens if you put one inside a block string. We’re not going to deal with people who do things like that. But more than one block comment in a program is quite a reasonable thing to see.

        _:test("gsub removes --[[ comments", function()
            local prog = "abc--[[def\nghi\njkl]]mno\npqr--[[stu\nvwx\nyza]]bcd\n"
            local res,numberOfSubs = prog:gsub("%-%-%[%[.-%]%]","")
            _:expect(res).is("abcmno\npqrbcd\n")
            _:expect(numberOfSubs).is(2)
        end)

This test runs. I think I now know two gsub calls that will remove all the comments from a tab. If that’s the case in anger, we can simplify our program rather a lot.

We need a new function, that removes all comments from a string. And, we really owe it to posterity to test it. But these test strings are really hard to write. So let’s use our sample tab as the input:

-- Sample
-- RJ 20250525

--[[ some kind of comment
more comment
--]]

function sample(a,b,c)
    local x = 15
end

Barker = class()

function Barker:init(dog, name)
    self.dog = dog
    self.name = name
end

function Barker:speak()
    SoundMachine:sound(self:woof())
end

function Barker:woof()
   return "woof" 
end

Porker = class()

function Porker:init(piggie, piggieName)
    
end

--[[
function shouldIgnore()
end
--]]

math.nothing = function(value)
    return value
end

function nothing(value)
    return value
end

function bizarreCommentNotFound() --[[ comments aren't that clever.
    trailing comment --]]
end

This has a couple of comments of each type, so we should be OK testing against it.

We’ll have to figure out a way to get the proper result though.

One problem at a time. Ah, I know. I’ll save it to another tab, inspect it, and use it as the answer. Another golden result. OK, here goes.

        _:test("comment removal", function()
            local prog = readProjectTab("Sample")
            local res = removeComments(prog)
            saveProjectTab("Uncommented", res)
            _:expect(2).is(1)
        end)
    end)
end

function removeComments(prog)
    local step1 = prog:gsub("%-%-.-\n","\n")
    local step2 = step1:gsub("%-%-%[%[.-%]%]","")
    return step2
end

I’ll let this rip and see what I get.

Here’s what’s in “Uncommented”:




more comment


function sample(a,b,c)
    local x = 15
end

Barker = class()

function Barker:init(dog, name)
    self.dog = dog
    self.name = name
end

function Barker:speak()
    SoundMachine:sound(self:woof())
end

function Barker:woof()
   return "woof" 
end

Porker = class()

function Porker:init(piggie, piggieName)
    
end


function shouldIgnore()
end


math.nothing = function(value)
    return value
end

function nothing(value)
    return value
end

function bizarreCommentNotFound() 
    trailing comment 
end

That “more comment” at the top concerns me. Oh and we clearly have to run the –[[ first, lest the other guy delete them.

function removeComments(prog)
    local step1 = prog:gsub("%-%-%[%[.-%]%]","") -- must be first
    local step2 = step1:gsub("%-%-.-\n","\n")
    return step2
end

Now we get this:





function sample(a,b,c)
    local x = 15
end

Barker = class()

function Barker:init(dog, name)
    self.dog = dog
    self.name = name
end

function Barker:speak()
    SoundMachine:sound(self:woof())
end

function Barker:woof()
   return "woof" 
end

Porker = class()

function Porker:init(piggie, piggieName)
    
end



math.nothing = function(value)
    return value
end

function nothing(value)
    return value
end

function bizarreCommentNotFound() 
end

And that is exactly what I would hope for. Note that it even unwound that last one, which now WILL be found. We have a more compact and more capable comment remover.

I’m tempted to remove multiple blank lines but there’s no real point. Fix the test now to use the Uncommented rather than save it.

        _:test("comment removal", function()
            local prog = readProjectTab("Sample")
            local res = removeComments(prog)
            --saveProjectTab("Uncommented", res)
            local expected = readProjectTab("Uncommented")
            _:expect(res).is(expected)
        end)

Tests all run. Let’s plug in our new function and expect the golden to need to be changed.

function processTabContents(str, extra)
    local tab = {}
    comment=false
    startComment=false
    local uncommented = removeComments(str)
    for str1 in uncommented:gmatch("[^\r\n]+") do
        
        --startComment = commentState(startComment, str1)
        --comment = oneLineCommentState(str1)
        
        if not comment and not startComment then
            local s1,e1,s2,e2, s3,e3
            s1,e1=string.find(str1,"class")
            if s1~=nil then
                s2,e2=string.find(str1,")")
                if s2~= nil then
                    str1=removeSpaces(string.sub(str1,1,e2))                            
                    table.insert(tab,extra.."    "..str1)
                end
            else
                s2,e2=string.find(str1,"function ")
                if s2~=nil then
                    s3,e3=string.find(str1,")")
                    if s3~= nil then
                        table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
                    end
                end
            end
        end
    end
    return tab
end

I’ve kept the two flags, for now, to minimize the changes. Let’s run and check the result. We do expect a fail because of our new function being discovered.

1: Sample  -- Actual: 



                    Making : Sample
                            function sample(a,b,c)
                        Barker = class()
                            function Barker:init(dog, name)
                            function Barker:speak()
                            function Barker:woof()
                        Porker = class()
                            function Porker:init(piggie, piggieName)
                            function nothing(value)
                            function bizarreCommentNotFound(), Expected: 



                    Making : Sample
                            function sample(a,b,c)
                        Barker = class()
                            function Barker:init(dog, name)
                            function Barker:speak()
                            function Barker:woof()
                        Porker = class()
                            function Porker:init(piggie, piggieName)
                            function nothing(value)

Yes! Perfect! Everything the same except we discovered that very obfuscated example. A noticeable improvement. Press the Make Golden button and we’re good to go. Tests all run.

Let’s refactor to remove the flags and related ifs:

function processTabContents(str, extra)
    local tab = {}
    local uncommented = removeComments(str)
    for str1 in uncommented:gmatch("[^\r\n]+") do
        
        do
            local s1,e1,s2,e2, s3,e3
            s1,e1=string.find(str1,"class")
            if s1~=nil then
                s2,e2=string.find(str1,")")
                if s2~= nil then
                    str1=removeSpaces(string.sub(str1,1,e2))                            
                    table.insert(tab,extra.."    "..str1)
                end
            else
                s2,e2=string.find(str1,"function ")
                if s2~=nil then
                    s3,e3=string.find(str1,")")
                    if s3~= nil then
                        table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
                    end
                end
            end
        end
    end
    return tab
end

I just converted the if to do, to preserve my locals, not that the block structure matters, but it’s a bit more tidy for the moment.

Now there are two functions that we don’t need any more, so we can delete them:

function commentState(startComment, str1)
    -- may change or just return input
    local sc1,ec1,sc2,ec21
    sc1,ec1=string.find(str1,"%-%-%[%[")  
    if sc1 ~= nil then
        startComment=true
    end            
    --sc2,ec21=string.find(str1,"%-%-%]%]")  
    sc2,ec21=string.find(str1,"%]%]")  
    if sc2 ~= nil then
        startComment=false
    end
    return startComment
end

function oneLineCommentState(str1)
    local sc3,ec3,sc4,ec4,sc5,ec5
    local comment=false
    sc3,ec3=string.find(str1,"%-%-")            
    if sc3 ~= nil then
        comment=true
        sc4,ec4=string.find(str1,"class")
        if sc4~=nil and sc4<sc3 then
            comment=false
        end
        sc5,ec5=string.find(str1,"function")
        if sc5~=nil and sc5<sc3 then
            comment=false
        end
    end
    return comment
end

We can also remove the tests for those:

        _:test("commentState", function()
            local cs = false
            cs = commentState(cs, "foo--[[bar")
            _:expect(cs).is(true)
            cs = commentState(cs, "mumble")
            _:expect(cs).is(true)
            cs = commentState(cs, "]] baz")
            _:expect(cs).is(false)
        end)

        _:test("comment removal", function()
            local prog = readProjectTab("Sample")
            local res = removeComments(prog)
            --saveProjectTab("Uncommented", res)
            local expected = readProjectTab("Uncommented")
            _:expect(res).is(expected)
        end)

Tests should still run. And they do.

Sweet. Let’s commit: replaced state-driven comment code with gsub replacement.

Reflection

So. What happened here? It seems that the gsub approach is somewhat better than the old one, in that it handles one more difficult case. It is certainly more compact, and likely less error-prone, though it was not trivially easy to write: pattern matching is like that.

It’s compact. However it’s arguably a bit more obscure. If I were really adept with patterns, I could imagine going that way from the beginning, but I think Dave’s original approach is more likely as a starting point. It was OK as an ending point, as well, even though I’d argue this is “better”.

The new scheme surely uses more memory, somewhere. It creates a new string almost as long as the original tab. And, as written, it doesn’t even release that old string, until the tab processing function returns. We could address that if we needed to, but it might be a bit tricky and would look odd.

The new scheme is quite likely to be slower, again because it copies essentially the entire tab. That’ll be done at library speed, but it’s still not going to be zero.

So it is better? I think so, but then, I did it. If I didn’t think so, I’d remove it.

How about the whole program? Are we finished with it?

Let’s take a look at the remaining big chunk and see if there’s anything we can do with it.

One More Question – Columbo

function processTabContents(str, extra)
    local tab = {}
    local uncommented = removeComments(str)
    for str1 in uncommented:gmatch("[^\r\n]+") do
        
        do
            local s1,e1,s2,e2, s3,e3
            s1,e1=string.find(str1,"class")
            if s1~=nil then
                s2,e2=string.find(str1,")")
                if s2~= nil then
                    str1=removeSpaces(string.sub(str1,1,e2))                            
                    table.insert(tab,extra.."    "..str1)
                end
            else
                s2,e2=string.find(str1,"function ")
                if s2~=nil then
                    s3,e3=string.find(str1,")")
                    if s3~= nil then
                        table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
                    end
                end
            end
        end
    end
    return tab
end

We have here a function that calls a function, and then does stuff. We prefer a pattern where a function either does a thing, or just calls other functions, but not both. We can do a quick extract. Here’s that:

function processTabContents(str, extra)
    local uncommented = removeComments(str)
    tab = processUncommentedTab(uncommented, extra)
    return tab
end

function processUncommentedTab(uncommented,extra)
    local tab = {}
    for str1 in uncommented:gmatch("[^\r\n]+") do
        
        do
            local s1,e1,s2,e2, s3,e3
            s1,e1=string.find(str1,"class")
            if s1~=nil then
                s2,e2=string.find(str1,")")
                if s2~= nil then
                    str1=removeSpaces(string.sub(str1,1,e2))                            
                    table.insert(tab,extra.."    "..str1)
                end
            else
                s2,e2=string.find(str1,"function ")
                if s2~=nil then
                    s3,e3=string.find(str1,")")
                    if s3~= nil then
                        table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
                    end
                end
            end
        end
    end
    return tab
end

Tests run. Commit: extracted processUncommentedTab function.

Now we can remove that do, moving the locals upward …

function processUncommentedTab(uncommented,extra)
    local tab = {}
    local s1,e1,s2,e2, s3,e3
    for str1 in uncommented:gmatch("[^\r\n]+") do
        
        s1,e1=string.find(str1,"class")
        if s1~=nil then
            s2,e2=string.find(str1,")")
            if s2~= nil then
                str1=removeSpaces(string.sub(str1,1,e2))                            
                table.insert(tab,extra.."    "..str1)
            end
        else
            s2,e2=string.find(str1,"function ")
            if s2~=nil then
                s3,e3=string.find(str1,")")
                if s3~= nil then
                    table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
                end
            end
        end
    end
    return tab
end

But we still have a lot going on here. What’s happening inside that loop? We’re processing one uncommented line.

function processUncommentedTab(uncommented,extra)
    local tab = {}
    for str1 in uncommented:gmatch("[^\r\n]+") do
        processOneLine(str1,tab, extra)
    end
    return tab
end

function processOneLine(str1, tab, extra)
    local s1,e1,s2,e2, s3,e3
    
    s1,e1=string.find(str1,"class")
    if s1~=nil then
        s2,e2=string.find(str1,")")
        if s2~= nil then
            str1=removeSpaces(string.sub(str1,1,e2))                            
            table.insert(tab,extra.."    "..str1)
        end
    else
        s2,e2=string.find(str1,"function ")
        if s2~=nil then
            s3,e3=string.find(str1,")")
            if s3~= nil then
                table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
            end
        end
    end
end

I had to move the locals. Should have left them where they were, perhaps. But who knew?

Now we’re down to just one kind of, well, complicated function. Can we improve it?

There are some things that make that tricky. The fact that we have to try the find and then do the if makes the code a bit extra complicated and hard to refine. We could perhaps do better with two finds, one for the if and one for the data, but that seems like a poor idea.

Can we perhaps just find the string we’re looking for? Find does find a pattern, I think.

I’m confident in my tests, so let me just do this. The first branch of that if finds “class” and then “)” and if it does, outputs a line. Let’s see …

What I did didn’t work, and I think I’m missing a commit. Yes. Darn. Got to undo.

Easy enough. Commit: pulled out processOneLine function

One issue with what I did was that it changed the output from …

Whoa!

I hate to report this but while the tests do run, the D2 report does not. And I bet I know what’s happened. Let’s checkout some prior versions and see where the change occurred.

The defect “crept in” when I pulled out the process uncommented tab function, here:

function processTabContents(str, extra)
    local uncommented = removeComments(str)
    tab = processUncommentedTab(uncommented, extra)
    return tab
end

function processUncommentedTab(uncommented,extra)
    local tab = {}
    for str1 in uncommented:gmatch("[^\r\n]+") do
        processOneLine(str1,tab, extra)
    end
    return tab
end

The defect isn’t quite obvious, but what it is is that tab is ot local in processTabContents. And we’re still using the global tab up top.

I think this is the fix:

function processTabContents(str, extra)
    local uncommented = removeComments(str)
    return processUncommentedTab(uncommented, extra)
end

Test. Yes! All better. Commit: fix defect using global tab in processTabContents.

We need a stronger test suite. And we need to get rid of that global as well.

Reflection II

When we’re refactoring, globals can really be a problem. That’s exacerbated in Lua, whose default is that if it doesn’t recognize a variable, it’s a global. So when we forget to say local, bad things can happen.

Moving Right Along

Let’s first see if we can remove that global entirely.

It all starts here:

function report()
    project="D2"
    font("Courier")
    textMode(CORNER)
    dy=0
    tab={}   
    r=listProjectTabs(project)
    local tabData
    for a,b in pairs(r) do
        str=readProjectTab(project..":"..b)
        tabData = processTab(str,b)
        appendTable(tab,tabData)
    end
    table.insert(tab,"\n")
    table.insert(tab,"\n")
    table.insert(tab,"END of "..project)
end

If we were to make the table local, and return it, that would help a lot. And this function hasn’t been generalized yet, and it includes a lot of weird stuff about font and so on.

One problem at a time.

Make it local and return it.

function report()
    project="D2"
    font("Courier")
    textMode(CORNER)
    dy=0
    local tab={}   
    r=listProjectTabs(project)
    local tabData
    for a,b in pairs(r) do
        str=readProjectTab(project..":"..b)
        tabData = processTab(str,b)
        appendTable(tab,tabData)
    end
    table.insert(tab,"\n")
    table.insert(tab,"\n")
    table.insert(tab,"END of "..project)
    return tab
end

Change the button:

function setup()
    GlobalTable = {}
    dy = 0
    --report()
    parameter.action("Make Golden", function()
        makeGolden()
    end)
    parameter.action("report D2", function()
        GlobalTable = report()
    end)
end

function draw()
    background(40, 40, 50)
    fill(255)
    for a,b in pairs(GlobalTable) do
        text(b,100,HEIGHT-a*20+dy)
    end
end

I renamed the darn thing to “GlobalTable” and used that throughout. Now I wonder if tab is even defined.

When I type print(tab) in the console, I do get a table. Someone is still working with the tab. The table is empty.

Hm, this exists:

        _:test("Sample", function()
            local str = readProjectTab("Sample")
            local result = processTab(str,"Sample")
            tab = result
            local tabString = table.concat(result, "\n")
            local tstString = readLocalData("tab")
            _:expect(tabString).is(tstString)
        end)

I don’t like that. Remove it, see what breaks. Nothing breaks. tab still exists.

        _:before(function()
            project = "Making"
            font("Courier")
            textMode(CORNER)
            dy=0
            tab={}   
        end)

Ha! I did it to myself! Remove that. Test. Works. Commit: remove tab global everywhere.

Let’s see what other globals exist. That can be a long list. But we have a program here that displays long lists. Let’s add a button:

    parameter.action("dump globals", function()
        GlobalTable = {}
        for k,v in pairs(_G) do
            table.insert(GlobalTable,"                    "..k)
        end
        table.sort(GlobalTable)
        print(#GlobalTable)
    end)

There are 348 globals in _G. One of them is dy, which we use in our program. I don’t see any short names starting with s or b or e. So that’s good.

Commit: added Dump Globals button.

OK, where were we? Oh, yes, looking at this last function:

function processOneLine(str1, tab, extra)
    local s1,e1,s2,e2, s3,e3
    
    s1,e1=string.find(str1,"class")
    if s1~=nil then
        s2,e2=string.find(str1,")")
        if s2~= nil then
            str1=removeSpaces(string.sub(str1,1,e2))                            
            table.insert(tab,extra.."    "..str1)
        end
    else
        s2,e2=string.find(str1,"function ")
        if s2~=nil then
            s3,e3=string.find(str1,")")
            if s3~= nil then
                table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
            end
        end
    end
end

My first attempt to simplify that failed, not giving me the parens and superclass at the end of the class lines in the output.

We don’t have a decent test for that. Let me change the sample:

Porker = class()

To:

Porker = class(Animal)

Test should fail.

1: Sample  -- Actual: 



                    Making : Sample
                            function sample(a,b,c)
                        Barker = class()
                            function Barker:init(dog, name)
                            function Barker:speak()
                            function Barker:woof()
                        Porker = class(Animal)
                            function Porker:init(piggie, piggieName)
                            function nothing(value)
                            function bizarreCommentNotFound(), Expected: 



                    Making : Sample
                            function sample(a,b,c)
                        Barker = class()
                            function Barker:init(dog, name)
                            function Barker:speak()
                            function Barker:woof()
                        Porker = class()
                            function Porker:init(piggie, piggieName)
                            function nothing(value)
                            function bizarreCommentNotFound()

New result is correct. Make it golden.

Now let’s see what we can do about those if statements.

function processOneLine(str1, tab, extra)
    local s1,e1,s2,e2, s3,e3
    
    s1,e1=string.find(str1,"class%(.-%)")
    if s1~=nil then
            str1=removeSpaces(string.sub(str1,1,e1))                            
            table.insert(tab,extra.."    "..str1)
    else
        s2,e2=string.find(str1,"function ")
        if s2~=nil then
            s3,e3=string.find(str1,")")
            if s3~= nil then
                table.insert(tab,extra.."        "..string.sub(str1,s2,e3))
            end
        end
    end
end

This runs the tests. My error before was not escaping the leading paren, which made it into a group in the pattern.

Let’s do that sort of thing again. And let’s not store our result back into str1 while we’re at it.

This passes the tests:

function processOneLine(str1, tab, extra)
    local s1,e1,s2,e2, s3,e3
    
    s1,e1=string.find(str1,"class%(.-%)")
    if s1~=nil then
        local unsp = removeSpaces(string.sub(str1,1,e1))                            
        table.insert(tab,extra.."    "..unsp)
    else
        s2,e2=string.find(str1,"function .-%(.-%)")
        if s2~=nil then
            table.insert(tab,extra.."        "..string.sub(str1,s2,e2))
        end
    end
end

That function pattern is a little tricky. Note that it requires at least one space before the left paren. That had to be there because the original code looks for “function “ with a space. That keeps it from discovering functions like this one:

math.nothing = function(value)
    return value
end

I’m not sure how intentional that is. But when we’re refactoring, we’re trying to preserve behavior and this seems to do it. My first attempt, without the explicit space, did try to find the math.nothing line, but of course only said “function(value)” which wasn’t terribly interesting.

We can commit this: using different patterns in processOneLine.

Let’s review processOneLine one more time.

function processOneLine(str1, tab, extra)
    local s1,e1,s2,e2, s3,e3
    
    s1,e1=string.find(str1,"class%(.-%)")
    if s1~=nil then
        local unsp = removeSpaces(string.sub(str1,1,e1))                            
        table.insert(tab,extra.."    "..unsp)
    else
        s2,e2=string.find(str1,"function .-%(.-%)")
        if s2~=nil then
            table.insert(tab,extra.."        "..string.sub(str1,s2,e2))
        end
    end
end

Well, there are a couple of unused locals there. Remove s3,e3. I think I want to move the local declarations thusly:

function processOneLine(str1, tab, extra)
    local s1,e1=string.find(str1,"class%(.-%)")
    if s1~=nil then
        local unsp = removeSpaces(string.sub(str1,1,e1))                            
        table.insert(tab,extra.."    "..unsp)
    else
        local s2,e2=string.find(str1,"function .-%(.-%)")
        if s2~=nil then
            table.insert(tab,extra.."        "..string.sub(str1,s2,e2))
        end
    end
end

Tests run. Commit: adjust locals in processOneLine.

I don’t see a reasonable improvement to that function, because of the interaction of the finds returning a result, and the need to do the second check only if the first one didn’t happen.

Unless … unless we could just do the one and the other without checking to see if we had done anything.

That might leave us open to things like

Mysteriousfunction = class()

I suspect that will confuse us anyway. Should we write a test just to see? This test fails:

        _:test("mysteriousfunctionclass", function()
            local tab = {}
            local prog = "function mysteriousclass(xyz)"
            processOneLine(prog, tab, "")
            _:expect(tab[1]).is("        "..prog)
        end)
6: mysteriousfunctionclass  -- Actual:     function mysteriousclass(xyz), Expected:         function mysteriousclass(xyz)

It has four spaces in front of “function” rather than the expected eight. That’s because it was recognized as a class definition. I reckon the original program would do that as well.

Let’s beef up our pattern.

I tried this:

    local s1,e1=string.find(str1,"%w-=%w-class%(.-%)")

That fails to recognize the class names at all. I wonder why.

Replacing the %w- with single spaces works for our example. Of course it wouldn’t work for other numbers of spaces in the input. Do I not know what %w is? Right. Whitespace characters are %s. Try that.

All the tests pass, even my mysterious one:

        _:test("mysteriousfunctionclass", function()
            local tab = {}
            local prog = "function mysteriousclass(xyz)"
            processOneLine(prog, tab, "")
            _:expect(tab[1]).is("        "..prog)
        end)

Commit: class detection now checks for equal sign.

Enough for the morning. Let’s sum up.

Summary

For what it’s worth, we now have our class and function report tool running under test, and using no globals. We’ve gone from 96 lines including internal spaces, to about 79. We’ve gone from one function to nine. And in getting that information I found an amusing defect:

making

Look at that one line in there:

function -.%(.-%)

Where did that come from?

It comes from here:

        local s2,e2=string.find(str1,"function .-%(.-%)")

Sure enough, that pattern matches itself. Probably we shouldn’t be searching for .- there but for characters that can appear in a name. Is %w that? Does it include underbar? No, it’s upper and lower A-Z and digits 0-9, according to a random page on the Internet.

        _:test("w matches underbar", function()
            local line = "_"
            local s = line:find("%w")
            _:expect(s).is(nil)
            s = line:find("[%w_]")
            _:expect(s).is(1)
        end)

The test above confirms that theory. And shows us something that will match any legal lua name.

In for a penny, let’s beef up our patterns.

This is too strong:

        local s2,e2=string.find(str1,"function [%w_]-%(.-%)")

It fails to accept function Foo:bar(). Should we go all the way here? I think yes.

OK, I’ve got this:

        local s2,e2=string.find(str1,"function [%w_][%w_]-:?[%w_][%w_]-%(.-%)")

That matches … no, that will require more than one character in a function name. Not good. Need a test? Also, we’ve moved beyond the things I’ve checked in my small tests for patterns.

What I need is to be sure that test the same patterns that are used in the program, at least where I’m trying tricky matches.

Funny how I thought I was summing up, and suddenly I’m not as done as I might like to be.

No. Time’s up for today. Revert out that most recent change, leave the odd function discovery. More next time.

I kept the underbar test and removed use of the new pattern.

Returning to summing up …

The actual reporter file is smaller by a tiny margin. However, we also have 80 lines of test, and 50 lines of sample, and 50 lies of “Uncommented”, the Golden file for the comment removal code.

Getting this program under test control has essentially tripled the total lines of code!

Is this typical of what it takes to get a program arranged the way I like it? Probably not: usually I don’t have large golden master files. However, the tests being almost as large as the program? That’s not far off. Nor should it surprise us: if we want automated tests of all our code, we should expect the tests to be proportional to the code size.

Then why do we do this, if we’re going to double the size of things?

Curiously, there’s a simple answer: we do this because we have found that day in and day out, working this way lets us go faster.

Dave’s program is an example of a program one gets when not working this way. It’is a perfectly good program, works fine, is not very hard to understand. It wasn’t quite ready to be plugged into another application, because of its use of globals–but that wasn’t what Dave, or anyone, had in mind for it. Had that been in mind, it wouldn’t have been written quite that way.

Getting the program under control really only took us one tests, and the space required for a golden master. Th additional tests were written to provide a closer focus on details of the program, and only because we were consciously teasing it apart into smaller methods. On a real project, we would do that extraction over a longer period of time, and only when we actually had a need to change the program.

Nothing I’ve done in terms of adding capability to the program justifies the work I’ve put into it. I’ve added probably one very obscure case to what it handles.

What does justify it? For me, the learning from doing it, and the opportunity to build a useful analysis tool for the dungeon program, and the chance to write about it.

But it is fair to point out that I could have just used the program as is, and pointed it at D2 to get the information. I have more in mind for it, and even so, the effort of the past two days … and the next one or two … may well be more than the real problem deserves.

But I deserve to learn from this, and so do you. Maybe we learn the same things, maybe not. Let me know!

See you next time!


Making.zip