Estimation and a tiny bit of code. ScopeMaster took up my challenge.

Colin Hammond, ScopeMaster took up my estimation challenge, and based just on what I had written, generated a report estimating the COSMIC Function Points for the story. Now if I were cynical, or trying to win a debate, I could tear the original analysis apart: it was based on a number of assumptions that just aren’t true. But when I read it, it seemed strangely similar to my understanding of the actual problem.

The original analysis, viewed as a sort of myth about something not well understood, nonetheless seemed to have identified some of the bones of the skeleton in the dungeon. If I put myself in the shoes of someone who didn’t know how this program works, but did know how programs in general work, I could see how the report was a decent cut at guessing what must be going on behind the curtain.

Frankly, I was surprised and rather fascinated. In addition, the ScopeMaster program generates some really interesting graphics automatically, including even a CRUD-based “gap analysis” identifying areas that might possibly be missed in the spec.

Colin and I will be having a Zoom meeting later this morning. I imagine that I’ll tell him a bit about how the program actually works, and I hope he’ll take another cut at calculating the CFPs for this story. It took him about half a day last time, so it’s asking for a lot, but it’s interesting, and surprisingly cool.

That’s not to say that I’ll be recommending ScopeMaster, or any kind of estimation any time soon. I’m not saying that estimation is bad, nor that ScopeMaster is bad. I think that estimation is a weak tool in a real Agile effort, and so I prefer to keep focus on steering actively, not predicting where we might go. Combine that with my current work, which is, I guess, something about exploring incremental iterative development and writing about it, and we can’t expect me to get into the estimation biz any time soon.

But it is interesting. I’ll write more about it after our chat, and probably a few more times if we go forward with another cut at CFPs for my Dungeon story.

For now, let’s look at where we are with my Spike, and see what we might want to do in the half hour or so before the Zoom.

Spike, Zipping

I created a little Codea project, SpikeTableZipper, and sketched out a couple of operations in it. I think I’ll work in that Spike a bit longer, and so I’d better get it under source control. Hold on a moment while I do that.

It’s rare that I put a Spike into source control, but even just yesterday I tried two different versions of a function, so we’ll probably be glad it’s in WorkingCopy, and it’s harmless if we don’t make use of the git stuff.

Let’s see what we have here:

        _:test("zip without replacement", function()
            local control = {"a", "b", "c", "d", "e", "f"}
            local without = { "y", "y", "z"}
            local kinds = makeTables("kind", control)
            local d = dump(kinds)
            local expected = [[kind: a$$kind: b$$kind: c$$kind: d$$kind: e$$kind: f$$]]
            local zipin = {"u", "v", "w", "x", "y", "z"}
            local zipped = zip("loot", kinds, zipin)
            _:expect(#zipped, "wrong number of elements").is(#zipin)
        local zipexp = [[kind: a$loot: u$$kind: b$loot: v$$kind: c$loot: w$$kind: d$loot: x$$kind: e$loot: y$$kind: f$loot: z$$]]
            local zipdump = dump(zipped)
        _:test("one zip step", function()
            local kind = {kind="a", pain="intense"}
            local loot = "aLoot"
            local zipped = cloneWithNewKeyedValue(kind, "loot", loot)

The first test does demonstrate the basic notion that I’ve got in mine, which is that we’ll have arrays of the various parameters to creation of a dungeon object, and we’ll “zip” them together. So if you wanted an A, a B, and a C, and you wanted them to be associated with X, Y, and Z, you’d do something like:

zipped = zip({A,B,C}, {X,Y,Z})

And then zipped would be

{ {A,X}, {B,Y}, {C,Z} }

The idea is that we’ll just zip together the combinations we want and then feed the final resulting table into a loop that creates all the things.

I imagine that the zipping operations will all wind up in some kind of helper object, and I suspect we’ll need a number of helper methods to get arrays lined up, and to do random selections, and so on.

For example, suppose we want to have 15 objects which can be A or B or C, and we want seven As, three Bs, and five Cs. We might write


And get out:

{ A,A,A,A,A,A,A, B,B,B, C,C,C,C,C}

Then we could zip that, say, with 15 rooms selected according to some rule, and we’d be ready to create the As, Bs, and Cs into their assigned rooms.

Essentially what I’m spiking is a series of operations that seem useful to me, given what I know about the problem, which amounts to taking various lists of keys and a few standard selection operations, and creating the arrays that we need to create the objects.

Frequent readers will recognize that this is more “bottom-up” than I often recommend. I am more inclined, as a rule1, to write a test with the objects I have, and to work out what needs to be done in the context of my real objects.

In this case, there isn’t much “real object” support. We’re just going to call our basic object creation methods, which accept all (or most) of the parameters that our Level Designers want to control. At this moment, I can “see” zipping arrays together. I don’t see any domain objects to help us … yet.

You pays your money and you takes your choice. This time, my choice is to work out the operations on arrays.

One more observation. We just have two tests so far, and they are quite different. The first one is creating whole arrays and checking them. The checking is quite painful, owing to the limitations of CodeaUnit, but no matter what framework I have, creating whole arrays and checking them is going to be tedious.

The second test just tests one step of the “zip” operation, cloneWithNewKeyedValue, which copies its first argument (perhaps unnecessary, but safer) and then injects a key-value pair provided to the function. That’s the central operation in zip, and testing it here gives me reasonable confidence that I could perhaps get away with less testing at the array level.

I chose the phrase “get away with” intentionally. The zip function is pretty simple:

function zip(key, keyedTableArray, valueArray)
    assert(#keyedTableArray==#valueArray, "tables must have same length")
    local result = {}
    for i,keyedTable in ipairs(keyedTableArray) do
        local element = cloneWithNewKeyedValue(keyedTable, key, valueArray[i])
    return result

After checking that the inputs are of equal length, it just loops over the first one, and inserts the result of the clone... operation, returning the batch.

I might argue that we can “see” that this works and that we don’t really need to test it at all. I might also point out two or three possible mistakes that I am known to make in writing such functions, like failing to return the result.

So we probably can’t really “get away with” no tests, but testing the inner operations may give us enough confidence to keep the array-level tests simpler, maybe just checking a couple of elements or something.

OK, time to get ready for my zoom call.

Post Zoom

Colin and I had a nice hour’s chat. We plan to connect again toward the end of next week and see whether a second assessment of my story makes sense. I hope we’ll do that, but there’s some danger that I’ll be done by then.

What I’ll try to do with the existing assessment, and would really like to do with a second better one, is to show why it works, to the extent that it does. And I think that in fact it does work.

In a very real sense, all software is alike. When we’re working at the programming language level, from assembler to C to Java to Smalltalk, programs are made up of chunks of data and chunks of procedure relating to that data. Modern programs, of course, use objects to bind data together with the procedures that manipulate it, and we connect the objects with messages that ask them to perform their functions for us. And the objects can only do a few things, really, looking things up, storing things away, fetching information, and so on. And the procedures, of course, all come down to assignments, conditionals, loops. Different syntax, but in an important sense, all the same.

When we think of something like Alexa or Siri, we may at one glance have no idea how it could work, but then we think a bit more and we start to figure out the basics of what must be going on. Siri has a language recognition capability. It’s surely immense, but we can guess that it winds up providing some kind of table of information, and has probably identified keywords like “define” or “play” or “weather”. We then imagine that somehow, based on the keywords, it dispatches off to some kind of expert that deals with the other words in the request. “Play WUOM” goes to the play handler and sooner or later it searches out WUOM and gets 91.7FM and connects to it. Or something like that. And if the play handler returns a string, the voice response function says it out loud.

We might not be able to write any of those modules … and yet in a very important sense we know almost exactly how it must work.

COSMIC Function Points identify four kinds of “data movements”, Entry, Exit, Read, and Write. A bit of training and conversation, and we’re all likely to get the drift of those notions and it seems credible to me that people who understand software fairly well, or even requirements fairly well, would tend to count up roughly the same number of COSMIC Function Points.

If that were to happen, and I believe that it would, because even old style function points people often—but not always—tended to get roughly the same numbers, then those numbers would surely have something important to do with the size of the thing analyzed.

Now, clearly, we could come up with a measure that was useless. Perhaps the number of words in the spec. (Although even that might be somewhat indicative of the problem’s size …) But suppose that it turned out that CFPs were well-correlated with things like the size of the components in lines of code (in some language) or that a given team tended to produce roughly the same number of CFPs per unit time.

If that were to happen …. CFPs would be rather clearly measuring some essential dimensions of the thing.

Now, one’s initial reaction to this story may include the notion of airborne swine, but if we remember that we really do have a decent notion of how Siri and Alexa work, maybe we can suspend disbelief for a bit.

My Observation So Far

From the article a day or so ago, I got a sort of “echo” of what I knew about my program. I had to figure out from his words how they fit what I know, but as I did that, I could see that the ScopeMaster program was relating functional bits of the problem to each other in roughly the same way that they are really related in my program, and that the bits seemed to be sized roughly in accord with how big I consider them to be as stories.

That “echo” was quite interesting to me. I knew that Colin didn’t know how my program worked, and I knew that my description of what I was doing was really quite vague if you didn’t know details, and yet, somehow, the ScopeMaster program had things a little bit right.

So I hope we’ll do another pass, with Colin drawing a bit more detail from my “requirements”. Then I hope to look at the ScopeMaster analysis and try to show where it corresponds well with my actual project, and where it deviates. My gut feel is that it will correspond fairly well, and that the deviations will be due to mistaken assumptions about details of my situation that were unknown.

An analysis like this is obviously quite sensitive to what we include and what we exclude. If we’re using SQL, our data access CFPs will be very different from what they’d be if we assume that our program has to read individual records from a flat file, parse them into fields, fetch fields, match them, and so on and so on. We’d have many Entries, Exits, Reads and Writes to do that SQL does behind the scenes.

I expect that Colin’s analysis may get some boundary conditions wrong, and I expect that I’ll be able to look at the ScopeMaster output and draw a circle somewhere and say “the system just does this already, we can ignore it” or to put a mark somewhere and say “this bit is much harder, because we have to code up blah and mumble”.

The question in my mind is, and will be, “how good can this thing be”, not “how can I prove this thing wrong”. I’m curious and interested.

That’s not to say that I plan to start selling ScopeMaster. I do believe that Agile efforts work best with much less mechanism involved—individuals and interactions over processes and tools—and that tracking any kind of progress focuses us on prediction over steering, and that’s not ideal.

Further, I believe that pulling a metric like this out of the team leads to all the well-known abuses of estimation, which you can read about at your leisure. So in many situations, ScopeMaster might be putting razor blades into the hands of babies, and I am not in favor of that.

But it surprised me in a favorable way. I’ll report further when I know more.

  1. Maybe more of a guideline than an actual rule …