On the plane to Florida, I got an idea for how to clump adjacent pixels to represent a single hit. Apres moi, le rat hole. Here's a report ...

The Problem

Our code reads the pixels from the input file, and stores them in objects called Hit, each with an x,y pair representing where the pixel was found. But a given shot pellet, blasting through the paper, results in more than one pixel in the picture, almost every time. The average pixels per real hit, to the naked eye, appears to be about 5.

The Customer (Chet with his Customer Hat perched jauntily on his head) wants to analyze actual hits, not pixels. He may even come up with a scheme where if there are, say, ten pixels all together, that hole will be identified as coming from two hits … and so on.

Therefore, we need to write some code that translates our many pixels into some smaller number of hits, as identified by the program. As a first story, we want to identify all the contiguous pixels, no matter how many, as a single hit.

The Scheme

Our code reads and records single pixels as a list of Hit objects. We could post-process that list, somehow consolidating adjacent ones. In principle that wouldn’t be hard, but since our first picture has about 5,000 pixels in it, I’m worried that we’d wind up doing about twelve million compares, which seems like a lot. (Now if I were wise, and the comparison were easy to code, I would go ahead and try it, because for all I know it’ll take only a tenth of a second to do all those compares. But I’m not wise, and the algorithm isn’t as obvious to me as it might be.)

Another approach is to create the new clumped hits as we process the raw file. On the flight from Michigan to Florida Sunday, I came up with one idea on how to do that. Any given pixel has (up to) eight immediately adjacent pixels. As we read through the file, whenever we find a pixel, we could, right then, look at all the adjacent ones, and record them into one big hit. It seems that, to do this, we need to remove the pixels from the file at that time, so that we don’t keep processing the same ones over and over.

So my cunning scheme is that when I discover a pixel, I’ll create a new Hit object, and put that pixel in it. Then I’ll create a list of all the adjacent pixel locations, and check them. If I find more pixels, I’ll add them to my Hit. Naturally, this process needs to continue: for each new pixel we add to the Hit, we need to check its adjacent pixels, until finally we’ve processed all the pixels in the Hit and found no more. And we’ll be removing these pixels from the original list, so that we don’t keep doing them over.

Some Experiments

Now while I’m pretty sure that this is a sensible and straightforward algorithm, I’m not certain, so I decided that while I was in Florida last week, I’d do an experiment to see how it feels. And that was where things started to go bad.

What I had in mind was a list – the good old linked list kind of list, but any kind would do, except for one problem. I imagine my code to look like this (just sketching):

public HitList newHit(x,y) {
  HitList list = new HitList(x,y);
  for(Hit hit: HitList) {
    for (Hit candidate: adjacentHits(x,y)) {
      if (isPixel(candidate) {
        list.add(candidate);
        removeFromSource(candidate);
      }
    }
  }
}

When I tried to code up a little experimental loop that way, I ran into problems. It turns out that for all the List types I could find, Java thinks it’s bad form to add something to a list while you’re looping over it. Very narrow-minded, if you ask me. Since I didn’t have access to any Java intelligence (certainly counting any that I might be thought to have), I had to write a test to see if I could make this work, finding that I got a concurrency exception when I tried to add to a list I was looping on. Bummer.

The Blind Watchmaker

Now, there I was, working alone, beating my head against the silliness of a loop that can’t deal with the structure it’s looping over having something added to it. I got focused more and more on finding a structure that I could process the way I wanted to.

What I missed, and probably you see it, is that a more ordinary loop, like a C-style for loop, or a while loop with explicit indexing, would almost certainly have worked. (And in fact, as I figured out later, it WILL work.)

If I had had a pair, or an Internet connection, odds are that I would have gotten hooked back up with reality right away. Instead, I had a little fun.

Can't Stop Me, I Can Go Entirely Off the Rails

I wrote a Master’s thesis on lists, back about 1963 or ‘64, so I decided to just roll my own. It’s obviously safe to append to a linked list while you’re looping over it, although not to a LinkedList, whatever that is, in Java. So I sketched the following code. As you’ll see, at this point, the code doesn’t store hits, it’s just storing integers, but that’s just a detail of the implementation. The important part was making the looping work right.

Here’s my first (and at this writing only) test:

package com.hendricksonxp.patterning.test;

import static org.junit.Assert.*;
import org.junit.Test;

public class ClumpingTest {

    @Test
    public void hitlist() {
        HitList hitlist = new HitList();
        hitlist.append(1);
        hitlist.append(2);
        int count = 0;
        while (hitlist.hasNext()) {
            if (count == 1) {
                hitlist.append(3);
                hitlist.append(4);
            }
            count++;
            hitlist = hitlist.next();
        }
        assertEquals(4,count);
    }
}

See what happens? I start looping over a list of two items, and on the second (count == 1) element, I add two more items to the list … wow, right in the middle of looping over it, isn’t that just so very scary … and count them too. Two plus two is … four, so that’s the answer.

The implementation is nearly trivial:

package com.hendricksonxp.patterning.model;

public class HitList {
    private int value;
    private HitList next;

    public HitList() {
    }

    public HitList(int value) {
        this.value = value;
    }

    public void append(int value) {
        if (next == null) {
            next = new HitList(value);
        }
        else {
            next.append(value);
        }
    }

    public boolean hasNext() {
        return next != null;
    }

    public HitList next() {
        return next;
    }
}

So the good news is that it’s really a fairly simple and nice implementation. The bad news is that it’s completely off the rails, because even the world’s simplest implementation with a regular for loop would have worked just fine. Once I realized that, I threw the above code away, the only record remaining in this article.

So … 45 minutes on the plane, and hour or two in the hotel after work. Some fun coding, possibly somethat will be useful later, but more likely not.

My Point, and I May Have One

One point, of course, is just to keep you up to date on progress in building this application, so that you’ll know everything that happens, not just the happy bits. But there is another point as well.

I spent a couple of frustrating hours trying to guess what the concrete implementors of List are (until I discovered that Eclipse can help somewhat) and testing each one to see if it would allow this trivial notion of appending to the end of a list that is being processed. In this case, had I not been completely isolated, it was a complete waste of time.

Writing this code, on the other hand, took me just about as long as it would take you to type it in. It’s clear, obvious, and exactly how lists worked back early last century in 1964. The complete implementation is 30 lines of code counting whitespace, and it’s right there in front of us so that we can see what it is.

Now I don’t really want any more people to ask me whether I grow my own silicon in the back yard to make chips to build computers out of, rather than just buying one, but this experience is an example of why I don’t just jump for the nearest downloadable framework every time I have some little problem. By being willing to work close to the metal, as I do here, I can get tight, reliable code quickly, and get on with what really matters, in this case, clumping the pixels into Hits. It didn’t pay off much this time, but often it does.

I wouldn’t do it every time … in fact I tried pretty hard not to do it this time. But the tradeoffs to surfing up an object and learning how to use it do not always favor the reuse side of the equation. Even here, the list version is quite simple, and while the for-loop version will be just as simple, there may be something of value here.

In general, do we really learn more about our application by researching the Web and the bookstore, or by moving a little closer to the metal and just producing the objects we need? I’m not at all sure, but in this app, in a couple of cases, I feel that I’ve learned more by doing than I would by reading and trying to figure out someone else’s code.

Again, your mileage may vary, and I’m certainly not saying you have to do what I do. I don’t even have to do what I do! In this case, I’ve deleted this code and anticipate using a regular for-loop when I work on this next, either later today, or Monday with Chet. But if things get tricky … I know where I can get an object that will do what I want.

Just something to think about. Next we’ll move a little closer to actually clumping the hits …