Shooting in the Dark

While we wait for the estimates to come rolling in, and before I forget, here's a report on our work so far on the Shot Pattern Analysis program.

Estimates Not Pouring In

Probably in the next article, I’ll be talking about Chet and my estimates for this project. So far no one has written with theirs … I’ll be interested to see how they compare in size to ours.

Let’s look at what we’ve done so far, a few spikes.

Initial Work

Chet took a picture of a target he had shot at, and converted it to a BMP file for analysis. What we knew of BMP format was that it was pretty simple, so we figured it would be a good place to start. It looks like this (click the picture to see the whole bitmap):

If you zoom around in there for a while, you notice that there are few, if any, one-pixel dots. Most of the shot hits cover more than one pixel, and quite a few cover four or more. Some of the larger ones probably represent hits by two shot close together, and some are probably just places where the paper tore. As we move forward, we’ll do some human analysis on the holes to guess what they represent.

We each worked for an hour or so separately, looking up the definition of the BMP file (which has some header fields, followed by the bits), and looking at the file in hex, to see what it looked like. Quickly we sorted out that this file is just black and white (which we intended), which means that each pixel is represented by exactly one bit in the file. I remembered from ancient history that the first bit of the file is the lower left hand corner in the picture. I wonder who thought of that. So we each separately kind of probed through the hex, looking for breaks in the pattern of FF that represented white paper. Here’s a line from TextPad, showing some bits representing a hole:

5FF80: FF FF FF FF FF FF FE 7F  FF FF FF FF FF FF FF FF

FE 7F in binary is of course 1111 1110 0111 1111. So there are two black bits amid all the white at that location. And if you look at the corresponding place in the file (this is the last black in the file, therefore the first black in the picture, you find two dots. Whee! We were beginning to understand the file format.

Chet started writing a little Java code to open his BMP file and try to figure out the header. That was trickier than he thought, because (as all here except us know), the numbers in the file are stored little-endian, and he was thinking that the various two-byte values would be stored high/low, not low/high. So he wound up with some code that read the file into a byte array, but was a bit stumped on what to do next. Besides, it was time for dinner.

Meanwhile, in Adrian, Michigan, where I had figured out how to get to the Internet via Bluetooth to my cell phone, I was (slowly) downloading similar information and looking at the file. I decided right away that the header was boring and the bits were interesting, so I was thinking about how to access them, figuring out we could hard code the starting bit location and such, for the time being, and do the boring part of decoding the header later. Released from my duties in Adrian, I drove home, and headed for bed.

Wednesday's Child is Full of Woe

We met Wednesday at Brighton Borders, which is turning out to be our standard location, on the grounds that it is less bad than any of the alternatives, even though it doesn’t have free Internet. Chet had a test program that was reading in the file and looking at a few values, but not much else. We probed around in our file, but decided it was too big to write meaningful tests. So we decided to create a little tiny file, 7 by 15 or something, with just one pixel set. We looked at the file, and noticed right away that it wasn’t what we expected. We were figuring, in the bitmap part of the file, we’d see something like FF_FF_FF_7F_FF_FF, all F’s except for one bit missing somewhere. Instead we saw FE_00_00_00_FE_00_00_00. It turns out BMP files’ rows are padded to a multiple of four bytes, with zeros. Another little something to take into account.

We fiddled around, doing hex conversions badly in our heads or on paper, and figuring out that weird code that you have to use to calculate which byte bit 37 is in. We wrote something like three tests, which I’ll show you below.

The bottom line is that we spent a whole session, about two hours, learning how to calculate byte positions and bit ands and the like. It was quite a trick finding out the bit operations: the Java book we had with us was useless on the topic. Finally I went and got Head First Java and found what we needed on page 660. A few more pages and it wouldn’t have been there. (And there’s no Internet at Borders, though we could have gotten around that, and would have, had Head First not had it for us.)

But now we’re ready to go. We’ve done the math by hand, written a few tests that find bytes and bits, and we’re ready to start digging around in a BMP file in earnest. That’ll be for tomorrow.

Thursday, Early

I woke up early and decided to get a start on things. Should I be counting this time against my estimates? I wanted to play with bit packing and unpacking, and decided to use the int type instead of byte. I developed these tests:

package shotgun;

import static org.junit.Assert.*;
import org.junit.Before;
import org.junit.Test;

public class MappingTests {
    Bitmap bitmap;

    @Before public void setUp() throws Exception {
        int[] bits = new int[] {0,0,0,0,0,0,0x8002,0,0,0};
        // bits 0 and 14 in word 6, i.e. bits 6*16 and 6*16+14
        bitmap = new Bitmap(bits);
    }

    @Test public void zeroBitIsZero() throws Exception {
        assertEquals(0,bitmap.bit(0));
    }

    @Test public void oneBitInIntSeven() throws Exception {
        assertEquals(1, bitmap.bit(6*16));
    }

    @Test public void anotherBitInSeven() throws Exception {
        assertEquals(1, bitmap.bit(6*16+14));
    }

}

With this implementation of Bitmap:

package shotgun;

public class Bitmap {
    int[] bits;

    public Bitmap(int[] bits) {
        this.bits = bits;
    }

    public int bit(int bitNumber) {
        int target = bits[bitNumber/16];
        return selectBit(target,bitNumber%16);
    }

    private int selectBit(int target, int bit) {
        return ((target<<bit) & 0x8000) >> 15;
    }

}

I’m just counting bits from left (high order) as zero, to right, the one bit, as 15, and parsing the bit number into the target int, and the bits. I was taught that the sign bit, the high-order bit, was bit zero. If that’s not current thinking, well, let me know and we’ll fix it. I couldn’t find a reference. You could certainly make a case that bit n should have the int value 2^n, so that bit zero would be the low order bit, and bit 15 the high. Does it matter? Is there a standing convention? I frankly don’t know. This is close enough for now.

News Flash: Well, it’s not news to you, of course. The int type in Java is 32 bits, not 16. Some page I was reading made me think 16. Not that it matters much to my experiment, it just makes me feel ignorant. But then, ignorance is correctible, as I’ve just shown! I was tempted to pass over this whole topic and change the program to make it look like I have a clue, but that wouldn’t be fair.

I could move on to what I perceive as the next couple of issues, namely a fixed-size header at the beginning of the array, and the four-byte rounding in the row length in a BMP file. But in that case we’ll probably be using a two-part subscript, row and column, which we can probably implement on top of what we have here.

I was just trying to get warmed up … enough for now. See you in Brighton.

Thursday in Brighton

We begin with a review of what we’ve done, which is shown above plus a few other items Chet worked on. He has a test or two written and not working, as he left to come here before making them work. We have some test classes, and a model class Pattern that we were working on yesterday. Pattern, we’ve decided, should be a “domain” class that understands notions of being a shotgun pattern, while we’ll use Bitmap to manage the bits. We’re going to start from where we are, combining the projects into one, and refactor until it starts looking right.

I’ll not burden you with the code for these spikes, except perhaps for a bit of it. Much of our problems had to do with the necessity to read this little-endian binary file into Java. If there are really useful methods for reading binary stuff we couldn’t find them. Java is still about 20 years behind Smalltalk … and it’s no great shakes at binary either.

We tried a number of things … one was to process the file in ints instead of bytes as Chet had done it. It seemed like a good idea at the time … and had we done it in short instead of int, it might have been. We made it work, but since we had been thinking in bytes for a day, we kept getting the multiplication and division by two wrong, so the code was rife with silly mistakes. Our tests caught them all, but we couldn’t see what was wrong.

We even resorted to the debugger a couple of times, which immediately showed us important facts like “the header isn’t really 54 bytes long as documented, it’s 62”.

We made our spike work: basically we can read the file, strip off the header, find the bit at (x,y), all that stuff. So it was a successful spike, with a bit more than we’d have liked of that very common learning pain that spikes can have. But we learned a lot about binary files in Java.

We also learned that we have spent more than enough time doing bit addressing. We’ll work on this tomorrow, but here’s today’s thought, which we formulated at lunch. (Lu and Carl’s, Brighton)

We’re interested in a shotgun pattern, not a bitmap of a photograph. As you can see from the picture, the pattern is a few black spots in a big white ocean. (Or, depending how we take the photo, it might be a few white spots in a big black ocean.) Either way, it’s a few little spots in a vast expanse of nothingness. Kind of like western Nebraska.

Therefore, what we’re thinking, subject to clearer heads tomorrow, is that we should process the file in one pass, to a list of “hits”, which will be little objects knowing the x,y coordinates of a dot. Now most of the actual hits make a bigger hole than just one pixel, so what we’ll do then is consolidate the list, finding all the pixels that touch each other, making bigger hits. Then we’ll have a model of the actual pattern, instead of a bunch of bits in a two-D array.

We think that’ll be better, and we can almost certainly do it in one quick pass over the file, direct from bytes to objects.

Makes much more sense, doesn’t it? Move to a real domain object as quickly as possible. So we’ll probably do that, if it doesn’t snow and strand Chet at home. (He needs new tires, unless he got them today. I already have mine: I’m ready for most anything.)

I’ve reviewed the code, and there’s nothing interesting there to show you. Tomorrow we’ll have something for you codeheads … I promise!

Be sure to tune in!