We need a test. We discuss why we work that way, then we write the test and make it run. In so doing, we learn something very important: maybe we don't want ArticleList to be a typed collection after all!

That Nagging Test

When last we left our heroes, they were in the course of discovering better code for the ArticleList class. The basic notion was to make it more like a list of Article objects, rather than a list of FileInfo objects. We changed the tests to work create lists of Articles instead of FileInfo, then made the tests run again, mostly by extending the Article object with a few methods to return information.

However, in ticking through ReSharper’s “Find Usages” list of references to ArticleList, we found this one in the Article class:

    private ArticleList ContentsList() {
      DirectoryInfo info = new DirectoryInfo(articleFiler.DirectoryName);
      return new ArticleList(info.GetFiles("*."));
    }

What I found interesting about this is that there is no test that turned red after we changed ArticleList to expect to be a list of Articles. Here’s the code that “should” have failed:

    public ArticleList(ArrayList list) {
      this.list = list;
      this.list.Sort(new ArticleDateComparer());
    }

    public ArticleDateComparer() {
    }

    public int Compare(Object d1, Object d2) {
      return DateTime.Compare( 
        ((Article) d2).LastWriteTime, 
        ((Article) d1).LastWriteTime);
    }
  }

The ArticleList is sorted by decreasing LastWriteTime, by first casting the items to Article. (This code used to cast to FileInfo, as you probably recall from our last episode.) Since the cast exception is not thrown during the tests, that tells us that there is no test for the Article’s ContentsList method. That’s a bad thing by our standards, since the code could clearly possibly break – it is broken now – yet we do not have a test for it. Now let’s join our heroes, as they deal with this concern.

A Colleague's Suggestion

Our esteemed new colleague, Jonathan Pierce, watched the last episode and commented on the refactoring list:

Thanks for crediting me in your article. I took a quick look at it and noticed that even in your updated version, your constructor for your ArticleList class is still not strongly typed.

Jonathan also sent a proposed version of the ArticleList that was strongly typed. I replied that ArticleList is not strongly typed yet because we aren’t done, and that I’d look at his example only after we were done, so as not to contaminate my mind.

Now on another day, I might well have looked at the idea. But we’re in mid-operation here, and it seemed that Jonathan was going to provide a fairly complete canned solution. That wouldn’t be very incremental, nor very much like discovery, nor yet again very much like pair programming. So I still haven’t looked at the idea.

The discussion continued a bit. Jonathan commented:

I think it is a mistake for you to say that you don't need all of the help that the C# compiler provides, since a lot of your runtime errors that you have been writing tests for would never have even compiled if all of your collections were strongly typed. You should also be careful when you insert explicit casts to think about why you need it since you may be violating the strong typing that you intended in the first place. I think it is always better to let the compiler find as many errors as you can rather that waiting until runtime or running your tests to detect them since you wouldn't even need to write some of the tests if the code didn't even compile in the first place.

I replied:

Yes. As a person who prefers and is familiar with the use of dynamically typed languages, I use the untyped style in early implementations even in typed languages. I suppose that when the template stuff comes out in C# I might use it just because it's there, since my collections tend to be typed in practice but not declared. On the other hand, I encounter very few errors that are due to untyped collections. The one in the current article is unusual in that regard, and I found it by inspection, in the middle of a refactoring, not due to a program failure. I think that is a sign for a test, not a sign for a typed collection, although I do plan to get there, more for the fun of it than for any real purpose: the much simpler untyped implementation works just fine.

Well, it’s clear that Jonathan and I differ on the value of strongly typed languages. Oversimplifying, he feels that they find important errors without the need for writing tests, and I feel that they get in the way and that I would want the tests anyway. I used to share his view rather closely. Before I started working in Smalltalk, I had used only very strict languages like C++ and Pascal. Particularly in Pascal, I had the feeling that by the time I convinced the compiler to accept my program, there were very few defects left in it. Smalltalk has no compile-time type checking at all. I was afraid that without all that wonderful checking, my Smalltalk code would be full of type errors that the compiler would have caught.

That turned out not to be the case. What really happened was that I made very few type errors, and when I did, they usually appeared immediately. It’s true that once in a while I would make an error that Pascal would have caught, and probably some of those errors even slipped through my tests into the production code.

But that’s not the end of the story. Smalltalk is substantially more productive than languages like Pascal, C++, or C#, in my hands, and in the hands of almost everyone who uses both kinds of languages with facility. I think there are a couple of reasons why that’s so:

Development Environment

The Smalltalk environment is very powerful, even compared to very hot IDEs like Eclipse or IntelliJ, and certainly compared to Visual Studio. It’s quite different, as well, so that when you first come to it from something like Visual Studio you’re not likely to get it, but if you’ll watch experts in both kinds of IDEs, you’ll see what I mean.

Compact Language

On the average, Smalltalk takes about 1/4 the code to say the same thing as is needed in C#. This has some big advantages. To the extent that number of defects is proportional to program size, the smaller program has advantages. Now someone, perhaps Jonathan, pointed out in our discussion that a sufficiently cryptic language, like Perl, might have a very different defect density. True, but Smalltalk is not at all cryptic, although its syntax is different from what people may be useful. I would even place a small bet that the defect density might be lower in Smalltalk.

Real Polymorphism

In Smalltalk, if an object understands a message, you can send it that message without begging permission from the compiler. Our present dilemma is an example of that, although an unintended one.

The problem we’re looking at is that the ArticleList now expects Articles to come through it. Setting aside for a moment that we have other reasons for making that the case, remember that the way we made ArticleList accept articles to begin with was by implementing methods like LastWriteTime on Article, which already existed on FileInfo.

If C# would let those FileInfo objects run through ArticleList, the code would work as it stands! By insisting that we declare in the comparison and in the loops that the items are Articles, C# stops FileInfo objects from coming through, even though they would work just fine.

Now, C# offers a way to deal with this issue – sometimes. We could define an interface, perhaps IArticleListElement, saying what all the messages are that will be sent to the elements of ArticleLists. Then we could cause everything we want to send in to “inherit” that interface. Then the compiler would kindly permit us to do what would already work if it would just step out of the way.

To do this would require us to create a new entity, IArticleListElement, and to declare our objects as inheriting from it. Except, of course, that we cannot do that. FileInfo isn’t our object, and we cannot declare a new interface and then say that FileInfo implements it. (At least I don’t know a way to do that.) We could perhaps wrap a FileInfo in an object of our own or something, but that’s even more code to make the system do what already works. Gack. Anyway, this is all a digression. We’re in C# and we have to make it work the way C# requires us to, like it or not.

Should We Go Typed?

Jonathan’s point is correct: if the ArticleList was strongly typed, then the code we’re whining about would not compile, and we wouldn’t need the test we don’t have to know it. The compiler would have “saved” us. And, based on what he sent me, Jonathan would solve the current problem by making the ArticleList strongly typed. But we can’t do that.

We can’t do that – by which I mean choose not to do it for a really good reason – because we have discovered broken code which is not covered by a test. Our discipline requires us to do two things: First, we must write the missing test to demonstrate the defect and to teach us something about testing that we apparently don’t know. Second, we must fix the defect. Using the TDD discipline as I understand it, we can’t go replacing some entire class with a new one, in the absence of tests that show that it works, nor can we implement new code in the presence of a broken test. Another programmer, using other methods, might go another way. We will not.

So, to show the bug, we need a test. That test is easy to write:

    [Test] public void Contents() {
      Article article = Articles.GetArticle("HotNeedleOfInquiry","../TestPages/");
      string contents = article.Text;
      AssertEquals("", contents);
    }

We just create an article, pointing to our existing test pages folder, and ask for its text. But not just any article: one named Hot Needle Of Inquiry. Why? Because that’s the magic article that always contains the current contents. I think we’ve seen that code before, but here it is again:

    public virtual string Text {
      get { 
        if (articleFiler.ArticleName == "HotNeedleOfInquiry")
          return TableOfContentsText();
        else
          return ArticleText();
      }
    }

    private string TableOfContentsText() {
      return ContentsList().Contents();
    }

    private ArticleList ContentsList() {
      DirectoryInfo info = new DirectoryInfo(articleFiler.DirectoryName);
      return new ArticleList(info.GetFiles("*."));
    }

The Text property for this article will contain TableOfContentsText(), which is ContentsList().Contents(), which will execute the Contents() method, which is the one where we have noticed the defect. Sure enough, we execute this test and it fails on the cast exception just as we expect. This confirms that we have a defect and that we understand it at least this well. Notice that when we fix that issue, the test will fail again because the contents won’t be an empty string. That’s just a placeholder to save me from figuring out manually what the table will look like. We already have a test elsewhere that checks that. We’ll fill in that string assertion later on. Right now, we have a broken test to fix.

I mentioned in the previous article, er, episode, that we could fix it quickly by building another constructor in ArticleList that takes a collection of FileInfo objects. Jonathan objected to that solution on the grounds that ArticleList shouldn’t have to create Articles, just add them. He’s certainly correct that we wouldn’t like that as the end of the path, but it might be a good first step. Jonathan, I think, likes to design and build things to be perfect right away. What we’re doing here is vey different: we are discovering good code. It’s a different style of working, one that I think is valuable enough to practice and write about. Anyway, we have to decide what to do. We have one suggestion on the table and it’s pretty simple. Offhand, I can’t think of anything simpler, and it is therefore within my personal ruleset to just Do The Simplest Thing That Could Possibly Work. But I am allowed to think, and I don’t really like having ArticleList creating Article objects much more than Jonathan does, so let’s see, what else could we do?

We could change the Article to create the Articles for the ArticleList directly, creating each one in a loop over the DirectoryInfo. That would be a bit worse than doing it inside ArticleList for at least one reason: it is the kind of code that once you write it starts happening all over. But we could defer that concern, since writing it all over would create duplication and we have to remove duplication. So we could create the Article objects in Article.

Reviewing the tests, we see that in essence they do create the Article objects themselves:

    [Test ] public void FormatList() {
      ArticleList list = new ArticleList(SampleArticles());
      string contents = list.Contents();
      string expected = "9/7/2004 ''MyBusinessIsToCreate''\r\n\r\n9/2/2004 ''BlogStories''\r\n\r\n";
      AssertEquals(expected, contents);
    }

    private ArrayList SampleArticles() {
      ArrayList result = new ArrayList();
      result.Add(Articles.GetArticle("MyBusinessIsToCreate", "../TestPages"));
      result.Add(Articles.GetArticle("BlogStories", "../TestPages/"));
      return result;
    }

So if we do it that way, we’ll create the duplication needed to solve the concern about the grubbiness. But we should consider a wider issue.

I’m not sure whether Jonathan’s style is what I’m about to describe but certainly one way to work with an idea like ArticleList is to create it empty and then add the articles to it. I liked adding an ArrayList or Array to it, for these reasons:

First, I like to create objects complete. This pattern is called Complete Constructor Method, I believe. It says that it’s better if an object is complete on creation rather than having to have a bunch of messages sent to it until finally it’s ready to go.

Second, I’m thinking of ArticleList not as a simple list object that you can add things to and subtract them from. Instead, it’s the object that knows how to create article descriptions, RSS feeds, and tables of contents.

Hey! We have just learned something. ArticleList has a confusing name. It’s not just a list like ArrayList. It’s not really just a typed collection. It’s an object that knows a collection of articles and provides information about the collection as a totality. This is actually more important than any issue of whether it should contain Article objects or FileInfo objects, or whether it should support Add() and Remove(). This is about what the object really wants to be, what the program really wants to be.

Reflection

At this moment an up-front design person would be saying “Aha! I told you that up front design is better. You just discovered that your whole theory of the universe is wrong and you have to start all over, and I would have designed the program up front, Mooohahahahaha …” Maybe they wouldn’t do the Moohaha part. Yes, well, they’re much smarter than I am, I guess. On the other hand, my program has been running just fine for a long time now, and I’ll bet you a day’s pay that we don’t have to start over either. We’ve been going forward, adding a little functionality, making it better, and I’ll bet that we can just keep doing that with no problem. Watch and see. You know that if it doesn’t work out, you’ll see it right in these pages.

Right now, though, we’ll just open our minds to what the ArticleList really is. My guess is that it might be a TableOfContents. I’m also noticing a closer relationship to the ArticleFiler, which knows about the directory or folder containing the Articles. Let’s keep that in mind too.

Getting This Test to Run

That’s got to be on our mind right now. We can’t go enhancing the code when it’s broken. We have to make this work, then refine it. Our other usage of ArticleList is creating its own articles, so that’s a good candidate for what to do, and while putting it inside the ArticleList would encapsulate it, it’s the wrong abstraction. So we’ll do it in Article. Here again is the starting code:

    private ArticleList ContentsList() {
      DirectoryInfo info = new DirectoryInfo(articleFiler.DirectoryName);
      return new ArticleList(info.GetFiles("*."));
    }

This method needs to return an ArticleList on a list of Articles, not a list of FileInfo, which comes back from info.GetFiles. Let’s just express that idea and then code it:

    private ArticleList ContentsList() {
      DirectoryInfo info = new DirectoryInfo(articleFiler.DirectoryName);
      FileInfo[] articleFiles = info.GetFiles("*.");
      ArrayList articles = MakeArticles(articleFiles);
      return new ArticleList(articles);
    }

So we’ll take the FileInfo objects and “convert” them to an ArrayList of Article objects, and then create the ArticleList. That begs MakeArticles:

    
private ArrayList MakeArticles(FileInfo[] files) {
      ArrayList result = new ArrayList();
      foreach (FileInfo file in files) {
        Article article = Articles.GetArticle(file.Name, file.DirectoryName);
        result.Add(article);
      }
      return result;
    }

Well, that might work. Let’s run the test and find out. And it does: we get the error we expect next, that the contents isn’t an empty string. Let’s enhance the test and get on with it all. We’ll just check the beginning of the Contents(), since we have other tests for it:

    [Test] public void Contents() {
      Article article = Articles.GetArticle("HotNeedleOfInquiry","../TestPages/");
      string contents = article.Text;
      Assert(contents.StartsWith("9/7/2004 ''MyBusinessIsToCreate"));
    }

And that, of course, runs just fine. All tests are green.

Reflection

Up above, I wrote the MakeArticles method and ran the tests to find out whether it worked. Does that strike you as odd? Why didnt I figure out why I was uncertain, look things up in the help files, or whatever needed doing to give myself confidence that it would work?

Because it’s not a contest between me and the computer. It’s a partnership. The computer knows whether the code works, because it has all those tests and that neat compiler to tell me. So I ask the computer. And the computer says that things are good so far. So I just enhanced the comparison part of the test and moved on.

Now this may seem like a lot of work to deal with our little FileInfo / Article mismatch, but if you’ll look back at what really happened, we have written less than 20 lines of code. The rest of all this was me thinking out loud about what those 20 lines were, and how I do what I do, and why.

That, of course, is the point of all this stuff. I’m showing you some things that I do. If parts of it make sense to you, take them home with you.

Until Next Time ...

The tests are green. ArticleList is still not a strictly-typed list, but what we have discovered is that maybe it isn’t a list at all. That’s interesting. I’m kind of glad I didn’t spend a lot of time yet making it typed. That might not be the right thing at all! Anyway, it’s 6 AM here at the outpost, and I’ve been up since 3. I think I’ll take a nap. We’ll look at discovering more about this program later on.