In which we set out to drain the swamp, encounter some alligators, and quite possibly add a little water to the swamp in the process.

Too Tightly Bound to the ArticleFiler

The Article, when it’s created, is passed a directory name and an article name. From these it creates an ArticleFiler for itself:

    public Article(string directoryName, string articleName) {
      this.pageName = articleName;
      this.articleFiler = new ArticleFiler(directoryName, articleName);
    }

While it’s appropriate for some objects to create the objects they work with, it’s often a code smell, and can lead to problems. I’m guessing that if we look around a bit, we’ll find that we’ve got the Article and the ArticleFiler too tightly hooked together. Let’s find out, and if so, see what we can do about it. I’ve been complaining about not trusting the tests, so let’s start there. We have a class called TestArticle, and it looks like this, in part:

  [TestFixture] public class TestArticle : Assertion {
    private string newLine = Environment.NewLine;
    private static Regex splitter = new Regex(Environment.NewLine);

    [Test] public void Hookup() {
      Assert(true);
    }

    [Test] public void Paragraph() {
      ArticleForTesting t = new ArticleForTesting("big long string");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

    [Test] public void TwoParagraph() {
      ArticleForTesting t = new ArticleForTesting("big long string\r\nsomething");
      string[] lines = SplitLines(t.Html());
      AssertEquals("<B>No Page</B>", lines[0]);
      AssertEquals("<p>big long string</p>", lines[1]);
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine +
        "<p>something</p>" + newLine, t.Html());
    }

    [Test] public void BlankParagraph() {
      ArticleForTesting t =
        new ArticleForTesting("big long string\r\n\r\nsomething");
      string[] lines = SplitLines(t.Html());
      AssertEquals("<p>big long string</p>", lines[1]);
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine +
        "<p>something</p>" + newLine, t.Html());
    }

    [Test] public void HorizontalRule() {
      ArticleForTesting t =
        new ArticleForTesting("----");
      AssertEquals("<B>No Page</B>\r\n<hr/>" + newLine, t.Html());
    }
    ...

You may be wondering what ArticleForTesting is. Here it is:

  public class ArticleForTesting : Article{
    private string savedText;

    public ArticleForTesting(string text) 
      : base ("noDir", "NoPage") {
      savedText = text;
    }

    public override string Text {
      get { return savedText; }
    }
  }

We see that ArticleForTesting is a subclass of Article, and that its big claim to fame is that it overrides the property Text, which in Article, looks like this:

    public virtual string Text {
      get { 
        if (articleFiler.ArticleName == "HotNeedleOfInquiry")
          return TableOfContentsText();
        else
          return ArticleText();
      }
    }
    protected string ArticleText() {
      if (! articleFiler.Exists )
        return "Describe " + pageName + " here.";
      else
        return articleFiler.ArticleText();
    }

So what we see here is that the regular Article goes to the ArticleFiler to get the Text, and that the ArticleForTesting is given its text, which it just returns. This makes sense for testing, of course, but the implementation is a bit stinky. While overriding a method is a time-tested way of making a subclass behave differently from its superclass, it’s not quite right to have the simpler implementation (return this constant text) in the subclass.

We could, of course, invert the relationship between the two classes, or we could implement an interface and have two Article classes, a sort of “stub” or “mock” one for testing. But it seems that the real issue is that for testing purposes, we would like to use the real Article class. Since all we want to do is to get the text into him more easily, and since he gets his text from his ArticleFiler, it makes sense that we’d give him a different kind of ArticleFiler, containing literal text, for testing purposes.

But we can’t. He creates his own ArticleFiler, so we can’t get at it. We could give him a setter and use it only for testing. Or we could give him a setter and use it everywhere. But then, the Article wouldn’t be complete upon construction, but only after his Filer setter was called. Objects’ constructors should be complete. Therefore, it seems that the Filer should be passed in to the Article’s constructor rather than created internally. Let’s see how we might do that. There are a zillion creations of ArticleForTesting in the test we just looked at. We can see that we can reduce that to one by extracting the creation to a method in the test if we want to, so we’ll ignore that for now. What of the other creations of Article? Here they are:

In ArticleList class, which I think we haven’t looked at yet, we use instances of Article to get information about the files in the contents, and to get an article’s “description”. Here’s the code:

    public string Contents() {
      StringWriter s = new StringWriter();
      foreach (FileInfo f in list) {
        // TODO do we hate the FileInfo here?
        Article article = new Article(f.DirectoryName, f.Name);
        s.WriteLine("{0} ''{1}''", f.LastWriteTime.ToShortDateString(), f.Name);
        string precis = article.Precis();
        if (precis != "")
          s.WriteLine(precis);
      }
      return s.ToString();
    }

    private string Description(FileInfo f) {
      // TODO uses FileInfo??
      Article article = new Article(f.DirectoryName, f.Name);
      string description = article.Description();
      if (description != "") return description;
      return WikiNameTagger.SpacedWikiWord(f.Name);
    }

Description is used in the RSS writer and is supposed to consist of the Article’s HTML. That one actually requires the article. What about Precis? That’s the first text line of the file, if it starts with the word “precis:”, the signal that the first paragraph should be italicized and indented in the HTML. So Precis really just needs to talk to the Filer, not to the Article.

The only other use of Article is in the OnInit() method of WikiPage, namely in the initialization of each of the aspx pages, like Page.aspx, which displays the blog articles. The code is:

    override protected void OnInit(EventArgs e) {
      base.OnInit(e);
      article = new Article(RequestPath(), RequestFileName());
    }

This one, of course, is where the action is. This is the construction that really makes the Article work, hooked up to the operating file directory on the web site.

Trying a Change

I’ve recently been exposed to two ideas that make me want to remove the creation of the ArcticleFiler from inside the Article. One was a web-based slide lecture by Scott Bain, talking about Factory patterns. Scott points out that objects should be cohesive – they should only do one kind of thing – and that therefore an object should do just one of:

  1. Creating objects;
  2. Using objects;
  3. Implementing atomic leaf-level operations.

The Article does two of these things right now: it creates the ArticleFiler, and it uses it. So that’s a bit of a code smell.

The other idea came from J. B. Rainsberger. He suggested, based on little more than a description of the situation, that perhaps the ArticleFiler should be passed in to the Article for its operations that actually need it.

I’m not entirely swayed by either of these notions, but there’s enough weird stuff happening that I’m thinking we should try something different and see what happens. It won’t take more than a few minutes and it will probably tell us something interesting. Chet is available for pairing this morning, so we’ll see what we get.

The basic plan will be to create the ArticleFiler outside the Article, and pass it in. Chet’s not sure why we should do that, and I agree that it’s going to cause some duplication, but I’d like to carry it through, so we’ll go ahead. We want to get rid of the existing Article constructor that takes a directory name and article name, and replace it with one that takes an ArticleFiler:

    public Article(string directoryName, string articleName) {
      this.pageName = articleName;
      this.articleFiler = new ArticleFiler(directoryName, articleName);
    }

… will change to:

    public Article(ArticleFiler filer) {
      this.pageName = filer.ArticleName;
      this.articleFiler = filer;
    }

Doing this is all pretty straightforward. The most important creation of an Article is in the WikiPage class, the base class for the operational part of the blog. That construction turns out this way:

    override protected void OnInit(EventArgs e) {
      base.OnInit(e);
      ArticleFiler filer = new ArticleFiler(RequestPath(), RequestFileName());
      article = new Article(filer);
    }

There are uses of Article in the ArticleList class, a class that supports displaying the blog table of contents, where we construct the Article this way:

    public string Contents() {
      StringWriter s = new StringWriter();
      foreach (FileInfo f in list) {
        // TODO do we hate the FileInfo here?
        ArticleFiler filer = new ArticleFiler(f.DirectoryName, f.Name);
        Article article = new Article(filer);
        s.WriteLine("{0} ''{1}''", f.LastWriteTime.ToShortDateString(), f.Name);
        string precis = article.Precis();
        if (precis != "")
          s.WriteLine(precis);
      }
      return s.ToString();
    }

Notice that TODO. I’m concerned there that the use of the FileInfo object isn’t sufficiently abstract. The ArticleList is just a list of file names, which is tied perhaps more tightly to the current directory-based structure of the blog data. I’m sure this is one of the forces pushing me to change things, but it’s not germane just now. I mention it just to keep you up to date on what’s bugging me. This series of articles, after all, is about discovering improved design by following our sense of good code, and not so good code.

Also in ArticleList, we have a Description method, updated to the new Article constructor like this:

    private string Description(FileInfo f) {
      // TODO uses FileInfo??
      ArticleFiler filer = new ArticleFiler(f.DirectoryName, f.Name);
      Article article = new Article(filer);
      string description = article.Description();
      if (description != "") return description;
      return WikiNameTagger.SpacedWikiWord(f.Name);
    }

Here again, I’m a bit bugged by the FileInfo connection. It just feels like a list of articles should be a list of articles, not a list of file names.

Duplication Rears Its Ugly Head

Notice that by requiring the Article to be given an ArticleFiler in its constructor, we have created duplication in our code. We have just looked at three places where we create an ArticleFiler prior to creating an Article. In each of those we only had an Article constructor call prior to this change. But duplication is bad; what’s this about?

Chet and I talked about this a bit. I was suggesting that we had taken an obscure design problem, the combination of Creation and Usage in one object, and made it into a more visible problem, duplication. Therefore we have changed a hidden issue into a visible issue, which is a good thing to do.

Chet took the position that code that is obviously bad, and spread all around, is worse than code that is only perhaps bad, and isolated to one location. It’s a compelling argument.

Since it’s my practice in these articles to show what really happened, I’m almost unconcerned about whether this is a righteous change or not. In Discovering Better Code, I accept the possibility that a change I’ll make will not improve the code, and that I’ll take it back out. That’s OK. We are trying to learn, trying to build sensitivity to code smells, and trying to get a sense of which direction to go when something happens.

Right now, I’m not convinced that we are in a bad state, and I’m going to finish this report and press forward. One reason why I’m not concerned is this:

Factory Object

A standard response to complex constructor code, such as we’re beginning to create here, is the Factory Object. This is a separate class containing methods used to create other objects. In essence, rather than put the combined code creating the Article and ArticleFiler back into Article class, we would create a new class, ArticleFactory and have a method on it that returns a new article, given the path and article name.

Why is this better? Well, in sufficiently simple cases, it might not be. But the separation of Creation and Usage is of some value, and the more different kinds of Filers there are, the more this separation becomes important. Either way, we know how to get rid of this duplication if we want to: we can move it back inside Article, or we can create an Article Factory. Before we concern ourselves with that, let’s look at the third kind of creation of Articles that we have.

Article For Testing

I think I mentioned that in the TestArticle class, we wanted to provide the text for an article literally, as for example in this test:

    [Test] public void HorizontalRule() {
      ArticleForTesting t =
        TestingArticle("----");
      AssertEquals("<B>No Page</B>\r\n<hr/>" + newLine, t.Html());
    }

This is a test for the wiki notation for a horizontal rule, “—-“. When this appears in an Article, the output is supposed to have an HR tag. The TestingArticle helper method (new as of today) is used to create an instance of ArticleForTesting:

    private ArticleForTesting TestingArticle(string text) {
      ArticleFiler filer = new ArticleFiler("noDir", "NoPage");
      return new ArticleForTesting(filer, text);
    }

ArticleForTesting is a subclass of Article that, instead of using the ArticleFiler to read its text, just contains the text. To save you looking back, it looks like t his:

  public class ArticleForTesting : Article{
    private string savedText;

    public ArticleForTesting(ArticleFiler filer, string text) 
      : base (filer) {
      savedText = text;
    }

    public override string Text {
      get { return savedText; }
    }
  }

In the course of making this change to the Article constructor, we consolidated the creation of ArticleForTesting into the one method shown above. It still has some code duplication in it, creating a Filer, but it’s not exactly the same, since it creates an instance of ArticleForTesting, not Article. Still, as I mentioned above, it seems like it would be better not to create a specialized kind of Article, but instead a special kind of ArticleFiler. That’s where I plan to go next. But first, let’s talk about other ways this testing could be done.

Testing the Article

Most of the testing for the Article class is about checking the HTML generation. The tests all look much the same as the one above. They set up an Article’s contents, then check the result of the Html() method. We use a special subclass of Article that holds its own text. That seems like one good way. But there are others.

We don’t have many tests that exercise writing Articles to files; it’s possible that we don’t have any. So one possibility would be to enhance the TestArticle code to create the Articles as actual file-resident Articles, writing them out to a special directory, then reading them back in and checking their HTML. That would be a much more robust kind of test.

Another possibility is to test the HTML generation directly. An Article gets its HTML using an HtmlCalculator, this way:

    public string Html() {
      HtmlCalculator calculator = 
        new HtmlCalculator(pageName, Text, articleFiler.DirectoryName);
      return calculator.Html();
    }

The calculator needs to know the page name, so that it can create the header line, the text, so that it can format the article, and the folder name from the Filer, so that it can set up links to articles, represented as wiki names. Then it does all the work.

Now it turns out that when I started creating the HTML, it was done right inside the Article, because it was trivial. So I tested it inside the Article, using the ArticleForTesting class. Then as the HTML calculation got more complex, I moved it outside, into HtmlCalculator class. A case could be made that the current TestArticle method should instead talk directly to HtmlCalculator.

Which way should we go? If we continue to think of ourselves as testing Article, we need to consider whether a more round-trip approach is better, and we also need to deal with creating all these text Articles, either on the disc, or as in-memory ArticleForTesting instances. And I have this really cool idea about that latter way!

We could create an Interface for ArticleFiler. IFiler interface would include at least the Text method, and perhaps others. Then ArticleFiler would implement IFiler, and it would be used inside Article as an IFiler (perhaps throughout the class, perhaps just in certain methods). Then we could build another class implementing IFiler, namely a TextFiler, that contains literal text. We would use that in the TestArticle class, changing the TestingArticle method to look something like this:

    private ArticleForTesting TestingArticle(string text) {
      IFiler filer = new TextFiler("noDir", "NoPage", text);
      return new Article(filer);
    }

That would eliminate the ArticleForTesting class with its odd overriding of the Text method. It would somewhat validate the use of Article with its Filer parameter, since there would now be two kinds of Filer. It might lead to an ArticleFactory.

And it would be cool. We’d get to build an Interface, and two concrete classes implementing it. We’d get to use a neat object-oriented feature, polymorphism, putting two different classes into the Article and having it use them all unknowing. Very cool.

Well, darn it, I’ve been thinking, and even dreaming about that change for two or three days now. But the only justification is in the TestArticle class, and most of that is just testing the Html() method, and thereofre could be redirected to the HtmlCalculator object. I can’t justify that cool change if I can change the test to be simpler. But maybe there’s more to TestArticle than I realize. Well, there is one test that isn’t just an HTML test, this one:

    [Test] public void Precis() {
      ArticleForTesting withPrefix = TestingArticle("precis:Here it is\r\nThis is not");
      string precis = withPrefix.Precis();
      AssertEquals("precis:Here it is", precis);
    }

The Article has the notion of a precis, that italicized section that I put at the top of my articles here and in the blog. And pulling the precis out of the text is done in Article, not in HtmlCalculator. In Article, it looks like this:

    public string Precis() {
      string firstLine = FirstTextLine();
      if (firstLine.StartsWith( "precis:"))
        return firstLine;
      else
        return "";
    }

    private string FirstTextLine() {
      StringReader reader = new StringReader(Text);
      return reader.ReadLine();
    }

So I could justify my cool Interface solution because this one test talks to Article and not to HtmlCalculator. Or, I could implement the Precis method on HtmlCalculator, delegate from Article, and then test it in HtmlCalculator’s test. Now I can rationalize that the precis is part of the text, and not part of the job of the HtmlCalculator, and there is some truth to that. But is that sufficient to justify my cool Interface idea?

No, I’m afraid not. It’s not a righteous change. Instead, let’s change the TestArticle class to TestHtml, adjust it to create HtmlCalculators instead of Articles and go from there.

Reflection

It’s interesting what just happened. Through just a little consideration of the objects and what they were doing, I’ve managed to avoid building something that would be overly complicated. Let’s face it, that’s Design Up Front. Isn’t that evil in XP? No, not in my opinion: you have to think about what you’re going to do before doing it. But you don’t have to think very long, and you can think at the very last minute. I think of it as Design All the Time, not Design Up Front.

TestHtmlCalculator

I begin with an experiment to see how hard it is to change these guys to use HtmlCalculator instead of ArticleForTesting. It turns out to be easy:

    [Test] public void Paragraph() {
      ArticleForTesting t = TestingArticle("big long string");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

    [Test] public void HtmlParagraph() {
      HtmlCalculator t = new HtmlCalculator("NoPage", "big long string", "noDir");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

That’s the big change if we did it line by line. But all the tests use the same page name and directory. So we can change the whole file this way:

    private HtmlCalculator MakeCalculator(string text) {
      return new HtmlCalculator("NoPage", text, "noDir");
    }

    [Test] public void HtmlParagraph() {
      HtmlCalculator t = MakeCalculator("big long string");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

Now it’s simple enough to change all the tests, and we wind up with this:

  [TestFixture] public class TestArticle : Assertion {
    private string newLine = Environment.NewLine;
    private static Regex splitter = new Regex(Environment.NewLine);

    [Test] public void Hookup() {
      Assert(true);
    }

    private HtmlCalculator MakeCalculator(string text) {
      return new HtmlCalculator("NoPage", text, "noDir");
    }

    [Test] public void HtmlParagraph() {
      HtmlCalculator t = MakeCalculator("big long string");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

    [Test] public void Paragraph() {
      HtmlCalculator t = MakeCalculator("big long string");
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine, t.Html());
    }

    [Test] public void TwoParagraph() {
      HtmlCalculator t = MakeCalculator("big long string\r\nsomething");
      string[] lines = SplitLines(t.Html());
      AssertEquals("<B>No Page</B>", lines[0]);
      AssertEquals("<p>big long string</p>", lines[1]);
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine +
        "<p>something</p>" + newLine, t.Html());
    }

    [Test] public void BlankParagraph() {
      HtmlCalculator t =
        MakeCalculator("big long string\r\n\r\nsomething");
      string[] lines = SplitLines(t.Html());
      AssertEquals("<p>big long string</p>", lines[1]);
      AssertEquals("<B>No Page</B>\r\n<p>big long string</p>" + newLine +
        "<p>something</p>" + newLine, t.Html());
    }

    [Test] public void HorizontalRule() {
      HtmlCalculator t =
        MakeCalculator("----");
      AssertEquals("<B>No Page</B>\r\n<hr/>" + newLine, t.Html());
    }

    [Test] public void DualSubstitution() {
      HtmlCalculator t = MakeCalculator("Here are ''two'' different ''emphasized'' strings");
      string[] lines = SplitLines(t.Html());
      AssertEquals("<p>Here are <em>two</em> different <em>emphasized</em> strings</p>", lines[1]);
    }

//    [Test] public void Precis() {
//      HtmlCalculator withPrefix = MakeCalculator("precis:Here it is\r\nThis is not");
//      string precis = withPrefix.Precis();
//      AssertEquals("precis:Here it is", precis);
//    }

    [Test] public void EasyImage() {
      HtmlCalculator image
        = MakeCalculator("This costofchange.jpg is an image.");
      string[] lines = SplitLines(image.Html());
      AssertEquals
        ("<p>This <img src=\"images/costofchange.jpg\"/> is an image.</p>",
        lines[1]);
    }

    [Test] public void EasyImageGif() {
      HtmlCalculator image
        = MakeCalculator("This costofchange.gif is an image.");
      string[] lines = SplitLines(image.Html());
      AssertEquals
        ("<p>This <img src=\"images/costofchange.gif\"/> is an image.</p>",
        lines[1]);
    }

    [Test] public void EasyImageGifGarbled() {
      HtmlCalculator image
        = MakeCalculator("This costofchange.gifgarbled is an image.");
      string[] lines = SplitLines(image.Html());
      AssertEquals
        ("<p>This <img src=\"images/costofchange.gif\"/>garbled is an image.</p>",
        lines[1]);
    }

    [Test] public void EasyImageEndOfLine() {
      HtmlCalculator image
        = MakeCalculator("This costofchange.jpg\r\n");
      string[] lines = SplitLines(image.Html());
      AssertEquals
        ("<p>This <img src=\"images/costofchange.jpg\"/></p>",
        lines[1]);
    }

    public static string[] SplitLines(string html) {
      return splitter.Split(html);
    }
  }

Notice that I commented out the Precis test. These tests all run now. I’ll turn on the Precis test and make it run:

    public string Precis() {
      throw new NotImplementedException();
    }

Implemented this way (thanks to ReSharper) I get the requisite red bar. Now for a real implementation:

    public string Precis() {
      string firstLine = FirstTextLine();
      if (firstLine.StartsWith( "precis:"))
        return firstLine;
      else
        return "";
    }

    private string FirstTextLine() {
      StringReader reader = new StringReader(text);
      return reader.ReadLine();
    }

This is just copied over from the version in Article. The tests run.

However, I really would like to have Article use the HtmlCalculator now to return its precis for it. And I have no test to require that change. Options are: do it, because after all that’s what I’m doing, or write the test and make it work.

Now if I had started from Precis in Article, I could have driven the whole thing that way, using the Article test and then replacing it later with an HtmlCalculator test. Now I have to create a test, and I’d better do it before I forget.

    [Test] public void ArticlePrecis() {
      ArticleFiler filer = new ArticleFiler("noDir", "NoPage");
      ArticleForTesting t = new ArticleForTesting(filer, "precis:Here it is\r\nThis is not");
      string precis = t.Precis();
      AssertEquals("precis:Here it is", precis);
    }

This runs. Now I’ll break it in Article:

    public string Precis() {
     return "hello";
    }

And then fix it to forward:

    public string Precis() {
      HtmlCalculator calculator = 
        new HtmlCalculator(pageName, Text, articleFiler.DirectoryName);
      return calculator.Precis();
    }

That runs, and justifies my change. Interestingly enough, it also leaves me with a testing reference to ArticleForTesting. Does this mean I still get to do my Interface trick???

Maybe later. For now, I’m going to rename the TestArticle class to TestHtmlCalculator and then move this one article test into a new class named TestArticle. OK, that’s done. The new file:

using NUnit.Framework;

namespace Blog {
  [TestFixture] public class TestArticle : Assertion {

    [Test] public void Precis() {
      ArticleFiler filer = new ArticleFiler("noDir", "NoPage");
      ArticleForTesting t = new ArticleForTesting(filer, "precis:Here it is\r\nThis is not");
      string precis = t.Precis();
      AssertEquals("precis:Here it is", precis);
    }
  }
}

Where Are We???

Well now. We have improved the Article tests largely by making them into HtmlCalculator tests. We have the Article no longer creating its Filer, but there is still only one kind of Filer and the result of our change is that there are four patches of code that duplicate the creation of an ArticleFiler in construction of an Article or ArticleForTesting.

Overall the code is better, owing to the test change and to general smoothing as we went along. But the thing we set out to do, separate out the Creation and Usage of ArticleFiler has not improved the situation. Does this mean that Scott Bain and J. B Rainsberger were wrong? Well, yes. Well, no, just kidding. What it means is that their ideas don’t apply in this case, most likely because it is so simple.

Chet and I will work on this later today, I expect. Stay tuned …