YAGNI Prohibits Data-Driven Code -- or Does It?

The YAGNI principle says that we should not build any code that we do not currently need. But some parts of our systems benefit from being built from data rather than from code. Therefore if we follow YAGNI we could never build such code. But wait? What about the rules of simplicity?

It's a Bad Rap

There’s an XP saying, You Aren’t Gonna Need It (YAGNI). I fear that I may have coined that saying, in response to hearing someone say “we’re going to need it someday so we might as well build it now” for about the ten-thousandth time. Whenever we hear ourselves saying “we’re going to need it”, YAGNI asks us to consider that perhaps we are NOT going to need it. YAGNI permits us to think as much as we want about the future, but asks us to implement code only for what we need today.

But doesn’t that mean that we could never build a data-driven piece of software? We would build one object after another, procedurally, instead of stepping back to make the simple data-driven program that would make it all easy. Well, no, YAGNI doesn’t mean that, and here’s why:

When we code, we follow more rules than YAGNI. In particular, we follow Kent Beck’s rules for simple code. In priority order, the rules for simple code are these:

It runs all the tests correctly;
It contains no duplication;
It expresses all the ideas that we had about the program;
It minimizes the number of classes and methods.

Remember, these rules are in priority order. (Some people would argue that we should reverse #2 and #3 but in fact they rarely come into conflict.)

In many ways, rule number 2, which requires us to remove duplication from our code, is the most fascinating. I’ve found that as I focus on recognizing and removing duplication, I produce very fine code with less and less speculative design effort. This short paper gives an example. I won’t go into a great deal of detail here – the entire story is told in a couple of chapters of my forthcoming C# book. Here we’ll just see a sketch of what happens.

Menus in Evolution

In my XML Notepad application, a Windows editor for XML, I needed menus. An early implementation of menus looked like this:

      insertSection = new MenuItem (
        "Insert &Section",
        new EventHandler(MenuInsertSection));
      insertPre = new MenuItem (
        "Insert &Pre",
        new EventHandler(MenuInsertPre));

Each menu creation looked like the two above. Supporting code included the methods, like MenuInsertSection, and further data and code:

    void MenuInsertSection(object obj, EventArgs ea) {
      CallModel(insertSectionAction);
    }

    private void CallModel(ModelAction modelAction) {
      GetText();
      modelAction();
      PutText(textbox, model.LinesArray(), model.SelectionStart);
    }

    private void InitializeDelegates(TextModel model) {
      enterAction = new ModelAction(model.Enter);
      shiftEnterAction = new ModelAction(model.InsertReturn);
      insertSectionAction = new ModelAction(model.InsertSectionTags);
      insertPreTagAction = new ModelAction(model.InsertPreTag);
      saveAction = new ModelAction(this.SaveFile);
      loadAction = new ModelAction(this.LoadFile);
    }

The ModelAction class just delegates to a method like InsertSectionTags, which does InsertTags, using some static data items:

    private static string[] newParagraph = { "<P></P>" };
    private static string paragraphSkip = "<P>";
    private static string[] newSection = {"<sect1><title></title>","</sect1>" };
    private static string sectionSkip = "<sect1><title>";
    private static string[] newPre = { "<pre></pre>" };
    private static string preSkip = "<pre>";
    private static string[] emptyLine = { "" };
    private static string emptyLineSkip = "";
    private static object[] noArgs = {};
    private ArrayList lines;
    private int selectionStart;

    public void InsertSectionTags() {
      InsertTags(newSection, sectionSkip);
    }

    private void InsertTags(string[] tagsToInsert, string tagsPrecedingCursor) {
      int cursorLine = LineContainingCursor();
      lines.InsertRange(cursorLine+1, tagsToInsert);
      selectionStart = NewSelectionStart(cursorLine + 1, tagsPrecedingCursor);
    }

The process of adding a new menu item that inserts tags is pretty simple: we add the insert strings, then the Insert handler, the Insert method, and call InsertTags. At first glance, there is no duplication to be concerned about, certainly there are no two lines of code that are the same.

However, the entire process is repetitive: we do those half-dozen things every time. And while no two lines are duplicated, there is huge duplication of partial lines. Look at those private static strings: they’re all just the same except for the strings and the names. Look at the action methods: they all say InsertTags, except with different arguments. We need to get rid of that duplication.

The first thing I did was to get rid of the variable declarations and some of the other duplication, by combining everything into the menu setup, like this:

      insertSection = new NotepadMenuItem (
        "Insert &Section",
        new EventHandler(MenuInsertTags),
        new string[] {"<sect1><title></title>","</sect1>" },
        new string[] { "<sect1><title>" } );

      insertPre = new NotepadMenuItem (
        "Insert &Pre",
        new EventHandler(MenuInsertTags),
        new string[] {"<UL>","<LI></LI>","</UL>"},
        new string[] { "<UL>", "<LI>" } );

Here, I’ve just moved the data from the strings above into a new object, NotepadMenuItem, and we use the same event handler for all the insert items. There are some formatting changes along the way that I’m skipping over but you can surely see the similarity. For the overly curious, here’s a bit of the NotepaedMenuItem class:

  class NotepadMenuItem : MenuItem {
    private string[] tagsToInsert;
    private string[] tagsToSkip;

    public NotepadMenuItem
      (String menuString, EventHandler handler, string[] inserts, string[] skips)
      :base(menuString, handler) {
        tagsToInsert = inserts;
        tagsToSkip = skips;
    }

    public string[] Inserts {
      get { return tagsToInsert; }
    }

    public string[] Skips {
      get { return tagsToSkip; }
    }
  }

This class just encapsulates all those scalars, removing the duplication of declaring them with names, by just declaring them as literals in the “new” statements up above.

But there is still duplication! Look at all the common elements in the insertSection and insertPre definitions above. Again, no line is duplicated, but there are common substrings. That’s duplication! We have to do something about it!

We don’t like the need to send in two strings, and we would prefer that the strings be defined by the TextModel (which is the class that does the actual insert), not in the Form code. So we posit a new way of writing the code:

      insertPre = new NotepadMenuItem (
        "Insert &Pre",
        new EventHandler(MenuInsertTags),
        TextModel.Tags.Pre);

Tags is just an enumeration that gives a unique value to every kind of insert, and inside the constructor for the NotepadMenuItem we look that up to find the strings we need. So we have eliminated a little more duplication, and we have moved some responsibility to the class that should have it. But we still have duplicated code:

      insertSection = new NotepadMenuItem (
        "Insert &Section",
        new EventHandler(MenuInsertTags),
        TextModel.Tags.Section);

      insertPre = new NotepadMenuItem (
        "Insert &Pre",
        new EventHandler(MenuInsertTags),
        TextModel.Tags.Pre );

      insertUnorderedList = new NotepadMenuItem (
        "Insert &UL",
        new EventHandler(MenuInsertTags),
        TextModel.Tags.UnorderedList );

There’s still too much duplication in the above. Again, the lines are all different but they contain too many characters that are the same. If we have the same patch of code repeated with just a few changes, how do we reduce the duplication? A loop! We should be looping over a data structure that reflects the variability in the above statements.

We build the table:

    private static InsertAction[] insertActions = new InsertAction[] {
      new InsertAction("Insert &Pre", 
        new string[] { "<pre></pre>" }, 
        new string[] { "<pre>" }),
      new InsertAction("Insert &Section", 
        new string[] {"<sect1><title></title>","</sect1>" }, 
        new string[] {"<sect1><title>" }),
      new InsertAction("Insert &UL", 
        new string[] {"<UL>","<LI></LI>","</UL>"}, 
        new string[] {"<UL>", "<LI>" }),
      new InsertAction("Insert &OL", 
        new string[] {"<OL>","<LI></LI>","</OL>"}, 
        new string[] {"<OL>", "<LI>" })
      };

And we loop over it:

      foreach (InsertAction action in model.InsertActions) {
        insertMenus.Add(new NotepadMenuItem ( 
          action.MenuString, 
          new EventHandler(MenuInsertTags),
          action));
      }

Guess what! The menu creation is now data driven. To add a new menu item, all we do is add another InsertAction to the array.

We could remove more duplication if we cared to. Since all the inputs to the InsertAction creation are strings, we could just create some kind of simple array of strings, and write a method that parses them to make up the inserts. I was satisfied with this much data-driven character, but you might want to push further in your own code.

Table Driven Code and YAGNI are Not Contradictory

The example above shows what will always happen. If table-driven code is a possibility, we know that it will wind up with a loop over a table. If we imagine the loop unrolled, it will produce code like one of the previous versions above: it will produce duplication in the form of the statements, but it will not duplicate many entire lines.

We learn to recognize this kind of duplication as soon as it appears, and we learn to extract it into data. The pattern of code to look for is pretty simple:

Look for code with the same kinds of statements, often in the same order, but where the details of the statements are different.

This kind of code is asking to be simplified by using a data-driven approach.

It turns out that YAGNI doesn’t prohibit data-driven code at all. YAGNI plus the rules of simplicity will in fact require data-driven code, if only we pay attention.