Kotlin 75 - NFSCD

NFSCD¹: Yesterday the FGNO² / Zoom Ensemble played with the Gilded Rose. This morning we’ll look at it here.

The Gilded Rose is a very enjoyable “refactoring” “kata”, provided by the marvelous Emily Bache, with many other contributors, from a kata originally created, if I have my history right, by Terry Hughes. I’ve written about it before, using Ruby, in a dozen or so articles.

You can check out the README and read the “requirements”.

Yesterday the FGNO / Zoom Ensemble played with the Gilded Rose. This morning we’ll look at it here. I did some work yesterday, and got some ideas from the group, so this isn’t a first look by any means.

Let’s get started. Our mission according to the requirements is to implement a new feature, Conjuring items. Our larger mission, I’d say, is to get this program under our control, and perhaps even to refactor it into good shape.

At this writing, I don’t know how far we’ll go with this. As always, my purpose is to mine these exercises for gold, or silver, or even lead. We’ll stick with this one so long as I find it interesting and don’t see anything any more fascinating.

most fascinating object in the universe, the shiny chrome squirrel

The most fascinating object in the universe, the shiny chrome squirrel.

OK, I resisted it that time. Let’s get to it. I’ve cloned the project and it loaded nicely into IDEA. Here’s the Item class, which we are admonished to leave alone:

open class Item(var name: String, var sellIn: Int, var quality: Int) {
    override fun toString(): String {
        return this.name + ", " + this.sellIn + ", " + this.quality
    }
}

I’m not sure why it’s marked open, whether that’s a trap or what. Anyway it has no useful behavior.

All the work is done in the simple class below. Once we understand this, we’re golden, or at least gilded:

class GildedRose(var items: Array<Item>) {

    fun updateQuality() {
        for (i in items.indices) {
            if (items[i].name != "Aged Brie" && items[i].name != "Backstage passes to a TAFKAL80ETC concert") {
                if (items[i].quality > 0) {
                    if (items[i].name != "Sulfuras, Hand of Ragnaros") {
                        items[i].quality = items[i].quality - 1
                    }
                }
            } else {
                if (items[i].quality < 50) {
                    items[i].quality = items[i].quality + 1

                    if (items[i].name == "Backstage passes to a TAFKAL80ETC concert") {
                        if (items[i].sellIn < 11) {
                            if (items[i].quality < 50) {
                                items[i].quality = items[i].quality + 1
                            }
                        }

                        if (items[i].sellIn < 6) {
                            if (items[i].quality < 50) {
                                items[i].quality = items[i].quality + 1
                            }
                        }
                    }
                }
            }

            if (items[i].name != "Sulfuras, Hand of Ragnaros") {
                items[i].sellIn = items[i].sellIn - 1
            }

            if (items[i].sellIn < 0) {
                if (items[i].name != "Aged Brie") {
                    if (items[i].name != "Backstage passes to a TAFKAL80ETC concert") {
                        if (items[i].quality > 0) {
                            if (items[i].name != "Sulfuras, Hand of Ragnaros") {
                                items[i].quality = items[i].quality - 1
                            }
                        }
                    } else {
                        items[i].quality = items[i].quality - items[i].quality
                    }
                } else {
                    if (items[i].quality < 50) {
                        items[i].quality = items[i].quality + 1
                    }
                }
            }
        }
    }

}

Space and time, and for that matter space-time, do not permit listing all the things not to like about this class and its one method.

Are there any tests, you may ask. Why, yes, there certainly are. Here they are:

internal class GildedRoseTest {

    @Test
    fun foo() {
        val items = arrayOf<Item>(Item("foo", 0, 0))
        val app = GildedRose(items)
        app.updateQuality()
        assertEquals("fixme", app.items[0].name)
    }
}

So that’ll be a load off our mind, knowing that we have comprehensive tests available.

There is also a main program that is interesting, in a file called TexttestFixture:

fun main(args: Array<String>) {

    println("OMGHAI!")

    val items = arrayOf(Item("+5 Dexterity Vest", 10, 20), //
            Item("Aged Brie", 2, 0), //
            Item("Elixir of the Mongoose", 5, 7), //
            Item("Sulfuras, Hand of Ragnaros", 0, 80), //
            Item("Sulfuras, Hand of Ragnaros", -1, 80),
            Item("Backstage passes to a TAFKAL80ETC concert", 15, 20),
            Item("Backstage passes to a TAFKAL80ETC concert", 10, 49),
            Item("Backstage passes to a TAFKAL80ETC concert", 5, 49),
            // this conjured item does not work properly yet
            Item("Conjured Mana Cake", 3, 6))

    val app = GildedRose(items)

    var days = 2
    if (args.size > 0) {
        days = Integer.parseInt(args[0]) + 1
    }

    for (i in 0..days - 1) {
        println("-------- day $i --------")
        println("name, sellIn, quality")
        for (item in items) {
            println(item)
        }
        println()
        app.updateQuality()
    }
}

Insider knowledge tells me that the purpose of this main program is to print out the result of running a few days of the update, so as to look at the text output with our own very eyes and inspect it for correctness or the lack thereof.

This is advanced stuff, kids, so if your mind is boggling at the depth and breadth of the tests, and the elegance of the code, well, I can just say sit tight, everything will be OK.

I’ll run the program main, just to see what it does. The console(!) contains, in addition to a lot of really fun file names, this:

OMGHAI!
-------- day 0 --------
name, sellIn, quality
+5 Dexterity Vest, 10, 20
Aged Brie, 2, 0
Elixir of the Mongoose, 5, 7
Sulfuras, Hand of Ragnaros, 0, 80
Sulfuras, Hand of Ragnaros, -1, 80
Backstage passes to a TAFKAL80ETC concert, 15, 20
Backstage passes to a TAFKAL80ETC concert, 10, 49
Backstage passes to a TAFKAL80ETC concert, 5, 49
Conjured Mana Cake, 3, 6

-------- day 1 --------
name, sellIn, quality
+5 Dexterity Vest, 9, 19
Aged Brie, 1, 1
Elixir of the Mongoose, 4, 6
Sulfuras, Hand of Ragnaros, 0, 80
Sulfuras, Hand of Ragnaros, -1, 80
Backstage passes to a TAFKAL80ETC concert, 14, 21
Backstage passes to a TAFKAL80ETC concert, 9, 50
Backstage passes to a TAFKAL80ETC concert, 4, 50
Conjured Mana Cake, 2, 5

Can We Get to Work Yet?

I think that’s nearly all the warming up we need to do. We’re just about ready to get to work. We just have to decide how to proceed.

Characterization Test: One idea is what some call a “characterization test”, which amounts to nothing more than saving the program’s output, as seen above, and then after making changes, run the program and compare the output. If it has not changed, then we can assume that we haven’t broken the program.; This is a big assumption. What we know is that we didn’t change the output of the program for that particular run. If we were to run it longer, or with different input, who knows what it might do. But it’s better than nothing, and we can of course run it on as much data, and for as many days, as gives us comfort.; There is a minor issue in this, which is that we may have to devise something clever to check the output, since comparing two pages to two pages and saying “something wrong” isn’t very helpful.; There is also the notion of “Approval Tests”, much the same idea, wrapped in some tools and ritual, which I believe comes from LLewellyn Falco. See approvaltests.com.
POUT: Plain Old Unit Tests: We can, probably should, and likely will create our usual small tests to check our work. One of my colleagues created unit tests from the spec, for every case he could think of. That may well be a very good idea. If I decide to do it, I’ll do it offline, except for a few for demonstration purposes, to spare you the inevitable tedium.
Refactoring Tools: Intellij IDEA has many powerful and useful refactoring tools, and if we trust them, we can use them without much testing, with reasonable confidence that we haven’t broken anything. I should add a few more caveats here, but in fact I do trust them to do a decent job without injecting mistakes. I know people who claim to have seen refactoring transformations break things, so we should be cautious.
Go For It!?!?: We could just go for it. Read the program, figure out the if nest, and slot in our changes where they seem to fit.; We could just go for it, carefully pulling bits out, hoping to make the program simple enough to understand.; I think we know what would happen if we did that. We’d go slowly, forever, every time we had to change something. We’d make mistakes, and some of them would slip out into the released version, making mistakes in actual use. It wouldn’t go well, but we might just wallow in the mire, telling ourselves, “Well, that’s legacy code for you”.; Let’s not do that.

A Tentative Plan

I’m going through all this intro because I’m not here to pull a refactored gilded Rose out of my orifice and shout “Behold!”. I’m here to share with you the process that I’d go through whenever faced with taking over some legacy program.

I think I’ll start with a characterization test. I’ll save the current output and figure out a way to check it against current output. I’ll do that in the ludicrous test fixture we have, fixing the “fixme” along the way.

My first cut at the test is this:

    @Test
    fun `check golden master`() {
        main(arrayOf())
        assertEquals(golden, asTested)
    }

I’ve captured the output that I showed above and pasted it into a variable, golden, and I plan to get the main to put the result into asTested. I expect that I’ll have to fiddle around a bit to get visibility of the variables.

First, I write a new function to replace the println calls:

var asTested = ""

fun myPrintln(s: String) {
    asTested+=s+"\n"
    println(s)
}

There’s surely some more clever way to do this, but I’m not here to be clever. This will work for strings, and we’ll deal with other matters as IDEA informs us. Now I’ll replace all the calls to println with mine.

IDEA objects to two:

    for (item in items) {
        myPrintln(item)
    }
    
    myPrintln()

Fine, I’ll override:

fun myPrintln(item: Item) {
    myPrintln(item.toString())
}

fun myPrintln() {
    myPrintln("")
}

Let’s run the test and see what it says. It runs green. Commit: set up simple golden master check.

Now I want to make sure that it fails, so I make an error in updateQuality:

    fun updateQuality() {
        for (i in items.indices) {
            if (items[i].name != "Aged Brie" && items[i].name != "Backstage passes to a TAFKAL80ETC concert") {
                if (items[i].quality > 0) {
                    if (items[i].name != "Sulfuras, Hand of Ragnaros") {
                        items[i].quality = items[i].quality - 12
                    }
                }

I changed a -1 to -12. Test. The test fails, as advertised. I can see that I won’t like this: it just says Expected (many text lines) but was (many more text lines).

We’d like to have better visibility into what changed. Is that worth doing now? I think we’ll do it.

    @Test
    fun `check golden master`() {
        main(arrayOf())
        val goldLines = golden.lines()
        val testLines = asTested.lines()
        val zipped = goldLines.zip(testLines)
        zipped.forEach{
            assertEquals(it.first, it.second)
        }
    }

Test. Fails:

expected: <+5 Dexterity Vest, 9, 19> but was: <+5 Dexterity Vest, 9, 8>
Expected :+5 Dexterity Vest, 9, 19
Actual   :+5 Dexterity Vest, 9, 8

Perfect. I’ll fix the bug, run the test, Commit: improved golden master test to check lines.

I also decide to comment out the actual println, which just gloms up my output.

fun myPrintln(s: String) {
    asTested+=s+"\n"
//    println(s)
}

Test, Commit: Commented out println.

Reflection: We now have a simple but obviously at least somewhat effective Characterization / Approval Test in place. We can begin to work, changing things, if we wish. It’s not great, but it’s decent.

Let’s make some simple changes to give us a place to work. I’ll use IDEA’s refactoring tools. Using those simply, plus my test, gives me sufficient confidence for what I’m about to do.

This code is so ugly:

    fun updateQuality() {
        for (i in items.indices) {
            if (items[i].name != "Aged Brie" && items[i].name != "Backstage passes to a TAFKAL80ETC concert") {
                if (items[i].quality > 0) {
                    if (items[i].name != "Sulfuras, Hand of Ragnaros") {
                        items[i].quality = items[i].quality - 1
                    }
                }
            } 
            ... and on and on ...

Let’s do something about all those item[i] references. IDEA will extract a variable for us:

    fun updateQuality() {
        for (i in items.indices) {
            val item = items[i]
            if (item.name != "Aged Brie" && item.name != "Backstage passes to a TAFKAL80ETC concert") {
                if (item.quality > 0) {
                    if (item.name != "Sulfuras, Hand of Ragnaros") {
                        item.quality = item.quality - 1
                    }
                }
            }
            ... and on and on ...

Nicer. Test. Green. Commit: Extract item variable from main loop.

Now I’d really like to be able to update a single item rather than all of them. So let’s extract all that code as a method.

    fun updateQuality() {
        for (i in items.indices) {
            val item = items[i]
            updateItem(item)
        }
    }

I’ll spare you the long extracted method. Our test will tell us if we are OK, to the extent that we question IDEA’s refactoring. Green. Commit: extract updateItem method.

Let’s take a moment to reflect and sum up where we are.

R/SU Where We Are

We started with no useful test: we now have one that is at least capable of detecting some errors. It might be fun to experiment with other changes to the code, to get a sense of how robust the test is.

We have improved the code so that there is a method to update a single item, and improved the code inside so that at least it doesn’t say item[i] a thousand times.

We could take a look at implementing the new feature now. I think we’d be smart enough not to try to interweave it with that briar patch of code. My pal GeePaw did it by prefixing the code in updateItem with a if for the new item, updating it according to the spec, and then returning. That allowed the old code to do everything but the conjured item. Then he continued, and enhanced, that idea.

There is a pattern called “Strangler”, named by Michael Feathers, if I’m not mistaken, in his excellent book Working Effectively with Legacy Code. The idea is that we surround the bad code, replacing bits of it with good code, until there is no bad code left. What we might do here is like that … although I think I wouldn’t really try to replace that old code. Instead, I’d just do all the item types, one at a time, until nothing was going through the old code. Then I’d delete it.

Suppose we choose to do that. Then, surely, what we should do is to write a TDD-style microtest (or a few) for each item as we adopt it. We could even test our new code against the old code similarly, should we care to.

There is another thing that we could do, and that would be to try to refactor, literally refactor, with small changes, that nasty update method. I’ve looked at that code. You can look at it. I don’t see many crevices in there where we can find useful things to improve.

Yet another idea occurred to me yesterday, and we refined it last night in the Zoom meeting. There are a number of patches of code like this:

    if (item.quality < 50) {
                item.quality = item.quality + 1

This code is enforcing one of the rules, which is that the quality of an item is never more than 50. (There is at least one exception, Sulfuras always has value 80, unchanging.(The code does not enforce this, it just seems to ignore Sulfuras entirely.))

Yesterday, I extracted small methods for situations like this, and plugged them in where they fit. Last night, the notion of a “bounded object” was suggested, perhaps first by Bill Wake. That would be an object containing, say, a quality value, and supporting methods to adjust it up and down. But the object would know its bounds and would ensure that its value always stayed within bounds.

To do that, we’d probably have to violate the rule about not modifying the Item class. Because we’re here at my house³, we can violate that rule if we think it’ll be educational or fun.

So … what should we do? How should we proceed?

I think how we’ll proceed is to wrap this article here, take a break, and then create a few microtests to get a feeling for how those might serve us.

Summing Up …

We’ve begun to get this weird program under control. We’ve improved it, safely, with the aid of our IDE and a simple but reasonably effective test.

We’ve made a bit of progress. That makes it a good morning.

See you soon!

Now For Something Completely Different. ↩
Hill’s name for our Tuesday evening Zoom Ensemble⁴: Friday Geek’s Night Out. ↩
Why didn’t he say “chez Ron”?? ↩
My more elegant name for the same chaotic evening as FGNO, q.v.⁵ ↩
“Quod vide”. “Which see”. We’ve talked about this. ↩