Kate Talks Testing
Kate was asked to talk about testing at the monthly meeting of the Ann Arbor Kanban Agile LEan society. She set up her iPad to display on the screen while the room settled down.
After the KALe host introduced her, Kate said, “So, I was given the topic ‘Testing’. That’s pretty wide open. Does anyone have a question to start us off?”
Someone in the back said, “Yeah, I do. I’m told you hate testers. Why is that? What’s wrong with us?”
Kate laughed. “Well, some of you go beyond assertive nearly all the way to rudeness. But no, I don’t hate testers. I do dislike separate testing groups, testing after development, and defect databases. Let me talk about those things a bit.
“At Oak River, where I work, we use an Agile software development process. It’s like Scrum, in many ways. We have what we call a Product Champion, like a Product Owner. The difference is that the Product Champion helps the team recognize that it is the team’s product, not hers.
“We don’t usually identify a ScrumMaster, because we believe that every individual has occasions to lead, every day. We don’t want to embody servant leadership in a specific role: it’s a responsibility of each of us.
“We have fully cross-functional teams, as Agile methods including Scrum recommend. We make sure that every team has all the skills needed to build running tested features every week. Those skills include testing.
“That brings us to testing. Maybe more details about our approach will come out. But take a look at what an approach like ours needs to do.”
Kate drew a picture on her iPad. “We work in one-week iterations, Sprints for you Scrum people. Every week we go from our Product Champion’s current most-valued ideas to packaged features that are ready to be given to customers. That means all the new features are tested and integrated into the existing product. At the end of any iteration, we could give the product to our customers or prospects, and often, we do.”
Kate’s antagonist called “You must ship a lot of bugs then. There’s no time for testing, and all code has bugs in it!”
Kate smiled. “Yes, I suppose all code has bugs in it. We call them defects, by the way. I stole that idea from a Manifesto author I know. His point is that it isn’t bugs creeping into the code at night while we are all sleeping. Every single defect is a result of something typed in, or left out, by an actual living programmer. We consider defects something to be avoided, not some inevitable fact of life. And we ship approximately one defect to our customers every six months, and we consider that to be too many.”
“How do you find and fix them, then, in such a short time?”
“Ha! We don’t find defects, we don’t fix them: we prevent them. Let’s look at ways we can ship code with very few defects, and I do mean very few.”
Kate drew another picture. “We could code and then test, like this. That would be bad. Here are a few reasons why.
“First, since we work in one-week iterations, it would mean that testing would pile up, since testing after the fact, we couldn’t start until the code was done. So at the beginning of the iteration, the testers would be idle and at the end either the programmers would be idle or they’d be coding things that couldn’t be tested. Either would be bad.
“Second, when the testing phase finds a problem, it has to send the software back to the coding phase. Since the programmer has already kicked that can over to testing and moved on to something else, he has to drop what he’s doing and fix the problem.”
“Why couldn’t he wait and fix it after he’s done with whatever he’s now working on,” someone asked.
“Well, the can he kicked down the road is more important, so fixing it is higher priority. He wasn’t done with the first thing, so shouldn’t even have started with the second.”
Kate went on. “But whether the programmer works on the defect right away or later, the real problem is that the defect exists at all. Once the code leaves the programmer’s hands, if it comes back, it is rework, and rework is waste. It costs us time, as the code cycles back and forth from code to test to code, and time, I’m told, is money.
“Finally, it should be pretty clear that if we work this way we can’t possibly get features done in a week. That means that the Product Champion won’t know how things are going, nor will management. When features are flowing out, we know what’s going on and we are better able to guide the effort, just by providing new feature stories next iteration. When features are not coming out, we need to look at other ways to find out how things are going, and most of those don’t work.”
Kate said, “Let me admit right here that at heart I’m a programmer. So I know that when someone asks me how I’m doing, I should say something like ‘ninety percent done, just got this little widget thing to do’. That way they’ll go away and I can get back to coding.
“But suppose I even believe I’m ninety percent done, but a third of my features come back to me from the test phase. I find that most of the time, it takes me nearly as long to fix a defect as it did to write the thing in the first place. And of course, the defect doesn’t even come back to me until the testing people finally get around to testing my code. So when I think I’m ninety percent done, I’m really only about sixty percent done. And that doesn’t count the delay in finding out.”
Another audience member asked, “But we can’t be sure there aren’t defects without testing, can we? And even then we can’t be certain.”
Kate reached into the bowl of candies on the table and tossed a Tootsie Roll toward the person who asked the question. It landed directly in front of them, of course. “Thanks for the lead-in,” she said.
“Yes, all our code needs to be tested, so that we can be as sure as possible that there are no defects. But we can’t afford to test in a separate phase, which is why I don’t like separate testing groups. They always require a second phase, and when they find defects, they always cause rework.”
Someone in the back said, “We testers don’t cause rework. The programmer caused the rework by writing the bug. Sorry, defect.”
Kate said, “Hold your hand out and don’t move it.” The questioner did and Kate tossed a Tootsie Roll into her hand. “Thanks. You have hit on the solution: we prevent defects rather than detect them. Here’s how:
“When our Product Champion decides on a new feature to request, we work with her to be clear on what the feature has to do. We consider what it should look like on the screen, what it should do when it works, and we consider how it might fail.” She drew another picture.
“When we ready a story for implementation, obviously we have to define what it has to do. We express most of what it has to do with examples of its inputs and outputs. We convert those examples to tests, typically automated ones. We call them the Champion’s tests, because they are there to assure the Product Champion that she’s getting what she asked for. An XP shop might call them Customer tests. Some people call them Acceptance Tests. It’s all good: the point is to have them.
“When we do the development, the programmer runs the Champion’s tests before declaring the feature done. Our rule is that if they don’t run, you’re not done — and if they do run, you are done.”
Someone called out, “Those are just happy path tests. What about the sad paths? Testers are good at finding those.”
Kate tossed a Tootsie Roll. It landed on the questioner’s copy of The Nature of Software Development.
“Did you know that the redhead in that book is supposed to be me? Frankly, I think Ron has something to learn about drawing. Good book otherwise. But I digress.”
Kate continued, “Anyway, we don’t actually have a role called Tester, which may be the source of the foul and base canard that I hate testers. We do have team members who have a lot of skill at testing and who use that skill to think about what you call sad paths. We call them Paths of Sorrow, mostly. Anyway, our rule is that if whatever tests we have run correctly, the feature is done, and if they don’t, it isn’t.
“You might think that would lead to cases where there is an obvious bug — excuse me, defect — that we didn’t test for. And it does, sometimes. Our rule calls that feature done, and we don’t like it. But we don’t change the rule: we take the occasion to learn what kind of test we missed. What we learn, typically, is that there was a Path of Sorrow that we didn’t specify a test for. So we learn something about testing.
“More importantly, we turn an unexpected sad path into something we do intentionally: ‘When such and such happens, do thus and so.’ Every path becomes a happy path that we’ve planned for and catered to.”
Another KALe member asked, “Doesn’t all that testing slow you down?” and received a Tootsie Roll for her trouble.
Kate put up her earlier picture. “It’s not slow compared to testing after the coding, for a few reasons. First, testing after the fact has built-in delays while we wait for the hand-off between code and test people. Second, we find that fixing after the fact still requires us to write and check the test, and the fix takes longer. So there’s less work the way we do it. Finally, the focus on concrete examples helps the developer understand what’s needed and they seem to make fewer missteps this way.
“Oh, and whatever comes after finally, since we never call anything done until we’re sure it really is, we don’t have a big list of defects. That means that we don’t need a defect database, and I hate those.
“Overall, we find that we go faster. But even if we didn’t, we’d prefer our way because it has built-in learning and keeps us going at a steady pace instead of seeming to speed up and slow down when defects occur.”
Kate said, “We have time for one more question. Anyone?”
Someone back in the corner asked “May I have a Tootsie Roll?”
Kate tossed one. It bounced off both corner walls and landed in the questioner’s lap. “Sure but that doesn’t count as a question. Do you have one?”
“Yes. What happens when there’s something wrong and the up-front tests didn’t show it? What about user experience things that you can’t test? What about things no one thought of?”
Kate laughed. “This must be some new kind of ‘one question’ that I wasn’t previously familiar with, but I’ll try:
“Calling a story done because its tests pass is a thing that we do because we used to have a bit of blaming going on. The business people would blame the programmers for defects and the programmers would blame the business people for being unclear. I found it distasteful, and stole the idea that the tests running means done from someone, probably Ron and Chet.
“Basically, we just pretend that we don’t care who screwed up. We just take any such needed change, whether it’s a missed mechanical test or a discovery about the UI, or anything, and we write a new story. That does mean that in this iteration, there’s something in the system that we really don’t like. There’s a calculation that everyone would consider wrong, or that discerning eyes would consider ugly.
“We don’t like that, of course. So when it happens, we learn from it. We learn to be more clear about what we need to do, we learn to move from testing after the fact to examples before the fact, and we learn to work in tiny little stories so that mistakes, when they do arise, are small and easy to fix.”
Kate paused. “So that’s my story. We love testing skills, because my teams can turn testing ideas into better understanding and into automated tests that serve to tell the developers when they’re done. And we avoid entirely having separate testing phases, long lists of defects, and we have a helping culture instead of a blaming culture.
“It works for us. It might well work for you. We’d be happy to visit with you to talk about how to start working this way, if you’re interested. Call me any time.
“I’ll stick around a bit after this session, but it’s 8 PM so the meeting is over. Thanks!”