What is STAF?
STAF stands for "Software Testing Automation Framework. As its name indicates, STAF is an automation framework. It is intended to make it easier to create and manage automated testcases and test environments.
STAF externalizes its capabilities through services. A service provides a focused set of functionality, such as, Logging, Process Invocation, etc. STAFProc is the process that runs on a machine, called a STAF Client, which accepts requests and routes them to the appropriate service. These requests may come from the local machine or from another STAF Client. Thus, STAF works in a peer environment, where machines may make requests of services on other machines.
STAF was designed with the following points in mind:
Minimum machine requirements - This is both a hardware and a software statement.
Easily useable from a variety of languages, such as Java, C/C++, Rexx, Perl, and Tcl, or from a shell (command) prompt.
Easily extendable - This means that it should be easy to create other services to plug into STAF.
Find this on the following location :
http://staf.sourceforge.net/docs.php
Thursday, November 30, 2006
Tuesday, November 28, 2006
Testing framework
A testing framework gives you a place to organize and run your tests without worrying about the low-level plumbing. We talk about two kinds here: tests harnesses and test tools. A test harness is an API that you can use as the basis for writing your own tests. A test tool is a program that you use to create and run tests. Both have their place, depending on what you’re testing.
What’s Available (Test Harnesses):
SUnit- SmallTalk Unit is the original XUnit testing harness. This is the one that all the current unit testing frameworks copied.
JUnit- One of the more popular testing frameworks, written in Java for Java.
JUnitPerf- A nice collection of JUnit extensions that make performance and scalability measurement easier. This is a great example of why you want to use an open tool. Lots of people extend open tools, and the extensions are the sort of thing you’d have to write for yourself if you used a toolkit you wrote.
NUnit- A unit testing harness for .NET, originally ported from JUnit. It supports all the .NET languages.
MbUnit- Built on top of NUnit, MbUnit brings several higher-level testing techniques to the table. Integrated combinatorial testing, reporting, row testing, data-driven testing, and other concepts are all part of the package. Note that you must download TestDriven.NET to get MBUnit.
HTMLUnit- Used inside another test harness (such as JUnit), HTMLUnit simulates a web browser to test web applications.
HTTPUnit.- HTTPUnit is a lot like HTMLUnit, but it uses HTTP requests and responses to do its testing.
JWebUnit- JWebUnit sits on top of HTTPUnit to give you a high-level API for navigating a web app.
JSUnit - JSUnit is an incredible framework for testing JavaScript. But it's also got a distributed test runner to execute test scripts on different machines.
DBUnit- This is not a traditional testing framework but can be used to save a database to XML, restore it later, or compare the data to the XML file to verify it's integrity. This is a great tool to use for creating a clean database for a test run.
What’s Available (Additional Testing Tools):
Cobertura.- Cobertura (Spanish for “coverage”) is a code coverage tool. When you run a set of tests, it tells you how much tests exercise the tested code. IBM's DeveloperWork's has a good tutorial by Elliotte Harold.
Clover- A commercial Java code coverage tool, Clover has integrated plug-ins for most of the popular IDEs out there.
Watir- A testing tool to drive automated tests inside Internet Explorer. By driving IE, it solves the problem of imitating a specific browser’s interpretation of a web page. It’s based on Ruby and is becoming more popular.
Selenium also drives browser testing. It's more complicated than Watir but it can drive tests against browsers on different platforms.
Fit- Fit takes a unique, user-friendly, table-driven approach to acceptance tests. It’s worth reading about even if you choose not to use it.
Fitnesse- An extension to Fit. Fitnesse is both a stand-alone wiki and an acceptance testing framework.
WinRunner & QTP- Winrunner is an enterprise-class tool for functional and regression testing (and has a price tag to match).
LoadRunner- From the same company as WinRunner, LoadRunner handles performance and stress testing.
Empirix E-Tester- Empirix is a web recorder/playback tool that embeds MS Internet Explorer.
Systir- Another Ruby-based tool. This one is designed to drive tests in other languages.
Key Concepts:
API- The programmatic interface that your test code uses to access the test harness.
Methodology for creating tests- The framework that the test harness exposes.
User Interface- How you create, save, run, and maintain tests.
Test engine- The program that actually runs the tests you create.
Results display- How you find out whether the tests passed or failed.
How to Choose:
Type of testingDoes the tool or harness let you run the tests you need (e.g., functional, performance)?-
Support for the stuff you’re testing- Does the tool let you test your application code? Your web site?
Supported programming languages- Can you test the languages you use natively, or do you have to learn a new technology?
Flexibility- Can you create and run the type of tests that you need for your program?
Open formats- Can you integrate your test tool with other tools?
What’s Available (Test Harnesses):
SUnit- SmallTalk Unit is the original XUnit testing harness. This is the one that all the current unit testing frameworks copied.
JUnit- One of the more popular testing frameworks, written in Java for Java.
JUnitPerf- A nice collection of JUnit extensions that make performance and scalability measurement easier. This is a great example of why you want to use an open tool. Lots of people extend open tools, and the extensions are the sort of thing you’d have to write for yourself if you used a toolkit you wrote.
NUnit- A unit testing harness for .NET, originally ported from JUnit. It supports all the .NET languages.
MbUnit- Built on top of NUnit, MbUnit brings several higher-level testing techniques to the table. Integrated combinatorial testing, reporting, row testing, data-driven testing, and other concepts are all part of the package. Note that you must download TestDriven.NET to get MBUnit.
HTMLUnit- Used inside another test harness (such as JUnit), HTMLUnit simulates a web browser to test web applications.
HTTPUnit.- HTTPUnit is a lot like HTMLUnit, but it uses HTTP requests and responses to do its testing.
JWebUnit- JWebUnit sits on top of HTTPUnit to give you a high-level API for navigating a web app.
JSUnit - JSUnit is an incredible framework for testing JavaScript. But it's also got a distributed test runner to execute test scripts on different machines.
DBUnit- This is not a traditional testing framework but can be used to save a database to XML, restore it later, or compare the data to the XML file to verify it's integrity. This is a great tool to use for creating a clean database for a test run.
What’s Available (Additional Testing Tools):
Cobertura.- Cobertura (Spanish for “coverage”) is a code coverage tool. When you run a set of tests, it tells you how much tests exercise the tested code. IBM's DeveloperWork's has a good tutorial by Elliotte Harold.
Clover- A commercial Java code coverage tool, Clover has integrated plug-ins for most of the popular IDEs out there.
Watir- A testing tool to drive automated tests inside Internet Explorer. By driving IE, it solves the problem of imitating a specific browser’s interpretation of a web page. It’s based on Ruby and is becoming more popular.
Selenium also drives browser testing. It's more complicated than Watir but it can drive tests against browsers on different platforms.
Fit- Fit takes a unique, user-friendly, table-driven approach to acceptance tests. It’s worth reading about even if you choose not to use it.
Fitnesse- An extension to Fit. Fitnesse is both a stand-alone wiki and an acceptance testing framework.
WinRunner & QTP- Winrunner is an enterprise-class tool for functional and regression testing (and has a price tag to match).
LoadRunner- From the same company as WinRunner, LoadRunner handles performance and stress testing.
Empirix E-Tester- Empirix is a web recorder/playback tool that embeds MS Internet Explorer.
Systir- Another Ruby-based tool. This one is designed to drive tests in other languages.
Key Concepts:
API- The programmatic interface that your test code uses to access the test harness.
Methodology for creating tests- The framework that the test harness exposes.
User Interface- How you create, save, run, and maintain tests.
Test engine- The program that actually runs the tests you create.
Results display- How you find out whether the tests passed or failed.
How to Choose:
Type of testingDoes the tool or harness let you run the tests you need (e.g., functional, performance)?-
Support for the stuff you’re testing- Does the tool let you test your application code? Your web site?
Supported programming languages- Can you test the languages you use natively, or do you have to learn a new technology?
Flexibility- Can you create and run the type of tests that you need for your program?
Open formats- Can you integrate your test tool with other tools?
Monday, November 27, 2006
Why Agile Teams Don’t Need Process QA
Why Agile Teams Don’t Need Process QA
Flashback: Quality Week 2001
In 2001 a conference for quality and testing professionals has been organized. Kent spent the first portion of his talk explaining how QA is a throwback to Tayloristic, Time-and-Motion, Scientific Management practices, and that XP teams just don’t need any of that. He went on to explain the XP practices, and how they are both disciplined and self-supporting.
What Kent didn’t realize is that he had a double time slot. The crowd wasn’t about to let him go after he’d just offended all they knew to be good and holy. He spent the next 45 minutes answering questions posed by extremely angry QA professionals.
The good news is that Kent survived the experience. Despite a few tense moments, the audience did not charge the podium after all. I must say that I was impressed with Kent’s courage and grace under fire.
The insights I gleaned from that Conference may not have been exactly what Kent intended. It seemed to me that:
XP teams still needed Test QA people.
XP teams might not need Process QA people.
Kent didn’t understand (or at least didn’t articulate) the difference between Test QA and Process QA; he seemed to paint all “QA” people with the same brush.
I should learn more about this XP stuff.
Flashback: Quality Week 2001
In 2001 a conference for quality and testing professionals has been organized. Kent spent the first portion of his talk explaining how QA is a throwback to Tayloristic, Time-and-Motion, Scientific Management practices, and that XP teams just don’t need any of that. He went on to explain the XP practices, and how they are both disciplined and self-supporting.
What Kent didn’t realize is that he had a double time slot. The crowd wasn’t about to let him go after he’d just offended all they knew to be good and holy. He spent the next 45 minutes answering questions posed by extremely angry QA professionals.
The good news is that Kent survived the experience. Despite a few tense moments, the audience did not charge the podium after all. I must say that I was impressed with Kent’s courage and grace under fire.
The insights I gleaned from that Conference may not have been exactly what Kent intended. It seemed to me that:
XP teams still needed Test QA people.
XP teams might not need Process QA people.
Kent didn’t understand (or at least didn’t articulate) the difference between Test QA and Process QA; he seemed to paint all “QA” people with the same brush.
I should learn more about this XP stuff.
Wednesday, November 22, 2006
Agile Testing
Introduction:
What is Agile Testing?
What is Agile Testing? The question came up on the Agile Testing mailing list, and among others attempted to define it:
When I use the term 'agile testing' I'm talking about a set of good practices which has helped me and my teams deliver higher quality software, whether or not we were part of an agile team. We know that people, not tools or methodologies, are what makes projects succeed. Anything we can do to allow ourselves to do our best work fits my definition of 'agile'.
'Agile testing' practices come about by applying agile values and principles to testing. For example, communication is essential, and by collaborating with our customers up front to write test cases, we can improve communication. By driving coding with customer-facing tests, we're more likely to deliver what the customer wanted.
Some other examples: Automating our regression tests allows us time to do exploratory testing, so we strive for 100% regression test automation (I've never achieved anything close to this on a team who isn't automating unit tests, but we can still automate some testing). Exploratory testing is inherently agile so we work at getting better at that. Collaborating helps communication, so testers should get together with programmers and with customers from the start of a project, as well as pairing with each other.
Simple design is important in designing and automating tests. Refactoring applies to any type of coding including test scripts. Retrospectives are the single most important practice I can think of for any team, because you can't improve without identifying and focusing on what you need to improve. Dividing work into small chunks, having a stable build at all times (because we have continuous integration and collective code ownership), releasing business value frequently, all that allows us to do a better job of testing.
None of this is really new, and a lot of it I was doing years before I heard of XP, Scrum, agile, etc.
I haven't come up with a good elevator speech for agile testing so I have to resort to examples. I've heard people come up with some great one-liners but then I can't remember them. I think the Agile Manifesto works from a testing point of view as well as from a coding point of view.
I spend too much time saying I don't know where Agile testing will be in five years, not enough pointing in some direction and saying "But let's see if maybe we can find it over there". They're probably right. So this is the start of a series of notes in which I'll do just that.
I'm going to start by restating a pair of distinctions that I think are getting to be fairly common.
If you hear someone talking about tests in Agile projects, it's useful to ask if those tests are business facing or technology facing. A business-facing test is one you could describe to a business expert in terms that would (or should) interest her. If you were talking on the phone and wanted to describe what questions the test answers, you would use words drawn from the business domain: "If you withdraw more money than you have in your account, does the system automatically extend you a loan for the difference?"
A technology-facing test is one you describe with words drawn from the domain of the programmers: "Different browsers implement Javascript differently, so we test whether our product works with the most important ones." Or: "PersistentUser#delete should not complain if the user record doesn't exist."
(These categories have fuzzy boundaries, as so many do. For example, the choice of which browser configurations to test is in part a business decision.)
It's also useful to ask people who talk about tests whether they want the tests to support programming or critique the product. By "support programming", I mean that the programmers use them as an integral part of the act of programming. For example, some programmers write a test to tell them what code to write next. By writing that code, they change some of the behavior of the program. Running the test after the change reassures them that they changed what they wanted. Running all the other tests reassures them that they didn't change behavior they intended to leave alone.
Tests that critique the product are not focused on the act of programming. Instead, they look at a finished product with the intent of discovering inadequacies.
Put those two distinctions together and you get this matrix:
Agile testing directions: tests and examples:
'It all depends on what you mean by home.'[...]'Home is the place where, when you have to go there,They have to take you in.' 'I should have called itSomething you somehow haven't to deserve.'
-- Robert Frost, "The Death of the Hired Man"
Consider the left to right division. Some testing on agile projects, I say, is done to critique a product; other testing, to support programming. But the meaning and the connotations of the word "testing" differ wildly in the two cases.
When it comes to supporting programming, tests are mainly about preparing and reassuring. You write a test to help you clarify your thinking about a problem. You use it as an illustrative example of the way the code ought to behave. It is, fortunately, an example that actively checks the code, which is reassuring. These tests also find bugs, but that is a secondary purpose.
On the other side of the division, tests are about uncovering prior mistakes and omissions. The primary meaning is about bugs. There are secondary meanings, but that primary meaning is very primary. (Many testers, especially the best ones, have their identities wrapped up in the connotations of those words.)
I want to try an experiment. What if we stopped using the words "testing" and "tests" for what happens in the left side of the matrix? What if we called them "checked examples" instead?
Imagine two XP programmers sitting down to code. They'll start by constructing an incisive example of what the code needs to do next. They'll check that it doesn't do it yet. (If it does, something's surely peculiar.) They'll make the code do it. They'll check that the example is now true, and that all the other examples remain good examples of what the code does. Then they'll move on to an example of the next thing the code should do.
Is there a point to that switch, or is it just a meaningless textual substitution? Well, you do experiments to find these things out. Try using "example" occasionally, often enough that it stops sounding completely weird. Now: Does it change your perspective at all when you sit down to code? Does it make a difference to walk up to a customer and ask for an example rather than a test? Add on some adjectives: what do motivating, telling, or insightful examples look like, and how are they different from powerful tests? ("Powerful" being the typical adjective-of-praise attached to a test.) Is it easier to see what a tester does on an XP project when everyone else is making examples, when no one else is making tests?
Credit: Ward Cunningham added the adjective "checked". I was originally calling them either "guiding" or "coaching" examples.
Agile testing directions: technology-facing programmer support
As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm writing about how we can use technology-facing examples to support programming.
One thing that fits here is test-driven development, as covered in Kent Beck's book of same name, David Astel's more recent book, and forthcoming books by Phlip, J.B. Rainsberger, and who knows who else. I think that test-driven development (what I would now call example-driven development) is on solid ground. It's not a mainstream technique, but it seems to be progressing nicely toward that. To use Geoffrey Moore's term, I think it's well on its way to crossing the chasm.
(Note: in this posting, when I talk of examples, I mean examples of how coders will use the thing-under-development. In XP terms, unit tests. In my terms, technology-facing examples.)
Put another way, example-driven development has moved from being what Thomas Kuhn called "revolutionary science" to what he called "normal science". In a normal science, people expand the range of applicability of a particular approach. So we now have people applying EDD (sic) to GUIs, figuring out how it works with legacy code, discussing good ways to use mock objects, having long discussions about techniques for handling private methods, and so forth.
Normal science is not the romantic side of science; it's merely where ideas turn into impact on the world. So I'm glad to see we're there with EDD. But normality also means that my ideas for what I want to work on or see others work on... well, they're not very momentous.
I hope future years will see more people with a mixture of testing and programming skills being drawn to Agile projects. Those people will likely neither be as good testers as pure testers, nor as good programmers as pure programmers, but that's OK if you believe, as I do, that Agile projects do and should value generalists over specialists.
I'm one such mixture. I've done limited pair programming with "pure programmers". When I have, I've noticed there's a real tension between the desire to maintain the pacing and objectives of the programming and the desire to make sure lots of test ideas get taken into account. I find myself oscillating between being in "programmer mode" and pulling the pair back to take stock of the big picture. With experience, we should gain a better idea of how to manage that process, and of what kinds of "testing thinking" are appropriate during coding.
There might also be testers on the team who do not act as programmers. Nevertheless, some of them do pair with programmers to talk about the unit tests (how the programmers checked the code). The programmers learn what kinds of bugs to avoid, and the testers learn about what they're testing. For some reason, Calgary Canada is a hotbed of such activity, and I look to Jonathan Kohl, Janet Gregory, and others to teach us how to do it well.
I want to emphasize that this is all about people. Testers traditionally have an arms-length (or oceans-length) relationship to programmers. For the programmer-support half of the matrix, that relationship is, I believe, inappropriate.
I've been using the phrase "checked examples" for programmer support tests. We can split that idea in two. There are new examples that guide decisions about what to do next. And there are automated examples that serve as change detectors to see whether what you just did was what you expected to do.
The common habit is that the change detectors are merely retained code-guiding examples. (You make your unit test suite by saving, one by one, the tests you write as you code.) That's not a logical necessity. I'd like to develop some lore about when to do something else.
For example, consider this maintenance scenario: you develop some code example-first. A month later, someone adds a new example and changes the code to match. Many prior examples for that hunk of code become "bad examples" (the tests fail, but because they're now wrong, not because the code is). The tendency is to fix those examples so that they're essentially the same. What I mean by that is that the left sequence of events in the table below is expected to yield the same tests as the right. (Read the left column, then the right.)
Example foo written
Example bar written
Code written to match foo
Code written to match bar
Example bar written (foo invalidated)
Example better-foo written (bar is still a good example)
Code changed to match bar - oops, now foo doesn't check out
Code changed to match better-foo (and bar continues to check out)
Update foo to be better-foo
That is, newly broken examples are rewritten to match an ideal sequence of development in which no example ever needed to be rewritten. But why? In the left column above, example new-foo is never used to drive development - it's only for checking. What's optimal for driving development might not be optimal for checking.
Let me be concrete. Suppose that software systems develop shearing layers, interfaces that naturally don't change much. For maintainability, it might make sense to migrate broken examples to shearing layers when fixing them. Instead of being an example about a particular method in a particular class, we now have an example of a use of an entire subsystem. That can be bad - think about debugging - but it reduces the maintenance burden and could even provide the benefit of thorough documentation of the subsystem's behavior.
I'm hoping that people who distinguish the two roles - guiding the near future and rechecking the past - will discover productive lore. For example, when might it be useful to write technology-facing change detectors that never had anything to do with guiding programming?
I said above that test-driven development is "one thing that fits" today's topic. What else fits? I don't know. And is EDD the best fit? (Might there be a revolution in the offing?) I don't know that either - I'll rely on iconoclasts to figure that out. I'm very interested in listening to them.
Agile testing directions: Testers on agile projects:
Should there be testers on agile projects?
First: what's the alternative? It is to have non-specialists (programmers, business experts, technical writers, etc.) perform the activities I've identified in this series: helping to create guiding examples and producing product critiques. Or, symmetrically, it's to have testers who do programming, business analysis, technical writing, etc. It's to consider "testing" as only one set of skills that needs to be available, in sufficient quantity, somewhere in the team, to service all the tasks that require those skills.
Why would non-specialists be a bad idea? Here are some possible reasons:
Testing skills are hard to learn. If you try to be a tester and a programmer or a tester and a technical writer, you won't have the minimum required skills to be a good enough tester.
Suppose you're the best basketball player in the world and also the best car washer. You should nevertheless pay someone else to wash your car, because you could earn far more in that hour playing basketball than you'd save washing your own car. That's an example of comparative advantage, what Paul Samuelson advanced as the only proposition in the social sciences that's both true and non-trivial. It's a general argument for specialization: it's to the advantage of both you and the person you hire for both of you to specialize. So why shouldn't a person with a knack for testing do only testing, and a person who's comparatively stronger at programming do only programming?
Testing might not be so much a learned skill as an innate aptitude. Some people are just natural critics, and some people just aren't.
All the other tasks that a tester might take on in a project imply sharing ownership of the end product. Many people have trouble finding fault in their own work. So people who mix testing and other tasks will test poorly. It's too much of a conflict of emotional interest.
A tester benefits from a certain amount of useful ignorance. Not knowing implementation details makes it easier for her to think of the kinds of mistakes real users might make.
Argument
Let me address minimum required skills and comparative advantage first. These arguments seem to me strongest in the case of technology-facing product critiques like security testing or usability testing. On a substantial project, I can certainly see the ongoing presence of a specialist security tester. On smaller projects, I can see the occasional presence of a specialist security tester. (The project could probably not justify continual presence.)
As for the exploratory testers that I'm relying on for business-facing product critiques, I'm not sure. So many of the bugs that exploratory testers (and most other testers) find are ones that programmers could prevent if they properly internalized the frequent experience of seeing those bugs. (Exploratory testers - all testers - get good in large part because they pay attention to patterns in the bugs they see.) A good way to internalize bugs is to involve the programmers in not just fixing but also in finding them. And there'll be fewer of the bugs around if the testers are writing some of the code. So this argues against specialist testers.
Put it another way: I don't think that there's any reason most people cannot have the minimum required exploratory testing skills. And the argument from comparative advantage doesn't apply if mowing your lawn is good basketball practice.
That doesn't say that there won't be specialist exploratory testers who get a team up to speed and sometimes visit for check-ups and to teach new skills. It'd be no different from hiring Bill Wake to do that for refactoring skills, or Esther Derby to do that for retrospectives. But those people aren't "on the team".
I think the same reasoning applies to the left side of the matrix - technology-facing checked examples (unit tests) and business-facing checked examples (customer tests). I teach this stuff to testers. Programmers can do it. Business experts can do it, though few probably have the opportunity to reach the minimum skill level. But that's why business-facing examples are created by a team, not tossed over the wall to one. In fact, team communication is so important that it ought to swamp any of the effects of comparative advantage. (After all, comparative advantage applies just as well to programming skills, and agile projects already make a bet that the comparative advantage of having GUI experts who do only GUIs and database experts who do only databases isn't sufficient.)
Now let's look at innate aptitude. When Jeff Patton showed a group of us an example of usage-centered design, one of the exercises was to create roles for a hypothetical conference paper review system. I was the one who created roles like "reluctant paper reviewer", "overworked conference chair", and "procrastinating author". Someone remarked, "You can tell Brian's a tester". We all had a good chuckle at the way I gravitated to the pessimistic cases.
But the thing is - that's learned behavior. I did it because I was consciously looking for people who would treat the system differently than developers would likely hope (and because I have experience with such systems in all those roles). My hunch is that I'm by nature no more naturally critical than average, but I've learned to become an adequate tester. I think the average programmer can, as well. Certainly the programmers I've met haven't been notable for being panglossian, for thinking other people's software is the best in this best of all possible worlds.
But it's true an attack dog mentality usually applies to other people's software. It's your own that provokes the conflict of emotional interest. I once had Elisabeth Hendrickson doing some exploratory testing on an app of mine. I was feeling pretty cocky going in - I was sure my technology-facing and business-facing examples were thorough. Of course, she quickly found a serious bug. Not only was I shocked, I also reacted in a defensive way that's familiar to testers. (Not harmfully, I don't think, because we were both aware of it and talked about it.)
And I've later done some exploratory testing of part of the app while under a deadline, realized that I'd done a weak coding job on an "unimportant" part of the user interface, then felt reluctant to push the GUI hard because I really didn't want to have to fix bugs right then.
So this is a real problem. I have hopes that we can reduce it with practices. For example, just as pair programming tends to keep people honest about doing their refactoring, it can help keep people honest about pushing the code hard in exploratory testing. Reluctance to refactor under schedule pressure - leading to accumulating design debt - isn't a problem that will ever go away, but teams have to learn to cope. Perhaps the same is true of emotional conflict of interest.
Related to emotional conflict of interest is the problem of useful ignorance. Imagine it's iteration five. A combined tester/programmer/whatever has been working with the product from the beginning. When exploring it, she's developed habits. If there are two ways to do something, she always chooses one. When she uses the product, she doesn't make many conceptual mistakes, because she knows how the product's supposed to work. Her team's been writing lots of guiding examples - and as they do that, they've been building implicit models of what their "ideal user" is like, and they have increasing trouble imagining other kinds of users.
This is a tough one to get around. Role playing can help. Elisabeth Hendrickson teaches testers to (sometimes) assume extreme personae when testing. What would happen if Bugs Bunny used the product? He's a devious troublemaker, always probing for weakness, always flouting authority. How about Charlie Chaplin in Modern Times: naïve, unprepared, pressured to work ever faster? Another technique that might help is Hans Buwalda's soap opera testing.
It's my hope that such techniques will help, especially when combined with pairing (where each person drives her partner to fits of creativity) in a bullpen setting (where the resulting party atmosphere will spur people on). But I can't help but think that artificial ignorance is no substitute for the real thing.
Staffing
So. Should there be testers on an agile project? Well, it depends. But here's what I would like to see, were I responsible for staffing a really good agile team working on an important product. Think of this as my default approach, the prejudice I would bring to a situation.
I'd look for one or two people with solid testing experience. They should know some programming. They should be good at talking to business experts and quickly picking up a domain. At first, I'd rely on them for making sure that the business-facing examples worked well. (One thing they must do is exercise analyst skills.) Over time, I'd expect them to learn more programming, contribute to the code base, teach programmers, and become mostly indistinguishable from the people who started off as programmers.
Personality would be very important. They have to like novelty, they shouldn't have their identity emotionally wrapped up in their job description, and they have to be comfortable serving other people.
It would be a bonus if these people were good at exploratory testing. But, in any case, the whole team would receive good training in exploratory testing. I'd want outside exploratory testing coaches to visit periodically. They'd both extend the training and do some exploratory testing. That last is part of an ongoing monitoring of the risk that the team is too close to the product to find enough of the bugs.
To the extent that non-functional "ilities" like usability, security, and performance were important to the product, we'd buy that expertise (on-site consultant, or visiting consultant, or a hire for the team). That person would advise on creating the product, train the team, and test the product.
(See Johanna Rothman about why such non-functional requirements ought to be important. I remember Brian Lawrence saying similar things about how Gause&Weinberg-style attributes are key to making a product that stands out.)
I'd make a very strong push to get actual users involved (not just business experts who represent the users). That would probably involve team members going to the users, rather than vice-versa. I'd want the team to think of themselves as anthropologists trying to learn the domain, not just people going to hear about bugs and feature requests.
Are there testers on this team, once it jells? Who cares? - there will be good testing, even though it will be increasingly hard to point at any activity and say, "That. That there. That's testing and nothing but."
Disclaimers
"I'd look for one or two people with experience testing. They should..."
Those ellipses refer to a description that, well, is pretty much a description of me. How much of my reasoning is sound, how much is biased by self-interest? I'll leave that to you, and time, to judge.
"... the whole team would receive good training in exploratory testing."
Elisabeth Hendrickson and I have been talking fitfully all year about creating such training. Again, I think my conclusion - that exploratory testing is central - came first, but you're entitled to think it looks fishy.
A Roadmap for Testing on an Agile Project:
If I were starting up an agile project, here is how I'd plan to do testing. (But this plan is a starting point, not the final answer.)
I assume the programmers will do test-driven design. That's well explained elsewhere (see the Further Reading), so I won't describe it here (much).
Test-driven programmers usually create their tests in what I call "technology-facing" language. That is, their tests talk about programmatic objects, not business concepts. They learn about those business concepts and business needs through conversation with a business expert.
Nothing will replace that conversation, but it's hard for programmers to learn everything they need through conversation, even the frequent conversations that a collocated business expert allows. Too often, the business expert is surprised by the result of a programming task - the programmer left out something that's "obvious". There's no way to eliminate surprises entirely - and agile projects are tailored to make it easy to correct mistakes - but it's sand in the gears of the project if, too often, a programmer's happy "I'm done with the order-taking task!" results in quick disappointment.
It's better if conversations can be conversations about something, about concrete examples. When those concrete examples are executable, we call them "tests" - specifically, I call them "business-facing" tests.
A business-facing test has to be something a business expert can talk about. Most business experts - not all - will find tests written in Java or C# too painful. A safe and satisfactory choice is to use Ward Cunningham's Fit. In it, tests are written as a variety of HTML tables that look not too dissimilar from spreadsheet tables. Fit is ideal for tests that are data-centric, where each test does the same kind of thing to different kinds of data.
Some tests are processing-centric, where each test is composed of a different set of processing steps. Tables are more awkward for that. I would augment the official version of Fit with my own StepFixture or Rick Mugridge's forthcoming DoFixture. Such fixtures make processing-centric tasks more compact.
More important than the format of the tests is how they're created: collaboratively. The conversation begins with the business expert describing a new feature. This is often done in general terms, so it's important to train the team to say, "Can you give me an example of that?" whenever there's a hint of vagueness. Those concrete examples will help the programmers understand, and they may well also make the business expert suddenly call to mind previously overlooked business rules.
Those examples turn into business-facing tests, but I think it's important that they not start that way. I don't want to see people huddled around a screen, editing tables. It's too easy to get distracted by the tools and by making things tidy. I want to see people in front of a white board, scribbling examples there. Those examples can later be put into files.
What's the role of the tester in all this? One part, probably, is clerical. Guess who gets to turn scribbling into tables? But the more important parts are as translator and idea generator.
Experts are characteristically bad at explaining why they do what they do. Their knowledge is tacit, and it's hard for them to make it explicit. It's the tester's responsibility to draw them out. Fortunately, many testers are quick studies of a domain - they've had to be. It's also the testers' responsibility to think of important ideas the business experts and programmers might overlook. For example, both business experts and programmers tend to be focused on achieving return on investment, not on loss. So they concentrate more on what wonderful things a new feature could do, less on what it shouldn't do if people make mistakes (error handling) or intentionally misuse it (security). The tester should fill that gap, make sure the tests describe enough of the whole range of possible uses of the feature.
Quite likely, most of the tests will come after the programmer's started writing the feature. I'd want the initial set of tests to be enough for the programmer to estimate accurately enough and get started quickly. The tester can then produce additional tests in parallel. Always, any doubtful cases - "what should the program do here?" - will be reviewed by the business expert. As time goes on and the whole team learns the domain, fewer and fewer cases will be doubtful. The team will get better, in all ways, at making choices that make sense for the business.
The programmer will use the tests in something like the standard test-first way. When working on technology-facing tests, the programmer watches a new test fail, makes a small change to the code, watches the test now pass (and earlier ones continue to pass), cleans up if necessary, and repeats. The same thing will be done with business-facing tests. When I use Fit, I display its results in the browser. My cycle starts by looking at a table element that isn't right (isn't green, in the case of checks; or white, in the case of actions). I flip back to my programming environment and do what's required to make that element right. If the job is simple, I can do it directly. If it's too big a job to bite off at once, I use technology-facing tests to break it into smaller steps. When I think I've got it right, three keystrokes run the tests, take me back to the browser, and refresh the page so I can see my progress. (It's a measure of the importance of rapid feedback that those three keystrokes feel like an annoying slowdown.)
All this is an important shift. These tests are not, to the programmer, mainly about finding bugs. They're mainly about guiding and pacing the act of programming. The programmer evolves the code toward a form that satisfies the business expert. The tests make the evolution smooth and pleasant. Now, testers are not traditionally in the business of making programmer's lives pleasant, but the good tester on an agile team will strive to present the programmers with just the right sequence of tests to make programming smooth. It's too common for testers to overwhelm the programmers with a barrage of ideas - too many to keep track of.
I'm edging here into the most controversial thing about testing on an agile project: testing is much more explicitly a service role. The tester serves the business expert and, especially, the programmers. Many testers are, to say the least, uncomfortable with this role. They are used to seeing themselves as independent judges. That will effect hiring: with exceptions, I'm more interested in a helpful attitude, conversational skills, and a novelty-seeking personality than testing skills. It's easier to grow the latter than the former.
By that token I am (with an important exception) content with having no testers on the project. If the programmers and business expert can and will develop testing skills, there may be no need for a person wearing a hat that says "Tester" on it. But, in addition to seeing programmers grow toward testers, I'd be happy to see testers grow toward programmers. I'd be ecstatic if someone I hired as a tester began to contribute more and more code to the code base. Agile is about team capability: as long as the team gets the work done, I don't care who does it.
Now for various exceptions.
Automated business-facing tests are the engine of my agile project, but I'm leery of automating everything. In particular, I will absolutely resist implementing those tests through the GUI. They are written in business terms, not GUI terms. They are about business value, not about buttons. It makes no sense to translate business terms into button presses, then have the GUI translate button presses into calls into the business logic. Instead, the tests should access the business logic directly, bypassing the GUI. The prevalence of automated GUI tests is a historical accident: testers used to have to test through the GUI because the programmers were not motivated to give them any other way to test. But agile programmers are, so our tests don't have to go through the traditional nonsense of trying (often fruitlessly) to make our tests immune to GUI changes while not making them immune to finding bugs.
But what about the GUI? How is it tested? First, realize that testing below the GUI will keep business logic out of the GUI. If there's less code there, less can go wrong. Moreover, a thin GUI offers less opportunity for "ripple effects" than does code deeply buried in the business logic. Selected technology-facing tests (for example, using jsunit for javascript input checking) plus manual tests of both the functionality and usability of GUI changes should suffice. We shouldn't need full automation.
I would expect arguments about this. I'd also expect to win them, at least at the start. But if my simple practice really did let too many bugs past, I'd relent.
Another manual testing practice is derived from my passion for concreteness. Few people buy cars without test driving them. That's because actually using something is different than reading about it or talking about it. Both of those acts are at a level of remove that loses information and narrows perception. An automated test is at the same level. So I would introduce a form of semi-structured manual exploratory testing into the project.
I'd do that by piggybacking on my fondness for end-of-iteration rituals. Some agile projects have a ritual of demonstrating an iteration's results to anyone they can drag into the project room. After that, it'd be natural for people to pair off (team members with team members, team members with observers) to try the product out in ways interesting to the business expert. One pair might look at what the web site feels like over dialup lines; another might emulate a rushed and harried customer-service person who's making mistakes right and left. They're looking not just for bugs but also especially for new ideas about the product - changes that can be made in later iterations that would make it better even though it's not wrong now. They're throwing up potential work for the business expert to decide about.
I would make a final exception for types of testing that are both highly specialized and don't lend themselves to a test-first style. Some of these types of testing are security testing, performance testing, stress/load testing, configuration testing, and usability testing. I don't now see any reason for these to be done differently on agile projects.
Further reading
For test-driven design, I'd start with Kent Beck's Test-Driven Development: By Example and also read either David Astels' Test-Driven Development: A Practical Guide or Hunt and Thomas's Pragmatic Unit Testing. See also testdriven.com.
What is Agile Testing?
What is Agile Testing? The question came up on the Agile Testing mailing list, and among others attempted to define it:
When I use the term 'agile testing' I'm talking about a set of good practices which has helped me and my teams deliver higher quality software, whether or not we were part of an agile team. We know that people, not tools or methodologies, are what makes projects succeed. Anything we can do to allow ourselves to do our best work fits my definition of 'agile'.
'Agile testing' practices come about by applying agile values and principles to testing. For example, communication is essential, and by collaborating with our customers up front to write test cases, we can improve communication. By driving coding with customer-facing tests, we're more likely to deliver what the customer wanted.
Some other examples: Automating our regression tests allows us time to do exploratory testing, so we strive for 100% regression test automation (I've never achieved anything close to this on a team who isn't automating unit tests, but we can still automate some testing). Exploratory testing is inherently agile so we work at getting better at that. Collaborating helps communication, so testers should get together with programmers and with customers from the start of a project, as well as pairing with each other.
Simple design is important in designing and automating tests. Refactoring applies to any type of coding including test scripts. Retrospectives are the single most important practice I can think of for any team, because you can't improve without identifying and focusing on what you need to improve. Dividing work into small chunks, having a stable build at all times (because we have continuous integration and collective code ownership), releasing business value frequently, all that allows us to do a better job of testing.
None of this is really new, and a lot of it I was doing years before I heard of XP, Scrum, agile, etc.
I haven't come up with a good elevator speech for agile testing so I have to resort to examples. I've heard people come up with some great one-liners but then I can't remember them. I think the Agile Manifesto works from a testing point of view as well as from a coding point of view.
I spend too much time saying I don't know where Agile testing will be in five years, not enough pointing in some direction and saying "But let's see if maybe we can find it over there". They're probably right. So this is the start of a series of notes in which I'll do just that.
I'm going to start by restating a pair of distinctions that I think are getting to be fairly common.
If you hear someone talking about tests in Agile projects, it's useful to ask if those tests are business facing or technology facing. A business-facing test is one you could describe to a business expert in terms that would (or should) interest her. If you were talking on the phone and wanted to describe what questions the test answers, you would use words drawn from the business domain: "If you withdraw more money than you have in your account, does the system automatically extend you a loan for the difference?"
A technology-facing test is one you describe with words drawn from the domain of the programmers: "Different browsers implement Javascript differently, so we test whether our product works with the most important ones." Or: "PersistentUser#delete should not complain if the user record doesn't exist."
(These categories have fuzzy boundaries, as so many do. For example, the choice of which browser configurations to test is in part a business decision.)
It's also useful to ask people who talk about tests whether they want the tests to support programming or critique the product. By "support programming", I mean that the programmers use them as an integral part of the act of programming. For example, some programmers write a test to tell them what code to write next. By writing that code, they change some of the behavior of the program. Running the test after the change reassures them that they changed what they wanted. Running all the other tests reassures them that they didn't change behavior they intended to leave alone.
Tests that critique the product are not focused on the act of programming. Instead, they look at a finished product with the intent of discovering inadequacies.
Put those two distinctions together and you get this matrix:
Agile testing directions: tests and examples:
'It all depends on what you mean by home.'[...]'Home is the place where, when you have to go there,They have to take you in.' 'I should have called itSomething you somehow haven't to deserve.'
-- Robert Frost, "The Death of the Hired Man"
Consider the left to right division. Some testing on agile projects, I say, is done to critique a product; other testing, to support programming. But the meaning and the connotations of the word "testing" differ wildly in the two cases.
When it comes to supporting programming, tests are mainly about preparing and reassuring. You write a test to help you clarify your thinking about a problem. You use it as an illustrative example of the way the code ought to behave. It is, fortunately, an example that actively checks the code, which is reassuring. These tests also find bugs, but that is a secondary purpose.
On the other side of the division, tests are about uncovering prior mistakes and omissions. The primary meaning is about bugs. There are secondary meanings, but that primary meaning is very primary. (Many testers, especially the best ones, have their identities wrapped up in the connotations of those words.)
I want to try an experiment. What if we stopped using the words "testing" and "tests" for what happens in the left side of the matrix? What if we called them "checked examples" instead?
Imagine two XP programmers sitting down to code. They'll start by constructing an incisive example of what the code needs to do next. They'll check that it doesn't do it yet. (If it does, something's surely peculiar.) They'll make the code do it. They'll check that the example is now true, and that all the other examples remain good examples of what the code does. Then they'll move on to an example of the next thing the code should do.
Is there a point to that switch, or is it just a meaningless textual substitution? Well, you do experiments to find these things out. Try using "example" occasionally, often enough that it stops sounding completely weird. Now: Does it change your perspective at all when you sit down to code? Does it make a difference to walk up to a customer and ask for an example rather than a test? Add on some adjectives: what do motivating, telling, or insightful examples look like, and how are they different from powerful tests? ("Powerful" being the typical adjective-of-praise attached to a test.) Is it easier to see what a tester does on an XP project when everyone else is making examples, when no one else is making tests?
Credit: Ward Cunningham added the adjective "checked". I was originally calling them either "guiding" or "coaching" examples.
Agile testing directions: technology-facing programmer support
As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm writing about how we can use technology-facing examples to support programming.
One thing that fits here is test-driven development, as covered in Kent Beck's book of same name, David Astel's more recent book, and forthcoming books by Phlip, J.B. Rainsberger, and who knows who else. I think that test-driven development (what I would now call example-driven development) is on solid ground. It's not a mainstream technique, but it seems to be progressing nicely toward that. To use Geoffrey Moore's term, I think it's well on its way to crossing the chasm.
(Note: in this posting, when I talk of examples, I mean examples of how coders will use the thing-under-development. In XP terms, unit tests. In my terms, technology-facing examples.)
Put another way, example-driven development has moved from being what Thomas Kuhn called "revolutionary science" to what he called "normal science". In a normal science, people expand the range of applicability of a particular approach. So we now have people applying EDD (sic) to GUIs, figuring out how it works with legacy code, discussing good ways to use mock objects, having long discussions about techniques for handling private methods, and so forth.
Normal science is not the romantic side of science; it's merely where ideas turn into impact on the world. So I'm glad to see we're there with EDD. But normality also means that my ideas for what I want to work on or see others work on... well, they're not very momentous.
I hope future years will see more people with a mixture of testing and programming skills being drawn to Agile projects. Those people will likely neither be as good testers as pure testers, nor as good programmers as pure programmers, but that's OK if you believe, as I do, that Agile projects do and should value generalists over specialists.
I'm one such mixture. I've done limited pair programming with "pure programmers". When I have, I've noticed there's a real tension between the desire to maintain the pacing and objectives of the programming and the desire to make sure lots of test ideas get taken into account. I find myself oscillating between being in "programmer mode" and pulling the pair back to take stock of the big picture. With experience, we should gain a better idea of how to manage that process, and of what kinds of "testing thinking" are appropriate during coding.
There might also be testers on the team who do not act as programmers. Nevertheless, some of them do pair with programmers to talk about the unit tests (how the programmers checked the code). The programmers learn what kinds of bugs to avoid, and the testers learn about what they're testing. For some reason, Calgary Canada is a hotbed of such activity, and I look to Jonathan Kohl, Janet Gregory, and others to teach us how to do it well.
I want to emphasize that this is all about people. Testers traditionally have an arms-length (or oceans-length) relationship to programmers. For the programmer-support half of the matrix, that relationship is, I believe, inappropriate.
I've been using the phrase "checked examples" for programmer support tests. We can split that idea in two. There are new examples that guide decisions about what to do next. And there are automated examples that serve as change detectors to see whether what you just did was what you expected to do.
The common habit is that the change detectors are merely retained code-guiding examples. (You make your unit test suite by saving, one by one, the tests you write as you code.) That's not a logical necessity. I'd like to develop some lore about when to do something else.
For example, consider this maintenance scenario: you develop some code example-first. A month later, someone adds a new example and changes the code to match. Many prior examples for that hunk of code become "bad examples" (the tests fail, but because they're now wrong, not because the code is). The tendency is to fix those examples so that they're essentially the same. What I mean by that is that the left sequence of events in the table below is expected to yield the same tests as the right. (Read the left column, then the right.)
Example foo written
Example bar written
Code written to match foo
Code written to match bar
Example bar written (foo invalidated)
Example better-foo written (bar is still a good example)
Code changed to match bar - oops, now foo doesn't check out
Code changed to match better-foo (and bar continues to check out)
Update foo to be better-foo
That is, newly broken examples are rewritten to match an ideal sequence of development in which no example ever needed to be rewritten. But why? In the left column above, example new-foo is never used to drive development - it's only for checking. What's optimal for driving development might not be optimal for checking.
Let me be concrete. Suppose that software systems develop shearing layers, interfaces that naturally don't change much. For maintainability, it might make sense to migrate broken examples to shearing layers when fixing them. Instead of being an example about a particular method in a particular class, we now have an example of a use of an entire subsystem. That can be bad - think about debugging - but it reduces the maintenance burden and could even provide the benefit of thorough documentation of the subsystem's behavior.
I'm hoping that people who distinguish the two roles - guiding the near future and rechecking the past - will discover productive lore. For example, when might it be useful to write technology-facing change detectors that never had anything to do with guiding programming?
I said above that test-driven development is "one thing that fits" today's topic. What else fits? I don't know. And is EDD the best fit? (Might there be a revolution in the offing?) I don't know that either - I'll rely on iconoclasts to figure that out. I'm very interested in listening to them.
Agile testing directions: Testers on agile projects:
Should there be testers on agile projects?
First: what's the alternative? It is to have non-specialists (programmers, business experts, technical writers, etc.) perform the activities I've identified in this series: helping to create guiding examples and producing product critiques. Or, symmetrically, it's to have testers who do programming, business analysis, technical writing, etc. It's to consider "testing" as only one set of skills that needs to be available, in sufficient quantity, somewhere in the team, to service all the tasks that require those skills.
Why would non-specialists be a bad idea? Here are some possible reasons:
Testing skills are hard to learn. If you try to be a tester and a programmer or a tester and a technical writer, you won't have the minimum required skills to be a good enough tester.
Suppose you're the best basketball player in the world and also the best car washer. You should nevertheless pay someone else to wash your car, because you could earn far more in that hour playing basketball than you'd save washing your own car. That's an example of comparative advantage, what Paul Samuelson advanced as the only proposition in the social sciences that's both true and non-trivial. It's a general argument for specialization: it's to the advantage of both you and the person you hire for both of you to specialize. So why shouldn't a person with a knack for testing do only testing, and a person who's comparatively stronger at programming do only programming?
Testing might not be so much a learned skill as an innate aptitude. Some people are just natural critics, and some people just aren't.
All the other tasks that a tester might take on in a project imply sharing ownership of the end product. Many people have trouble finding fault in their own work. So people who mix testing and other tasks will test poorly. It's too much of a conflict of emotional interest.
A tester benefits from a certain amount of useful ignorance. Not knowing implementation details makes it easier for her to think of the kinds of mistakes real users might make.
Argument
Let me address minimum required skills and comparative advantage first. These arguments seem to me strongest in the case of technology-facing product critiques like security testing or usability testing. On a substantial project, I can certainly see the ongoing presence of a specialist security tester. On smaller projects, I can see the occasional presence of a specialist security tester. (The project could probably not justify continual presence.)
As for the exploratory testers that I'm relying on for business-facing product critiques, I'm not sure. So many of the bugs that exploratory testers (and most other testers) find are ones that programmers could prevent if they properly internalized the frequent experience of seeing those bugs. (Exploratory testers - all testers - get good in large part because they pay attention to patterns in the bugs they see.) A good way to internalize bugs is to involve the programmers in not just fixing but also in finding them. And there'll be fewer of the bugs around if the testers are writing some of the code. So this argues against specialist testers.
Put it another way: I don't think that there's any reason most people cannot have the minimum required exploratory testing skills. And the argument from comparative advantage doesn't apply if mowing your lawn is good basketball practice.
That doesn't say that there won't be specialist exploratory testers who get a team up to speed and sometimes visit for check-ups and to teach new skills. It'd be no different from hiring Bill Wake to do that for refactoring skills, or Esther Derby to do that for retrospectives. But those people aren't "on the team".
I think the same reasoning applies to the left side of the matrix - technology-facing checked examples (unit tests) and business-facing checked examples (customer tests). I teach this stuff to testers. Programmers can do it. Business experts can do it, though few probably have the opportunity to reach the minimum skill level. But that's why business-facing examples are created by a team, not tossed over the wall to one. In fact, team communication is so important that it ought to swamp any of the effects of comparative advantage. (After all, comparative advantage applies just as well to programming skills, and agile projects already make a bet that the comparative advantage of having GUI experts who do only GUIs and database experts who do only databases isn't sufficient.)
Now let's look at innate aptitude. When Jeff Patton showed a group of us an example of usage-centered design, one of the exercises was to create roles for a hypothetical conference paper review system. I was the one who created roles like "reluctant paper reviewer", "overworked conference chair", and "procrastinating author". Someone remarked, "You can tell Brian's a tester". We all had a good chuckle at the way I gravitated to the pessimistic cases.
But the thing is - that's learned behavior. I did it because I was consciously looking for people who would treat the system differently than developers would likely hope (and because I have experience with such systems in all those roles). My hunch is that I'm by nature no more naturally critical than average, but I've learned to become an adequate tester. I think the average programmer can, as well. Certainly the programmers I've met haven't been notable for being panglossian, for thinking other people's software is the best in this best of all possible worlds.
But it's true an attack dog mentality usually applies to other people's software. It's your own that provokes the conflict of emotional interest. I once had Elisabeth Hendrickson doing some exploratory testing on an app of mine. I was feeling pretty cocky going in - I was sure my technology-facing and business-facing examples were thorough. Of course, she quickly found a serious bug. Not only was I shocked, I also reacted in a defensive way that's familiar to testers. (Not harmfully, I don't think, because we were both aware of it and talked about it.)
And I've later done some exploratory testing of part of the app while under a deadline, realized that I'd done a weak coding job on an "unimportant" part of the user interface, then felt reluctant to push the GUI hard because I really didn't want to have to fix bugs right then.
So this is a real problem. I have hopes that we can reduce it with practices. For example, just as pair programming tends to keep people honest about doing their refactoring, it can help keep people honest about pushing the code hard in exploratory testing. Reluctance to refactor under schedule pressure - leading to accumulating design debt - isn't a problem that will ever go away, but teams have to learn to cope. Perhaps the same is true of emotional conflict of interest.
Related to emotional conflict of interest is the problem of useful ignorance. Imagine it's iteration five. A combined tester/programmer/whatever has been working with the product from the beginning. When exploring it, she's developed habits. If there are two ways to do something, she always chooses one. When she uses the product, she doesn't make many conceptual mistakes, because she knows how the product's supposed to work. Her team's been writing lots of guiding examples - and as they do that, they've been building implicit models of what their "ideal user" is like, and they have increasing trouble imagining other kinds of users.
This is a tough one to get around. Role playing can help. Elisabeth Hendrickson teaches testers to (sometimes) assume extreme personae when testing. What would happen if Bugs Bunny used the product? He's a devious troublemaker, always probing for weakness, always flouting authority. How about Charlie Chaplin in Modern Times: naïve, unprepared, pressured to work ever faster? Another technique that might help is Hans Buwalda's soap opera testing.
It's my hope that such techniques will help, especially when combined with pairing (where each person drives her partner to fits of creativity) in a bullpen setting (where the resulting party atmosphere will spur people on). But I can't help but think that artificial ignorance is no substitute for the real thing.
Staffing
So. Should there be testers on an agile project? Well, it depends. But here's what I would like to see, were I responsible for staffing a really good agile team working on an important product. Think of this as my default approach, the prejudice I would bring to a situation.
I'd look for one or two people with solid testing experience. They should know some programming. They should be good at talking to business experts and quickly picking up a domain. At first, I'd rely on them for making sure that the business-facing examples worked well. (One thing they must do is exercise analyst skills.) Over time, I'd expect them to learn more programming, contribute to the code base, teach programmers, and become mostly indistinguishable from the people who started off as programmers.
Personality would be very important. They have to like novelty, they shouldn't have their identity emotionally wrapped up in their job description, and they have to be comfortable serving other people.
It would be a bonus if these people were good at exploratory testing. But, in any case, the whole team would receive good training in exploratory testing. I'd want outside exploratory testing coaches to visit periodically. They'd both extend the training and do some exploratory testing. That last is part of an ongoing monitoring of the risk that the team is too close to the product to find enough of the bugs.
To the extent that non-functional "ilities" like usability, security, and performance were important to the product, we'd buy that expertise (on-site consultant, or visiting consultant, or a hire for the team). That person would advise on creating the product, train the team, and test the product.
(See Johanna Rothman about why such non-functional requirements ought to be important. I remember Brian Lawrence saying similar things about how Gause&Weinberg-style attributes are key to making a product that stands out.)
I'd make a very strong push to get actual users involved (not just business experts who represent the users). That would probably involve team members going to the users, rather than vice-versa. I'd want the team to think of themselves as anthropologists trying to learn the domain, not just people going to hear about bugs and feature requests.
Are there testers on this team, once it jells? Who cares? - there will be good testing, even though it will be increasingly hard to point at any activity and say, "That. That there. That's testing and nothing but."
Disclaimers
"I'd look for one or two people with experience testing. They should..."
Those ellipses refer to a description that, well, is pretty much a description of me. How much of my reasoning is sound, how much is biased by self-interest? I'll leave that to you, and time, to judge.
"... the whole team would receive good training in exploratory testing."
Elisabeth Hendrickson and I have been talking fitfully all year about creating such training. Again, I think my conclusion - that exploratory testing is central - came first, but you're entitled to think it looks fishy.
A Roadmap for Testing on an Agile Project:
If I were starting up an agile project, here is how I'd plan to do testing. (But this plan is a starting point, not the final answer.)
I assume the programmers will do test-driven design. That's well explained elsewhere (see the Further Reading), so I won't describe it here (much).
Test-driven programmers usually create their tests in what I call "technology-facing" language. That is, their tests talk about programmatic objects, not business concepts. They learn about those business concepts and business needs through conversation with a business expert.
Nothing will replace that conversation, but it's hard for programmers to learn everything they need through conversation, even the frequent conversations that a collocated business expert allows. Too often, the business expert is surprised by the result of a programming task - the programmer left out something that's "obvious". There's no way to eliminate surprises entirely - and agile projects are tailored to make it easy to correct mistakes - but it's sand in the gears of the project if, too often, a programmer's happy "I'm done with the order-taking task!" results in quick disappointment.
It's better if conversations can be conversations about something, about concrete examples. When those concrete examples are executable, we call them "tests" - specifically, I call them "business-facing" tests.
A business-facing test has to be something a business expert can talk about. Most business experts - not all - will find tests written in Java or C# too painful. A safe and satisfactory choice is to use Ward Cunningham's Fit. In it, tests are written as a variety of HTML tables that look not too dissimilar from spreadsheet tables. Fit is ideal for tests that are data-centric, where each test does the same kind of thing to different kinds of data.
Some tests are processing-centric, where each test is composed of a different set of processing steps. Tables are more awkward for that. I would augment the official version of Fit with my own StepFixture or Rick Mugridge's forthcoming DoFixture. Such fixtures make processing-centric tasks more compact.
More important than the format of the tests is how they're created: collaboratively. The conversation begins with the business expert describing a new feature. This is often done in general terms, so it's important to train the team to say, "Can you give me an example of that?" whenever there's a hint of vagueness. Those concrete examples will help the programmers understand, and they may well also make the business expert suddenly call to mind previously overlooked business rules.
Those examples turn into business-facing tests, but I think it's important that they not start that way. I don't want to see people huddled around a screen, editing tables. It's too easy to get distracted by the tools and by making things tidy. I want to see people in front of a white board, scribbling examples there. Those examples can later be put into files.
What's the role of the tester in all this? One part, probably, is clerical. Guess who gets to turn scribbling into tables? But the more important parts are as translator and idea generator.
Experts are characteristically bad at explaining why they do what they do. Their knowledge is tacit, and it's hard for them to make it explicit. It's the tester's responsibility to draw them out. Fortunately, many testers are quick studies of a domain - they've had to be. It's also the testers' responsibility to think of important ideas the business experts and programmers might overlook. For example, both business experts and programmers tend to be focused on achieving return on investment, not on loss. So they concentrate more on what wonderful things a new feature could do, less on what it shouldn't do if people make mistakes (error handling) or intentionally misuse it (security). The tester should fill that gap, make sure the tests describe enough of the whole range of possible uses of the feature.
Quite likely, most of the tests will come after the programmer's started writing the feature. I'd want the initial set of tests to be enough for the programmer to estimate accurately enough and get started quickly. The tester can then produce additional tests in parallel. Always, any doubtful cases - "what should the program do here?" - will be reviewed by the business expert. As time goes on and the whole team learns the domain, fewer and fewer cases will be doubtful. The team will get better, in all ways, at making choices that make sense for the business.
The programmer will use the tests in something like the standard test-first way. When working on technology-facing tests, the programmer watches a new test fail, makes a small change to the code, watches the test now pass (and earlier ones continue to pass), cleans up if necessary, and repeats. The same thing will be done with business-facing tests. When I use Fit, I display its results in the browser. My cycle starts by looking at a table element that isn't right (isn't green, in the case of checks; or white, in the case of actions). I flip back to my programming environment and do what's required to make that element right. If the job is simple, I can do it directly. If it's too big a job to bite off at once, I use technology-facing tests to break it into smaller steps. When I think I've got it right, three keystrokes run the tests, take me back to the browser, and refresh the page so I can see my progress. (It's a measure of the importance of rapid feedback that those three keystrokes feel like an annoying slowdown.)
All this is an important shift. These tests are not, to the programmer, mainly about finding bugs. They're mainly about guiding and pacing the act of programming. The programmer evolves the code toward a form that satisfies the business expert. The tests make the evolution smooth and pleasant. Now, testers are not traditionally in the business of making programmer's lives pleasant, but the good tester on an agile team will strive to present the programmers with just the right sequence of tests to make programming smooth. It's too common for testers to overwhelm the programmers with a barrage of ideas - too many to keep track of.
I'm edging here into the most controversial thing about testing on an agile project: testing is much more explicitly a service role. The tester serves the business expert and, especially, the programmers. Many testers are, to say the least, uncomfortable with this role. They are used to seeing themselves as independent judges. That will effect hiring: with exceptions, I'm more interested in a helpful attitude, conversational skills, and a novelty-seeking personality than testing skills. It's easier to grow the latter than the former.
By that token I am (with an important exception) content with having no testers on the project. If the programmers and business expert can and will develop testing skills, there may be no need for a person wearing a hat that says "Tester" on it. But, in addition to seeing programmers grow toward testers, I'd be happy to see testers grow toward programmers. I'd be ecstatic if someone I hired as a tester began to contribute more and more code to the code base. Agile is about team capability: as long as the team gets the work done, I don't care who does it.
Now for various exceptions.
Automated business-facing tests are the engine of my agile project, but I'm leery of automating everything. In particular, I will absolutely resist implementing those tests through the GUI. They are written in business terms, not GUI terms. They are about business value, not about buttons. It makes no sense to translate business terms into button presses, then have the GUI translate button presses into calls into the business logic. Instead, the tests should access the business logic directly, bypassing the GUI. The prevalence of automated GUI tests is a historical accident: testers used to have to test through the GUI because the programmers were not motivated to give them any other way to test. But agile programmers are, so our tests don't have to go through the traditional nonsense of trying (often fruitlessly) to make our tests immune to GUI changes while not making them immune to finding bugs.
But what about the GUI? How is it tested? First, realize that testing below the GUI will keep business logic out of the GUI. If there's less code there, less can go wrong. Moreover, a thin GUI offers less opportunity for "ripple effects" than does code deeply buried in the business logic. Selected technology-facing tests (for example, using jsunit for javascript input checking) plus manual tests of both the functionality and usability of GUI changes should suffice. We shouldn't need full automation.
I would expect arguments about this. I'd also expect to win them, at least at the start. But if my simple practice really did let too many bugs past, I'd relent.
Another manual testing practice is derived from my passion for concreteness. Few people buy cars without test driving them. That's because actually using something is different than reading about it or talking about it. Both of those acts are at a level of remove that loses information and narrows perception. An automated test is at the same level. So I would introduce a form of semi-structured manual exploratory testing into the project.
I'd do that by piggybacking on my fondness for end-of-iteration rituals. Some agile projects have a ritual of demonstrating an iteration's results to anyone they can drag into the project room. After that, it'd be natural for people to pair off (team members with team members, team members with observers) to try the product out in ways interesting to the business expert. One pair might look at what the web site feels like over dialup lines; another might emulate a rushed and harried customer-service person who's making mistakes right and left. They're looking not just for bugs but also especially for new ideas about the product - changes that can be made in later iterations that would make it better even though it's not wrong now. They're throwing up potential work for the business expert to decide about.
I would make a final exception for types of testing that are both highly specialized and don't lend themselves to a test-first style. Some of these types of testing are security testing, performance testing, stress/load testing, configuration testing, and usability testing. I don't now see any reason for these to be done differently on agile projects.
Further reading
For test-driven design, I'd start with Kent Beck's Test-Driven Development: By Example and also read either David Astels' Test-Driven Development: A Practical Guide or Hunt and Thomas's Pragmatic Unit Testing. See also testdriven.com.
Every Bug Deserves Its Own URL
We use TestTrack Pro for bug tracking at my new job. Mostly i find it to be a ho-hum tool. I've seen better, and i've seen worse. But there is one thing about it that really bugs me. I want to be able to reference bugs -- in emails, in wiki pages, in documents -- and i want to make the links clickable: so anyone reading can click to see the actual bug report, along with any updates that have been made since. But TestTrack doesn't seem to support this.
I've gotten used to doing this kind of thing using Bugzilla and Jira and the tracker included in the SourceForge package. They all make it easy to access a bug report via a URL that embeds the bug number. See for your self with these examples using Bugzilla, Jira, and the SourceForge tracker. When compiling bug lists, whether to group related bugs or to pull together a release plan, i've gotten in the habit of including the URLs so that readers can immediately look up the details themselves. With time, the plan itself becomes a simple tool for tracking progress.
But i can't do this with TestTrack Pro, and it's driving me crazy. I have both a windows client and the web client for it. The problem with the web client seems to be that it uses a post to retrieve a particular bug record, and that means that the request can't be summarized in just a URL. The other tools don't have this problem because they use a get instead. I've been told that TestTrack violates web standards that stipulate that posts should only be used for transactions, whereas gets should be used for queries. All i know is that this is driving me nuts. This issue itself is making me start thinking that we'd be better off switching to another tool.
I want to make it easy to get people to look into the bug database. I want to get managers looking in there. I want to get everyone on the team looking at it, and fixing the bugs and updating the reports. I want it to be relevant and accurate. All too often documentation that is hard to reference and share and point people to is documentation that goes unread and unupdated. And if no one is looking at it, what's the point of taking care in keeping it up to date?
Am i crazy? What do you think? Does your tool allow you to reference a bug report using a URL? Do you use this "feature"? Put your details regarding the tools you use in comments, and i'll compile a matrix of which tools do and don't provide a URL for each bug report.
Update: December 28, 2005. Several commentors suggested i write a gateway script to convert a GET into a POST, but that didn't work out. In fact, as Danil suggested in comments, I found out that the Test Track interface actually works just as well with a GET as with a POST. That's good. However, one of the data items that needs to be submitted is a session variable named (misleadingly) "cookie", and there is no way for a gateway script to provide this. I was hoping that if it were omitted, Test Track would simply ask for your name and password, but it doesn't. And if you are already logged in, that still doesn't help, because there is no easy way to reuse your session id, because it really isn't a cookie, despite the name. So i'm giving up on this one.
(As i write this, i'm realizing that i could write a script that hard-coded my login and password in it, have it log into Test Track, get a valid and current session ID ("cookie") and then use that for the bug report. That would be hairy, insecure, and a long way off from my original goal of a simple URL. )
Here are additional tool details from comments:
Gives each bug a URLGeminiFogBugz
Isolates bugs from the worldPVCS TrackerMercury Test Director (previous release)
I've gotten used to doing this kind of thing using Bugzilla and Jira and the tracker included in the SourceForge package. They all make it easy to access a bug report via a URL that embeds the bug number. See for your self with these examples using Bugzilla, Jira, and the SourceForge tracker. When compiling bug lists, whether to group related bugs or to pull together a release plan, i've gotten in the habit of including the URLs so that readers can immediately look up the details themselves. With time, the plan itself becomes a simple tool for tracking progress.
But i can't do this with TestTrack Pro, and it's driving me crazy. I have both a windows client and the web client for it. The problem with the web client seems to be that it uses a post to retrieve a particular bug record, and that means that the request can't be summarized in just a URL. The other tools don't have this problem because they use a get instead. I've been told that TestTrack violates web standards that stipulate that posts should only be used for transactions, whereas gets should be used for queries. All i know is that this is driving me nuts. This issue itself is making me start thinking that we'd be better off switching to another tool.
I want to make it easy to get people to look into the bug database. I want to get managers looking in there. I want to get everyone on the team looking at it, and fixing the bugs and updating the reports. I want it to be relevant and accurate. All too often documentation that is hard to reference and share and point people to is documentation that goes unread and unupdated. And if no one is looking at it, what's the point of taking care in keeping it up to date?
Am i crazy? What do you think? Does your tool allow you to reference a bug report using a URL? Do you use this "feature"? Put your details regarding the tools you use in comments, and i'll compile a matrix of which tools do and don't provide a URL for each bug report.
Update: December 28, 2005. Several commentors suggested i write a gateway script to convert a GET into a POST, but that didn't work out. In fact, as Danil suggested in comments, I found out that the Test Track interface actually works just as well with a GET as with a POST. That's good. However, one of the data items that needs to be submitted is a session variable named (misleadingly) "cookie", and there is no way for a gateway script to provide this. I was hoping that if it were omitted, Test Track would simply ask for your name and password, but it doesn't. And if you are already logged in, that still doesn't help, because there is no easy way to reuse your session id, because it really isn't a cookie, despite the name. So i'm giving up on this one.
(As i write this, i'm realizing that i could write a script that hard-coded my login and password in it, have it log into Test Track, get a valid and current session ID ("cookie") and then use that for the bug report. That would be hairy, insecure, and a long way off from my original goal of a simple URL. )
Here are additional tool details from comments:
Gives each bug a URLGeminiFogBugz
Isolates bugs from the worldPVCS TrackerMercury Test Director (previous release)
Collaborative Testing
Modelling Web Applications:
When I am testing web applications, I like to analyze the system to find areas of testability, and to identify potential areas of instability. To help remember areas to investigate, I use this mnemonic:FP DICTUMM
F - Framework (Struts, Rails, Plone, Spring, .NET, etc.)P - Persistence (Hibernate, Toplink, iBatis, Active Record, DAOs, etc.) D - Data (database, flat files, etc.)I - Interfaces (code level, web services, user, etc.)C - Communication (HTTP(S), web services, SOAP, XMLHTTPRequest, database drivers, etc.) T - Technology (.NET, J2EE, Ruby, PHP, Perl etc.)U - Users (Human end users, and other systems or apps that use this system)M - Messaging (JMS, Enterprise Message Beans, Web Methods, etc.)M - Markup (HTML, CSS, DHTML, XML, XHTML)
Each of these areas help inform my testing, or provide areas of testability. Understanding the whole picture from the data model up to the HTML displayed in a web browser is important for my testing.
Read more :
When I am testing web applications, I like to analyze the system to find areas of testability, and to identify potential areas of instability. To help remember areas to investigate, I use this mnemonic:FP DICTUMM
F - Framework (Struts, Rails, Plone, Spring, .NET, etc.)P - Persistence (Hibernate, Toplink, iBatis, Active Record, DAOs, etc.) D - Data (database, flat files, etc.)I - Interfaces (code level, web services, user, etc.)C - Communication (HTTP(S), web services, SOAP, XMLHTTPRequest, database drivers, etc.) T - Technology (.NET, J2EE, Ruby, PHP, Perl etc.)U - Users (Human end users, and other systems or apps that use this system)M - Messaging (JMS, Enterprise Message Beans, Web Methods, etc.)M - Markup (HTML, CSS, DHTML, XML, XHTML)
Each of these areas help inform my testing, or provide areas of testability. Understanding the whole picture from the data model up to the HTML displayed in a web browser is important for my testing.
Read more :
Subscribe to:
Posts (Atom)