by Judy McKay

Quality Doesn’t Just Happen

May 24, 200716 mins
DeveloperProject Management ToolsROI and Metrics

If developers and managers both want to create software that users love, why is it that so many shoddy software projects escape from the Quality Assurance department?

Quality in software development projects doesn’t happen on its own. It also doesn’t occur after a small group of heroes rides in on white horses and waves its shiny swords to vanquish the problems. Quality happens only when careful planning is done, when the entire project team maintains a quality-conscious approach every step of the way, and when problems don’t escape from the phase in which they were introduced. A quality product is a team effort. It’s planned and predictable. It’s without heroes, and it’s faster and cheaper than a low-quality effort.

How can this be? Let’s look at some sample projects. The first is a normal, low-quality, late project. We’ll call it project “Hurry Up” (HU for short).


Read a sample chapter from Managing the Test People by Judy McKay.

Project HU got a bit of a late start due to the ongoing maintenance issues of its predecessor project “Just Ship It” (JSI). JSI was handled by a project manager (PM) who felt it was more important to ship on time than to ship a high-quality product. So he did. This PM was rewarded for his ability to “pull it together,” “get it out the door” and “meet the schedule.” The JSI PM was given a bonus for meeting his schedule and is now vacationing in Tahiti while the team deals with the fallout of the numerous bugs and unhappy customers.

Lesson #1: Don’t reward for shipping on schedule. Anyone can ship garbage. Base rewards on quality metrics.

During the last month of the project, the JSI developers worked 80-hour weeks. One heroic fellow was recognized for working 120 hours in one week, stopping only for brief rests. He heroically repaired multiple interfaces between applications. Those interfaces had not been properly specified (there were no design documents), no integration testing was done (no time to do it), and the QA team fought quality issues throughout system test.

Lesson #2: Don’t reward heroes for their Herculean effort late in the project to fix problems that could have—and should have—been fixed by the same people much earlier in the lifecycle.

The entire JSI team is down with the flu now due to lack of sleep.

Lesson #3: If you expect to work your people inordinate hours, you might want to consider corporate-sponsored flu shots!

Project HU was supposed to start three weeks ago, but the lingering effects of the flu, the nagging JSI maintenance problems and general team discord delayed the start. The analysts responsible for writing the requirements are in a rush. They got started late, the customer can’t make up his mind, and the PM is pressuring for completion. The analysts write what they can in an MS Word document and ask for a review. The PM tells development to start coding and schedules a “quick” requirements review between the analysts and the developers.

Lesson #4: Always include QA and other project team members in all reviews to get the most well-rounded input possible.

The requirements specification is sent out via e-mail and questions/responses are requested. Development has already started coding; it doesn’t want any changes to be made. No one responds to the e-mail, so the requirements are signed off as is.

Lesson #5: It’s easy to ignore documents that are sent by e-mail for approval. No response does not equal approval: No response means, “I didn’t have time to read it.”

Development is busy coding. It hits some problems because the interfaces between functions aren’t well-defined. This means that the team has to recode, and it substantially slows down the schedule. When it asks the analyst for clarification, it’s given a new user interface and two new items of functionality. The team decides not to ask any more questions.

Lesson #6: Don’t start coding until the requirements are stable and understood, or else budget time for subsequent rework.

As the development team nears the end of its scheduled time, it’s apparent it won’t complete the project on time. It begins to concentrate on the harder work, leaving the easier user interface and reporting tasks for last. While the team had hoped to do unit testing, only a few developers are doing it, and the effort is spotty at best.

Lesson #7: Code isn’t “complete” until it works. Good unit testing is part of the development effort, not an optional item to be jettisoned when the schedule is tight.

The QA group is summoned by the PM. Having just completed yet another maintenance release for JSI, it is frazzled and grumpy. This is the first the group has heard of Project HU, and it has little domain expertise. It is told that there is a requirements document, but it may be somewhat out of date. No use cases were written. The team has to create its own test data. It is now even grumpier!

Lesson #8: Is your test team always grumpy? Maybe it has good reasons!

The QA group starts writing test cases, but it is interrupted by the arrival of code to test. It hurries to create test data, guided by the developers, and it begins testing. It’s soon obvious that the QA team can’t make much effective progress, because it has only a partial UI and no reporting capabilities. All data verification has to be done directly in the database.

Lesson #9: To maximize team efficiency, the project plan must consider testing efficiency. This may determine feature implementation order.

The software is buggy. The test team tests around the areas that aren’t implemented or aren’t working, but it finds a number of issues that block further testing.

Even worse, when the test team gets a bug fix from development, 30 percent of the time it doesn’t fix the problem. In this state of code churning, the project hurtles past the deadline. The PM is pressured to ship (he wants his trip to Tahiti too!). The developers and testers are told to increase their efforts, work together to achieve the goal, do whatever it takes…

Lesson #10: Buggy software takes longer to ship.

The product ships in an unknown state. Last-minute functionality was added, and it received only cursory testing. A large number of identified bugs are still open, although all known critical problems were either addressed or reclassified as “serious.” The maintenance release is already being planned. The team is exhausted. It worked heroic hours, again, and produced a barely supportable product—again. The customer is unhappy—again. The product has features the customer doesn’t want or doesn’t understand, and it’s missing several major items they were expecting. Accolades come down from above for another “on-time” delivery.

What went wrong?

  • Management doesn’t recognize that “on time” doesn’t equal “satisfied customers.”
  • The entire project team is driven by schedule. Every decision shows schedule-, rather than quality-consciousness.
  • Shortcuts taken to improve schedule time (including unfinished requirements, insufficient system design, no unit test) actually made the project take longer to complete.
  • The maintenance release is, in reality, still the primary release, but now the unhappy customer is involved too.
  • People are exhausted, burned out and not utilized effectively. The rewards system is messed up!

Six months after this project shipped (and eight maintenance releases later), an analysis is done to determine the origin of all the bugs. The analysis shows:

  • 50 percent of the bugs were introduced in the requirements. These were due to unclear and vague requirements, as well as functionality that was not defined, and thus had to be introduced in a maintenance release. This also includes data issues and equipment issues where the test team didn’t have the right data or equipment to reflect the customer’s environment. Additionally, all bugs associated with the unwanted features are counted here, since those bugs would not have occurred if the features hadn’t been implemented.
  • 15 percent of the bugs were due to design issues, particularly interfaces between code modules and the database.
  • 25 percent of the problems were coding errors, both in new code and regressions introduced in the fixes.
  • 10 percent of the problems were system integration issues that were visible only in the fully integrated environment.

A Kinder, Gentler Project

A new PM is brought on board to lead project “Smarter Now” (SN). He listens carefully to the problems encountered by the development manager, QA manager and analysts, and he vows that his project will not suffer the same consequences. To start with, he looks at past “cost of quality” numbers. Assigning costs for each bug, depending on the phase in which it was introduced versus the phase in which it was caught, he learns:

  • 50 percent of the total bugs were found by the test group in the system test phase.
  • The other 50 percent were found by the unhappy customers.

Employing widely used cost numbers, he assigned the following values:

  • $1 for each bug found in the requirements review
  • $5 for each bug found in the design review
  • $10 for each bug found in unit test
  • $100 for each bug found in system test
  • $1,000 for each bug found by the customer

Doing the math, he determined the 1,000 bugs in the product resulted in the following cost:

Found in requirements: 0 Found in design: 0 Found in unit test: 0 Found in system test: 500 x $100 = $50,000 Found by the customer: 500 x $1,000 = $500,000 Total cost of quality: $550,000

According to the bug distribution of the previous project, if testing had been done throughout the lifecycle, the cost of quality should have been as follows:

Found in requirements: 500 x $1 = $500 Found in design: 150 x $5 = $750 Found in unit test: 250 x $10 = $2,500 Found in system test: 100 x $100 = $10,000 Found by the customer: 0 Total cost of quality: $ 13,750

Doing the proper quality assurance, building quality in and verifying at each phase that there were no escapes could have reduced the cost of quality by more than $500,000.

This is an ideal model, of course, but could it be done in practicality? The QA manager stepped forward to be the quality champion. The development group and analysts agreed to put quality first. The knowledgeable PM agreed to put quality first because he knew that building in quality from the start would keep the project on schedule and eliminate the 11th-hour heroism.

As with the other projects, the newest one, project Smarter Now (SN), starts with the requirements and it starts late. The PM tells the analysts to write the most complete requirements possible and gives them the time they requested at the beginning of the project, even though this now crosses into the allotted development time.

Having never been given a reasonable time frame before, the analysts enthusiastically launch the project with extensive customer meetings. The QA manager invites himself to these meetings, knowing that the more he understands the customer’s environment and needs, the better job he can do to monitor quality throughout the lifecycle. He also knows that getting his team involved early allows it to ferret out design issues, help the developers create good unit tests, and allow the test team to build solid test cases that accurately reflected the customer’s usage.

The QA manager knows he has the best team for the job. He recruited the members, hired them, trained them and motivated them. He learned how to do this, presumably, by reading my book, Managing the Test People.

Lesson #11: Hire the right people. Build a strong QA team from the start in order to create and maintain a strong quality-consciousness in your organization.

During the requirements process, the QA team becomes more and more active as it works to understand the customer’s needs. It helps the analysts ensure that each requirement is testable.

Lesson #12: If requirements are testable, they provide enough details for the developers to accurately implement the functionality.

The requirements resulted in exact statements—not just what the system had to do (functional), but also a description of how it had to do it (non-functional). The QA team, aware that earlier projects suffered from usability and performance issues, works with the analysts to define exact usability and performance requirements. The requirements effort takes twice as long as on prior projects—almost one-third of the total project time. Management fidgets. The PM stays calm.

A formal requirements review meeting is held with all the project team members and the customer. Each person prepares for the meeting by reading the document. The meeting lasts for three hours, finding 100 bugs. All agree these were the best requirements ever produced by this organization. The 100 bugs are fixed and the requirements approved at the next meeting. QA identifies the majority of problems during the review by repeatedly asking, “How will I test this?” Vague or inaccurate requirements are identified. The customer clarifies several points where the words written by the analyst didn’t accurately convey the customer’s needs.

Lesson #13: A cross-functional requirements review will always save more money by preventing bugs than it costs in time and manpower.

The design phase takes another month with multiple project team reviews as each component is documented. QA actively participates in this phase as well, concerned with developing the test cases and test data. (The best way to verify design specifications is to use them as the basis for test case creation. If they aren’t clear enough to make a test case, they aren’t clear enough for coding.) Fifty bugs are found in this phase. Half the scheduled time has passed; the PM is still calm.

Quality risk analysis is also done at this phase. The QA team examines each feature implementation item and assigns two numerical risk ratings: technical risk and business risk. Technical risk rates the risk inherent in the code or implementation due to complexity, traditional instability, difficulty in creating test data, and other technical risk factors. Business risk is a value measuring the impact to the customer if this item doesn’t work correctly.

Development is asked to review and to provide input into the technical risk rating. The analysts and the customer do the same for the business risk rating. QA then multiplies the risk items together to get one risk number for each testable software component. Testing is prioritized based on risk factors, allowing the team to mitigate the highest risk items first. If there isn’t sufficient time at the end of the schedule, the QA team will be able to talk about risk mitigation achieved versus risk still known to exist in the product as input into the business decision of releasing the software.

The SN development team is excited to start coding because it has a clear path ahead. Since it worked with the more technical QA members, it clearly understands code quality expectations and the importance of unit testing. It writes its code and unit tests simultaneously and keeps a rough count of the bugs found. Fifty bugs are found by unit testing. Development feels it’s the best and strongest code it has ever produced; it finishes it in two months, whereas previous efforts of similar size took six months.

System testing begins with a daily, automated smoke test. This automation was prepared while development was still writing the code and is used to verify that no issues creep into the daily build (remember the 30 percent regression rate on the previous project?). The QA team has a strong skill mix, including technical testers who can verify unit tests and build automation, and pure black box testers who specialize in GUI and reporting aspects. The project is apportioned among testers according to their strengths before the final test phase begins.

All test cases are assigned and prioritized. Feature testing is ordered based on the risk factors assigned. The QA team is prepared to discuss a ship/no-ship decision at any point in terms of risk mitigated.

The QA manager, based on past experience, knows that each bug found during testing would cost testers an average of four hours of bug investigation time. Fewer bugs means faster testing. He conservatively estimates that the team will find half the bugs as on the previous project (or 250 bugs). With 1,000 test cases, that means one bug would be found for every four test cases. Given the average bug investigation time of four hours, one hour of bug investigation time is assigned to each test case to determine a reasonable schedule. Using this number, the QA manager determines that QA will need 500 hours of test case execution time, plus one hour of bug investigation time per test case, or a total of 1,500 hours—or 38 workweeks. Reaching the 80 percent risk mitigation goal would require 30 weeks of test time.

Spread across the five testers who had been involved in the project from the beginning, six weeks of testing is expected. While not ideal, this would still meet the project requirements. As testing commences, the metrics prove the bug estimate was high; testing is actually completed in a month and achieves 90 percent risk mitigation. Thirty bugs did escape to the customer, but these were all lower-risk issues.

Lesson #14: Cost of quality metrics are easy to gather, and help to focus a team on the high-return activities.

Measuring Quality’s Cost

So what was the cost of quality on this release? Let’s look at the numbers:

Found in requirements: 100 x $1 = $100 Found in design: 50 x $5 = $250 Found in unit test: 50 x $10 = $500 Found in system test: 150 x $100 = $15,000 Found by the customer: 30 x $1,000 = $30,000 Total cost of quality: $45,850

While still not perfect, the cost of quality was a vast improvement over the previous effort. And it shows the areas that still need to improve. Too many problems reached system test. We’d like to see no problems get to the customer. But even with these acknowledged areas to improve, the project saved a half-million dollars on cost of quality and shipped on time.

Fiction? No. These cases were created from a composite of real projects with real humans. So, what made the last project successful? More time? No, it actually took less time than previous projects. Fewer features? No, the customer received the functionality that was needed. Heroics? No heroes needed. It was the people—a quality-conscious team guided by a smart PM, and driven by an active, involved and capable QA team.

A quality-focused team produces a better project in a shorter amount of time, every time, but you have to have the right people to make it happen. We won’t need the heroes to ride in at the end to save the project if it’s never in distress. A well-planned project with a quality focus won’t be in crisis. There may still be trade-off decisions, which is why we use risk-based testing to be sure we mitigate the highest risk first, but these can be informed decisions with measurable consequences.

People make our projects happen—regular people who are doing their jobs, not heroes (although some might argue that the PM was heroic to stand up and defend quality practices when the schedule lengthened at the beginning of the project). The QA team must have the skills, personalities and capabilities to perform a high-quality function throughout the project’s lifecycle. Get that team together, give it responsibilities, and integrate it into the project from the beginning. Building a quality team, like building a quality product, takes effort. Quality doesn’t just happen; the right people make it happen.

Read a sample chapter from Managing the Test People by Judy McKay.