Category: Process Improvement

Metric Misuse – Quality Assurance Metrics Gone Awry

I was reading through some posts from Bob Sutton, one of my favorite management gurus, and I ran across a post that contains one of my favorite Dilbert comic strips.

Bob Sutton’s post, as well as the comments that I made on his blog, reminded me of one of my favorite topics: misused Quality Assurance metrics.

Tying Quality Assurance Metrics to Financial Rewards – A Dangerous Game

“Treat monetary rewards like explosives, because they will have a powerful impact whether you intend it or not.” –Mary and Tom Poppendieck, authors of Implementing Lean Software Development: From Concept to Cash

Over the years, many people have asked me what Quality Assurance metrics they should use to evaluate employee performance. My advice is that Quality Assurance metrics should not be used directly to evaluate employee performance. The Dilbert comic strip may seem a bit extreme, but it’s exactly what happens when employee performance is based strictly on metrics. This is true regardless of whether monetary rewards are explicitly tied to the metrics or not.

In my comments on Bob Sutton’s blog, I mentioned three specific metrics that had unintended effects when used for evaluating employee performance:

  1. rewarding testers for the number of test cases they wrote resulted in poorly written test cases;
  2. rewarding testers for the number of bugs they found resulted in a high number of unimportant or duplicate bugs reported; and
  3. penalizing testers for bugs rejected by the test lead or development staff resulted in important bugs going unreported.

Many people think that they have the ability to write a set of metrics that can be used to unequivocally gauge the performance of a Quality Assurance professional, but I have not yet encountered a metric that couldn’t be manipulated to favor the employees.

(If the metric can’t be gamed, it probably isn’t under the control of the employees, so it wouldn’t be effective at driving behavior anyhow.)

Are Metrics Worthless Then?

Actually, metrics are a great tool for identifying coaching opportunities and potential problems. However, in order to get honest metrics, they shouldn’t be used directly for employee evaluations or employee rewards.

When I’ve looked at the metrics that I mentioned earlier with an eye towards coaching, I had excellent results.

  1. Reviewing the number of test cases written helped me identify a tester on my team who was putting much more detail than I wanted into his test cases. After some coaching, he was able to consistently meet my expectations.
  2. Reviewing the number of bugs found by each tester helped me identify a tester who was digging into the root cause of the most difficult to reproduce bugs. She didn’t report as many bugs as others, but her work was critical to getting a great product out the door in a timely manner. It turned out that she was the most skilled tester even though she reported the fewest bugs.
  3. Reviewing the number of bugs rejected by the development staff helped me identify a manager who was evaluating his programmers based solely on the number of valid bugs found in their code. The developers were motivated to simply mark bugs as invalid rather than fix the bugs. This insight allowed me to address the problem directly with that manager.

Good Quality Assurance metrics provide powerful tools for managing a Quality Assurance team when used properly. However, they shouldn’t be used in a vacuum. They should just be considered one data point among many.

I was only able to scratch the surface of this topic in this blog post. I plan to discuss specific metrics in future blog posts. In the meantime, if you want to read a much more in-depth review of the pros and cons of employee incentives, you can find one paper here.

Your Experiences

I know that a lot of people feel passionately about Quality Assurance metrics, both pro and con. I’m very interested to hear about your experiences with Quality Assurance metrics. Have you found any that were particularly useful? Have you found any that had unintended consequences?

Influencers – James Bach

There are a number of blogs that I enjoy reading about all areas of the software development process. The software development process includes project management, business analysis, development, and testing, of course. In addition, delivering software that works requires consideration of concepts around management, sales, and business organization.

I’d like to share some of the blogs that I find influential when thinking about how to build software that works.

James Bach and Exploratory Testing

One of my favorite blogs is James Bach’s Blog. He is the creator of Rapid Software Testing. I first became interested in his blog because he wrote about exploratory testing in a way that made sense to me. I knew from experience that exploratory testing was one of the best ways to find important bugs, but I didn’t have a great way to communicate the high value of exploratory testing to teams that were focused heavily on mechanizing the testing process. James Bach has written a lot on the topic of exploratory testing. This particular post about the history of Exploratory Testing captures many of the concepts that drew me to the blog.

I particularly like the concepts of testing vs checking and the “responsible tester.”

Feedback

What do you think of James Bach’s blog and the concepts he writes about? What blogs and resources do you find particularly interesting and useful for building software that works?

Prioritization and Quality – Tales from the Trek – The Hoarders

Impact of Prioritization on Quality

As I’ve mentioned before, a key aspect of building quality software is ensuring that it does what the users need it to do. In my experience, the backlog of feature request (whether written or held in the stakeholders’ heads) is always much larger than what the development team can build in a short period of time.

When the backlog gets too big, people could spend more time managing the backlog than actually building anything. What is more likely, though, is that most of the backlog is ignored, and the clutter causes great ideas to get lost. I have seen cases where key customer issues ended up unaddressed for months until the customer complained a third or fourth time.

Idea Hoarding

I sat down with one of my clients to look at their backlog, and we found that they had over 400 backlog items that had not even been viewed for more than a year. They had more new, high-priority work coming in than they could deliver, so their backlog was growing. Clearly, nobody was ever going to review, let alone work on, the items that were over a year old. I suggested simply closing the backlog items that hadn’t been touched for over a year, but the client didn’t want to remove any items from the backlog without first reviewing them in a meeting with a team of key people, which was not going to happen.

The discussion reminded me of an episode of Hoarders where they were trying to convince someone to sell most of his 27 tool boxes. He agreed that 27 might be overkill, but he still didn’t want to sell them and insisted that the average person, who wasn’t a handyman like him, would need at least 7 toolboxes.

Time to Declutter

When a backlog gets hopelessly large, you may want to consider declaring backlog bankruptcy (based on the concept of email bankruptcy) and simply close all items that haven’t been looked at in over a year. If that sounds scary, I can understand. I tend towards hoarding myself, and I hate the thought of getting rid of something that might come in handy later. If declaring backlog bankruptcy, it may help to keep these ideas in mind:

  • When there are too many backlog items, they are all ignored. The best ideas can’t break through the clutter.
  • Business changes so quickly that most ideas more than a year old aren’t relevant any more.
  • If an idea in the backlog is truly that important, it will get entered as a new entry again. In fact, if it’s that important, it probably was entered multiple times and has already been implemented.
  • You can always search on closed items if you really, really want to!

Keeping the Backlog Under Control

Once you cleaned up the backlog, you want to try to keep it manageable. It helps to have a weekly triage process where the items are reviewed and prioritized. Some decisions that should be made during the triage process are:

  • Is there any realistic chance this item will get resources in the next year? If not, close it.
  • If the item is going to remain open, assign it to an owner who will take responsibility for getting the item onto the schedule.
  • Enter the date the issue was last reviewed.
  • Assign a priority and effort to the item.

I’ve found that it’s easier to identify which issues to review if you create a report that shows the priority of each item, the date it was entered, and the date it was last reviewed. This type of report helps ensure that older items are addressed.

Feedback

What challenges have you had with backlog clutter? What actions have you tried to address the challenges? I’d love to hear your stories from the trek.

The User Acceptance Testing Death Spiral

In a past role, I joined a test team that was in a User Acceptance Testing (UAT) “Death Spiral” that had caused the user base to lose confidence in the integration testing team. Based on conversations that I’ve had with others, I believe that the UAT Death Spiral is a common scenario that people encounter, and it can destroy a test team. It took some work, but we were able to pull out of the downward trajectory and regain a functional, productive partnership between the business folks and the integration testing team.

What is the User Acceptance Testing Death Spiral?

Our company had an understaffed integration testing team, aggressive deadlines, and a culture that valued meeting deadlines above all other goals. This meant that software often had known major bugs when UAT started. Even worse, there were typically areas of code that hadn’t been tested at all by the integration testing team before UAT started. It was common for the UAT team to find major, obvious bugs.

Because major bugs made it past the integration test team, the business felt a need to create a more formal, robust UAT team that would catch the numerous errors missed by the integration test team. The business folks assumed that the bugs were missed because the integration team wasn’t very good at their job.

As UAT became more robust, they realized that they needed more time to complete their testing. The company culture of tight deadlines meant that release dates could not be extended to accommodate in-depth UAT. Instead, the business insisted that the integration test cycle be shortened and that UAT start earlier.

I think you can see where this is going…

When UAT started earlier, they found even more bugs and the business lost even more confidence in the integration test team. The business then insisted on testing more of the functionality in UAT and starting UAT even earlier. This death spiral continued until the business had lost complete confidence in the integration test team. However, because the UAT team was not made up of experienced testers, the business was not finding all of the bugs either.

Basically, the integration team didn’t have time to do their job, the business was spending a huge amount of resources to test everything, even features already tested by the integration test team, and the quality of software in production was as poor as ever.

Breaking Free

Breaking free of this downward spiral was more than just a logistical problem. It was a political problem as well. The integration test team feared that the UAT team was trying to take over their jobs and the UAT team felt that the integration team wasn’t competent. Rebuilding trust was critical for any new process to be successful.

I worked with both the integration and UAT test teams to plan a new strategy. The strategy was that the integration test team would first test anything that did not have a user interface. In addition, the integration test team would write, maintain, and run automated regression tests. Basically, they would test the areas that required their expertise. Only after these areas were tested and any major bugs were fixed would the UAT team start their work. We would divide up the test cases to reduce any overlap of testing between the two teams as much as possible.

Even though the UAT team agreed that this plan made sense theoretically, they feared that removing the redundant testing would mean that bugs were missed and worried that starting UAT later would mean that they wouldn’t have time to complete their work. I convinced them to give the plan a try on a smaller project. If the advantages to the new plan didn’t materialize, it would be easier to adjust for the lost time on a small project.

Fortunately, on the small project, everything fell into place. The integration test team was able to adequately test their portion of the plan before UAT started, and the UAT team knew exactly what parts were and weren’t tested by the integration team. The UAT team had a shorter test cycle both because they didn’t run as many redundant tests and because the initial quality of code was much better. Each bug takes time to find, fix, and retest.

Best of all, the software went live and had no problems in production.

For the rest of my time with this team, we followed the new process. This resulted in higher quality code with lower cost, and it had the added benefit of greatly improving the working relationship between the two teams.

Your Experiences

I’d like to hear your experiences with User Acceptance Testing issues. Have you been in a situation where the business users lost confidence in the development team or integration test team? If so, what do you think were the root causes of the issues?

Prioritization and Quality – Tales from the trek – Priority 0

Impact of Prioritization on Quality

A key aspect of building quality software is ensuring that it does what the users need it to do. In my experience, the backlog of feature request (whether written or held in the stakeholders’ heads) is always much larger than what the development team can build in a short period of time. However, prioritizing these features seems to be difficult for people. Everything seems important, so everything gets priority 1.

Of course, if everything has the same priority, the stakeholders are de facto allowing the development team to prioritize the features. This can be a problem because the development team often doesn’t have the visibility to all of the factors that may determine the importance of the feature to the company’s success.

Here’s one example of a client who had an issue with prioritization, and how we arrived at a working solution.

Going Live too Early

A client had a new system replacing an existing business critical system. Unfortunately, their existing system had reached its technical limits before the new system was fully tested, and management made the decision to go live without much testing. Of course, the results were predictable. There were many errors in a production system that had to be fixed right away.

 

The “prioritization” method initially was that end users would come into the room of developers and tell them that they needed to drop everything and work on whatever issue the end user mentioned. The problem was that many different users were coming in each hour, and the developers didn’t get a chance to finish any task before being told to drop it and work on something else.

Getting Organized for Quality

The first thing we did was set up a SharePoint list where users could report their issues. We created a process where the users would report their issues in SharePoint. Then, I would triage the list and assign the work to developers. This simple improvement resulted in a huge increase in productivity for the development team because they could complete tasks without interruption.

However, we weren’t always working on the most important issues. Users were choosing the priority, and every issue was the highest priority to that user. Even when we met with representatives from all departments together and set definitions for priorities, every issue was priority 1 on a 3-priority scale.

Priority Names Matter

Our original 3 priority levels were called “High”, “Medium”, and “Low”. Because all issues were production issues, people didn’t want to minimize any by calling them “Medium” or “Low”. Everyone agreed that the issues were not all the same priority, but they weren’t willing to prioritize using those names.

First, the client came up with a category called “Priority 1 – Urgent”. This was higher than “Priority 1” and the client felt comfortable putting some items in Priority 1, and some in Priority 1 – Urgent. Still, way too many items were in Priority 1 – Urgent, so the development team was still choosing the priority.

Then, the client decided that the most critical items would be in a new priority called “Priority 0”. This was reserved for the top 5-7 issues to be worked on immediately by our 5-person development team.

This worked! The client was completely willing to prioritize into “Priority 0”, “Priority 1 – Urgent”, and “Priority 1” even though they were not willing to prioritize into “Priority 1”, “Priority 2”, and “Priority 3”. Just by changing the names of the priority levels, we were able to accomplish the goal of dividing the issues into 3 different levels.

We could then focus on the issues that brought the most value to the system.

Feedback

What challenges have you had with prioritizing features? What actions have you tried to address the challenges? I’d love to hear your stories from the trek.

Embedded Quality – Building on a Solid Foundation

Embedded Quality is a Quality Assurance (QA) technique that I’ve introduced on many projects with great success. Embedded Quality has some of the following benefits:

  • Avoids long test/fix cycles at the end of projects
  • Ensures that projects finish strong
  • Lowers cost
  • Shortens timeline
  • Improves quality
  • Increases customer confidence in IT department
  • Easily applied to any software development methodology

Basic Concepts

The basic concepts of Embedded Quality are nothing Earth-shattering. The concepts draw from standard software development techniques and seem like they should be common sense. However, in the 20+ years that I’ve been working in software development, I’ve found that these concepts are not commonly practiced.

QA starts on Day 1

I have found that on many projects, Quality Assurance is an after-thought. Involving QA as part of the core project team helps prevent nasty surprises at the end of the project.

QA is part of the Core Project Team

Too often, the development team and QA team are completely separate entities who send their work “over the wall” to each other. Quality and speed increase exponentially when we break down the barriers between these groups and work as a team.

QA is performed by qualified experts

One of my clients used to hire people out of high school to test their software. If they did well, they got promoted to doing phone support. Eventually, they realized that they could drastically decrease phone support and increase customer satisfaction by hiring qualified people to do their Quality Assurance work.

Just as building inspectors need special knowledge and experience to do their jobs well, software QA experts need special knowledge and experience to do their jobs well.

General User Acceptance Testing (UAT) does not begin until Core Project Team QA is complete

A common practice on projects is to rely on UAT as the only form of testing. This results in the end users seeing a large number of the flaws in the software. Invariably, I have seen the end users permanently lose confidence in the development team when using this QA strategy.

I liken this to a builder turning over a new house to a buyer without having any inspections and saying, “We don’t really know if the plumbing is solid, all the electricity works, or the roof leaks. Just try everything out and let us know what the problems are.” (In fact, this does happen: http://abcnews.go.com/GMA/Consumer/story?id=2630414&page=1.)

Strong Foundation – No code is “Complete” until it is tested and works correctly

Many project schedules track code as complete when the developer completes their initial delivery. This can give a false impression of the project status.

More importantly, this ensures that new code is built on a strong foundation. When developers build new code on untested code, it is very similar to building an office tower on a foundation that has not been inspected. If a problem exists in the foundation and is not identified early, the building will fail.

http://blogs.wsj.com/chinarealtime/2009/06/29/shanghai-building-collapses-nearly-intact/

Embedded Quality – Next Steps

Over the next few months, I will post tips on how to implement Embedded Quality on your next project. I’ll share stories of my experiences over the past 20+ years, and I would like to hear your experiences as well.

Scroll to top