Summary: Elaborate usability tests are a waste of resources. The best results come from testing no more than 5 users and running as many small tests as you can afford.

Some people think that usability is very costly and complex and that user tests should be reserved for the rare web design project with a huge budget and a lavish time schedule. Not true. Elaborate usability tests are a waste of resources. The best results come from testing no more than 5 users and running as many small tests as you can afford.

In earlier research, Tom Landauer and I showed that the number of usability problems found in a usability test with n users is:

N (1-(1- L ) n )

where N is the total number of usability problems in the design and L is the proportion of usability problems discovered while testing a single user. The typical value of L is 31%, averaged across a large number of projects we studied. Plotting the curve for L =31% gives the following result:

Diminishing returns for usability testing, as more and more users are tested. The curve bends around 5 users, which is the recommended number of test participants.

The most striking truth of the curve is that zero users give zero insights.

As soon as you collect data from a single test user, your insights shoot up and you have already learned almost a third of all there is to know about the usability of the design. The difference between zero and even a little bit of data is astounding.

When you test the second user, you will discover that this person does some of the same things as the first user, so there is some overlap in what you learn. People are definitely different, so there will also be something new that the second user does that you did not observe with the first user. So the second user adds some amount of new insight, but not nearly as much as the first user did.

The third user will do many things that you already observed with the first user or with the second user and even some things that you have already seen twice. Plus, of course, the third user will generate a small amount of new data, even if not as much as the first and the second user did.

As you add more and more users, you learn less and less because you will keep seeing the same things again and again. There is no real need to keep observing the same thing multiple times, and you will be very motivated to go back to the drawing board and redesign the site to eliminate the usability problems.

After the fifth user, you are wasting your time by observing the same findings repeatedly but not learning much new.

Iterative Design

The curve clearly shows that you need to test with at least 15 users to discover all the usability problems in the design. So why do I recommend testing with a much smaller number of users?

The main reason is that it is better to distribute your budget for user testing across many small tests instead of blowing everything on a single, elaborate study. Let us say that you do have the funding to recruit 15 representative customers and have them test your design. Great. Spend this budget on 3 studies with 5 users each!

You want to run multiple tests because the real goal of usability engineering is to improve the design and not just to document its weaknesses. After the first study with five participants has found 85% of the usability problems, you will want to fix these problems in a redesign.

After creating the new design, you need to test again. Even though I said that the redesign should "fix" the problems found in the first study, the truth is that you think that the new design overcomes the problems. But since nobody can design the perfect user interface, there is no guarantee that the new design does in fact fix the problems. A second test will discover whether the fixes worked or whether they didn't. Also, in introducing a new design, there is always the risk of introducing a new usability problem, even if the old one did get fixed.

Also, the second study with 5 users will discover most of the remaining 15% of the original usability problems that were not found in the first round of testing. (There will still be 2% of the original problems left — they will have to wait until the third study to be identified.)

Finally, the second study will be able to probe deeper into the usability of the fundamental structure of the site, assessing issues like information architecture, task flow, and match with user needs. These important issues are often obscured in initial studies where the users are stumped by stupid surface-level usability problems that prevent them from really digging into the site.