How many people do I need to test on? (Answer: It’s not as many as you think.)
An RCT (Randomised Control Trial), an online survey (e.g. a Semantic Pairs survey, or Comprehension-Motivation Test©, or Implicit Attitude Map©), or on-street sampling are insight work ‘at scale’ that require large numbers of people to get solid data.
It must be large numbers of people. It must be so. Large. Larger. More is better, right? Think of a number… no, more. No, more than that. Double it. Top it up a bit more, Jack…
Still more, right? Wrong; It’s not as many as you think, honestly.
The power law describes an imbalanced relationship between e.g. the size of a population and the number needed to sample to be able to extrapolate the sample for the whole population. And it works in the researcher’s favour: as population increases in number, the sample size, well… doesn’t increase by much.
You can see, in the graph and table image below, the sample size flatlining despite the population increasing by one decimal point each time. To have an error range of 5% (the answers you get will be + or -5% of true, with 95% confidence) for a population of 40,000 you need sample only 381 people. To go from a population of 40,000, to 400,000 and to 4 million you have to climb up to the heady heights of 384 people. And it’s flat-lined there.
This graph/table let’s us, ahead of deployment:
- Be confident we are caputirng large enough sample to make robust extrapolations
- Can decide of margin of error ahead of sampling
- Can have confidence in answers (with the margin of error chosen)
- Can work out p values
Generally speaking, if we’re running an RCT (control versus new condition – no other aspects at play) or a survey and are not splitting the data too much (under 20’s, women, who answered x, y, z of which we’re left with only a few respondents, and who answered a question-set similarly to another group) we recommend either ~200, ~400, or ~1100 people. There’s little point being halfway between any of those numbers, say at ~800 because you don’t get the next error range threshold and so are just wasting time and money.
The power law is everywhere: The distributions of a wide variety of physical, biological, and man-made phenomena approximately follow a power law over a wide range of magnitudes. These include the sizes of craters on the moon and of solar flares, the foraging pattern of various species, the sizes of activity patterns of neuronal populations, the frequencies of words in most languages, frequencies of family names, the species richness in clades of organisms, the sizes of power outages, criminal charges per convict, volcanic eruptions, human judgements of stimulus intensity, and many more. It’s not niche, by any means.
So, there’s your solid robust data using smaller numbers than most assume. The only watch out with surveys (not RCTs they’re fine) is that you might miss an entire line of questioning and be ‘blind’ to that aspect, which might be important.
HUMAN BEHAVIOUR – AND HOW
TO CHANGE IT
For more speak with Davina (Client Services Director) or Oliver (Founding Director)
+44 (0)843 289 2901