The Law of Small Numbers - article under development 25/11/2018
The law of small numbers is a statistical quirk that is vitally important in the understanding and interpretation of health data. In brief, it points out that when a sample size is small, small random changes have a large apparent effect on the analysis of the data. What follows is of necessity a bit long but it is essential that patients who are reading medical journal abstracts and articles have a good understanding of this principle. Many of the worst examples of healthcare misunderstanding result from a failure to recognise the importance of this principle.
When we measure health outcomes, we generally look to the rate at which events occur. Medically this is referred to as the incidence and is usually expressed as a percentage or a rate per unit population. This allows us to compare one population with another of a different size: for example, the incidence of hip fractures might be compared between Auckland and Nelson. These areas have very different numbers of people so there will certainly be more fractures in Auckland, but the rate per unit population may be the same.
The law of small numbers means that comparing a relatively uncommon event like hip fracture is fraught with difficulty in New Zealand. Precisely because Nelson has a relatively small population, relatively few hip fractures occur. Random events occur unpredictably - sometimes there are more and sometimes less, by chance alone. In a small population, a small change in the absolute number of events has a large effect on the rate as we calculate this by dividing the number of events in a given time by the population size.
This is further compounded by a natural human reaction when faced with insufficient data.
If we look at the picture illustrating this article, we see one red sweet and four black ones. If this is a handful taken from a larger bag of sweets, most people would assume that there are fewer red sweets in the bag than black sweets. Of course it is remotely possible that we picked out the only four black sweets in the bag, or that there was only one red sweet in the bag. Either way, from what we know, our best guess is that 20% of the sweets are red. Daniel Kahneman (Nobel Laureate in Economics 2002) discovered that people tend to generalise from small numbers so will go with this assumption rather than think a bit deeper and consider alternative explanations for the observation.
This combination of a statistical anomaly affecting small numbers and the human tendency to use this information even though we know it may be flawed results in interesting problems.
A good example was the Bill and Melinda Gates Foundation. It was noted that small schools were more often represented in the top 10% of performance than larger schools. This led to the foundation supporting efforts to make schools smaller. Later it was noted that small schools are also over-represented in the bottom 10% as well. This is because a small number of children with notably higher or lower performance markedly affect the rate at which students perform well or poorly. A closer look still shows that whilst schools with smaller numbers of pupils continue to be over-represented in the top 10%, it is not often the same schools, rather those who have a few higher performing students briefly get a turn in the limelight.
In healthcare this can lead to serious errors, both in government policy and in the management of illness.
In theory, statistical tests take into account the size of a population and so do to a degree account for the law of small numbers, but when the sample size is small, even common random variations in events can appear to be significant.
In medical studies, law of small numbers errors are more common if:
- The sample size is smaller
- The outome being measured is less commonly occurring
- The outcome being measured is subjective (like pain, not easily measured exactly)
Regardless of the effect being studied, this problem leads to many real effects being discounted and many spurious events being recorded. As positive results are much more likely to be published than negative results, this primarily leads to over-estimation of effects in the medical literature.
When you are reading the abstract of a paper, it is very important therefore to bear in mind the size of the sample. As a rule, studies with a sample size less than 500 should be treated with great caution regardless of who was involved in the study. Using the data requires a good deal of expertise and a thorough understanding of both the complete study methodology and other studies in the field. From the perspective of a patient reading about their healthcare
When considering a statistical result, it is important to understand that chance has a much more substantial effect on outcomes when sample sizes are smaller.
In this image, for instance, it is possible to say that 1 in 5 sweets are red, as we can see four black sweets and one red sweet. A larger picture might well tell a quite different story - it is possible that the red sweet is the only one among many black sweets, that there are many of them or perhaps only two. There is no way to tell for sure from this picture. Similarly, it is very important to remember that the people studied in any medical trial can only be part of the picture and that inferring information from small snapshots of people is hazardous.
Daniel Kahnemann (Nobel Laureate in Economics 2002) discovered that people tend to generalise from small numbers. His work suggests that told this picture is from one of a bag of sweets, many, if not most people would conclude that someone prefers to eat the red sweets rather than saying it is impossible to tell how many sweets of either colour there may be. This is a key problem when interpreting medical studies - we see the words 'significant diference' and forget to check how large a sample was tested. The group being studied is to the population at large as our picture is to the whole bag of sweets from which it came.
The problem is that there are statistical methods that will determine significance on almost any sample size, as they are simple mathematical tools and not intended in themselves to make judgements of worth. This means that many studies with very small sample sizes claim a significant result even when doing so is plainly illogical. Peer review does not get prevent this as the reviewers assume a trained audience who should be aware of the importance of sample size and may feel a paper is worthy of publication in case it might inform further research.
So when you are reading a medical paper, look for the number of patients in the trial. This may be clearly stated, or represented briefly as n= (however many were in the trial). Not all abstracts include this information and we strongly recommend that you do not make use of abstracted data that does not include a sample size. This is often the sign of a lower quality paper in any case.
How many is enough? - that depends on what you are looking to find, how different the change is likely to be from the background variation found in the population at large and how likely the change being sought is to occur. A very obvious single event that is rare in the population at large but that might be expected to occur frequently with an intervention might need only a small number in the study, but more complex problems need larger populations.
By and large, the more complex the condition being studied, the larger a trial needs to be in order to be relevant. Mood is constantly changing and very susceptible to many influences, so detecting the effect of a new antidepressant, for instance, requires a good sized sample or very careful case-control matching. Similarly body weight change is multi-factorial and so many hundreds of patients are needed to make meaningful studies.
It is also important to understand the context of the trial. A study to determine whether a new medication lowers blood pressure might need only a few participants - perhaps as few as 40 to be useful, but the same drug would need a trial of hundreds, possibly thousands to prove that it was better than another drug in actually preventing heart disease or stroke.
As a useful guide, studies with fewer than 100-200 patients are very unlikely to be of value to non-professional readers and studies with fewer than 50 patients are seldom of more than passing interest. For more complex conditions with continuous ouctome measures (such as weight loss studies) even larger numbers may be needed.
The bottom line is that small scale clinical trials are not best used to inform clinical decisions or to support arguments; they are instead indicators of further research needs and hints at the direction that further research might take. Pointers and possibilities, not certainties. Even really well conducted trials with robust scientific methodology can fall foul of the law of small numbers. They are best treated with extreme caution.
Have a Piece of Candy: Kasia