In Intuitive Biostatistics (1995), Harvey Motulsky writes:
With large samples, even very small differences will be statistically significant. Even if these differences reflect true differences between the populations, they may not be interesting. You must interpret scientific or clinical importance by thinking about biology or medicine. For example, few would find a mean difference of 1 mmHg in blood pressure to be clinically interesting, no matter how low the P value. It is never enough to think about P values and significance. You must also think scientifically about the size of the difference.I had some arguments along these lines in the previous post in this series.
Now let's look at some real data. While I was studying the mating of the land snail Oxyloma retusum, I observed the matings of 20 pairs of snails. In 13 pairs the smaller snail was on top, while in 7 pairs the larger snail was on top. Did the snails position themselves randomly with respect to each other or was the difference between the positioning of the snails statistically significant?
I did a chi-square "goodness of fit" test and got a P value of 0.18. Since the P was greater than the "magic" cut-off value of 0.05 (5%), I could not reject the null hypothesis that the positionings of the snails was random.
To better understand what Motulsky meant in the above quote, let's assume I had a larger sample of 100 pairs and in 65 of them the smaller snail was on top, while in 35 pairs the larger snail was on top (note that the ratio of the 2 groups is still the same, or 1.857). Now the P is ~0.003 and therefore, I can reject the null hypothesis and conclude that the positionings of the snails was not random.
Now, let's apply the chi-square test to each set of the following made-up numbers.
|Total||Group 1||Group 2||P|
A difference of 100 out of 1000 could indeed have biological significance. But, what about a a difference of 70 out of 1000? Could that also be biologically significant? I don't know. It depends. When it comes to biological significance, there are no magic numbers, laws or rules. Decisions have to be made on a case-by-case basis.
And why is a difference of 10 out of 100 not biologically significant? In borderline or ambivalent cases like these, the best thing to do is probably to repeat the study, if possible, many times. If you had 10 samples of 100 snails each and if in each sample the difference was about 10, you would be more confident to declare that the repeatedly observed difference was real and biologically significant.