An article by Howard Wainer1 in the May-June issue of the American Scientist is about the equation that is the title of this post, where the ratio of a sample standard deviation (S.D.) to the square root of the sample size (N) gives the standard error of the mean of the sample (S.E.).
If you collect many samples from a population and calculate the mean of a certain variable for each sample, you will probably get as many different sample means as there are samples. The sample means will have a normal distribution and the standard deviation of that distribution will be the standard error, calculated as above.
The standard deviation of a sample is an estimate of the variability of the population from which the sample came, whereas the standard error is not a direct measure of the variability of the population. This is because the standard error depends on sample size. The larger the sample size, N, the smaller will be the standard error. To demonstrate this, I have carried out a simulation. From a "population" of 500 normally distributed numbers, I randomly picked 10 samples each of 10, 20, 30, 50 numbers, 5 samples each of 100 numbers and 4 samples each of 150 numbers. The plot below shows the distribution of the sample means as a function of sample sizes. You can see how the scatter of the sample means around the population mean (9.97, indicated by the red horizontal line; S.D. was 1.015) decreases as the sample size increases.
The take-home lesson from Wainer's article is that the mean values of small samples are likely to have greater variation than those of larger samples and that any conclusions, not just purely scientific but also those with political and social implications, based solely on small samples must keep this in mind.
One example Wainer discusses in detail involves the claims put forward in the 1990s that smaller schools are better than larger schools. In reality, data seem to show that not just the schools with the highest performances, but also those with the lowest performances are more likely to be small, because of increased variation at small sample sizes. Another example is about an insurance company's ranking of the safest and least-safe cities in the U.S. that did not include any of the largest U.S. cities. The reason is that the safety ratings of the largest cities are closer to the national average, while smaller cities, perhaps mostly by chance, are more likely to be much better or much worse than the average.
1Howard Wainer. The most dangerous equation. American Scientist, May-June 2007, pp. 249-256.