In an earlier post I observed that “Seixon does not understand sampling”. Seixon removed any doubt about this with his comments on that post and two more posts. Despite superhuman efforts to explain sampling to him by several qualified people in comments, Seixon has continued to claim that the sample was biased and therefore “that the study is so fatally flawed that there’s no reason to believe it.”

I’m going to show, without numbers, just pictures, that the sampling was not biased and what the effect of the clustering of the governorates was.

Let’s look at a simplified example. Suppose we have three samples to allocate between two governorates. Governorate A has twice as many people as Governorate B, so if they are not paired up, A gets two samples and B gets one sample. (This is called stratified sampling.) If they are paired up using the Lancet’s scheme, then B has a one in three chance of getting all three samples, otherwise A gets them. (This is called clustered sampling.) Seixon claims that this method introduces a bias and what they should have done was allocate the three samples independently with B having a one third chance of getting each cluster. (So that, for example, B has a (1/3)x(1/3)x(1/3) chance of getting all three. This is called simple random sampling.

We can see the difference each of these three procedures makes by running some simulations. I used a random number between 1 and 13 as the result of taking a sample in governorate A and one between 1 and 6 for governorate B and ran the simulation a thousand times. The first graph shows the results for stratified sampling. The horizontal lines show the distribution of the results. 95% of the values lie between the top and bottom lines, while the middle one shows the average.

stratified sampling

The second one shows the result of clustered sampling. Notice that the average is the same as for the first one. This shows that by definition, the sample is not biased. However, the top and bottom lines are further apart—the effect of using cluster sampling instead of stratified sampling is to increase the variation of the samples.

cluster sampling

The third one shows the result of simple random sampling. The average is the same as the previous two. There is less variation than for cluster sampling.

simple random sampling n=3

The last graph shows simple random sampling but with two samples instead of three. The average is the same as for the others, and the amount of variation is about the same as for cluster sampling. In other words, the result of cluster sampling is just like simple random sampling with a smaller sample size. The ratio of the sample sizes for which cluster sampling and simple random sampling give the same variation is called the design effect. In this case it is roughly (3/2=1.5). In our example governate A was quite different from governate B (samples from A were on average twice as big). If A and B were more alike then the design effect would be smaller. That is why they paired governorates that believed were similarly violent. If the governorates that they paired were not similar, it does not bias the results as Seixon believes, but it does reduce the precision of the results, increasing the width of the confidence interval.

simple random sampling n=2

Seixon offers one more argument against clustering—if clustering is valid, why not put everything into just one cluster? The answer is that although that would not bias the result, it would increase the design effect so much that the confidence intervals would be so big that the results would be meaningless.

This article by Checchi and Roberts goes into much more details of the mechanics of conducting surveys of mortality. (Thanks to Tom Doyle for the link.)