Thu 22 Sep 2005
Daniel Davies has a new post on Lancet denial, with some particularly egregious examples. The worst example is by Harry of Harry’s Place whose “discussion” of the study is to make a statement that he must surely know to be false:
Dsquared is a serial bullshitter who has never given a straight answer to any question.
Davies also links to a transcript by Seixon of the Hitchens-Galloway debate, where Seixon touts his own debunking of the Lancet study. Seixon’s debunking fails because he makes basic errors in his statistics, but at least they are original, so let’s look at where he goes wrong:
Dr. Les Roberts removed 6 provinces from being in the sample. That means that every single household in those 6 provinces was purposefully given a probability of 0% of being chosen for the sample. This violates the principle of randomness, thus violating the principle of statistics that you have a random sample. … The study’s results are based on a biased sample. Resting upon this fact alone, the study’s results cannot be claimed to be accurate, nor should they be trusted as accurate. Dr. Les Roberts and anyone else cannot argue this simple point, because then they will have to take on the vast body of statistical literature and theory looming over the credibility of this study.
Unfortunately, Seixon does not understand sampling. The sample was not biased by the exclusion of six randomly chosen provinces since each household in Iraq was equally likely to be chosen by the sampling procedure. Seixon could just as well argue than all surveys are biased because after the sample has been randomly selected each person outside the sample has a 0% chance of being selected.
In response the study’s statement that “Most individuals reportedly killed by coalition forces were women and children.” Seixon offers this:
46% men, 46% children, 7% women, and 1% elderly. With this in mind, try to finish this sentence: Most individuals reportedly killed by coalition forces were _______ and __________. In a world where scientists don’t try to hoodwink their readers, the correct answer would be “men” and “children”. Yet they chose “women” and “children”, even though out of the 4 groups, women were #3, and they put two groups into their sentence. I guess they were hoping no one was going to actually read the rest of the study and find out that they are misleading liars. Seriously, what is the point of misleading in this fashion? Could it be… a political agenda?
So even though the statement is true Seixon insists that it is a somehow a lie. The fact that seems to have escaped Seixon is that Roberts et al grouped women and children together because they are likely to be non-combatants.
September 22nd, 2005 at 3:39 am
I think his confusion is not the confusion you attribute to him. The giveaway is earlier in his post where he says: “The reason is that all of the clusters from the one were transferred to the other, based on one random selection.” He is misreading this:
Incidentally the guy’s name is evidently George Gooding.
September 22nd, 2005 at 7:00 am
Oh boy, you screwed up, mostly because you took what I said out of context.
Look, the Johns Hopkins team did a real random sample to begin with. At that point, 17 of the 18 provinces had households distributed to them, a true random sample.
Then they decided that they needed to cut down the amount of provinces they were going to travel to, so then they paired up 12 provinces, so that 6 provinces were stripped of their households.
Thus, as I said, the sample is no longer random.
Kevin, you also don’t seem to get this. They didn’t redistribute all those households randomly individually. They took all the households in a pair, then randomly chose one of the provinces, and gave all the households to that one. Thus, they didnt randomly distribute each household, therefore no longer a random sample.
You can’t have a random sample when you purposely remove 6 provinces from your original random sampling.
Tim, you screwed up most here:
”Seixon could just as well argue than all surveys are biased because after the sample has been randomly selected each person outside the sample has a 0% chance of being selected.”
Well, obviously you didn’t read the study. They did do a random sample to begin with. If they had left it, then I would have nothing to talk about. At that point, all households in Iraq had an equal chance of being selected. THEN, they started tampering with their random sample, and removed households from 6 provinces and put them in 6 others.
Try not taking things out of context, and try not misrepresenting what others are saying.
I understand statistics quite well, and the fact that no statisticians have been asked to comment on this, beside the one I quoted in my post who happens to agree with me, is quite a telling sign.
I’m sorry, but this Lancet study is all washed up, and the only ones in denial are the ones who still believe it is worth the time it takes to read it.
September 22nd, 2005 at 7:14 am
And Kevin, what in the world was the purpose of printing my name? Trying to scare me into silence for my dissent?
How did I misread that passage? I misread it because my name is so and so? Haha. Brilliant.
September 22nd, 2005 at 7:15 am
I could be wrong, but I imagine that reliable numbers are hard to get on this and that any study is bound to be imperfect. I would not be surprised if the study was fairly accurate. However, I am curious to know how “children” was defined. “Reportedly” is a pretty slippery word too.
September 22nd, 2005 at 7:24 am
Please tell me how, regardless of the methodology of the Lancet study, you can do the following:
Discount a study that has 20 times the sample of the Lancet one, and 26% more coverage?
Innumerates? Give me a break Tim, you are grasping for straws. Don’t lump me in with all those who took Kaplan’s word for it, I have already stated that his “debunking” was severely lacking. Anyone who believes his article blindly are innumerates. I have proposed something entirely different, and it is undeniable. You can’t hide from it, so you try to steamroll right past everything.
September 22nd, 2005 at 7:26 am
I see children were defined as under 15, that seems reasonable.
September 22nd, 2005 at 7:44 am
Seixon,
A sample of households is random if, at the outset, each household has an equal probability of being selected. If my raffle ticket is blue and a red ticket is drawn from the hat, it is immediately apparent that my probability of winning has fallen to zero. That does not mean the draw was unfair. Notice that the probability distribution changes while the process of drawing the winning ticket is still going on. We started with all tickets having an equal chance of being the winning ticket. Now all red tickets have an equal probability and all other have probability zero. Finally the distribution collapses to a single ticket with probability one. Nothing is changed by the fact that the probability distribution alters during the selection process. It is nonetheless a random draw. Likewise the Lancet team’s approach.
Apologies for dragging your name into it; I should have thought of the possibility that you would prefer commenters to stick to using your pseudonym. Since Tim Lambert has dealt with numerous Lancet critics, it seemed likely that your paths had crossed before you started your blog. No, rest assured that I don’t expect you to be intimidated.
September 22nd, 2005 at 8:14 am
Kevin,
Again, you are misreading me.
The Lancet team did a random sampling to start off. Now, if they had simply left it at that, I would have no qualms about that aspect of the methodology. As you said, from the original sampling they did, all households had an equal chance.
The problem started when they decided to cut 6 provinces out of the sample entirely. They did this because they wanted to cut down the amount of traveling they would have to do, supposedly for security reasons (when in fact, they went to all the most dangerous areas of Iraq, purposefully going to Fallujah, for example, which later was excluded as an outlier).
At this point, they would remove all the clusters from 6 of the provinces, and place them in 6 other provinces.
This meant that Basra’s 2 clusters were moved to Missan, based on one single drawing.
If they were to be truly randomly selected, keeping with the equal chance of being selected, they would have to do one drawing per cluster. They only did one drawing per grouping of clusters.
Thus, the entire sample became biased and 6 provinces were purposefully removed from the sample.
To put it more simply: there was 0% probability of households being in both Basra and Missan.
September 22nd, 2005 at 8:27 am
Another thing:
I said it was misleading the way they said “women and children” instead of “men and children”.
Any objective reading of the statement gives the impression that women and children were most affected. The truth, that men were the most affected of all, gets completely left out.
Thus, the statement is misleading, not a lie, as I said. You know, if you actually read what I wrote instead of shooting off misrepresentations of my work.
September 22nd, 2005 at 8:58 am
Another example, cross-posted from CrookedTimber:
Let me do another example, Lancet essentially did the following…
In a poll of 1,000 Americans, 67 from Texas are chosen to be interviewed via a random sampling of the nation. 114 are chosen in California, and so on.
Now Lancet decides that it wants to cut down on the number of states it wants to call. So they pair up California and Texas and decide that one of them will receive all the people chosen in the other.
A number between 0 and 53,000,000 is chosen at random (the combined populations of Texas and California). If the number is between 0 and 20,000,000 then Texas gets them. If it is between 20,000,000 and 53,000,000 then California gets them.
The number picked turns out to be 45,567. Texas gets all of California’s 114 selected persons.
Now Lancet will call 181 people in Texas, and 0 in California.
You have got to be smoking some serious reefer if you think this results in a randomized sample, and a representative result.
This is exactly what Lancet did in Iraq. Try to weasel your way out of that one, sirs. I’m not in denial, I know what I’m talking about, and will continue to hand you sirs your arses as long as it is necessary.
September 22nd, 2005 at 9:22 am
Seixon,
Before the random numbers were generated, a household in Dehuk, Arbil, Tamin, Najaf, Qadisiyah, or Basrah had the same probability of selection as a household elsewhere in Iraq. That no such household got selected was due to chance; they were not discriminated against. The key statement here is: “the sample remained a random national sample.” The fact that those Governorates were already out of the running when it came to choosing individual communities does not invalidate it. The mere fact that cluster assignment took place in two phases does not introduce bias.
I know it will take a while to convince you of this. I won’t have much time over the next few days, so please don’t interpret silence as assent to your position.
September 22nd, 2005 at 12:57 pm
I would hazard that I know less statistical theory than the other people commenting here, but I can’t see the problem that Sexion thinks is there. If I wanted to do a study on something in the United States, and I randomly chose either the twenty-five states alphabetically from Alabama to Missouri or the twenty-five from Montana to Wyoming, and then I do whatever else I’m planning to do to only one of those two lists, how does this invalidate my results? Of course it’s possible that the twenty-five I didn’t pick will be systematically different, but there’s no reason to assume they will be.
September 22nd, 2005 at 1:27 pm
I’m only going to make a meta-argument. The fact that Seixon thinks no statisticians commented on this is bizarre. Am I misunderstanding him? The Lancet is peer-reviewed and the validity of their statistical methodology would obviously be the focus of any reviewer’s criticism. I also recall reading an article in the Chronicle of Higher Education about this article and I think the experts in statistics all agreed the methodology was statistically sound.
One could criticize the paper’s conclusions or say that based on other evidence one thinks the true death toll falls somewhere in the lower part of their estimated range, but these technical objections seem unlikely to be important. I’m guessing the real experts would have identified these flaws right from the start. In my untutored opinion the biggest problem with the Lancet study and the UN study is precisely that they didn’t focus on the worst-hit areas like Fallujah during the time period when the US was softening it up for the final assault. This, of course, is no one’s fault. But in hindsight I wish they’d sampled several neighborhoods in Fallujah so we’d know whether the loss of life was as massive as suggested by the one neighborhood they did visit. The UN survey is no substitute–it didn’t cover the period in question. Most likely we’ll never know, unless someone does a study many years from now when it won’t have any relevance.
September 22nd, 2005 at 2:44 pm
Kevin,
You continue to obfuscate. Have you even read the study? It sure as hell doesn’t seem like, and Mr. Lambert is sure very silent on the matter, even though I sent him an e-mail.
When you said:
“Before the random numbers were generated, a household in Dehuk, Arbil, Tamin, Najaf, Qadisiyah, or Basrah had the same probability of selection as a household elsewhere in Iraq. That no such household got selected was due to chance; they were not discriminated against.”
Actually, if you read the study, clusters were selected in all six of those when they did a random sample. So “that no such household got selected was due to chance” is a gargantuan whopper. They DID get selected. Can you go read the study please?
I guess I need to quote from the study:
“Clusters initially assigned at random:
Baghdad: 7
Ninawa: 3
Dehuk: 1
Sulaymaniya: 2
Arbil: 1
Tamin: 1
Salah al Din: 2
Diala: 2
Anbar: 1
Babil: 3
Karbala: 1
Najaf: 2
Wasit: 1
Qadisiyah: 1
Dhi Qar: 2
Muthanna: 0
Basrah: 2
Missan: 1″
See? Can you read? Or does Lambert’s immense tie to the credibility of this study keep you from firing up Adobe Acrobat to read the PDF?
Now, if they had left it like that, THAT would have been a random sample.
But NO! A random sample wouldn’t do, because they didn’t want to travel to all those provinces, for “security” reasons (which I am sure is also the reason they felt it was so necessary to venture into Fallujah).
Next came the “grouping process”.
They chose 12 provinces, on a whim, and decided that, hey, 6 of these aren’t going to have clusters any more, because we want to cut down our travel time. Why not all 18 provinces? Why not 10? Why not 8? Ah, that is left up to the God of Arbitrary, of course.
Now after this “grouping process”, suddenly Basrah, Dehuk, Arbil, Tamin, Najaf, and Qadisiyah had 0 clusters. They originally had clusters assigned to them due to the random sample. After their tampering, they had none.
Of course they said “the sample remained a random national sample”. If they didn’t, then people like you would be like deer in headlights wondering what you were going to tell me when I challenged it.
To make this even more clear: let’s say we continued with this “grouping process” that they did. Eventually, I could have all the clusters end up in Baghdad province.
I could just keep arbitrarily pairing up provinces, doing one random drawing, and give all the clusters assigned to each of the pairs to one of the provinces.
I would eventually end up having Baghdad with all 33 clusters, since Baghdad has the highest population, thus the highest probablity of winning this flip-of-a-coin-winner-takes-all nonsense.
I don’t even know how to make this anymore clear to you guys. I’m starting to get the feeling that you know I have a point, but don’t want to admit it.
washerdreyer,
You think you would get a representative opinion of Americans by only sampling 25 of the 50 states? Oh man… Take a statistics course, please.
Johnson,
No statisticians have commented on this “grouping process” that is in the study. Or, not that have appeared in the media, anyways. I have this from CNN:
“But Richard Peto, who is professor of medical statistics at Oxford University, cautioned AP the researchers may have zoned in on hotspots that might not be representative of the death toll across Iraq.”
www.cnn.com/2004/WORLD/me…
That is precisely what the result of the “grouping process” was. Almost all the safest areas of Iraq were ripped out of the random sample they already conducted, and the clusters were given to more dangerous areas.
Me thinks that none of you actually read my entire analysis. Mr. Lambert pretends in his response to me that there were not two phases in the distribution of clusters, and so does Kevin.
There was 0% probability that both Basra and Missan would end up with a cluster. There was 0% probability that Ninawa and Dehuk both ended up with a cluster. There was 0% probability that Sulaymaniyah and Arbil both ended up with a cluster. There was 0% probability that Tamin and Salah al Dinh both had a cluster. There was 0% probability that Karbala and Najaf both had a cluster. There was 0% probability that both Qadisiyah and Dhi Qar had a cluster.
Come into the light, stop being in denial.
September 22nd, 2005 at 4:03 pm
Wonderful, the reply I get from Lambert?
“You don’t understand sampling.”
In my book, this is called stone-walling. Come on Mr. Lambert, show me the error of my ways, and quit misrepresenting my argument like you did in your original post.
I’ve got Kevin saying, “but, but, the study says so, so it must be true!”
Can I get more than a “you don’t understand anything” reply? Usually when you rebut someone, you tell them HOW and WHY they are wrong, not just that they are wrong.
Absent this, as I said earlier, I cannot conclude anything other than that you know I have sunk your battleship and you are lost on how to spin yourself off the deck to save your credibility.
September 22nd, 2005 at 4:48 pm
A gentle suggestion for seixon:
“It is better to remain silent and be thought a fool than to speak and remove all doubt.”
Attributed to Abraham Lincoln among others
September 22nd, 2005 at 4:58 pm
Actually, this is what I emailed Seixon
I did tell him how and why he was wrong, he just seems to have ignored that part. There certainly wasn’t a period after the word “sampling” in my email.
September 22nd, 2005 at 4:59 pm
Mark,
Why don’t you put your money where your mouth is? Am I going to actually get any rebuttals here, or are you are all going to HISS me out of here for daring to challenge your “obvious” superior intellect?
Enough with the ad hominem and stone-walling guys. Let’s see you show how I am wrong instead of just sounding like a broken record.
September 22nd, 2005 at 5:25 pm
Tim:
A number between 0 and 53,000,000 is chosen at random (the combined populations of Texas and California). If the number is between 0 and 20,000,000 then Texas gets them. If it is between 20,000,000 and 53,000,000 then California gets them.
The number picked turns out to be 45,567. Texas gets all of California’s 114 selected persons.
“Now Lancet will call 181 people in Texas, and 0 in California.
You have got to be smoking some serious reefer if you think this results in a randomized sample, and a representative result.
This is exactly what Lancet did in Iraq.”
Would you agree that if a poll was taken in this context the result would be as credible as asking New Yorkers who they are to vote for at the next election and extend that out for the entire US population? It’s absurd right?
So if you agree it is absurd then you would have say the Lancet study does contain a few little problems.
One big gaping issue I have with your prized study is that in a previous post it was mentioned the reseachers conducted a household wide survey of all the inhabitants.I would be real interested in finding out how they eliminated double counting.
September 22nd, 2005 at 7:53 pm
I think the best “critique” of all time was one (perhaps apocryphal) from the late 19th century which said of some statistical study “the results don’t represent the entire population, but only a sample, and a random sample at that!”
100+ years later this is still the essence of most of the objections to the Lancet study including Seixon’s, and the innumerable innumerates who object to extrapolating a sample proportion to an entire population because “it just doesn’t seem right”.
September 22nd, 2005 at 8:45 pm
Joe and Seixon; I think your example about Texas and California would be more analogous if you were talking about grouping Wyoming and Montana, which is something that I would bet serious money at short odds a lot of pollsters do. Or North and South Dakota. Or grouping Oklahoma with Texas.
Secondly, it’s just not true that clusters were systematically moved out of safe areas into dangerous areas; most obviously the city of Najaf was not sampled despite having been the scene of heavy fighting.
September 22nd, 2005 at 9:16 pm
“The sample was not biased by the exclusion of six randomly chosen provinces since each household in Iraq was equally likely to be chosen by the sampling procedure. Seixon could just as well argue than all surveys are biased because after the sample has been randomly selected each person outside the sample has a 0% chance of being selected.”
I’d like to weigh in on this. I think Seixon has a point as regards his criticism of the sampling, though I don’t think he explains it clearly. There are biases, though it is debatable how profound their effects are.
Seixon thinks the sample was biased. Tim think it was not - on the basis that household in Iraq was equally likely to be chosen by the sampling procedure. But eqiprobability of each unit being chosen is not the only requirement for generating a random sample - independence is also a prerequisite.
Let me give an example. I decide to generate a sample by selecting a random starting point on a circle and then (a) toss a coin to see which direction to move in, (b) repeatedly throw a dice to see how many paces to move in, and (c) sampling at each point I stop at.
Each point on the circle has eqiprobability of being sampled - this is Tim’s point. But I haven’t made a random sample because the choice of points are not independent of each other. They are dependent, and any interdependence between the values at sampled points will bias the result. “Clumping” clusters together will have the same effect for the same reasons.
I would also like to make a point of my own. It is simply untrue that each household in Iraq was equally likely to be chosen by the sampling procedure. Locating each sample within a governorate was performed by: (a) drawing a map, and (b) generating a random coordinatate on that map.
This finds a random point in space: but does not find a random household. The method will oversample households in areas of low population density and undersample them in areas of high population density. Any link between mortality and population density will bias the result. Because there are marked differences in population density the sampling will be seriously skewed.
As to how serious these errors are in effecting the result - there’s no way I can quantify this. But bias exists and we have to be careful in interpreting the result. I don’t think this means the Iraq study is worthless, it clearly was and remains a substantial contribution to knowledge as the only study of excess mortality resulting from the Iraq War.
I do think some of you have assumed that because most criticisms of the study are innumerate, then all of them are.
September 22nd, 2005 at 9:55 pm
“The researches of many commentators have already thrown much darkness on this subject, and it is probable that if they continue we shall soon know nothing at all about it” - Mark Twain.
It’s probably an unreasonable expectation of the blogosphere but I’d welcome a bit more rising above the ad hominem from both sides here. I am a lazy bear of very little brain and am having a hard enough time counting on my fingers and toes so I can do without the finger-pointing, tu-quoques and outrage.
I have to say I don’t find Seixon’s criticisms of The Lancet study particularly persuasive. I don’t think their methodology is unreasonable nor do I find their presentation of the data misleading or dishonest. The study is filled with caveats and seems to go out of its way to say that the data sample is limited, may be unrepresentative and calls for further verification.
In the absence of a larger and less extrapolated survey, this doesn’t seem unreasonable. I’d certainly like to hear more about “the study that has 20 times the sample of the Lancet one, and 26% more coverage”.
I also want to hear more about how this relates to the survey “Combat Duty In Iraq And Afghanistan” content.nejm.org/cgi/cont… in the New England Journal of Medicine 1 July 2004, particularly Table 2 concerning responsibility for the death of noncombatants: content.nejm.org/cgi/cont…
In this interview www.socialistworker.co.uk… Dr Les Roberts claims the NEJM results back up The Lancet study. Is this fair?
September 22nd, 2005 at 10:45 pm
Wait, the “randomly” excluded provinces just happened to be in Kurdish and Shi’ite controlled areas where the infrastructure doesn’t get bombed as much and acts of terrorism are few and far between?
September 22nd, 2005 at 10:52 pm
[Wait, the “randomly” excluded provinces just happened to be in Kurdish and Shi’ite controlled areas where the infrastructure doesn’t get bombed as much and acts of terrorism are few and far between?]
Jet, one of those “Shi’ite controlled” areas was Najaf!
There are certainly methodological criticisms one could make of the Lancet study if one was in a statistics classroom and didn’t have to a) explain how the problem could have been corrected and b) revisit one’s suggestions for a) after being reminded that there was a bloody war on. In principle, if there was a very high correlation between death rates and population density, the results might be skewed.
However, if anyone wants to seriously claim that this invalidates the study to such a degree that it should not have been published or should not be quoted, then I would be grateful if they could provide a single (real-world) example of a piece of empirical work of any kind ever having given seriously misleading results because of a problem of this kind and explain why they think that the Lancet study was analogous. I really think that blackboard critiques of this sort are reaching.
September 23rd, 2005 at 12:05 am
dsquared,
The violence in Najaf lasted around 3 months and was limited to contained portions of the city and did not result in major infrastructure damage which is what the Lancet study said caused most (2/3?) of the deaths.
And no one is saying the study is invalid. Only that serious problems exist with it that are ignored or shouted down. The biggest problem I’d like to see addressed is the obvious bias in the study. The points Seixon brings are do support the claim of extreme bias in the survey reports’s wording, ie 100,000 civilians (didn’t they say 1/3 of those were combat deaths?), even though women only accounted for 7% of deaths, grouping them in the category “women and children” so that that phrase can be used with the higher percentage of deaths of children giving the misleading impression that the women were dieing at the higher rates also, etc.
September 23rd, 2005 at 12:06 am
Nikolai, yes the pairing of the governates means that the samples were not independent, but that does not bias the result. It is just cluster sampling again at the level of governates and all that does is reduce the effective sample size.
The sample procedure does bias the sample towards households in relatively less densely populated and near edges of towns. Only if there are large density differences within a town and large differences in mortality in dense and less dense areas will that make a significant difference. (And it seem just as likely to bias things up as down.)
I have not assumed that because most of the criticisms are innumerate they all are. I have described Sexion’s criticism as innumerate because it is.
September 23rd, 2005 at 12:37 am
“I have describes Sexion’s critism as innumerate because it is.”
I recall the old adage here that about half of statisticians graduated in the bottom 50% of their class. Jeez, and I thought my spelling was shocking, despite a mild case of lysdexia….
Interesting to see that Davies has also taken what I guess we should now call the party line on the ‘24,000′ figure.
I think we can pretty much call the whole debate dead now. No new evidence will alter anyone’s view. It is deceased; ’tis an ex-parrot; ’tis ceased to be……
September 23rd, 2005 at 1:04 am
I may be satisfied that the sampling procedure itself did not necessarily introduce a mathematical bias, but I believe Seixon may have a point that the pairing process could have introduced some unfortunate subjective aspects.
Looking at Seixon’s map, I can certainly accept that a method that ended up excluding all of the northernmost and southernmost provinces of the country, including most of the Kurdish region and the until-recently significantly quieter area of the country patrolled by the British division, could well still have been a randomly derived result, but given what we know about Iraq’s ethnographics and the recent patterns of violence it was at best an unfortunate one for a study that is asserting a national character. And when a study ends up including all 6 out of the 6 most violent provinces, and only 2 out of the 6 least violent (in terms of U.S. casualties, I grant), and bases that choice on an apparently undocumented assessment of relative levels of violence by the survey team, that would also seem to be an unfortunate aspect of the method.
The obvious example would be if the pairing process had somehow ended up excluding Baghdad (which would not have happened, because the survey team apparently concluded that there was no comparable governorate to pair Baghdad with, so it was not subject to randomized elimination). Of course, if Baghdad had been excluded, this would have led to justifiable criticism that the study was not inclusive of the country’s major metropolitan area. But by the same token, a pairing method that guaranteed Baghdad’s inclusion in this way, but introduced an extra chance for exclusion of anyone living in more “pairable” provinces, could well introduce a survey bias, regardless of the mathematical soundness of the pairing technique itself.
This is not to say that the actual number of casualties does not still likely fall within the Lancet study’s probability range, but I believe you could make the case that the aspect of subjectivity introduced by the pairing process would justify the assumption that the true number was somewhere below the stated 100,000 midpoint.
September 23rd, 2005 at 1:08 am
“I have not assumed that because most of the criticisms are innumerate they all are. I have describes Sexion’s critism as innumerate because it is.”
OK. Well this innumerate has decided that the easiest way to break the stone-walling is to cut down my argument to a swift morsel of information. Here it is:
In their “grouping process”, the Lancet used one random event to distribute multiple clusters. This violates the principles of random distribution, thus no longer a random sample. For example, for Ninawa and Dehuk, a total of 4 clusters were distributed according to a single random event. For the sample to remain random, one would have to distribute the clusters at one random event per cluster.
If you deny that this is correct, then you might as well distribute all 33 clusters with a single random event, which would most likely select Baghdad, and thus Baghdad would end up with 33 clusters, the rest of the country none. This is exactly what they did in their “grouping process”, only to a larger degree.
If you keep repeating their “grouping process”, you will most likely end up with all the clusters in Baghdad also.
Mr. Lambert, if you cannot respond to this, everyone can see that you are an emperor without clothes and you will have disgraced your name as being knowledgable about statistics.
September 23rd, 2005 at 1:30 am
Tim;
“the pairing of the governates means that the samples were not independent, but that does not bias the result. It is just cluster sampling again at the level of governates and all that does is reduce the effective sample size.”
I admit I don’t completely follow this, perhaps you could elaborate?
By my logic, dependent samples don’t systematically bias the estimate in a particular direction, so much as destroy the basis of the “marbles-in-a-jar” logic needed for statistical inference. If clusters within governorates are more similar to, or more different from, each other than clusters in other governorates then estimates and confidence intervals are all thrown off by an unknown and unknowable amount.
Re: Density and Mortality.
To paraphrase: the point has been made that you need (a) density differences, (b) mortality difference between dense and less dense areas to (c) influence the estimate. There’s been an implication that this isn’t likely.
I just want to point out that it’s been well established for a long time in geography that there are large differences in density within towns, and in public health that people in densely populated areas (i.e. the poor) have considerably higher death rates.
Dsquared says that “in principle [given] high correlation between death rates and population density, the results might be skewed.”
It actually goes a bit beyond this. There can be no correlation, but if there is higher variability in death rates between high and low density areas (i.e. densely populated areas are more likely to see multiple people wiped out by a bomb) this will throw the basis of the inference off.
It isn’t just the exact estimate I’m worried about, but how certain we can be of where it is. The reason for the paper was to make an inference of excess mortality was and say roughly how confident we are of where it is. Sampling problems destroy the basis for this.
September 23rd, 2005 at 1:46 am
“Most individuals reportedly killed by coalition forces were women and children.”
Actually, the sentence is pretty badly worded, since interpreted literally with the usual logical operators, it would be saying that most of the individuals killed were immature females. It would have been clearer, had they said women or children.
September 23rd, 2005 at 1:49 am
Nikolai, clustering, whether it is at the individual level or the governate level, does not bias the estimate. What it does is increase the variance of the estimate and hence reduce the precision. As is explained in the article:
How likely do you think it is that Seixon has discovered a monstrous blunder in the statistics that was missed by the authors and the referees and all the expert statisticians who have looked at the study?
September 23rd, 2005 at 2:02 am
“You can’t have a random sample when you purposely remove 6 provinces from your original random sampling.”
Well, sure you can, unless your second selection is dependent on some (nonrandom) variable.
It reminds me of the observation:
“Life is unfair. But it’s unfair for everybody; that makes it fair.”
September 23rd, 2005 at 2:03 am
Gee, Mr. Lambert, I don’t know, how likely do you think it was that blogger Brendan Loy was the first person to predict the destruction of New Orleans?
I read you because you have always addressed these kinds of debates on their merits. Arguments from authority (in this case, someone else’s authority) are generally beneath you, and should stay that way.
Nine-tenths of the “refutations” of the Lancet study were utter crap, I concede. But before Seixon’s post, I’d never seen the provinces chosen on a map, or his cross-reference of the provinces with the heaviest U.S. casualties vs the Lancet cluster provinces. These were both valuable contributions, regardless of whether the rest of Seixon’s argument holds up.
September 23rd, 2005 at 2:15 am
Z, I believe this is the portion of the Seixon argument that holds up… that the selection of the 12 provinces from the initial 18 may indeed have been non-random. The study itself never asserts that it was random:
“To lessen risks to investigators, we sought to minimise travel distances and the number of Governorates to visit, while still sampling from all regions of the country. We did this by clumping pairs of
Governorates. Pairs were adjacent Governorates that the Iraqi study team members believed to have had similar
levels of violence and economic status during the preceding 3 years.”
18 provinces were chosen. 6 were not paired, and thus included, whereas 12 were paired, and half of those left out. There is no indication from the above that the choice of whether a province ended up in the 6 or the 12 was randomized… indeed it seems to have been based on undocumented “belief” that there was a comparable adjacent province to pair with. From the above, Baghdad and Anbar clusters may have had no chance of being excluded, while Basra and all of the Kurdish provinces did. That seems an unfortunate choice of method, in retrospect.
September 23rd, 2005 at 2:20 am
Correction to my last. The first sentence, third para should read, “Clusters were chosen in each of Iraq’s 18 provinces. Of those, 6 provinces were unpaired, and their clusters automatically included, and 12 provinces were paired with a similar neighbour province, and half of those substituted out.”
Sorry, should have used preview.
September 23rd, 2005 at 2:32 am
In fact, reading Fred Kaplan’s original critique of the study, he brushes against what I am saying here, although he does not explain it adequately. I quote:
Kaplan says it is unclear how they made this calculation. Actually, it isn’t, and he should have read the study more thoroughly because it spells out how this was done.
Lambert, what you are saying ignores the fact that the clusters in 12 provinces were distributed via 6 random events, instead of one random event per cluster. This had the result of ensuring that around 25% of Iraq would be excluded from the get-go. This is how it went awry, and produced a sample bias.
Also, the last quote in my piece demonstrates that statisticians have talked about this, yet predictably they aren’t getting much press coverage for their dissent. I demonstrate:
Now, here Peto isn’t given any room to explain what he means by this, but he is talking about exactly the thing I have proven for you all right here, and in my piece back in May.
September 23rd, 2005 at 3:01 am
I think it is clear from one of Seixon’s comments at his own blog that he has simply screwed up:
Unless I am much mistaken this is just a blunder. Let B and M represent Basrah’s and Missan’s percentages of Iraq’s population. The expected number of cluster for each (1st round) is 33B and 33M, respectively. Basrah’s expected number after the second round is:
[B/(B+M)][33(B+M)] = 33B.
September 23rd, 2005 at 3:02 am
Tim;
I think I perhaps wrongly used “bias” in a loose fashion. The estimate isn’t biased in the sense of being systematically pushed up or down. But a dependent sample does damage the assumptions needed for statistical inference, and means we have less certainty over confidence intervals and the location of the estimate.
I’m afraid I’m still not sure what you mean by clustering “at the individual level or the governate level”. The technique estimates mortality rates from clusters of individuals (in households); I’m not sure what clustering at the governate level is. If it means non-random allocation of clusters between governates then this could affect the result.
“How likely do you think it is that Seixon has discovered a monstrous blunder in the statistics that was missed by the authors and the referees and all the expert statisticians who have looked at the study?”
This is an argument from authority. Seixon flagged a glitch in the sampling procedure based on an intuitive understanding of randomness. He didn’t express it particularly well, and I’m sure a lot of what he says outside this is wrong. But there have been some very clever people on this thread and others insisting that if all units are equally likely to be chosen then something is a random sample - and this just isn’t true.
I’d bet that this wasn’t missed by the authors and the referees and all the expert statisticians, but they appreciated the paper was important in spite of this, and that the sentence you quote was put in to try and at least reference the problem.
September 23rd, 2005 at 3:03 am
Ah…. going back and rereading the paper, I see what he’s getting at (maybe?). The second selection was not strictly random, since it depends on the assignment into pairs, which was done on the judgement “that the Iraqi study team members believed to have had similar levels of violence and economic status during the preceding 3 years”, and some of them were not paired. Thus, the unpaired Governates, and whatever (presumably somewhat of an outlier since they’re not paired) levels of violence and economic status they represent as well as any other variables, get a free pass through the second selection.
Of course, this adds the complication of the accuracy of the matching by violence and economic status. Any systematic bias there would show up in the end result. This leaves us in the usual position of showing that there was an opportunity for bias to leak in, without demonstrating whether there was or was not any bias.
Aside from that, if we assume the matching was accurately done, how would this affect the results? Off hand I don’t see an effect… it seems to me that, by definition, any dependency on violence and economic status would show no difference; similarly, any dependency on variables totally nondependent on violence and economic status would be affected quasi-randomly, which is not a bias. But I haven’t quite proved to my self beyond any lingering question that this partial second selection, correctly done wouldn’t result in bias.
Anyway, that’s just my worthless opinion, which I hope somebody who knows what’s what will explain to me.
September 23rd, 2005 at 3:28 am
Z, again with the caveat I don’t accept the Greater Seixon argument, involving intentional duplicity, etc., there’s still something to the Lesser Seixon argument of bias introduced through the selection of the 12 surveyed governorates.
The cross-referencing of violence levels as experienced by American and allied troops with the provinces chosen by the Lancet process is indicative (not conclusive by any means, though) that the matching based on relative violence levels was not, in fact, accurately done (and the Lancet authors seem to concede it was only based on a “belief”.)
In practice, what they ended up with via this method was a survey of Iraq conducted entirely within the central 12 provinces of the country, and wholly excluding its northern and southern extremities. It’s certainly arguable that any significant north-south variations in violence level in the country, assuming they were inaccurately controlled for, could have introduced some measure of error here. One can say that without accepting the larger argument that this was malicious.
September 23rd, 2005 at 3:53 am
Kevin,
Yes, I did make a mathematical error in that comment on my blog. It was late and I thought about it after going to bed and realized that I had goofed up a little bit.
I am now writing a new blog post that will make this very, very clear to all the hold-outs.
nikolai and z,
You are both overlooking that the second phase of sampling was carried out in a winner-takes-all one-off fashion. In order for randomness to be preserved, albeit with a higher resultant variance, each cluster would have to be assigned via its own random drawing. This was not done in the 2nd round, as the province with the highest population was most likely to win the one-off, and thus win all of the pair’s clusters.
In statistical jargon, the expected mean or expected result of clusters for each province in the 2nd round would be either the total number of clusters between the pair, or 0.
For example, in the case of Basrah and Missan, their expected number of clusters would be (populationGovernate/populationTotal)*33 in the 1st round.
In the 2nd round, Basrah would have an expected number of clusters equal to:
(population(Missan+Basrah)/populationTotal)*33
This is because Basrah would be expected to win the one-off between it and Missan, and take all of their clusters. Missan’s expected number of clusters would be 0, since there is only ONE random drawing for their 3 clusters, and Basrah would be expected to win it all.
Hope that clears it up… I’m still working on my blog post…
September 23rd, 2005 at 4:16 am
Missan’s expected number of clusters would be 0, since there is only ONE random drawing for their 3 clusters, and Basrah would be expected to win it all.
This is wrong. The term “expected” has a particular meaning in statistics. In the notation I used earlier (comment no. 39) the expected number of Missan clusters before the first round takes place is:
[M/(B+M)][33(B+M)] = 33M
Now, once Basrah and Missan had their first-round clusters assigned to them, Basrah had 2 and Missan had 1. The expected number of Missan clusters now became:
[M/(B+M)][2+1] = 3M/(B+M).
This is roughly equal to one since Basrah has roughly twice Missan’s population. In the event, against the odds, Missan got the three clusters.
September 23rd, 2005 at 4:42 am
It’s an interesting problem. Assume for the moment that Coalition fatality figures are in fact, correlative with violence levels.
In that case, of the six provinces that were not paired, accounting for 14 of the 33 clusters, five were among the top seven provinces in terms of anti-U.S. violence (the sixth, Muthanna, is relatively peaceful, but did not have enough population to merit a cluster). One of those 14, the Anbar province cluster, would be ultimately excluded.
The remaining 19 clusters were in the paired provinces. The net effect of the choice in pairing here seems to have been relatively small, though… three clusters (in Qadisiyah, Tamim, and Dehuk) were moved to a significantly more violent province but two (those in Basrah) were moved to a significantly less violent one, while the other moves did not make a difference on the violence gradient, for a net non-violent > violent cluster-move of only 1 due solely to the effects of pairing.
However, the cumulative effect of not pairing the violent provinces (particularly Baghdad and Babil) while pairing the less violent ones, may still have been significant. Splitting up the provinces into thirds based on the anti-Coalition violence stat, and discounting Anbar/Fallujah, only six of the 32 clusters ended up being located in Iraq’s six most peaceable provinces, while 18 clusters were located in the six most violent provinces.
September 23rd, 2005 at 4:52 am
Comment 43 has nicely demonstrated Comment 16.
Must go wash brain, it’s been contaminated!
September 23rd, 2005 at 5:58 am
New post at my blog.
September 23rd, 2005 at 6:12 am
Damn thing. HERE.
Oh, and Kevin:
You are again completely ignoring the fact that the clusters were being distributed by ONE random drawing, instead of ONE PER CLUSTER.
Your math assumes that there will be three separate random drawings, one for each cluster, and this is not the case when it comes to the Lancet study.
With Lancet it is:
Basrah: 66% chance of getting all 3, 33% chance of having 0
Missan: 33% chance of getting all 3, 66% chance of having 0
In a random distribution, it would be:
1st drawing:
Basrah: 66% chance of getting one
Missan: 33% chance of getting one
2nd drawing:
Basrah: .4356 chance of having 2; .21 chance of having 1; .1089 chance of having 0
Missan: .1089 chance of having 2; .21 chance of having 1; .4356 chance of having 0
3rd drawing:
Basrah: .287 chance of having 3; .144 chance of having 2; .072 chance of having 1; .036 chance of having 0
Missan: opposite of Basrah
Now let’s compare the end results here:
Lancet:
Basrah - 66% chance of having all 3
Missan - 33% chance of having all 3
Random distribution:
Basrah - 28.7% chance of having all 3
Missan - 3.6% chance of having all 3
Are you telling me that those two are equal? Have a nice day in Denial Town.
September 23rd, 2005 at 7:59 am
So think of all those people claiming it was extremely likely that 100,000 “civilians” had been killed in Iraq citing the Lancet study and then think of Seixon’s analysis showing why the spread was 100,000 +- 100,000. Looks like the right isn’t the only side willing to whore and twist the truth to further their cause. Seixon wins this round.
Seixon’s argument also gets a ton of ethos points since he’s the underdog being attacked and called names for daring to bring this point up.
September 23rd, 2005 at 8:32 am
My analysis doesn’t show that it is 100,000 +/- 100,000. My analysis shows that the finding of the Lancet study is completely meaningless because of the sample bias they introduced by violating simple principles of random sample distribution.
I realize I wasn’t able to formulate myself clearly to begin with, but I hope that is now cleared up for all to see that I am, in fact, not an innumerate.
If anything, Mr. Lambert has (perhaps purposely) poor attention to details. Did Mr. Lambert always know about this flaw and hope that no one would find it? Did Mr. Lambert try to bluster me into conceding because he knew I was right? I sure hope not, that would be intellectual dishonesty of the worst kind.
September 23rd, 2005 at 8:37 am
Your math assumes that there will be three separate random drawings, one for each cluster….
Seixon, if I was assuming that I would have said so. Good night.
September 23rd, 2005 at 8:47 am
Well, because it seemed intriguing, I went back and checked Seixon’s math on coalition fatalities in the surveyed areas, and I’m afraid the results weren’t too impressive.
Seixon (he can correct me if I’m wrong, but I don’t see how he got the numbers he did otherwise) evidently counted all fatalities, both combat and non-combat, and used a date range that extended into mid-2005. For comparison, I took only the set of 684 Coalition combat fatalities prior to Sept. 30, 2004, for which a definitive place of death could be established, which seemed to be the more useful comparator.
Across all of Iraq, with 24.4 million people, the rate of Coalition fatalities per million Iraqis was 28. Across the ten provinces where the Lancet study did its fieldwork (less the six excluded by pairing, Muthanna due to low population, and Anbar due to the Fallujah outlier), there were 459 Coalition fatalities in the period leading up the survey, while occupying a population of 17.03 million Iraqis… a rate per million of 27.
In other words, Coalition military activity in the areas where the Lancet team worked, in the first 18 months of war and occupation, was actually very slightly SAFER than Iraq as a whole… apparently discrediting the statistical backing for Seixon’s “most violent provinces” argument.
The most that one can say on the available data is the non-random choice of which provinces to pair/not pair had the potential to introduce a bias into the Lancet study, but the one piece of evidence that it actually did doesn’t appear to stand up to scrutiny, I’m afraid.
September 23rd, 2005 at 8:50 am
Perhaps that was too cryptic. Seixon, please consider the following case. An “honest” six-sided die is rolled, once and once only. The number of spots which can appear on the uppermost face is one of the following: 1, 2, 3, 4, 5, 6. Each has probability 1/6.
The number of spots which appears is a random variable. Call it X. What is the expected value of X, also known as E(X)?
Once again, good night.
September 23rd, 2005 at 9:51 am
Kevin,
You forgot that you have x number of clusters to distribute, not just one roll of the dice. You don’t distribute ALL of them via the same roll of the dice. That isn’t randomized distribution. If that was the case, then you could roll the dice on all 33 clusters at the same time, and Baghdad would win it and take em all. Would that be a randomized distribution of Iraq? Of course not.
September 23rd, 2005 at 9:53 am
BruceR,
It is quite possible that I didn’t weed out non-combat deaths, but I still don’t see how you are getting the results you are. As I said on my blog, it’s a while since I compiled those numbers, and I should probably go through it again.
September 23rd, 2005 at 9:54 am
In any case, that doesn’t discount the very fact that the “grouping process” led to a non-random distribution of the clusters. This by itself renders a bias into the sample, and thus undermines any number that pops out of the survey of those households.
September 23rd, 2005 at 12:20 pm
John Quiggin Says:
September 22nd, 2005 at 7:53 pm
I think the best “critique” of all time was one (perhaps apocryphal) from the late 19th century which said of some statistical study “the results don’t represent the entire population, but only a sample, and a random sample at that!”
100+ years later this is still the essence of most of the objections to the Lancet study including Seixon’s, and the innumerable innumerates who object to extrapolating a sample proportion to an entire population because “it just doesn’t seem right”.
Would Quiggin mind explaining in his own words why he thinks Seixon is innumnerate. I don’t mean cutting and pasting like he does with T. lambert’s work on global warming. [Rest of comment deleted. TL]
September 23rd, 2005 at 1:11 pm
I’m a neophyte in statistics, so here’s my clusters for dummies interpretation of this.
Suppose Iraq had three provinces, A , B and C. A has a population of Na , B has a population of Nb and C has a population of Nc. A suffered Da war deaths and B suffered Db war deaths and C suffered Dc war deaths, so the actual number of war deaths is Da + Db. +Dc
You’re going to do a lot of clusters but you decide for practical reasons you can only visit C plus either A or B. You assign a fraction Nc / (Na + Nb + Nc) to province C. Either Province A or Province B will get the remaining fraction (Na + Nb)/(Na +Nb + Nc) of the clusters. The chance that A gets these clusters is Na/(Na +Nb) and the chance that B gets them is Nb/(Na + Nb)
Okay, now suppose (because I’m assuming this is realistic) that the measured death rate you get in whatever province you measure is an unbiased estimate of the actual number of deaths in that province–or on second thought, to keep it simple, you just count the actual number of deaths in the two provinces you sample.
Then the expected value for the total deaths in Iraq that you’ll get is
Dc + (Na/(Na +Nb))(Da)(Na+Nb)/Na) + (Nb/(Na + Nb))(Db)(Na +Nb)/Nb
which equals Dc +Da +Db.
Of course you won’t actually get that. You’ll either get
Dc + Da(Na + Nb)/Na or else Dc + Db(Na +Nb)/Nb
and if I weren’t so lazy I’d calculate the variance. So if you did a perfect count inside a given province and did each province you’d get the exact answer, but if you pair two provinces arbitrarily as I did and only measure the death rate in one of them and extrapolate to both of the paired provinces, you’ll get a death rate that is either too high or too low, but the expected value will still be the true value. So pairing provinces increases the variance in my toy model, since a perfect count in all three provinces presumably has no variance.
I assume pairing provinces in the actual paper increased the variance and that the authors are smart enough or had really good software that was smart enough to calculate it. But if my toy model is on the right track, the expected value of this approach would be the true value.
September 23rd, 2005 at 1:45 pm
Yes, that’s correct Donald. Clustering does not change the expected value, but it does increase the variance, which is what they said in the paper. I have some graphs in this post on clustering that attempt to show you what happens.
September 23rd, 2005 at 1:56 pm
Seixon comments:
In my universe probabilities add up to 1. Correct probabilities are 8/27, 12/27, 6/27 and 1/27 thanks to Mr Binomial Distribution. And yes, clustering gives you a different distribution, but if you use the correct probabilities the expected value is the same.
September 23rd, 2005 at 5:36 pm
Yes I was wondering why they were not adding up to 1, thanks for that. The exact numbers aside, the point still stands.
If we were to have used the “grouping process” to begin with, we could end up with Baghdad with zero clusters by the mere chance a province it got paired up with won all of their clusters.
This would not give a correct result, as Baghdad wouldn’t be sampled as all, and the other area would be vastly oversampled.
Your dice rolling example of clustering is nice, but it is a given that two dice have the same probabilities for their results. Two different provinces in Iraq are not like to equal set of dice, I’m afraid.
So yes, if you assume that Basrah and Missan are exactly the same, which the study did with a “belief” that they never substantiated, then it works out.
Unfortunately it is absurd to claim that two provinces are like an equal set of dice. That derails any notion of the real world.
Thanks for those probabilities though, I will double-check them later and update my blog post with them. Point still stands.
September 23rd, 2005 at 5:45 pm
In fact, Lambert, clustering does give different probabilities. Like you showed, those probabilities are if you do a REAL random sample.
The Lancet probabilities are vastly different from these, and claiming that the results will be similar with an entirely different set of probabilities also is absurd.
September 23rd, 2005 at 6:23 pm
Would Quiggin mind explaining in his own words why he thinks Seixon is innumnerate.
I can’t speak for John Quiggin, but I will begin to credit Seixon with numeracy when he solves the problem I set him in comment no. 53: The number of spots which appears when an honest 6-sided die is thrown just once is a random variable. Call it X. What is the expected value of X, also known as E(X)?
Of course he could get the answer with a quick glance at any introduction to probability theory. But just getting him to look at one would be progress. It really is a bit much that somebody who doesn’t know what these terms mean expects his critique of a scholarly paper to be taken seriously.
September 23rd, 2005 at 6:54 pm
Would Quiggin mind explaining in his own words why he thinks Seixon is innumnerate (sic)
I think Seixon’s own words do a better job. For example, those in comment #48 and #60.
In any case, given your past posts I know you have enough statistical knowledge to read a text on cluster sampling, so you know as well as I do that Seixon is talking nonsense.
September 23rd, 2005 at 7:00 pm
Kevin,
Your question is a complete waste of time as it deals with something completely different than what we are talking about. Your attempts to derail the discussion instead of just answer to my challenge shows that this is nothing but a boondoggle.
As for your stupid little problem, the answer:
1/66 = 1
1/65 = 0.8333
1/64 = 0.6667
1/63 = 0.5
1/62 = 0.3333
1/61 = 0.1667
E(X) would thus be 3.5. So now you’ll have to tell me why in the hell that is relevant to this discussion. I guess you’ll start wandering off into talking about how provinces of Iraq are like equal sets of dice like Mr. Lambert did. Good luck on that.
September 23rd, 2005 at 7:14 pm
E(X) would thus be 3.5. So now you’ll have to tell me why in the hell that is relevant to this discussion.
As you see, it’s not an integer. The fact that X cannot take the non-integer value 3.5 doesn’t alter the fact that the expected value does so. Similarly, the expected number of clusters for Basrah given that there are 3 up for grabs and Basrah’s population is double that of Missan, is 2.
I explained this earlier. You didn’t get it. Do you get it now?
September 23rd, 2005 at 8:27 pm
Yes, the expected values will be the same.
The problem is that with the Lancet method, there is no probability of 1 or 2, and the probabilities for 0 and 3 are much higher than with a real random distribution.
The clusters are no longer randomly distributed with the “grouping process”. I’m not aware of any accepted cluster-cluster methodology, but please point me in the direction of one if there is any.
What the Lancet has essential done is cluster any number of clusters into a single cluster, and distributed this cluster-cluster via a single random drawing.
As you can see on my blog in the examples, this gives very absurd results if repeated numerous times, or if it was used in the initial sampling.
So now you’ll have to explain why that method is only applicable in a 2nd phase and no other phase, and only for certain arbitrary provinces.
As I tried explaining, if we were to use this method for the initial sampling, then Baghdad would more than likely end up with all 33 clusters, or as I showed on my blog, if this method was used more than once, it would have tremendous effects on the result.
In my example of the USA, which you still have not responded to, you cannot simply say “well, Texas and Arizona are similar, so we’ll only interview people in one of them and oversample it to make up for the other state”. That will produce a bias.
However, for the USA, with election numbers and registered voters, if Texas and Arizona are found to be similar demographically, then this method could be acceptable, even though it will still produce a bias. Say we know that 45% of Arizona and Texas are Republicans, and about 35% of them are Democrats, and the rest independents.
If Texas and Arizona are within a few percentage points of each other on those demographics, or equal, then you can defend oversampling just the one of them as the Lancet did.
When we come to Iraq though, this provides huge problems. There is no way to know the violence levels in each province after the invasion. There is no census or registry that can lead us to conclude that two provinces are similar in this respect.
They paired the provinces up by “belief” and nothing else. That is not scientfically or statistically credible.
Thus, they undeniably introduced a bias into their sample that is impossible to gauge, and thus the conclusions of the survey are completely meaningless.
September 23rd, 2005 at 9:37 pm
Thanks Tim. I went back to the paper and found the paragraph where the issue was discussed. This sort of thing is fun–it makes me go back and understand parts of the paper I’d just skimmed over on previous readings.
From my algebra exercise previously I gather that one would try to pair provinces with similar levels of violence in order to reduce the variance. It has nothing to do with the expectation value, which is unaffected. I don’t know how this subjective decision affects the variance they calculate, but the description of how they calculated their variance is in the paragraph about bootstrapping, overdispersion in the regression, and other stuff that I know nothing about.
Seixon, are you saying that the subjective pairing of provinces changes the expectation value of the number of deaths, which is what I think is meant by the term “bias”, or are you saying it increases the variance in a way that you claim is impossible to calculate? And what do you mean when you say the results of the paper are completely meaningless?
I haven’t read your post yet, but I think what you’re saying is that if there had been just one round in the cluster assigning procedure, Baghdad would have had the greatest chance of getting all 33 clusters, though I don’t think it’d be “more than likely”–it’d be a chance of 5million over 25 million, or one fifth. That wouldn’t change the expectation value for the number of deaths–it would just increase the variance, since you’d now be using one province to represent the entire country. They did what they did because they had limited resources and weren’t stupid.
September 24th, 2005 at 12:39 am
I think we may be at a situation where everyone basically agrees. Seixon accepts that “clumping” won’t affect the expected estimate, others accept that the sampling wasn’t random.
Donald gets it right when he says the the whole row has been around the word “biased”. In the sense that bias is systematic in a particular direction, there isn’t sample bias caused by clumping clusters. In the long run the estimate will be the same. In a looser sense, that a biased sample is one that isn’t representative of the population, then dependency between clusters is a form of sample bias. I think the difference may be between common and technical language.
September 24th, 2005 at 1:18 am
dsquared,
As it turns out, the city of Najaf was not included in the survey. It was included in the grouping of random governates to be paired and ended up not being selected for polling. The whole thing stinks. 6 very stable provinces were “randomly” excluded and then 3 hot-spots were also “randomly” excluded. Then when Fallujah didn’t fit, it was excluded. It stinks of data manipulation (what a nice round number 100,000 is). But since all we have is their word on the “randomness”, I guess we have to see how politically motivated they might have been. Their choice of pre-war child mortality numbers was suspect at best, and the accepted range of pre-war mortality numbers should have been introduced into the study (how utterly inconvenient to have posted a NEGATIVE number for casualties). The expressed politics of the Lancet group and the timing of the publication certainly don’t help their credibility (I’ll be looking for Lancet Iraq Part Deuce in late 2008). A CI sigma of 2 (or 4, don’t forget Fallujah) and this study is taken as solid gold proof of 100,000 dead and anyone who disagrees gets the Lambert treatment (yes he’s posted a gazillion good refutations, but no one likes the schoolyard bully to win).
September 24th, 2005 at 1:53 am
Mon Dieu!
Was there ever a better blog post title than this one?
D
September 24th, 2005 at 2:01 am
Seixon, I’ve posted the corrected Coalition combat fatality stats on my site if you want to check my work. I believe they confirm that your assertion that the provinces surveyed in the Lancet study were significantly more dangerous to Coalition troops than the national mean during the actual period surveyed is incorrect.
September 24th, 2005 at 2:03 am
Tim:
Can someone you runs a website like you do also be a sock puppet on their own site? Pardon me for saying this but Kevin Donahue sounds remarkebly like a nice version of you despite the broguish name. Now this is too hysterical for words!
Having read this on the post about Hitchens,
Kevin Donoghue Says:
September 22nd, 2005 at 4:53 am
Joe C,
Even if the things you say about Cole are all true (I know that some of them are false and therefore see no reason to trust you on the others), they have no bearing on Cole’s knowledge of Arabic. It’s a red herring.
I can only assume you put on a sock on you own website. Why? Because language like hand writing is difficult to disguise.
I say this sincerely, as I think Donahue is you in disguise, it is also you in personality and to be honest, he is likable. He’s nice, helpful and a teacher type. Please bring him out more.
I’d love to buy Kev lunch. Let me know if you are available one day Tim as I am always flitting around Sydney.
September 24th, 2005 at 2:20 am
Sorry, I don’t understand: how does supplying seperate figures including and excluding Fallujah stink of data manipulation?
I’ve read criticisms accusing the survey of being both too random and now of not being random enough. The study clearly states the limitations and difficulties of both the sample and the method of extrapolation. I can see how the variance of the data is affected, but that’s some way short of completely invalidating the entire survey and the more huffing and puffing I hear, the more I suspect that its main sin is to propose a politically unacceptable reality rather than be misleading or fraudulent.
I suppose I should declare my grievous bias: I supported the war at the time, and with some reservations I still do. I certainly hope the war hasn’t caused such a horrific number of extra civilian deaths but it doesn’t seem impossible or fraudulent to suggest it has. The Lancet Survey seems the best available estimate, given limited resources and the unarguably hazardous environment.
I ask again: do the New England Journal of Medicine findings contradict or support The Lancet survey? Would it make any difference to you if they did?
September 24th, 2005 at 2:21 am
Joe Cambria, your trolling has to be a little less blatant if you want somebody to bite. Commentors should read my comment policy on trolls.
September 24th, 2005 at 2:23 am
Jet, you’re basing your argument on Seixon’s Coalition fatality numbers, which don’t appear to add up. Running off the same database, but limiting my results to the same time period and excluding non-combat fatalities, I came up with quite different ones.
I think the case has been made that allowing a degree of subjectivity in the pairing process was not ideal. What hasn’t been proven is that this made any difference.
14 clusters were unaffected by the pairing; 1 (Anbar) was ultimately treated as an outlier. That leaves 18, of which 8 in total ended up being transferred to new provinces in total. One cluster transfer (Sulimaniyah-Arbil) was between two undisputedly quiet provinces. Three (Qadisiyah, Tamim and Dehuk) were to probably more violent ones. Two (the Basra to Missan transfer) ended up moving to a less violent locale. That leaves the two that were transferred from Najaf to Karbala, which Seixon treats as equally violent provinces, I believe incorrectly (I think you can make the case, based on an accurate count of Coalition combat fatalities, that Najaf was somewhat more violent than Karbala in this period).
The upshot is that the net violent/nonviolent plus/minus is +1 cluster if you believe Seixon’s theory, and -1 if you base it on the actual relevant Coalition fatality numbers. Regardless, once Anbar was excluded, the provinces surveyed seem to have had an average level of anti-Coalition violence close to the national average, for what that’s worth.
September 24th, 2005 at 4:15 am
Jet;
…since all we have is their word on the “randomness”, I guess we have to see how politically motivated they might have been.
If your argument seems to be that the study is fraudulent (i.e. the results claim to be random, but are in fact fixed). If you really don’t want to accept the results, why not use the slightly more charitable strategy of suggesting that they got unlucky by pulling an unrepresentative random sample? It’s no wonder that the whole Lancet study debate has generated much more heat than light. Is there anything which would convince you to change your mind regarding mortality in Iraq?
September 24th, 2005 at 5:08 am
I’m doubtful of the premise there is a correlation between coalition combat deaths and Iraqi deaths. September saw ~30 coalition deaths, ~130 Iraqi military/police and ~450 civilians. Coalition deaths happen in relatively large clumps scattered across the country. Iraqi deaths seem to occur more predictably in high profile target areas.
I’m sure too many people have died in Iraq, but I’m not sure the Lancet is skeptical enough about the 2002 2.9% infant mortality rate. To make the claim that a relaxation of sanctions resulted in a miraculous drop of infant mortality from 10.8% to 2.9% in 2 years, given that the average for 1989-1990 (relatively peaceful years, at least until the invasion) was about ~5%. Which is more likely, 10 occupation hating resistance sympathizing self-pollers lied about their babies, or Iraq dropped its infant mortality rate in two years from 10.8% to 2.9%, or half the 1990 rate of 4.7%?
Given that the Lancet study just used the 2.9% number instead of trying to assign some statistical magic to deal with the discrepancy is all you need to know to realize the study is seriously flawed.
nikolai,
The Lancet lost most of their Iraqi related ethos when they politicized themselves. I’d also be more inclined to give them some credibility if they hadn’t decided to just accept the almost magical infant mortality rate they found. How can anyone believe that in 2 years Iraq went from Somalia infant mortality rates to Brazilian rates, and the evidence is a self-poll where the respondents have a motive to exaggerate their circumstance?
September 24th, 2005 at 5:11 am
Oh, I meant they went from Somalia to better than Brazilian rates.
September 24th, 2005 at 5:28 am
I’m starting to look like a one man show here, but before anyone brings up the ILCS as confirming the Lancet, it may confirm the pre-war Lancet number (but like I said before, that is a magical drop in deaths), but it certainly doesn’t confirm the post-war Lancet number. The ILCS shows a more believable increase from 3.3% to 3.5%, while the Lancet self-pollers say it was 2.9% to 5.9%. The ILCS found an increase of 3 infant deaths per 1,000 and the Lancet found an increase of 30 infant deaths per 1,000. The ILCS doesn’t help the Lancet at all.
September 24th, 2005 at 6:43 am
I realized my “fun” comment sounds ghoulish, given what we’re talking about. I was caught up in learning a tidbit about statistics I didn’t know before. (Which would include most of statistics beyond the elementary level).
Nikolai said we’re all in agreement. I can’t tell for sure. Seixon has been talking about “bias” and “worthless” and it’d be nice to have clarification of what he means by those terms. What’s odd is that, as dsquared pointed out at Crooked Timber, there’s not really much doubt that there have been tens of thousands of deaths in Iraq, somewhat higher than what Iraq Body Count has found (which would be a minimum) and possibly as high as the Lancet paper suggests. And it’s odd to see all this criticism of a paper for its bias in the colloquial sense when the most obvious bit of data manipulation, if you want to call it that, was when they left out the horrifying statistics of the Fallujah sample and thereby chucked out the Anbar province. And if I remember correctly,(my copy of the paper is at home) their sample in Sadr City or someplace where you’d expect a high death toll by chance turned out to have suffered zero violent deaths. So it seems that in practice the Lancet figure of 60,000 violent deaths didn’t include any neighborhoods where people died in large numbers due to intense fighting or bombing in urban settings–in my perhaps naive way I would expect that that these comparatively small areas would account for a large fraction of the actual death toll and with a survey that only has 33 samples you’ve got a better than even chance of missing the really hard hit areas unless at least one in 50 neighborhoods suffered such casualties. And much of that intense urban fighting occurred in the summer and fall of 2004, after the UN survey. And yet we’re supposed to think an estimate which excluded the one sample that suffered intense bombing was biased upward. Right now I’m using the word “bias” in its colloquial sense–a study with a small number of samples is going to miss neighborhoods that lost large fractions of their people unless such neighborhoods are relatively common. But I don’t think that’s a bias in the expected value sense. If Seicon wants to talk about bias in the colloquial sense, than anyone can play and I’ll ride my own hobbyhorse, which is that we don’t know what the intense urban warfare in the summer and fall of 2004 did to civilians–the UN survey ended too soon and the Lancet study had too few clusters.
I realize from earlier rounds of this that a small number of clusters in the Lancet study could also miss neighborhoods where Saddam had gone on some prewar rampage, but I don’t know of any evidence he was going on such rampages just before the war. The UN survey could have picked this up, if they asked.
September 24th, 2005 at 8:53 am
On the subject of bias, I just read a post in another comment section where my debate opponent used the IBC analysis of their own data to refute the Lancet study–in particular, since the IBC analysis shows that most of the deaths attributed to Americans occurred in the early months of the war, this meant there couldn’t have been massive civilian casualties in Fallujah during the summer of 2004 because it doesn’t show up in the IBC count. I blame IBC’s analysis more than my opponent, because they do calculations of all sorts of percentages and give the results to two or three significant digits, even though their data is based on Western news reporters whose access to most of Iraq became increasingly restricted after the early months. The US and Iraqi governments aren’t going to say how many civilians they’ve killed and if reporters can’t go out in the field and talk to ordinary people, they’re not going to be able to find out very much. They will be able to report on insurgent suicide bombings, however. The IBC analysts seem to have fallen in love with their data and they report on their various calculated percentages as though this came from an unbiased sample of the total number of Iraqi civilian dead. There’s not a hint of any sort of bias problem in their analysis–compare that to the Lancet paper, which spends paragraphs discussing why its numbers might be wrong. Naturally it’s the Lancet paper that attracts criticism.
September 24th, 2005 at 9:07 am
Donald Johnson,
Perhaps that is because most people haven’t heard of the IBC report, but everyone has heard of the Lancet report.
September 24th, 2005 at 11:40 am
Donald Johnson, Perhaps that is because most people haven’t heard of the IBC report, but everyone has heard of the Lancet report.
Wow, that has got to be the best darn reason to be critical of a paper that I have ever heard.
Why worry about accuracy when you can attack it because it is well known!
September 24th, 2005 at 11:08 pm
“The IBC analysts seem to have fallen in love with their data and they report on their various calculated percentages as though this came from an unbiased sample of the total number of Iraqi civilian dead.”
Donald, correct me if I’m wrong, but IBC doesn’t calculate its numbers based on surveys and extrapolations. Last time I checked, their figures were arrived at solely from media or morgue accounts of casualties.
Your comment# 81 seems to be suggesting that the Lancet violent excess death extrapolation of some 60,000 is more likely to be an underestimate of the actual death toll, rather than an overestimate. You base this on the exclusion of the Falluja data, and the absence of deaths from clusters in areas of significant violence.
I mentioned this several months ago here, but didn’t get much feedback. The ILCS survey was a vastly broader one than the Johns Hopkins effort, and most seem to agree that the ILCS is much more likely to provide an accurate picture of the war-related mortality rate than the Lancet study. I think it’s a mistake to claim that the ILCS corroborates the Lancet war related death extrapolation, because there are indications that the composition of the two war related totals are quite different.
The Lancet extrapolates approximately 12,000 children killed ex-Falluja, all by the coalition. On the other hand, the ILCS calculated slightly less than 3,000 war-related deaths of children. I realize there has to be an adjustment for the difference in time periods covered by the two studies, but I suspect this is largely offset by two factors; the ILCS cut-off for children is 18, while the Lancet cut-off is 15. The ILCS figure is made up of fatalities inflicted by all sides, which acknowledges the fact that insurgents certainly contributed to the child death toll, something the Lancet extrapolations completely exclude.
There’s one other significant indication that the Lancet and ILCS may have very different compositions of their war-related death estimates. Tim disagreed with me when I raised this at the time, but I’m convinced that the ILCS war related death survey question was intended to capture the deaths of Iraqi military personnel during the initial invasion, and logically must have elicited reports of such deaths from respondents. The Lancet survey failed to capture any deaths of regular Iraqi military servicemen (which was understandable given the Lancet’s survey question).
It would be revealing to see a more detailed breakdown of the ILCS data. What it seems to suggest, from the data relating to the deaths of children, is a much lower civilian death toll from coaltion bombing than the Lancet portrays. The ILCS covers the first year of the war, including the invasion phase. If it’s numbers indicate a dramatically lower civilian death toll from coalition forces during the invasion, then that may be an indication that it’s faulty to assume the fighting post-April 2004 killed tens of thousands of civilians in places like Falluja and Najaf, as you’re implying, Donald.
September 25th, 2005 at 1:00 am
Mike, the Lancet study found 4 children killed by the coalition ex Falluja. This really is too small a number to extrapolate from.
September 25th, 2005 at 5:39 am
Mike, my point was that the IBC count is biased by the fact that the reporters are going to find it easier to count civilians killed by the insurgents. That’s because they get much of their information from US and Iraqi government sources, which will tell them how many civilians are killed by the enemy, but not how many they’ve killed. Also, in Vietnam it was known that the bodycount of enemy soldiers included a fair number of civilians and I suspect the same occurs in Iraq. So yeah, the IBC numbers are an actual count (of sorts), but not only will it necessarily be an undercount, but it will also be a sample of the total civilian death toll that is biased by the limitations placed on the reporters. I used the analogy of the French/Algerian war in another comment section of another blog. If you tried to count the deaths in the French/Algerian war solely relying on press accounts, you’d probably have a fairly accurate count of the civilians killed by the Algerian rebels, but a huge undercount of the number of Algerian civilians killed by the French, especially in areas inaccessible to reporters.
Interesting you brought up the 3000 figure in the UN survey, Mike. That was 3000 dead for the first year or so (or maybe 15 months, someone else said somewhere). IBC counted 1300 dead in the first two years, so at a minimum they missed more than half of the
child deaths.
September 25th, 2005 at 5:41 am
Though to be fair, I don’t know the error bars on the 3000 dead children figure.
September 25th, 2005 at 6:57 am
Tim:
The extrapolated war related deaths from children constitute much of the foundation for the Lancet’s most controversial and high profile conclusion, that violence accounted for most of the excess 100,000 death estimate, and most of the violent deaths were suffered by women and children as a result of coalition air strikes.
I think you have to take the good with the bad in this regard, Tim. Much of the defense of the Lancet study has involved defending ” small numbers ” derived from the study. That sometimes cuts both ways.
September 25th, 2005 at 7:12 am
Donald:
The statement from you that I referred to was a straightforward (at least from my perspective) assertion that the IBC was engaged in a statistical exercise based on survey samples. That’s what I was taking issue with.
I don’t think that’s the case Donald, although I’ll have to do some chacking at IBC. If I recall correctly, the IBC has attributed many of the deaths it tabulated to coalition and Iraqi security forces.
Another significant variable that often gets overlooked is the fact that the IBC count most certainly contains insurgent deaths that have been counted as civilian casualties. An IBC estimate of how many fall into this category is very difficult, if not impossible to determine.
Essentially, with the IBC figure for children, you’ve provided further evidence in favour of the accuracy of the ILCS study over the Lancet study, in terms of the discrepancy that exists between them in the context of war related child deaths.
September 25th, 2005 at 11:16 am
Jet,
With regard to the infant mortality figures, did the 5/1,000 figure which came from Saddam’s regime?
During the sanctions period, Saddam was conducting extensive propaganda abotu the impact of sanctions on Iraq’s chidlen (which extended to such incredibly vile acts as deliberately withholding chemotherapy from children with cancer even though the drugs were available.)
I believe the recent survey from the IMF showed an even higher figure for infant mortality
” Under five years mortality is estimated at 115 (compared to 33 in Jordan and 107 in Yemen). Infant mortality is currently estimated at 102 per 1,000 live births, compared 105 in sub–Saharan Africa.”
www.imf.org/external/pubs…
September 26th, 2005 at 2:58 am
Ian gould,
I believe they are UNICEF numbers I got from this very site. Search this site for “iraq child mortality” and Lambert has a pretty good thread covering it. I’m really just rehashing Lambert’s own words about the infant mortality bit.
I still wish someone would explain on this thread if clustering of clustering is really a statistical method and the variance accounts for this method, or is Seixon right that CI does not account for clustering of clustering.
And on the off topic point, could someone clarify why the 100,000 number has so much backing even though the de facto final (blog) word on the matter, Lambert, has pointed out the problems with an extremely large chunk of the data (infant mortality numbers).
September 26th, 2005 at 4:26 am
jet, clustering of clustering is called multistage clustering and yes it’s a valid method as has been explained several times already.