Observational studies are a mainstay of epidemiology. In observational studies, investigators gather data passively rather than manipulating variables. For example, if you want to know if people who wear tight shoes develop bunions, you would find a group of people who wear tight shoes and one that doesn't. You would try your best to make sure the groups are the same in every way besides shoe tightness: age, gender, weight, etc. Then you would follow them for 10 years to see how many people in each group develop bunions. You would then know whether or not wearing tight shoes is associated with bunions.
Observational data can never tell us that one thing caused another, only that the two are associated. The tight shoes may not have caused the bunions; they may simply have been associated with a third factor that was the true cause. For example, maybe people who wear tight shoes also tend to eat corn flakes, and corn flakes are the real cause of bunions. Or perhaps bunions actually cause people to wear tight shoes, rather than the reverse. Observational data can't resolve these questions definitively.
To establish causality, you have to do a controlled trial. In the case of our example, we would select 2,000 people and assign them randomly to two groups of 1,000. One group would wear tight shoes while the other would wear roomy shoes. After 10 years, we would see how many people developed bunions in each group. If the tight shoe group had more bunions, we could rightly say that tight shoes cause bunions. The reason this works is the randomization process (ideally) eliminates all differences between the groups except for the one you're trying to study. You should have the same number of corn flake eaters in each group if the randomization process worked correctly.
A less convincing but still worthwhile alternative would be to put tight and loose shoes on mice to see if they develop bunions. That's what researchers did in the case of the tobacco-lung cancer link. Controlled studies in animals reinforced the strong suggestion from epidemiological studies that smoking increases the risk of lung cancer.
Finally, another factor in determining the likelihood of associations representing causation is plausibility. In other words, can you imagine a way in which one factor might cause another or is the idea ridiculous? For example, did you know that shaving infrequently is associated with a 30% increase in cardiovascular mortality and a 68% increase in stroke incidence in British men? That's a better association than you get with some blood lipid markers and most dietary factors! It turns out:
The one fifth (n = 521, 21.4%) of men who shaved less frequently than daily were shorter, were less likely to be married, had a lower frequency of orgasm, and were more likely to smoke, to have angina, and to work in manual occupations than other men.So what actually caused the increase in disease incidence? That's where plausibility comes in. I think we can rule out a direct effect of shaving on heart attacks and stroke. The authors agree:
The association between infrequent shaving and all-cause and cardiovascular disease mortality is probably due to confounding by smoking and social factors, but a small hormonal effect may exist. The relation with stroke events remains unexplained by smoking or social factors.In other words, they don't believe shaving influences heart attack and stroke directly, but none of the factors they measured explain the association. This implies that there are other factors they didn't measure that are the real cause of the increase. This is a critical point! You can't determine the impact of factors you didn't measure! And you can't measure everything. You just measure the factors you think are most likely to be important and hope the data make sense.
This leads us to another important point. Investigators can use math to estimate the relative contribution of different factors to an association. For example, imagine the real cause of the increased stroke incidence in the example above was donut intake, and it just so happens that donut lovers also tend to shave less often. Now imagine the investigators measured donut intake. They can then mathematically adjust the association between shaving and stroke to subtract out the contribution of donuts. If no association remains, then this suggests (but does not prove) that the association between shaving and stroke was entirely due to shaving's association with donuts. But the more math you apply, the further you get from the original data. Complex mathematical manipulation of observational data requires certain assumptions, and while it is useful for extracting more information from the dataset, it should be viewed with caution in my opinion.
Of course, you can't adjust for things you didn't measure, as the study I cited above demonstrates. If factors you didn't measure are influencing your association, you may be left thinking you're looking at a causal relationship when in fact your association is just a proxy for something else. This is a major pitfall when you're doing studies in the diet-health field, because so many lifestyle factors travel together. For example, shaving less travels with being unmarried and smoking more. Judging by the pattern, it also probably associates with lower income, a poorer diet, less frequent doctor visits, and many other potentially negative things.
If the investigators had been dense, they may have decided that shaving frequently actually prevents stroke, simply because none of the other factors they measured could account for the association. Then they would be puzzled when controlled trials show that shaving doesn't actually influence the risk of stroke, and shaving mice doesn't either. They would have to admit at that point that they had been tricked by a spurious association. Or stubbornly cling to their theory and defend it with tortuous logic and by selectively citing the evidence. This happens sometimes.
These are the pitfalls we have to keep in mind when interpreting epidemiology, especially as it pertains to something as complex as the relationship between diet and health. In the next post, I'll get to the meat of my argument: that modern diet-health epidemiology may in some cases be a self-fulfilling prophecy.
17 comments:
Stephan,
I'm assuming you saw my Blog post based on the full text of the "Red meat kills" study which pointed out that the same study "proved" with the same logic used to create the "red meat kills" headline, that eating red meat makes men die in accidents.
When, exactly, did they stop requiring statistics classes when granting M.D.s and Ph.D.s in nutrition? It must have been a while ago, given the studies that have passed peer review lately.
--Jenny Ruhl
Unfortunately, there was no critical thinking taught in my education program nor dietetic internship. I had to learn it myself (thank you Gary Taubes and Dr Eades). The funny thing is, I do remember my teachers using this "other variable" technique to discount any study that was favorable to a low carb diet.
The biggest problem with The "Gold Standard" randomized placebo controlled trials is that they are soooo expensive to conduct. And then there are always confounding variables, drop out rates,etc. The number you need to enroll and the time you need to follow them in order to find a valid difference between two groups is often very large. Still they are the best we've got.
Stephan, help me. I can't find your DART trial. Pubmed just gets bogged down in a DART trial involving HIV in Africa. The closest I could find was tne Iowa Woman's Heart Study which showed an increase in mortality and CV disease in women who ate refined grains as compared to women who ate whole grains.
Stephan,
I look forward to your next post!
I think you could make an addition couple of points here. If A is correlated with B, there are three possibilities: A causes B, B causes A, or a third factor C causes each, assuming the association is real.
The bunions could cause the wearing of tight shoes if people with bunions find that loose shoes are more likely to slip and slide against the bunions and thereby irritate them.
Infrequent shaving absolutely could cause heart attack or stroke if it intermediately causes the other social variables. Do people who are not married shave less because they are not married, or are they not married because they shave less frequently and it is more difficult to find a mate?
This complicates adjustment. If the real cause was shaving and they adjust for donut intake, they eliminate the correlation with shaving when it was the causal factor. Adjustments can turn true data into false data or false data into true data, depending on whether your hypothesis underlying the adjustment is correct. But while you are adjusting, it's still just a hypothesis!
Another complication is the level of statistical significance. If the level is set at 0.05, then one out of twenty correlations will be false. Moreover, if you do multiple comparisons in one study, you need to adjust the p level downward. So if you do ten comparisons in one study then your p level must be 0.005 in order for one out of twenty findings to be false. This is getting complicated here because it depends on what procedure you use, but many researchers get away with using less rigorous tests and failing to adjust the significance level for multiple comparisons because it will make their data non-significant. So we have the additional problem that many of the correlations are not real correlations. Only out of the true correlations do we face the issue of causation, and of course epidemiological evidence cannot address causation!
I look forward to your next post as I said!
Chris
Stephan,
I'm going to assign this post to my learning lab class (we're doing a behavioral study with rats). Next week's class is the first in which we will cover scientific writing, and this post is a fantastic example of superb writing. You tell a story. Stephan, you are a remarkable story teller and that's what is wrong with most bad writing, even in science. The authors forgot that what they are doing is telling a story, spinning a yarn, parlaying a narrative. Your posts are consistently well written. I hope you continue on in your profession and continue to publish peer-reviewed papers with the same high quality writing as you use in your blog.
Bravo!
PS. For an excellent piece on scientific writing (though it applies to all writing) is Gopin and Swan's "The Science of Scientific Writing". You can Google it.
Wonderful post. Thank you.
My statistics classes were pretty good, I have to say (enviro biology). However, the stats classes at the neighboring university (with 10 times the tuition rate) sucked and that's putting it mildly. All the epidemiology students came over and took stats at our university. Jenny, I have some bio students going into nutrition and what is worse than not having to take stats is that they don't have to take biochem!!! Can you believe it!?
Controlled trials are best but sometimes observational studies are all we have. Especially in the environmental sciences where controlling for variables in experimental designs in the field can be almost impossible. I guess what I'm trying to say, Jenny, is that IMO the worst thing about that meat paper (I checked out their methods) was that they failed to correct for some very relevant variables. The stat methods weren't all that bad, I didn't think. The fact that they corrected for all kinds of other dietary factors except for types of carbohydrate intake is so telling. Most Americans get their feedlot meat served to them on a big fat white bun with trans fat fries and an HFCS coke... (but I think you mentioned this, too). This was apparently lost on them. I guess everyone knows that bread and HFCS are harmless! Pathetic.
Thank you for another enlightening post. It is no wonder that the conventional wisdom about maintaining optimum health is often wrong and has people like Dr Eades and Jenny (above) so frustrated. It is not surprising that there is so much diversity of thought and interpretation about health matters among even the most critical thinkers such as yourself, Jenny, Dr Eades, Dr William Davis etc.
I'm feeling for those mice having to wear two pair of tight shoes!
I would like to suggest that each mouse could potentially serve as its own control, wearing two tight shoes and two loose shoes apiece - thus reducing the effect of inter-subject biological variability. The pattern of loose-tight shoe assignment for each mouse should ideally be randomized to control for any possible "lateral" or "symmetry" effect on bunion formation. Compliance may well be an issue with the mice so their level of adherence to the shoe-wearing protocol must be rigorously documented and included as an independent variable during analysis. That said, we should all guard against an overly mousopomorphic view of this (and other) health issues which may be due to increasing pressure from the mouse lobby to leverage NIH resources in its own interest.
:-) @ phanamere!
Great post as usual, Stephan!
The MD says "Budgets are tight. Could we try millipedes"? (-:
Jenny,
I did see your post. Yet another reason to wonder what exactly they were measuring.
Homertobias,
Yes, intervention trials are terribly expensive. I don't think that justifies using observational data to form public health advice though, unless it's a massive link like the one between smoking and lung cancer. The DART trial was published in 1989 in the Lancet. I had to photocopy it at my library to get a copy.
Chris,
Good points. I revised the post to add a couple of the things you mentioned.
Aaron,
I appreciate the encouragement.
Bogartg1.
Yes there are a million different views on diet and health. That's why it's important to stay grounded and keep in mind what healthy traditional cultures were doing. That's the only proven model.
Phantamere,
Good point, it might be good to test different types of shoes too, like high heels.
"Good point, it might be good to test different types of shoes too, like high heels."
I say we put mice in tiny Crocs.
Put a loose shoe on one foot and a tight one on the other.
Sushil, I agree -- that's why I suggested Crocs. :)
Oops. I'm short, shave infrequently, unmarried, and...well you get the idea. I also have course hairs growing out of my ears (associated with heart disease in a study from long ago). I don't want to shave--can I pluck my ears, or shave some mice instead?
Dan,
The fact that you read this blog places your outside of the category of guys who don't shave and don't care about their health.
But coarse hairs-- you'd better watch out!
Post a Comment