A study was reported last week in various new sources with the claim that heavy drinking causes pancreatic cancer. The study results do tend to suggest that conclusion, though there are a handful of issues worth mentioning.
The article was written by employees of the American Cancer Society. While sometimes perceived as a scientific organization, they are primarily a political activist group. For example, they have engaged in efforts to prevent people from learning about tobacco harm reduction, disguising their political goals with sciencey claims. I am not sure whether they deserve similar distrust on the topic of alcohol as they have earned regarding tobacco, but I would not assume they are being honest. They are certainly not doing great epidemiology. For example, they controlled for whether someone had a history of gallstones, but alcohol consumption has a causal relationship with gallstones (in particular, it protects against them) so, as anyone who read this blog for the last week should know, it should not be controlled for as a confounder. It probably does not matter much, but it might. They also controlled for body mass index, which has the same problem and might matter more. Additionally they controlled for such variables as marital status, which are not plausibly confounders, though at least they are not caused by the exposure.
In short, they just threw in whatever variables they happened to have, which is standard ACS practice. At least it might be innocent ignorance of proper methodology, without any biased fishing for which confounders yield the "best" answer, though we cannot be sure. The datasets they use for most of their analyses are old cohorts for whom they gathered exposure data a long time ago (in this case, 1982) and have been watching for health outcomes ever since. Since they cannot collect the optimal information for each exposure-disease combination they want to study, they generally just throw in what they have. That brings up another difficulty with interpreting the results of these studies, though one that is an inherent challenge, not an error: We only know the exposure data from 1982, and then it is limited by the quality of a 1982 survey (there are also some sampling issues with that survey that I will not go into). We also have evidence there is measurement error in the alcohol variable (e.g., when they studied smokeless tobacco using this data they found there was still an association with alcohol-caused liver disease even after controlling for alcohol consumptions, which means they apparently did not measure alcohol very accurately). Indeed, most surveys underestimate alcohol consumption because people under-report it.
All that said, the results seem to be legitimately interpretable as supporting the claim. But let's not get quite as excited about it as those reporting it did. The authors were clearly fishing a bit for the result, so their exact claims should not be taken seriously. First, the increase in risk was only about 20%, so even if the result were exactly right, it is a fairly modest contribution to the harm from heavy drinking. The result suggested that the negative results start at 3 drinks per day (or more precisely, a reported 3 drinks per day, which probably means more), but we know that if they had seen the results starting at 2 drinks per day, they would have reached a similar conclusion. This matters because it means that the results are biased upwards (that is, they tend to exaggerate the effect) because the hypothesis – drinking at least 3 drinks per day increases pancreatic cancer risk – was designed to fit the data. (See this for more about that concept if you are interested and it is not clear why this would be). Consider this to help explain: Assume that other studies had suggested that the unhealthy effects start at 2 drinks per day; then this study would have somewhat contradicted the previous belief, but it would have been described as if it supported it, biasing the evidence in favor of the claim.)
The finer you fit the worldly conclusions to the specific data, the more bias you get, both in the sense of overstating the support for the generically-phrased worldly hypothesis (E causes D) and exaggerating the strength of the association. So when some commentators pointed out that women only had the effect at 4 or more drinks per day (a level of mining that the study authors actually chose to not include in their abstract, though obviously it was in the results), they increased this "over-fitting" problem, as it is sometimes called. The worst example of this, though, which the study authors are guilty of and some reports fell for, is separating out the effects of liquor, beer, and wine, and claiming that the latter two did not show an association, and (obviously) liquor showed a stronger association than the average of the three. This is not actually true in the first place – beer and wine showed the association, it was just smaller – but the real problem is chopping up the association to see if it can be made stronger if only some of the dataset is considered and the rest is ignored. It pretty much always can, and so doing such fishing and concluding that whatever it generates is right is pretty much a guarantee that the result is biased. It was very likely guaranteed that either beer, wine, or liquor would show a substantially stronger association than the other two by chance alone, so picking it and highlighting it tells us very little. It would not be fishing if there were a good theoretical reason to expect the result, but it would be rather odd to expect that women are able to safely drink more than men, or for different sources of alcohol matter a lot for the pancreas (they plausibly matter for oral cancer, where highly concentrated alcohol in liquor touches the mucosa).
This reminded me of another aspect of the wind turbine case I wrote about a few times over the last few days (sorry to flog it, but I can only tell the stories that I have). Without going into confidential details, consultants working for the industry were trying to claim that one epidemiology study's results should be dismissed because the authors did not use a Bonferroni correction. You are excused for not knowing what that is, even if you read a lot of epidemiology studies, because it is basically never done. It is a simplistic statistical trick for "penalizing" your statistical tests when you are doing multiple analyses of the same data (and interpreting the results a particular way) making it harder to call something statistically significant. It did not actually apply at all in that case, but to some extent the concept addresses the same problem that I describe as creating bias resulting from fishing in the data. Unfortunately, no simple rule can correct for the bias, so it is not really very useful in addressing the main problem, and it is based on very simplistic assumptions that do not actually describe the way fishing is typically done. Nevertheless my recent experience explaining why a particular claim was nonsense was a reminder that there is a simple thought exercise that is taught in first-year statistics-for-epidemiologists, that reminds people that there is a problem if we are allowed to look at the sane data too many different ways (e.g., looking at 2 drinks, and 3, and 4, and men, and women, and beer, and liquor), and they remember hearing about that Bonferroni thing. Anyone who would try to use the specific Bonferroni arithmetic is generally making a host of errors (it is a baby step toward understanding something, not method that works well in the complicated real world), but everyone who took a year of epidemiology did learn that fishing is a problem.
As the ACS study authors acknowledge in their introduction, there is controversy about the alcohol-pancreatic cancer relationship, and their study obviously does not resolve it. I ran into this controversy a few years ago at a conference talk in which I was insisting that, in the context of one study that the authors claimed supported the (never actually supported) claim that smokeless tobacco causes pancreatic cancer, that they should have controlled for alcohol consumption (we were showing an exercise about how the reader could do that even though the original authors did not). Someone who I believe was from U.S. National Cancer Institute pointed out that NCI's position was that alcohol does not cause pancreatic cancer, though she agreed that in the particular study in question it was strongly associated, so a case could be made for what I was claiming. It was kind of interesting that the U.S. government position was that, in this particular controversy, this particular evil vice is not a problem after all. I will be interested to see how that evolves. In any case, there were strongly contradictory results before the new study, and following the study… well, obviously there still are.
This reflects what might be rule number one about interpreting the health news (one that reporters should learn): The new study is not necessarily better than an old study, and it is rare when it should move our beliefs very far from where they were based on all previous existing knowledge.
The article was written by employees of the American Cancer Society. While sometimes perceived as a scientific organization, they are primarily a political activist group. For example, they have engaged in efforts to prevent people from learning about tobacco harm reduction, disguising their political goals with sciencey claims. I am not sure whether they deserve similar distrust on the topic of alcohol as they have earned regarding tobacco, but I would not assume they are being honest. They are certainly not doing great epidemiology. For example, they controlled for whether someone had a history of gallstones, but alcohol consumption has a causal relationship with gallstones (in particular, it protects against them) so, as anyone who read this blog for the last week should know, it should not be controlled for as a confounder. It probably does not matter much, but it might. They also controlled for body mass index, which has the same problem and might matter more. Additionally they controlled for such variables as marital status, which are not plausibly confounders, though at least they are not caused by the exposure.
In short, they just threw in whatever variables they happened to have, which is standard ACS practice. At least it might be innocent ignorance of proper methodology, without any biased fishing for which confounders yield the "best" answer, though we cannot be sure. The datasets they use for most of their analyses are old cohorts for whom they gathered exposure data a long time ago (in this case, 1982) and have been watching for health outcomes ever since. Since they cannot collect the optimal information for each exposure-disease combination they want to study, they generally just throw in what they have. That brings up another difficulty with interpreting the results of these studies, though one that is an inherent challenge, not an error: We only know the exposure data from 1982, and then it is limited by the quality of a 1982 survey (there are also some sampling issues with that survey that I will not go into). We also have evidence there is measurement error in the alcohol variable (e.g., when they studied smokeless tobacco using this data they found there was still an association with alcohol-caused liver disease even after controlling for alcohol consumptions, which means they apparently did not measure alcohol very accurately). Indeed, most surveys underestimate alcohol consumption because people under-report it.
All that said, the results seem to be legitimately interpretable as supporting the claim. But let's not get quite as excited about it as those reporting it did. The authors were clearly fishing a bit for the result, so their exact claims should not be taken seriously. First, the increase in risk was only about 20%, so even if the result were exactly right, it is a fairly modest contribution to the harm from heavy drinking. The result suggested that the negative results start at 3 drinks per day (or more precisely, a reported 3 drinks per day, which probably means more), but we know that if they had seen the results starting at 2 drinks per day, they would have reached a similar conclusion. This matters because it means that the results are biased upwards (that is, they tend to exaggerate the effect) because the hypothesis – drinking at least 3 drinks per day increases pancreatic cancer risk – was designed to fit the data. (See this for more about that concept if you are interested and it is not clear why this would be). Consider this to help explain: Assume that other studies had suggested that the unhealthy effects start at 2 drinks per day; then this study would have somewhat contradicted the previous belief, but it would have been described as if it supported it, biasing the evidence in favor of the claim.)
The finer you fit the worldly conclusions to the specific data, the more bias you get, both in the sense of overstating the support for the generically-phrased worldly hypothesis (E causes D) and exaggerating the strength of the association. So when some commentators pointed out that women only had the effect at 4 or more drinks per day (a level of mining that the study authors actually chose to not include in their abstract, though obviously it was in the results), they increased this "over-fitting" problem, as it is sometimes called. The worst example of this, though, which the study authors are guilty of and some reports fell for, is separating out the effects of liquor, beer, and wine, and claiming that the latter two did not show an association, and (obviously) liquor showed a stronger association than the average of the three. This is not actually true in the first place – beer and wine showed the association, it was just smaller – but the real problem is chopping up the association to see if it can be made stronger if only some of the dataset is considered and the rest is ignored. It pretty much always can, and so doing such fishing and concluding that whatever it generates is right is pretty much a guarantee that the result is biased. It was very likely guaranteed that either beer, wine, or liquor would show a substantially stronger association than the other two by chance alone, so picking it and highlighting it tells us very little. It would not be fishing if there were a good theoretical reason to expect the result, but it would be rather odd to expect that women are able to safely drink more than men, or for different sources of alcohol matter a lot for the pancreas (they plausibly matter for oral cancer, where highly concentrated alcohol in liquor touches the mucosa).
This reminded me of another aspect of the wind turbine case I wrote about a few times over the last few days (sorry to flog it, but I can only tell the stories that I have). Without going into confidential details, consultants working for the industry were trying to claim that one epidemiology study's results should be dismissed because the authors did not use a Bonferroni correction. You are excused for not knowing what that is, even if you read a lot of epidemiology studies, because it is basically never done. It is a simplistic statistical trick for "penalizing" your statistical tests when you are doing multiple analyses of the same data (and interpreting the results a particular way) making it harder to call something statistically significant. It did not actually apply at all in that case, but to some extent the concept addresses the same problem that I describe as creating bias resulting from fishing in the data. Unfortunately, no simple rule can correct for the bias, so it is not really very useful in addressing the main problem, and it is based on very simplistic assumptions that do not actually describe the way fishing is typically done. Nevertheless my recent experience explaining why a particular claim was nonsense was a reminder that there is a simple thought exercise that is taught in first-year statistics-for-epidemiologists, that reminds people that there is a problem if we are allowed to look at the sane data too many different ways (e.g., looking at 2 drinks, and 3, and 4, and men, and women, and beer, and liquor), and they remember hearing about that Bonferroni thing. Anyone who would try to use the specific Bonferroni arithmetic is generally making a host of errors (it is a baby step toward understanding something, not method that works well in the complicated real world), but everyone who took a year of epidemiology did learn that fishing is a problem.
As the ACS study authors acknowledge in their introduction, there is controversy about the alcohol-pancreatic cancer relationship, and their study obviously does not resolve it. I ran into this controversy a few years ago at a conference talk in which I was insisting that, in the context of one study that the authors claimed supported the (never actually supported) claim that smokeless tobacco causes pancreatic cancer, that they should have controlled for alcohol consumption (we were showing an exercise about how the reader could do that even though the original authors did not). Someone who I believe was from U.S. National Cancer Institute pointed out that NCI's position was that alcohol does not cause pancreatic cancer, though she agreed that in the particular study in question it was strongly associated, so a case could be made for what I was claiming. It was kind of interesting that the U.S. government position was that, in this particular controversy, this particular evil vice is not a problem after all. I will be interested to see how that evolves. In any case, there were strongly contradictory results before the new study, and following the study… well, obviously there still are.
This reflects what might be rule number one about interpreting the health news (one that reporters should learn): The new study is not necessarily better than an old study, and it is rare when it should move our beliefs very far from where they were based on all previous existing knowledge.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.