EP-ology by Carl V. Phillips: Unhealthful News 65 - A metaphor for the peer review process

I have often tried to explain how naive it is for people to believe that the health science journal peer review process provides some magical confirmation of the accuracy of the analysis in a published paper. ("Naive" is a fair description of those outside the process who have been misled about it; it is far too charitable for those who have been part of the process and still pretend – or perhaps worse, actually believe – it provides such confirmation.) Yes, the process does screen out some of the most absurd junk, unless it conforms to favored political agenda, in which case it gets published to. But it is unusual when the peer review process seriously favors good analysis over bad, so long as the bad is basically literate and throws in a few of the right words, and vanishingly rare for it to actually improve an analysis in any substantive way.

So what do reviewers actually do?

In a column this week, management professor and expert on employee performance reviews, Samuel A. Culbert, provides a good metaphor. Culbert argues against the rhetoric from union-busting efforts, particularly directed at teachers' unions, going on in the U.S. right now. That rhetoric holds that unions protect incompetent employees and prevents rewarding those who are best. Culbert counters that the performance reviews that are a large part of determining whether someone is "incompetent" or "best" are subjective evaluations by supervisors. These supervisors, due to their own agendas or perhaps because they are not actually competent to judge, base ratings not on how much someone contributed the the bottom line, but predominantly on how "comfortable" the boss is with the employee. As he puts it:

Think about it. Performance reviews are held up as objective assessments by the boss, with the assumption that the boss has all the answers.

This fairly obvious observation should also be the one that overcomes the naivety about peer review. Like situations where a subjective review is necessary in the workplace (i.e., where it is not possible to look at someone's sales figures or other objective measure), peer review is a subjective exercise with no particular rules or even standards. Papers are reviewed by a few random people who likely to not have all the answers. Typically they know something about the subject matter, but do not have any expertise in forensic epidemiology – extracting hidden information from the meager information they have about what the authors did – nor expertise in research methods beyond what they know how to do in their own studies. Good graduate students are often excellent reviewers, with their fresh expertise in methods and recent experience in thinking hard about the topic. More senior researchers, like more senior managers, have settled into a rut and tend to review papers based on whether they feel comfortable with them. If they feel uncomfortable about the conclusions or with approaches they do not understand, then they give a bad review.

This has the obvious downside that it passes through a lot of bad analyses. But it also has negative effects of the type Culbert is concerned about when he writes,

What employee would ever say that the boss is wrong, and offer an idea on how something might get done better?

Indeed, it is very difficult to get something published that challenges the orthodoxy or that is done in ways that are better than the accepted standard. Every week there are quite a few papers published that propose tiny little improvements in the way statistical calculations are done in epidemiology. These are popular with authors because they are fun mathematical games, popular with certain kinds of journals because they give the illusion of seriousness, and popular with reviewers and editors because either they too are the types who like those games and do not feel threatened by arcane incremental changes that they will never bother with. On the other hand, if there is one useful scholarly analytic paper (not just bemoaning commentary) published that challenges one of the real fundamental problems in the science, it is a good month.

Culbert's main thesis is a partial solution he has worked out for the employee evaluation problem:

...boss and subordinate are held responsible for setting goals and achieving results. No longer will only the subordinate be held accountable for the often arbitrary metrics that the boss creates. ....bosses are taught how to truly manage, and learn that it’s in their interest to listen to their subordinates to get the results the taxpayer is counting on. Instead of the bosses merely handing out A’s and C’s, they work to make sure everyone can earn an A. And the word goes out: “No more after-the-fact disappointments. Tell me your problems as they happen; we’re in it together and it’s my job to ensure results.”

This is the type of peer review that I have been trying to encourage and develop, without much success I have to admit, for the last ten years. The idea is to recognize that every paper that contains any unique contribution, particularly including new data that would be lost for all purposes if not published, but also fundamentally new analyses of any importance, should be published. So once it finds its right home, in terms of subject matter and technical level, the optimal peer review process should be designed to make it good and publish it. Instead of the purely subjective or political model, where someone can simply say "no" with little justification, saying "you need to fix it by doing…" is fair but without being so regimented that it is anti-productive. Indeed, it is good for creating value.

Of course, this can only go so far. Editors and reviewers cannot be asked to fix dismally bad research reports. I have done it a few times, but only when I saw serious value buried beneath a pile of poor analysis and terrible writing. On the other hand, this is what most people think reviewers and journals do, right? They think journals are hands-on, playing an active role, working with the author to make the paper worthy. The reality is that with the exception of very few journals, they mostly just sort submissions, rejecting some and telling others to make some simplistic unimportant changes (trying to take away good ideas more often than adding them, in my experience) as the condition for publication.

Moreover authors who are interested in just cranking out dozens of lame papers will not be interested in taking the time to make them good (because – ironically given the metaphor – one fairly meaningless but nevertheless often used objective measure of academics' performance is counting up how many papers they publish, no matter the quality or number of coauthors). But many authors and readers would appreciate and benefit from this approach.

The metaphor should not be carried too far. The damage done to someone's life (and to the institution) by an incompetent or politically hostile manager is much greater than that from bad peer review, because the latter offers the solution of just shopping around for a better review. Thus, union-type protections against arbitrariness are rather more important. I know from personal experience that after the scientific leadership in public health at University of Alberta was replaced by political types, I suffered exactly the types of arbitrary attacks that union rules are designed to prevent. In order to try to force me to stop doing research on tobacco harm reduction (which the new department chair openly admitted was his motivation; others were slightly – but only slightly – cagier about it) the administration gave me the lowest score on my annual evaluation of anyone in the unit even though by basically every imaginable measure I was among the top few professors in the department. (The first of these was changed on appeal, thanks to the faculty union, and the second one was rendered moot by my severance that I accepted in lieu of filing a formal grievance about the full constellation of complaints I had about these and other actions taken against me.)

Finishing by moving back to on-point for my Sunday series about how to figure out who to believe, the analog to the worst-case stereotype union solution is that anyone can write whatever they want and publish it, and no one can ever judge them for competence because we cannot trust those who judge. The opposite is to blindly trust an arbitrary set of judges who have been demonstrated to not do a very good job. Neither of these is any good, and intelligent and creative people who really want to fix the problems should be able to figure out some option that takes some of what is good about both solutions, for both the case of peer review and public employees. Both cases seem to require a web of trust (which peer review is supposed to be), which necessarily includes openly stated standards and analysis (which standard peer review lacks) and also that allows the exercising of judgment (unlike overly rigid union rules or simplistically believing whatever someone happened to have already published). But in both cases, there are a lot of vested interests who prefer one of the broken extreme solutions to something better for educating our kids or the science.

EP-ology by Carl V. Phillips

06 March 2011

Unhealthful News 65 - A metaphor for the peer review process

No comments:

Post a Comment

EP-ology

Blog Archive

Blogs: critical analysis, THR, and other stuff I work on