Assessing Science is hard! NSERC bureaucrats should know it, but then so do we!

Dozens of Canadian scientists are now back home from Ottawa after a week of “grant selecting” at NSERC. Many are self-satisfied by their 5 days of empowerment (the “Ottawa power trip”?). Others are embittered by the ever-tightening bureaucratic grip on Canada’s science evaluation and granting.

It is unfortunate that this post is coming post-competition, because it could have been a useful reminder for panelists of a few valuable lessons from our science history books: Assessing Science is hard; evaluating the research of our peers is a complex process; and beware of anyone advocating or working toward the one “correct” metric for science.

I wrote before about the self-satisfaction that most panelists feel after completing a competition cycle at NSERC. This is to be expected. Panelists are often convinced that they have done their best being as fair and as just as possible under the rules and constraints of the game provided to them. They haven’t been asked to evaluate these rules in the middle of a competition. They are only asked to play by them. They do their best under the circumstances and feel good about having done so.

But now that they are home, it may be high time for them to think through these rules, how they affected their own judgment, and their impact on the research careers of their peers.

But then it hit me that maybe something should have been done much earlier, before their trip to Ottawa, or at least at the beginning of the grant selection process. As part of their initiation to the process, NSERC should have considered having special workshops for panelists. Sessions where a healthy dose of objectivity-inducing humility lessons could be impressed on them. How?

One could remind the panelists that there is a slightly surreal aspect to this activity of assessing their peers. There should be a constant awareness that it may be near-impossible to accurately evaluate scientific work, and that every time a committee decides to award or decline a grant, they are making a judgment that can be fraught with unintended and unexpected consequences.

Scientists should know that Science is filled with examples of major discoveries that were initially either under-appreciated or not understood at all by their peers.

Alfred Wegener deduced continental drift by integrating data from diverse disciplines. He was generally ridiculed at the time (1915), but was vindicated 50 years later when plate tectonics prevailed.

George Boole’s fellow mathematicians thought he was wasting his time comparing the ancient arts of logic & reasoning to mathematical systems. 70 years later, Claude Shannon stumbled on this obscure “Boolean Algebra” and proposed using it to make electrical switches do logic, which eventually led to the current great leaps in Computer Science.

Satyendra Nath Bose never won the Nobel, though his –then unappreciated– work on Bose-Einstein statistics/condensates eventually led to several Nobel prizes, including our own Carl Wieman.

Ludwig Boltzmann, who made founding contributions to the fields of statistical mechanics and statistical thermodynamics, eventually committed suicide over ridicule he received from other scientists over theories that were posthumously accepted.

Scientists are sometimes even unsure about their own scientific ideas. In 1917, Einstein modified the equations of general relativity that he had introduced 2 years earlier, adding an extra term involving what is called the “cosmological constant”. He regretted this addition 12 years later after some work of Edwin Hubble, only to be vindicated seven decades later, when it was discovered in 1998 that there really is a need for the cosmological constant. In other words, “Einstein’s “biggest blunder” was, in fact, one of his most prescient achievements”.

If even Einstein demonstrably made mistakes in judging his own research, how can the rest of us measure reliably the value of science systematically, and claim to organize the scientific systems of entire countries around our attempts to measure?

In his essay, “The mismeasurement of science”, Michael Nielsen argues that heavy reliance on a small number of centralized metrics is bad for science – It suppresses cognitive diversity; it creates perverse incentives; it misallocates resources.

The main argument of Nielsen is essentially “against homogeneity in the evaluation of science: it’s not the use of metrics…. rather the idea that a relatively small number of metrics may become broadly influential.” He argues that “it’s much better if the system is very diverse, with all sorts of different ways being used to evaluate science”. He advocates for the “use of a more heterogeneous system”. In other words, just the exact opposite of what NSERC’s bureaucrats are forcing us to do! And he hadn’t heard of the HQP metric on evaluating science, which would have disqualified, among many other scientific giants, Isaac Newton, Michael Faraday, Henri Poincaré, and John Milnor (whose 80th birthday is being celebrated at BIRS this coming week).

His view is that “the best way to evaluate science is to ask a few knowledgeable, independent- and broad-minded people to take a really deep look at the primary research, and to report their opinion, preferably while keeping in mind the story of Einstein”.

This is as close as one can get to the evaluation system that NSERC just did away with.

4 Responses to Assessing Science is hard! NSERC bureaucrats should know it, but then so do we!

Peter Bell says:

February 21, 2011 at 12:41 am

Bravo! Thanks for great writing.

Daniel Lemire says:

February 21, 2011 at 2:01 am

This post should become a manifesto. I would sign it.

- Ghoussoub says:
  
  March 1, 2011 at 4:18 pm
  
  Thanks Daniel for this huge compliment coming from a pro.
  
Antony Hodgson says:

February 22, 2011 at 8:28 pm

As someone who has just recently served on an NSERC Discovery Grant committee, I completely agree with this assessment. We are moving towards a system in which those who have the largest research ‘machines’ will get virtually all the money, and those who are just starting out, who are working in small groups or who have a broad mix of activities which do not necessarily result in many research publications will have difficulty being funded at all. By concentrating all the resources into fewer hands, rather than providing a modicum of enabling resources to a wide array of researchers, we risk cutting a small number of deep and wide grooves in the research landscape, thereby creating well-worn ruts, rather than exploring the terrain widely. I worry deeply about creating such a narrow focus in our national research efforts.