V
Reassessing the assessment criteria and techniques traditionally used in evaluating computer science research effectiveness.
Academic culture iS changing. The rest of the world, including university management, increasingly as-sesses scientists; we must demonstrate worth through indicators, often numeric. While the extent of the syndrome varies with countries and institutions, La Fontaine’s words apply: “not everyone will die, but everyone is hit.” Tempting as it may be to reject numerical evaluation, it will not go away. The problem for computer scientists is that assessment relies on often inappropriate and occasionally outlandish criteria. We should at least try to base it on metrics acceptable to the profession.
In discussions with computer scientists from around the world, this risk of deciding careers through distorted instruments comes out as a top concern. In the U.S. it is mitigated by the influence of the Computing Research Association’s 1999 “best practices” report.a In many other countries, computer scientists must repeatedly explain the specificity of their discipline to colleagues from other areas, for example in hiring and promotion committees. Even in the U.S., the CRA report, which predates widespread use of citation databases and indexes, is no longer sufficient.
a For this and other references, and the source of the data behind the results, see an expanded version of this column at http://se.ethz. ch/~meyer/publications/cacm/research_eval-uation.pdf.
Informatics Europe, the association of European CS departments,b has undertaken a study of the issue, of which this Viewpoint column is a preliminary result. Its views commit the authors only. For ease of use the conclusions are summarized through 10 concrete recommendations.
Our focus is evaluation of individuals rather than departments or laboratories. The process often involves many criteria, whose importance varies with institutions: grants, number of Ph.D.s and where they went, community recognition such as keynotes at prestigious conferences, best paper and other awards, editorial board memberships. We mostly consider a particular criterion that always plays an important role: publications.
Research evaluation Research is a competitive endeavor. Researchers are accustomed to constant assessment: any work submitted—even, sometimes, invited—is
b See http://www.informatics-europe.org.
peer-reviewed; rejection is frequent, even for senior scientists. Once published, a researcher’s work will be regularly assessed against that of others. Researchers themselves referee papers for publication, participate in promotion committees, evaluate proposals for funding agencies, answer institutions’ requests for evaluation letters. The research management edifice relies on assessment of researchers by researchers.
Criteria must be fair (to the extent possible for an activity circumscribed by the frailty of human judgment); openly specified; accepted by the target scientific community. While other disciplines often participate in evaluations, it is not acceptable to impose criteria from one discipline on another.
Computer science concerns itself with the representation and processing of information using algorithmic techniques. (In Europe the more common term is Informatics, covering a slightly broader scope.) CS research includes two main flavors, not mutually exclusive: Theory, developing models of computations, programs, languages; Systems, building soft ware artifacts and assessing their properties. In addition, domain-specific research addresses specifics of information and computing for particular application areas.
CS research often combines aspects of engineering and natural sciences as well as mathematics. This diversity is
APriL 2009 | voL. 52 | no. 4 | communicAtionS of the Acm
31
References:
http://www.informatics-europe.org
http://se.ethz.ch/~meyer/publications/cacm/research_evaluation.pdf
http://se.ethz.ch/~meyer/publications/cacm/research_evaluation.pdf
http://se.ethz.ch/~meyer/publications/cacm/research_evaluation.pdf
Archives