Close This site uses cookies. If you continue to use the site you agree to this. For more details please see our cookies policy.

Search

Type your text, and hit enter to search:

Ethical AI: A bane for science?
 

AI holds vast potential, but a growing body of evidence has revealed deep flaws in how it is used in science. An interdisciplinary team of researchers led by Princeton University has published guidelines for responsible AI use in science

18
Image credits: Canva

AI holds the potential to help doctors find early markers of disease and policymakers avoid decisions that lead to war. But a growing body of evidence has revealed deep flaws in how machine learning is used in science, a problem that has swept through dozens of fields and implicated thousands of erroneous papers.

Now, an interdisciplinary team of 19 researchers, led by Princeton University computer scientists Arvind Narayanan and Sayash Kapoor, has published guidelines for the responsible use of machine learning in science.

“When we graduate from traditional statistical methods to machine learning methods, there are a vastly greater number of ways to shoot oneself in the foot,” said Narayanan, director of Princeton’s Center for Information Technology Policy and a professor of computer science. “If we don’t have an intervention to improve our scientific standards and reporting standards when it comes to machine learning-based science, we risk not just one discipline but many different scientific disciplines rediscovering these crises one after another.”

The authors say their work is an effort to stamp out this smouldering crisis of credibility that threatens to engulf nearly every corner of the research enterprise. A paper detailing their guidelines appeared on May 1 in the journal, Science Advances.

Since machine learning has been adopted across virtually every scientific discipline, with no universal standards safeguarding the integrity of those methods, Narayanan said the current crisis, which he calls the reproducibility crisis, could become far more serious than the replication crisis that emerged in social psychology more than a decade ago.

The good news is that a simple set of best practices can help resolve this newer crisis before it gets out of hand, according to the authors, who come from computer science, mathematics, social science and health research.

“This is a systematic problem with systematic solutions,” said Kapoor, a graduate student who works with Narayanan and who organised the effort to produce the new consensus-based checklist.

The checklist focuses on ensuring the integrity of research that uses machine learning. Science depends on the ability to independently reproduce results and validate claims. Otherwise, new work cannot be reliably built atop old work, and the entire enterprise collapses. While other researchers have developed checklists that apply to discipline-specific problems, notably in medicine, the new guidelines start with the underlying methods and apply them to any quantitative discipline.

One of the main takeaways is transparency. The checklist calls on researchers to provide detailed descriptions of each machine learning model, including the code, the data used to train and test the model, the hardware specifications used to produce the results, the experimental design, the project’s goals and any limitations of the study’s findings. The standards are flexible enough to accommodate a wide range of nuance, including private datasets and complex hardware configurations, according to the authors.

While the increased rigour of these new standards might slow the publication of any given study, the authors believe wide adoption of these standards would increase the overall rate of discovery and innovation, potentially by a lot.

“What we ultimately care about is the pace of scientific progress,” said sociologist Emily Cantrell, one of the lead authors, who is pursuing her PhD at Princeton. “By making sure the papers that get published are of high quality and that they’re a solid base for future papers to build on, that potentially then speeds up the pace of scientific progress. Focusing on scientific progress itself and not just getting papers out the door is really where our emphasis should be.”

Kapoor concurred. The errors hurt: “At the collective level, it’s just a major time sink,” he said. That time costs money. And that money, once wasted, could have catastrophic downstream effects, limiting the kinds of science that attract funding and investment, tanking ventures that are inadvertently built on faulty science, and discouraging countless numbers of young researchers.

In working towards a consensus about what should be included in the guidelines, the authors said they aimed to strike a balance: simple enough to be widely adopted, comprehensive enough to catch as many common mistakes as possible.

They say researchers could adopt the standards to improve their own work; peer reviewers could use the checklist to assess papers; and journals could adopt the standards as a requirement for publication.

“The scientific literature, especially in applied machine learning research, is full of avoidable errors,” Narayanan said. “And we want to help people. We want to keep honest people honest.”

    Tweet       Post       Post
Oops! Not a subscriber?

This content is available to subscribers only. Click here to subscribe now.

If you already have a subscription, then login here.