Open access to biological data has long been a cornerstone of scientific progress, enabling researchers around the world to collaborate, verify results and accelerate discoveries. However, a growing group of experts is warning that unrestricted access may not be appropriate for all biological information in the era of artificial intelligence.
A new paper argues that certain types of pathogen-related data could significantly improve the capabilities of advanced AI systems, potentially increasing biosecurity risks if the information is misused. To address this challenge, the authors propose a new framework for regulating access to the most sensitive biological datasets while preserving the principles of open science wherever possible.
The researchers emphasise that governments should remain cautious about imposing broad restrictions on scientific data. Open access promotes innovation, transparency and reproducibility, all of which are essential to scientific advancement. Yet recent studies suggest that a small subset of biological information—particularly data related to human-infecting pathogens—may require additional safeguards.
Evidence from several AI models highlights the concern.
The stakes are high. Scientists and policymakers have increasingly warned that future AI models could help create severe biological threats by making it easier to design pathogens with enhanced transmissibility or pandemic potential. Researchers refer to these abilities as “capabilities of concern”—AI functions that could enable the creation or optimisation of dangerous biological constructs. These capabilities of concern resemble the capabilities that scientists and policymakers have homed in on as most concerning after decades of debate about dual-use biological research.
Researchers have proposed several AI model “capabilities of concern” that may enable such high-consequence harms, for example, a model’s ability to predict or generate more transmissible variants for pandemic pathogens.
To reduce these risks, the paper proposes a tiered classification system for pathogen-related datasets. Data would be categorised according to its potential contribution to sensitive AI capabilities, with stricter access controls applied to higher-risk information.
However, the authors argue that technical safeguards alone are not enough.
“Reducing AI biosecurity risks requires technical solutions, but technical solutions do not exist in a vacuum,” the paper states. Effective risk reduction, the authors argue, will require a combination of technological safeguards and robust institutional oversight.
As a result, the researchers propose a governance framework specifically designed to oversee newly generated dual-use pathogen data—information that could support legitimate scientific research but might also be exploited for harmful purposes. Although the proposal is rooted in the United States’ regulatory experience, the authors believe its principles could be applied internationally.
The framework draws lessons from decades of biological research oversight. Institutions such as the U.S. National Institutes of Health have developed rules governing human clinical trials, laboratory biosafety, genetic data protection and the dissemination of potentially dangerous research findings. While these systems have achieved varying levels of success, they provide valuable experience in balancing scientific freedom with public safety.
The proposed governance model is built around several key principles.
The first is clarity. The authors argue that data governance rules should be objective and straightforward to apply, leaving little room for subjective interpretation. Previous oversight systems have often relied on vague standards that require researchers or review committees to make difficult judgments outside their areas of expertise. More technically precise rules, they argue, would improve consistency while reducing the administrative burden on scientists.
The second principle is independent oversight. The researchers suggest that decisions regarding sensitive biological data should be made by neutral authorities rather than organisations with direct research or funding interests. While self-regulatory systems have helped preserve academic independence, they may also create conflicts of interest when assessing potentially high-risk research.
The paper arrives amid growing global debate over how to govern increasingly powerful AI systems. Most AI safety discussions have focused on regulating model training, limiting access to model weights, or controlling outputs. By contrast, this proposal targets an earlier stage of development: the data itself.
The authors compare their approach to privacy regulations, where controls are often applied before sensitive information becomes widely available. They argue that governing pathogen data upstream may offer a more effective way to reduce risks before advanced AI models acquire potentially dangerous biological capabilities.
As artificial intelligence continues to reshape scientific research, the challenge for policymakers will be finding a balance between openness and security. According to the researchers, protecting society from emerging biosecurity risks should not come at the expense of scientific progress—but neither can the potential consequences of unrestricted access to sensitive pathogen data be ignored.
The paper was authored by Doni Bloomfield of the Fordham University School of Law; Allison Berke of RAND; Moritz S. Hanke and James R. M. Black of the Center for Health Security at the Johns Hopkins Bloomberg School of Public Health; Aaron Maiwald and Oliver M. Crook of the University of Oxford; Toby Webster of RAND Europe; Tina Hernandez-Boussard of the Stanford University School of Medicine; and Jassi Pannu of the Center for Health Security at the Johns Hopkins Bloomberg School of Public Health.
10.06.2026.




