Biosecurity Statements Repository

By Shiying. H, ptnhean @ 2025-10-03T14:22 (+14)

We are glad to share the first draft of a Biosecurity Statements Repository, a project developed by @ptnhean and me in response to @Mslkmp  and @Tessa A 🔸  post (‘Five Tractable Biosecurity Projects You Could Start Tomorrow’) published earlier this year. 

Aim:

Our repository compiles 10–20 examples of biosecurity statements and practices from biological design tools and AI models. These examples have been analysed and grouped based on how different developers are addressing biosecurity and dual-use risk concerns.

Target Audience:

This repository is designed for developers working on computational biology research, who recognise the dual-use risks associated with AI tools. However, while many are aware of these risks, they may be unsure about how to publish statements addressing these risks - for both tool users and the general public. 

Methodology:

We started out by gathering existing statements and practices from organisations building AI biological design tools. We then reviewed these statements, thematically identifying common features, including explicit acknowledgement of dual-use risks. 

Following feedback from Tessa and Max, the repository has been updated with further relevant categories and emergent themes to improve readability and user-friendliness. Some of these categories now highlight how users interact with the tool/model, access management, tool subcategories, and more. 

Intended Impact: 

We hope that this repository can help to:

How to Use: 

The first tab provides an overview of all included tools and their corresponding categories. The extractions tab includes exact wording used in the original biosecurity statements, including the paragraphs we referenced during the process of categorising these tools. 

A section below details a list of categories and values used in the repository, for those who are interested. 

Feedback:

If you know developers of  AI biological design tools or model developers who may benefit from this resource, we would appreciate your help in sharing this with them. 

We also welcome feedback on the repository itself, as well as suggestions on any additional categories, important tools or practices we may have missed. 

Acknowledgements:

Team: Shiying He, PT Nhean

Conceptualisation and feedback: Tessa Alexanian and Max Langenkamp; 

Tool Subcategories and Values: Adapted based on ‘Understanding AI-Facilitated Biological Weapon Development’ by the Centre for Long-Term Resilience (CLTR) and Global Risk Index for Al-enabled Biological Tools by CTLR and RAND Europe) - further reported in the table below. 

Repository Categories and Values:

Category

Values

Date (MM/YYYY)
  • MM/YYYY
Countries
  • Origin of the author/developer’s host institution, not nationality
Organization
  • To separate from model name 
User’s interaction with the model

Choose ≥1 value

  • Local execution: running the model directly on a user’s device or local infrastructure, won’t require external servers or internet access.
  • API access: accessing the model through web-based platform, usually hosted by the developer or a third party (e.g. querying a pathogen prediction model via cloud-based API)
  • Secure server: hosting and operating the model on a controlled server with strong security and access restrictions to limit unauthorized or unsafe uses (e.g requiring user authentication and monitoring)
  • Not reported: interactions with model are unclear due to model being released to a small group of scientists for testing (e.g. Google AI co-scientist)
Paywall

Choose 1 value

  • Yes, with free trials
  • Yes, no free trials
  • Yes, but with a free version
  • No
  • Not reported: presence of paywall is unclear due to model being released to a small group of scientists for testing (e.g. Google AI co-scientist)
Access Management

Choose ≥1 value

  • Authentication: verify user identity before granting access
  • Open weights: model parameters are publicly available for download and use
  • Open source: model code is publicly accessible and modifiable

Tool Subcategories (adapted based on the following reports:  ‘Understanding AI-Facilitated

Biological Weapon Development’ by the Centre for Long-Term Resilience (CLTR) and Global Risk Index for Al-enabled Biological Tools by CTLR and RAND Europe) 

Choose ≥1 value(s) 

  • Protein design tools: tools that can predict the sequence of proteins with specified structural and/or functional properties (e.g. binding with a given target).
    • Example user input: 3D protein structure
    • Output: Amino acid sequences
  • Small biomolecule design tools: Tools that can predict molecular structures with specific profiles (e.g. generating a drug that provokes a desired biological response and maintains acceptable pharmacokinetic properties)
    • Example user input: ligand structure, target molecule structure or class, and desired property
    • Output: molecular structure
  • Pathogen property prediction: tools that can predict or detect features of a pathogen such as propensity for zoonotic spillover, host tropism, likelihood of infecting humans, virulence, etc.
    • Example user input: genome sequences
    • Output: zoonotic spillover prediction score or classification (e.g. high risk)
  • Host-pathogen interaction prediction tools: tools that can predict the protein-protein interactions between a given host and pathogenic agent (e.g. predicting likelihood of antibody escape for viral mutations, exploitation of host mechanisms, or the virus’ entry mechanism into host cells).
    • Example user input: host protein sequences, viral protein sequences
    • Output: Likely interactions between host and viral proteins
  • Viral vector design tools: tools that can predict the amino acid sequences of virus capsids with the aim of optimizing them as delivery vectors (e.g. capable of assembling and packaging their own genomes, low immunogenicity)
    • Example user input: target capsid amino acid segment for mutation
    • Output: amino acid sequences
  • Immunological system modelling: tools that artificially replicate a component of the human immune system with the aim of predicting immune responses (e.g. predicting T-cell receptor epitope recognition)
    • Example user input: amino acid sequence of TCR CDR3 region and the epitope
    • Output: likely TCR recognition of an epitope COVID 19 clinical outcome prediction
  • Experimental design/simulation and autonomous tools: tools that are able to generate and simulate designs given a predefined objective, and predict experimental outcomes. Tools that are able to conduct experiments (including physical tests, modelling, or data mining) without human intervention.
    • Example user input: experimental workflow and variables, and laboratory automation equipment
    • Output: optimized methods or variables, simulated experimental data, or experimental data
Biosecurity Actions

Choose ≥1 value(s) 

  • Statement acknowledging dual-use risk: a formal recognition that their tools could be misused for harmful biological purposes (e.g. a disclaimer noting potential misuse in a model card for a protein design tool, or ethics statement).
  • Expert engagement: involving biosafety, bioethics, and AI experts in the development and review processes (e.g. forming an external advisory board, qualitative interviews with experts to ensure safe use cases).
  • Curation of training data: synthetic data, or training data undergoing a filtering process to limit biological data that can enable dangerous capabilities (e.g. excluding pathogen virulence factors from a genomic training dataset).
  • Evaluation for dual-use risk: assessing models or tools for their potential misuse before release (e.g. red teaming exercises on a biological design tool, or adopting frontier evaluation frameworks)
  • Adoption or development of governance frameworks: applying internal policies or aligning with external standards to ensure safe development (e.g. adopting the WHO guidance on dual-use research, or create a lab-specific AI-biosafety protocol)
  • Community consultations: Forums, engagements, discussions or dialogues with potential users and/or stakeholders. (e.g. Alphafold 3: "Building on the external consultations we carried out for AlphaFold 2, we’ve now engaged with more than 50 domain experts, in addition to specialist third parties, across biosecurity, research and industry, to understand the capabilities of successive AlphaFold models and any potential risks. We also participated in community-wide forums and discussions ahead of AlphaFold 3’s launch.")
    • E.g. Co-Scientist’s trusted tester program to gather feedback from researchers using the tool
  • Provision of recommendations: Provides recommendations for future work, safeguards, mitigation strategies.
  • Publication of evaluation tests and/or results: provides additional details on the methodology of evaluations conducted, benchmarks used and/or results of evaluations
Information HazardsOnly flag if present