The 2025 Peregrine Report

208 Expert Proposals for Reducing AI Risk

Maximilian Schons, Samuel Härgestam, Gavin Leech, and Raymund Bermejo

In collaboration with and supported by Halcyon Futures

Say you are unconstrained by money, and can get all the talent in the world – what are the top interventions that will have a substantial impact over the next 2 years?

Executive Summary

Purpose and context. By early 2025, mainstream debates about AI had recognized the possibility of transformative AI coming within just a few years, far faster than most historical forecasts. However, we found no comprehensive list of proposed AI risk mitigations that would be viable in such a scenario. This report addresses that gap, complementing resources like the IAPS AI Reliability Survey (O’Brien, 2025), which identifies the most promising research prospects to guide strategic AI R&D investment, and Risk & Reward, 2024 AI Assurance Technology Market Report (Juniper Ventures, 2024), which explores the landscape of AI risk management from an investment perspective.

Methods. We conducted 48 in-depth interviews, with key staff at OpenAI, Anthropic, Google DeepMind, Mila, AMD, the EU AI Office, multiple AI Safety Institutes, METR, RAND, Scale AI, GovAI, Transluce, and ARIA. Participants were explicitly asked to consider interventions that might currently seem costprohibitive or politically infeasible – the focus was on fast, positive impact, assuming transformative AI were to arrive within only a few years. These interviews, distilled, then served as the basis for a four day retreat for 25 senior participants to discuss further. To ensure participants could speak freely, both interviews and the retreat were held under the Chatham House Rule.

Results. From the above interviews we distilled two main results: I) a structured portfolio of 208 initiatives, clustered into eight domains; II) a set of four clusters of broader strategic considerations affecting the viability of any such efforts:

A need for readiness: Too many efforts still optimize for polish over time-to-impact. With multi-year research cycles increasingly out of step with AI progress, execution needs to shift toward rapid prototypes, staged pilots, and funding mechanisms that can mobilize substantial capital in weeks or months, rather than quarters or years.

A need for coordination: The ecosystem remains fragmented and often duplicative. Actors working on risk mitigation should be pragmatic – one does not need total alignment with all other actors to have fruitful collaborations.

A need for standardization: Interviewees repeatedly called for shared audit interfaces, interoperable evaluation layers, clear capability surfaces, and operational definitions for terms like “AGI” to prevent institutions from talking past each other.

Finally, a need to address capacity constraints: evaluation capacity is too small, technically grounded leadership is scarce, and the field still relies too heavily on inexperienced talent rather than recruiting seasoned operators from adjacent domains.

Conclusion. The input we captured from key AI stakeholders converged around faster execution, better coordination, shared standards, and stronger operational capacity. The structure of our questionnaire and the timing of the interviews (early 2025) likely shaped what respondents focused on. Numerous concrete initiatives emerged, with varying levels of feasibility and expected impact. Our sample size indicates we likely underrepresented perspectives and still miss promising project candidates. The results of this report are therefore best understood as a structured starting point for further indepth analyses of projects and the overall AI security landscape.

Illustrative Projects From Each Domain

Technical AI Alignment Research

Defense-in-Depth Analysis of Post-training: Take an open model produced by an organization like DeepSeek and systematically implement every known safety technique on it, measuring how these approaches stack together and where they might conflict or have gaps.

Proposal #4

Evaluation & Auditing Systems

Ultra-Reliable AI Evaluation: Develop benchmarking and engineering methodologies that can identify cases of one in a million or less where models fail catastrophically.

Proposal #55

Intelligence Gathering & Monitoring

AI OSINT: Provide relatively cheap intelligence without requiring unilateralist action, making it politically feasible while revealing where regulatory or governance levers might be needed.

Proposal #65

AI Governance & Policy Development

Human Verification Systems: Build robust systems for verifying human identity to provide a foundational security layer for protecting critical decision-making processes.

Proposal #90

International Coordination

Cross-Border Notification Systems: Develop mechanisms for countries to alert each other about out-of-control AI systems, similar to “red phones” during the Cold War.

Proposal #115

Preparedness & Response

Autoverification (Lean): Develop systems that automate formal verification through Lean theorem proving, addressing the critical shortage of Lean programmers worldwide.

Proposal #173

Public Communication & Awareness

Consensus-Building Evidence for AI Risk: Create compelling, empirical evidence of AI risks through large-scale experiments via concrete demonstration and graphics, as opposed to doing so through abstract theory or thought-experiments.

Proposal #185

Miscellaneous

Whistleblower Protection Fund: Establish a large, long-horizon fund on the order of several hundred million dollars – enough to secure the livelihoods of a substantial cohort of potential whistleblowers for a decade and to cover major legal exposure – ensuring both financial safety and sustained legal protection.

Proposal #189

Broad Strategic Considerations

Below is a list of converging insights and ideas from interviewees that, although not directly related to a specific project, establish broader strategic considerations for successfully executing projects under the assumption of significantly accelerated AI progress.

Readiness

Coordination Across the Ecosystem

Standards and Common Interfaces

Capacity Constraints

Readiness

Increasing the pace of preparation and research is vital. People and projects should aim to deliver results sooner rather than later, even if more deliberation would yield a cleaner outcome. If your results will only come out in 3 years’ time, build intermediate prototypes or pivot to a different project that yields (possibly-preliminary) outputs earlier.

Some of the most important work will ultimately require substantial, multi-year budgets. The opportunity now is to launch proof-of-work pilots with clear milestones that can start immediately and scale – many are viable at six- to seven-figure levels. A staged, evidence-driven portfolio lets funders de-risk their spend and double-down on what works.

Government agencies with budget, teeth, and technical staff should be created to monitor and, when needed, restrict further development by AI labs. This is politically difficult, for good reason. Conveying the reasoning for these decisions is as necessary as establishing the legal and institutional structure itself.

Political views on AI risk will vary over time. Consider the design of programs to be durable across administrations – non-partisan governance, multi-year funding, and risk-triggered escalation criteria.

Increasing general technical understanding among governments and leading company directors seems valuable but hard. A specific organization dedicated to explaining the current state of AI and likely near-term trajectories to influential people seems immensely valuable if it works, but it runs into the problem of being hard to distinguish from less credible actors.

Some effort should be dedicated towards preparation for vital opportunities in the near future. The focus should be on response speed and ability to quickly evaluate a project’s necessity. Examples of organizational structures include funds intending to finance promising projects within days of them being proposed.

Economic transition seems quite inevitable. The exact nature is somewhat uncertain, but it is very likely that entire areas of the job market will evaporate. The task is to identify which areas are likely to get hit sooner and, as a stretch goal, what to do about it.

Coordination Across the Ecosystem

Competent, dedicated, and charismatic leaders are crucial to ensuring projects run smoothly.

Competitive salaries and benefits, including good living environments for those working on governance and alignment projects, are crucial.

Diversify organizational types – nonprofits, businesses, governmental departments – so the benefits of incentives inherent to all of these structures can be captured.

Adversarial relationships with labs, such as attempts to draft coercive laws with leading AI labs, are strongly discouraged. Working relationships are crucial, including labs in other countries. (This is somewhat at odds with other recommendations, such as ones requiring strict training data control.)

The current ecosystem is quite disconnected – a lot of work is duplicated without much reason beyond failure to notice it is being done competently and reliably elsewhere. A system should be created to make parallel work more efficient. As a second-best, a more defined system for propagating the concrete conclusions from existing work would also be useful, but this is less relevant, as existing mechanisms have partial functionality. Internal sharing of alignment and control approaches among all labs might be a viable substitute if other approaches fail.

Currently, there is no shared global roadmap in place that is on track to fully address AI risk. Guarantees are not especially useful, but having a rough roadmap to ensure 80%, 95% and 99% chance of avoiding catastrophic outcomes would be beneficial. This is very challenging due to the tradeoffs between setting realistic goals and having realistic odds of avoiding catastrophe, but it is still worth attempting. This report outlines priority projects, likely owners, and early milestones that can seed a roadmap.

To be worthwhile, global coordination does not need to resolve every potential issue, but rather to establish a necessary condition – to reach a state where all major world players, both private and governmental, have some incentives to meet baseline, auditable safeguards.

There is a need to determine how to allocate authority and responsibility for deciding which alignment and control measures should be optimized. Questions include how equality is prioritized, and how to account for non-human experiences. More broadly, we may ask whether a majority of humans deciding on an answer to questions like these is sufficient for that to become a guiding principle.

Standards and Common Interfaces

A general interface for auditing, both internal and for third-parties, is required. Opinions vary on whether this should be achieved through centralized legislation or voluntary agreements. The results of using this general interface should be intuitive and usable by laymen.

While laws passed by governments and international organizations are the default, other avenues should also be considered. Examples include research agreements where labs share novel breakthroughs contingent on alignment measures being reliably followed.

A lot of value lies in a neat, accessible user interface for questions like capabilities. Dynamically answering user questions regarding what the system can actually do, when asked, is a widely desired property for audit software or even user-facing models.

There is currently no agreed-upon definition of what exactly AGI means. Getting authorized representatives from leading AI organizations in the same room to agree on the terms that refer to specific real-world consequences would be valuable, turning slogans into operational definitions, enabling consistency of action, and reducing the risk of talking past each other.

Capacity Constraints

Current evaluations and red-teaming efforts are inadequate in terms of frequency and sophistication. Massive upscaling is necessary.

Technically-skilled people should be in positions of leadership. This doesn’t need to be ubiquitous, but a majority of people with a hand on the chisel, having a sense for what sculpture looks like, is necessary. Otherwise, the necessary research taste will be missing.

It is probably too late to train promising but inexperienced talent. Finding people already experienced in relevant fields who are currently working elsewhere is recommended. This doesn’t necessarily exclude young candidates, but they are much less likely to have the relevant technical skills.

Funding is likely to be constrained in the short term, especially if competitive salaries are used. A widely shared but not ubiquitous opinion was that this would become less of a constraint as time goes on due to the increased obviousness of posed dangers.

A widely noticed practical use case for alignment measures would be very useful, but it is hard to come by before reaching critical levels of danger. More effort should be devoted to finding examples of current failures that are likely indicative of future catastrophic failures.

Download the full Peregrine report for more details, important considerations, and to explore all 208 proposals

Read the report