Google, xAI and Microsoft sign US pre-release AI safety reviews
TL;DR:
- The Center for AI Standards and Innovation (Caisi), part of the US Department of Commerce, has signed pre-deployment evaluation agreements with Google DeepMind, Microsoft and xAI, covering cyber, biosecurity and chemical-weapons risks.
- Separately, OpenAI confirmed it had given the US government early access to GPT-5.5 for national security testing, in a LinkedIn post by policy chief Chris Lehane.
- Resultsense view: the deals materially close the gap with the UK’s AI Security Institute (AISI), which already runs comparable assessments. UK firms running US-headquartered frontier models should expect converging vendor-disclosure requirements on cyber capability.
The Caisi agreements, announced on Tuesday, follow a similar deal struck with Anthropic and OpenAI under the Biden administration two years ago, when the agency was named the US AI Safety Institute. The earlier programme has run more than 40 evaluations to date, often on unreleased models with safety guardrails removed or reduced.
Driven by Mythos
The Financial Times reports that senior US officials have been “spooked” by Anthropic’s Mythos model, which the company says can identify and exploit cyber-security vulnerabilities at substantially greater scale than predecessors. Anthropic chief executive Dario Amodei met White House chief of staff Susie Wiles last month, easing a stand-off after the start-up was earlier labelled a national security threat for refusing to allow the Pentagon unrestricted use of its technology — a designation Anthropic is suing to overturn.
President Donald Trump struck a more conciliatory tone in a recent CNBC interview, saying Anthropic was “shaping up”. Advisers have separately discussed an executive order that would impose pre-deployment assessments more formally, though the discussions are described as early-stage.
UK parallel: Microsoft also signs with AISI
Microsoft confirmed in a parallel blog post that it had signed an equivalent agreement with the UK’s AI Security Institute, also focused on national security and large-scale public safety risks. AISI has had similar arrangements with most major frontier labs since 2024 and supplied much of the procedural template the US has now copied.
For UK enterprise buyers, this convergence matters in two ways. First, it tightens the audit trail vendors must produce for capability claims — particularly around offensive cyber and dual-use scientific risk. Second, UK financial-services firms operating across the channel are likely to see overlapping pre-deployment testing requirements emerge as US, UK and EU regulators settle into a common evaluation cadence.
Looking forward
The remaining question is whether OpenAI’s testing arrangement is folded into the Caisi framework or stays as a separate bilateral. Either way, with three of the four largest US labs now formally inside the new Caisi regime — and Anthropic still in dispute — the floor on US frontier-model oversight has moved sharply in a single quarter. Procurement teams should expect AISI-style cyber-capability declarations to become standard contract language.