Google, Microsoft and xAI submit AI models for US safety reviews

TL;DR:

  • Google, Microsoft and xAI have agreed to submit new AI models and features to the US Commerce Department’s Center for AI Standards and Innovation (CASI) for pre-deployment evaluation, expanding voluntary arrangements first put in place under the Biden administration in 2024.
  • OpenAI and Anthropic’s existing CASI arrangements will continue in renegotiated form; the move marks a shift for an administration that has otherwise focused on cutting AI red tape.
  • Resultsense view: the trigger here is not abstract safety theatre — it is operational concern about advanced models linking low-level vulnerabilities into working exploits, the very capability Anthropic claims for its Claude Mythos model.

The agreement comes alongside the Pentagon’s separate expansion of its approved AI defence suppliers and amid wider Washington concern over models capable of autonomous cyber-offensive work. The Commerce Department says CASI will “conduct pre-deployment evaluations and targeted research to better assess frontier AI capabilities and advance the state of AI security”.

A shift, but a voluntary one

The Trump administration’s headline AI direction since taking office has been deregulatory. The CASI arrangement is a notable counter-current — voluntary, but high-profile, covering three of the most-deployed US frontier labs. By keeping the OpenAI and Anthropic deals in place and adding Google, Microsoft and xAI, Commerce is bringing nearly all the most-capable US-headquartered foundation-model developers under a common pre-deployment evaluation regime.

The cyber concern

Silicon UK reports that Anthropic is currently working with financial institutions and government bodies to test their systems with Claude Mythos, which the company says is adept at chaining together multiple low-level flaws into advanced exploits. That capability — the same one cited by Politico’s reporting earlier this week on EU pressure for Anthropic to grant access — is the kind of frontier behaviour CASI evaluations are specifically designed to surface before public release.

Google has separately changed its AI rules to permit classified military work, while the US government has sought to remove Anthropic tools from some systems over the start-up’s refusal to grant blanket “any lawful use” permissions.

UK and EU context

For UK enterprises, the practical question is interoperability. Britain’s AI Safety Institute already holds bilateral testing arrangements with several US labs, and the ICO is mid-consultation on automated-decision-making guidance under the new Data (Use and Access) Act regime. Meanwhile in Brussels, EU legislators struck a deal this week to delay AI Act high-risk restrictions and exempt industrial AI from the law’s scope. UK firms now face three different frontier-AI evaluation regimes — CASI in the US, AISI in the UK, and a softened EU AI Act — that are clearly not converging.

Looking forward

The shape of the renegotiated OpenAI and Anthropic CASI deals is the next thing to watch. Whether they introduce binding commitments — or remain a voluntary disclosure regime — will tell UK regulators a great deal about where the global frontier-AI evaluation regime is heading.