Anthropic publishes election safeguards results for Opus 4.7

TL;DR:

  • Claude Opus 4.7 and Sonnet 4.6 scored 100% and 99.8% respectively on Anthropic’s 600-prompt election-policy compliance test, and 94% / 90% on simulated multi-turn influence-operation attempts.
  • Anthropic is bringing back election banners on Claude.ai for the US midterms (pointing to TurboVote) and plans similar messaging for Brazil’s elections.
  • For UK readers, the test is whether equivalent safeguards arrive in time for the May 2026 UK local and devolved elections — and whether forthcoming evaluations are published with comparable transparency.

Anthropic has published an update on the safeguards it has built into Claude ahead of the 2026 US midterms and other elections this year, including evaluation scores for its newest models. The post details political-bias measurement, election-related usage policy enforcement, real-time information sourcing, and a new test for autonomous influence operations.

On a 600-prompt benchmark — 300 harmful (e.g. attempts to generate election misinformation) and 300 legitimate (e.g. drafting campaign content) — Claude Opus 4.7 responded appropriately 100% of the time and Sonnet 4.6 99.8% of the time, the company says. On a separate simulated influence-operations test using multi-turn adversarial conversations, both models responded appropriately around 90% (Sonnet) and 94% (Opus). On political-bias evaluations, the same models scored 95% and 96% — measuring whether responses treat opposing viewpoints with comparable depth.

A new autonomous-operations test

For the first time, Anthropic also tested whether models could plan and run a multi-step influence operation autonomously, end-to-end, without human prompting. With safeguards and training in place, both models refused nearly every task. With safeguards stripped — used only to measure raw capability — Mythos Preview and Opus 4.7 both completed more than half the tasks. The company frames that as a reason for continued vigilance even when production safeguards are working.

In product terms, Anthropic is reactivating election banners on Claude.ai pointing US users to TurboVote, with a similar mechanism planned for Brazil. Web-search triggering on election-related questions was measured at 92% (Opus) and 95% (Sonnet) across more than 600 prompts.

Looking forward

The UK has local and devolved elections in May 2026, alongside ongoing scrutiny of how chatbots handle political queries from voters. Anthropic’s published methodology and open-source evaluation dataset set a useful precedent for what UK regulators — including the Electoral Commission and the AI Security Institute — could expect from frontier vendors deploying assistants here. The harder question is whether equivalent UK-specific evaluations exist and whether Anthropic, OpenAI and Google will publish them at the same level of detail. The Met’s separate move into Palantir-driven internal investigations underlines that AI accountability questions are no longer abstract for UK public bodies; the same applies when AI is sitting between voters and election information.