Leaked Claude Code, AI bank account agents, and Gemini’s new cost‑latency API dials

01 Claude Code leak outlines Anthropic’s persistent‑agent and ‘Undercover’ features

What happened: a large set of Anthropic client code for Claude — dubbed “Claude Code” in coverage — was accidentally exposed, and researchers and reporters inspected it. The leaked material contains references to a persistent agent architecture, a stealth “Undercover” mode, and an apparent virtual assistant prototype called Buddy.

Why it matters: the artifacts in the leak make clear Anthropic had been building features that go beyond single-turn chat. Persistent agents imply state and background tasks that run across sessions, which changes how providers must think about safety, data retention, and user control.

Company response and takedown effort: Anthropic pursued DMCA notices to curb the spread of the leaked client code. That enforcement move, however, unintentionally flagged legitimate GitHub forks, illustrating how difficult it is to stop redistribution once code has leaked.

Practical implications for developers: engineers and integrators should assume agent-style capabilities will be prioritized across the industry. Expect more tooling for persistent state, session management, and tighter review of any third-party client code until providers document official SDKs and supported patterns.

Takeaways

Leaked client code shows Anthropic experimented with persistent agents and an “Undercover” stealth mode.
A prototype assistant named Buddy appears in the material, suggesting a consumer-facing assistant concept.
Anthropic’s DMCA takedown push accidentally impacted legitimate GitHub forks, highlighting distribution-control limits.

Sources

Here's what that Claude Code source leak reveals about Anthropic's plans Ars Technica AI [AINews] The Claude Code Source Leak Latent Space

02 Gradient Labs to power automated bank account managers with GPT mini/nano agents

What happened: OpenAI’s Gradient Labs announced a deployment that uses GPT-4.1 and GPT-5.4 mini and nano models to power AI agents that handle banking support workflows for every customer.

Why it matters: the rollout automates routine support tasks with low latency and the reliability targets Gradient Labs emphasizes. Banks integrating these agents can reduce live-agent load and scale standard account-management work across large customer bases.

Practical detail: Gradient Labs pairs larger models for complex reasoning with smaller, faster mini/nano models for low-latency operations. That hybrid architecture is designed to balance cost and responsiveness for production financial workflows.

Takeaways

Gradient Labs uses GPT-4.1 and GPT-5.4 mini/nano agents to automate bank support workflows.
The approach mixes larger reasoning models with smaller, low-latency models to manage cost and speed.
Banks adopting these agents can scale routine account management while aiming to keep latency low.

Sources

Gradient Labs gives every bank customer an AI account manager OpenAI Blog

03 Gemini API gains Flex and Priority dials to balance cost, latency and reliability

What happened: Google introduced new inference controls for the Gemini API that let developers adjust how the system trades off cost, latency, and reliability.

Why it matters: the API dials give product teams clearer levers for production tuning. Instead of a one-size-fits-all endpoint, teams can choose higher-priority inference for time-sensitive requests or lower-cost, flexible options for background jobs.

How developers will use it: expect routing rules that send urgent requests to Priority inference and less critical workloads to Flex. That creates predictable behavior for SLAs and can reduce cloud spend by aligning compute intensity with business needs.

Takeaways

Gemini API now exposes Flex and Priority inference settings to tune cost vs. reliability.
Developers can route urgent requests to Priority and background tasks to Flex to optimize spend.
The dials are aimed at giving production teams clearer SLA and cost-control options.

Sources

New ways to balance cost and reliability in the Gemini API Google AI Blog

Briefs

What moved around the edges

OpenAI buys TBPN to broaden global AI dialogue and back independent media

OpenAI announced it has acquired TBPN to accelerate global conversations about AI and provide support for independent media, aiming to expand engagement with builders, businesses, and broader tech communities.

OpenAI Blog

Mustafa Suleyman refocuses Microsoft AI remit on ‘superintelligence’ tied to business outcomes

Following a mid‑March restructuring, Microsoft’s inaugural CEO of AI, Mustafa Suleyman, has handed off some duties and shifted his public focus toward pursuing ‘superintelligence,’ with the company framing the effort around business and product priorities.

The Verge AI

Google posts March 2026 AI roundup covering Workspace, Home, and developer updates

Google published a March 2026 recap that consolidates recent product updates and new AI features across Workspace, Google Home, and developer tooling.

Google AI Blog

Microsoft’s MAI group unveils three foundational models for speech, audio and image generation

MAI released three new foundational models—announced six months after the group's formation—that can transcribe voice to text and generate audio and images, positioning Microsoft to compete on multimodal and speech-capable foundational models.

TechCrunch AI

Anthropic’s DMCA takedowns meant to stop leaked Claude Code swept up legitimate GitHub forks

Ars Technica reports Anthropic’s DMCA notices targeting leaked Claude Code client code unintentionally affected legitimate forks on GitHub, underscoring how legal takedowns struggle to fully contain widely distributed leaks.

Ars Technica AI

01 Claude Code leak outlines Anthropic’s persistent‑agent and ‘Undercover’ features

02 Gradient Labs to power automated bank account managers with GPT mini/nano agents

03 Gemini API gains Flex and Priority dials to balance cost, latency and reliability

What moved around the edges

Subscribe to AI Digest