01 Claude Code leak outlines Anthropic’s persistent‑agent and ‘Undercover’ features
What happened: a large set of Anthropic client code for Claude — dubbed “Claude Code” in coverage — was accidentally exposed, and researchers and reporters inspected it. The leaked material contains references to a persistent agent architecture, a stealth “Undercover” mode, and an apparent virtual assistant prototype called Buddy.
Why it matters: the artifacts in the leak make clear Anthropic had been building features that go beyond single-turn chat. Persistent agents imply state and background tasks that run across sessions, which changes how providers must think about safety, data retention, and user control.
Company response and takedown effort: Anthropic pursued DMCA notices to curb the spread of the leaked client code. That enforcement move, however, unintentionally flagged legitimate GitHub forks, illustrating how difficult it is to stop redistribution once code has leaked.
Practical implications for developers: engineers and integrators should assume agent-style capabilities will be prioritized across the industry. Expect more tooling for persistent state, session management, and tighter review of any third-party client code until providers document official SDKs and supported patterns.
- Leaked client code shows Anthropic experimented with persistent agents and an “Undercover” stealth mode.
- A prototype assistant named Buddy appears in the material, suggesting a consumer-facing assistant concept.
- Anthropic’s DMCA takedown push accidentally impacted legitimate GitHub forks, highlighting distribution-control limits.
02 Gradient Labs to power automated bank account managers with GPT mini/nano agents
What happened: OpenAI’s Gradient Labs announced a deployment that uses GPT-4.1 and GPT-5.4 mini and nano models to power AI agents that handle banking support workflows for every customer.
Why it matters: the rollout automates routine support tasks with low latency and the reliability targets Gradient Labs emphasizes. Banks integrating these agents can reduce live-agent load and scale standard account-management work across large customer bases.
Practical detail: Gradient Labs pairs larger models for complex reasoning with smaller, faster mini/nano models for low-latency operations. That hybrid architecture is designed to balance cost and responsiveness for production financial workflows.
- Gradient Labs uses GPT-4.1 and GPT-5.4 mini/nano agents to automate bank support workflows.
- The approach mixes larger reasoning models with smaller, low-latency models to manage cost and speed.
- Banks adopting these agents can scale routine account management while aiming to keep latency low.
03 Gemini API gains Flex and Priority dials to balance cost, latency and reliability
What happened: Google introduced new inference controls for the Gemini API that let developers adjust how the system trades off cost, latency, and reliability.
Why it matters: the API dials give product teams clearer levers for production tuning. Instead of a one-size-fits-all endpoint, teams can choose higher-priority inference for time-sensitive requests or lower-cost, flexible options for background jobs.
How developers will use it: expect routing rules that send urgent requests to Priority inference and less critical workloads to Flex. That creates predictable behavior for SLAs and can reduce cloud spend by aligning compute intensity with business needs.
- Gemini API now exposes Flex and Priority inference settings to tune cost vs. reliability.
- Developers can route urgent requests to Priority and background tasks to Flex to optimize spend.
- The dials are aimed at giving production teams clearer SLA and cost-control options.
What moved around the edges
OpenAI buys TBPN to broaden global AI dialogue and back independent media
OpenAI announced it has acquired TBPN to accelerate global conversations about AI and provide support for independent media, aiming to expand engagement with builders, businesses, and broader tech communities.
OpenAI BlogMustafa Suleyman refocuses Microsoft AI remit on ‘superintelligence’ tied to business outcomes
Following a mid‑March restructuring, Microsoft’s inaugural CEO of AI, Mustafa Suleyman, has handed off some duties and shifted his public focus toward pursuing ‘superintelligence,’ with the company framing the effort around business and product priorities.
The Verge AIGoogle posts March 2026 AI roundup covering Workspace, Home, and developer updates
Google published a March 2026 recap that consolidates recent product updates and new AI features across Workspace, Google Home, and developer tooling.
Google AI BlogMicrosoft’s MAI group unveils three foundational models for speech, audio and image generation
MAI released three new foundational models—announced six months after the group's formation—that can transcribe voice to text and generate audio and images, positioning Microsoft to compete on multimodal and speech-capable foundational models.
TechCrunch AIAnthropic’s DMCA takedowns meant to stop leaked Claude Code swept up legitimate GitHub forks
Ars Technica reports Anthropic’s DMCA notices targeting leaked Claude Code client code unintentionally affected legitimate forks on GitHub, underscoring how legal takedowns struggle to fully contain widely distributed leaks.
Ars Technica AI