AI News Digest — April 2, 2026

Highlights

OpenAI Closes $122B Mega-Round, Unveils ChatGPT Super App: OpenAI officially confirms a $122 billion funding round at an $852 billion valuation, backed by SoftBank, Amazon, and NVIDIA, and signals a hard pivot toward a unified ChatGPT super app.
Anthropic’s Claude Code Leaked Source Cloned 8,000+ Times: An npm packaging error exposed Claude Code’s source code, which spread to over 8,000 GitHub forks despite mass takedown efforts, revealing details of an unreleased “Mythos” model.
Google DeepMind Identifies Six “Traps” That Hijack Autonomous AI Agents: A new study documents six environmental manipulation patterns that reliably redirect real-world AI agents browsing the web, handling email, and executing transactions.
EU Bars AI-Generated Content from Official Communications: The European Commission, Parliament, and Council have prohibited their press teams from publishing fully AI-generated content, citing authenticity and accountability concerns.
North Korea Linked to Axios npm Supply Chain Compromise: Google formally attributes the supply chain attack on the widely-used Axios npm package to North Korean group UNC1069, marking a significant escalation in open-source ecosystem targeting.

News

AI Security

Claude Code Source Leaked via npm Packaging Error (The Hacker News) — Anthropic confirms internal Claude Code source was inadvertently bundled into a public npm package, exposing proprietary logic and references to unreleased models.
Claude Chrome Extension “ShadowPrompt” Vulnerability (ITmedia Enterprise) — A critical flaw in Claude’s Chrome extension allows a visited webpage to hijack AI behavior without user interaction, exploiting a design-level gap in extension sandboxing.
Google DeepMind: Six Traps for Hijacking AI Agents in the Wild (The Decoder) — Researchers categorize six injection patterns — from hidden page instructions to malicious email content — that consistently redirect agent actions in deployed systems.
Mercor Cyberattack Tied to LiteLLM Open-Source Compromise (TechCrunch AI) — AI recruiting startup Mercor confirms a security incident after attackers compromised the widely-used LiteLLM project as an entry vector.
Google Attributes Axios npm Attack to North Korean Group UNC1069 (The Hacker News) — The attribution implicates a financially motivated North Korean threat actor in a supply chain attack affecting millions of downstream npm dependents.
CrowdStrike: AI-Accelerated Attacks Now Average 29-Minute Breakout, Fastest 27 Seconds (ITmedia Enterprise) — The 2026 threat report documents how widespread AI tooling has compressed the average attacker breakout time to under 30 minutes, with observed cases completing initial intrusion in 27 seconds.
Is “Hackback” Now Official US Cybersecurity Strategy? (Schneier on Security) — Bruce Schneier analyzes the 2026 US Cyber Strategy document and its implications for offensive cyber doctrine.
New EvilTokens Service Fuels Microsoft Device Code Phishing (BleepingComputer) — A new MaaS kit automates device code phishing to hijack Microsoft OAuth tokens, lowering the bar for enterprise account takeover.
‘NoVoice’ Android Malware Infected 2.3 Million Devices via Google Play (BleepingComputer) — Malware hidden in over 50 Play Store apps silently exfiltrates data; apps have been pulled but infection persists on user devices.
New Chrome Zero-Day CVE-2026-5281 Under Active Exploitation (The Hacker News) — Google patches the fourth Chrome zero-day exploited in the wild so far in 2026; update is immediately available.
CERT-UA Impersonation Campaign Spread AGEWHEEZE Malware to 1 Million Emails (The Hacker News) — Attackers spoofed Ukraine’s computer emergency response team to distribute new malware at scale.
Microsoft Warns of WhatsApp-Delivered VBS Malware with UAC Bypass (The Hacker News) — A new campaign uses WhatsApp messages to deliver VBScript payloads that execute with elevated privileges, bypassing User Account Control.
Perplexity AI Sued Over Alleged Data Sharing with Meta and Google (The Decoder) — A class-action lawsuit accuses Perplexity of sharing personally identifiable user chat data with third-party advertisers.
FBI Warns Against Chinese Mobile Apps Over Privacy Risks (BleepingComputer) — The FBI issues a formal advisory cautioning Americans about data collection practices in foreign-developed applications.

USA

OpenAI Confirms $122B Round and ChatGPT Super App (The Decoder) — The raise, backed by SoftBank, Amazon, and NVIDIA, values OpenAI at $852 billion; the company signals a pivot away from API-first toward an integrated consumer super app.
Meta’s Hyperion AI Datacenter Powered by 10 New Natural Gas Plants (TechCrunch AI) — Meta’s planned flagship AI datacenter will rely on newly built gas generation capacity equivalent to powering an entire US state.
Cognichip Raises $60M to Use AI for Chip Design (TechCrunch AI) — The startup claims its AI can cut chip development costs by 75% and halve the design timeline, targeting the next generation of AI accelerators.
Salesforce Announces 30 New AI Features for Slack (TechCrunch AI) — The overhaul deeply integrates Salesforce’s Agentforce platform into Slack, enabling autonomous agents to surface insights and take actions within conversations.
Gradient Labs Gives Every Bank Customer an AI Account Manager (OpenAI Blog) — Built on GPT-4.1 and GPT-5.4, Gradient Labs deploys AI agents that automate banking support workflows, targeting sub-second response latency at enterprise scale.
Holo3: Breaking the Computer Use Frontier (Hugging Face Blog) — H Company releases Holo3, a new computer-use model claiming state-of-the-art performance on desktop and browser automation benchmarks.
Microsoft Research ADeLe: Predicting and Explaining AI Performance Across Tasks (Microsoft Research Blog) — ADeLe provides a framework to forecast LLM task performance and explain failures, moving beyond aggregate benchmark scores.
America’s AI Boom Is Leaving the Rest of the World Behind (Rest of World) — Analysis shows US AI investment is widening rather than closing the global gap, with capital and talent concentrating in a handful of American firms.
Gig Workers Are Now Training Humanoid Robots at Home (MIT Technology Review) — A growing gig economy is emerging around collecting embodied AI training data via teleoperation rigs, with workers in Nigeria and elsewhere earning per-task fees.
Elgato Stream Deck Gets MCP Support for AI Agent Control (The Verge AI) — Elgato adds Model Context Protocol integration, allowing AI agents to trigger physical Stream Deck buttons and read their state.
Anthropic Is Having a Month (TechCrunch AI) — A second human-error incident within a week at Anthropic; TechCrunch covers the streak of internal missteps.
Baidu’s Robotaxis Froze in Traffic, Creating Chaos (The Verge AI) — Multiple Apollo Go robotaxis stalled mid-journey in a Chinese city, trapping passengers and blocking intersections before operators remotely intervened.
Google Releases March 2026 AI News Roundup (Google AI Blog) — Monthly update covering Gemini, Search, Workspace, and Cloud AI releases from March.

Europe

EU Bars AI-Generated Content from Official Communications (The Decoder) — All three major EU institutions have instructed press teams to avoid fully AI-generated material, a de facto content policy ahead of formal AI Act enforcement.

Japan (AI & Tech)

.claudeフォルダの構造と使い方 — Claude Code入門 (ITmedia AI+) — A beginner’s guide to Claude Code’s .claude folder structure, explaining how CLAUDE.md and project-level configuration shape agent behavior.
東大、オープンソース四足歩行ロボット「MEVIUS2」を発表 (ITmedia AI+) — The University of Tokyo unveils MEVIUS2, an open-source quadruped robot with parts orderable online, demonstrated climbing stairs in public.
NotebookLMで作業時間95％削減 — 自治体・企業の「Google回帰」 (ITmedia Enterprise) — Japanese municipalities and enterprises report up to 95% reduction in document processing time after deploying NotebookLM and other Google AI tools.
Slack SlackbotがAIエージェント連携を大幅強化 (ITmedia AI+) — Slack announces Slackbot’s expansion into a multi-agent coordinator, including a “Deep Thoughts” strategic analysis feature.
NTTデータ系気象会社がMCPサーバを提供開始 (ITmedia AI+) — An NTT Data weather subsidiary launches an MCP server exposing real-time meteorological data for AI agent pipelines in retail, distribution, and construction.
スマホでAIモデルをローカル実行「Off Grid」レビュー (Gigazine) — Review of Off Grid, a free iOS/Android app running open-source LLMs and image generation models fully on-device without cloud dependency.
株式投資にAIを使うと？ ClaudeとGeminiの”性格の違い” (ITmedia AI+) — A Tokyo University research group finds Claude adopts a conservative incremental investment strategy while Gemini takes bolder positions when automated to trade equities.
「説明可能なAI」が鍵 — 2028年、生成AI導入企業の50%がLLMオブザーバビリティーに投資 (ITmedia Enterprise) — Gartner predicts half of generative AI deployments will require explainability and observability tooling by 2028, driven by governance requirements.
最短27秒で初期侵入完了 — AI普及で高速化するサイバー攻撃 (ITmedia Enterprise) — CrowdStrike’s threat report finds AI-assisted attackers completing initial access in as little as 27 seconds, with the average dropping to 29 minutes.
KDDI、フィクティブ取引で2461億円の過大計上を発表 (The Japan Times) — KDDI discloses ¥246.1 billion in overstated revenues via fictitious transactions; eight executives including the CEO will return portions of remuneration.
溶接工が6時間でアプリを開発 — 静岡の町工場が生成AI教育に500万円 (ITmedia Business) — A 13-person factory in Shizuoka invested ¥5 million in generative AI training and reports a welder built a business app in six hours with no prior coding experience.
AIを使わないApple共同創業者ウォズニアック、50周年で技術への違和感を語る (ITmedia Business) — On Apple’s 50th anniversary, Steve Wozniak says AI-generated content feels “too perfect to be human,” expressing discomfort with the loss of personality in technology.

Research Papers

Benchmarks & Evaluation

Beyond pass@1: A Reliability Science Framework for Long-Horizon LLM Agents — Argues that current benchmarks measure single-attempt capability but not production reliability; proposes a framework distinguishing capability from consistency across repeated runs, critical for deployment decisions.
BenchScope: How Many Independent Signals Does Your Benchmark Provide? — Introduces Effective Dimensionality to measure how many genuinely independent signals a benchmark suite provides, revealing that many popular eval sets are heavily redundant and inflate apparent coverage.
ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities — Demonstrates that errors in benchmark construction itself cause AI performance to appear lower than it is; after correcting labeling flaws, agent scores on ELT pipeline tasks improve substantially.
SciVisAgentBench: Evaluating Scientific Data Analysis and Visualization Agents — Introduces a benchmark testing agents on end-to-end scientific workflows — from raw data ingestion to publication-quality visualization — revealing consistent failure modes in multi-step reasoning.
SkillTester: Benchmarking Utility and Security of Agent Skills — A dual-axis evaluation framework that tests both functional utility and security vulnerabilities of AI agent “skills” (tools/plugins), including a curated security probe suite.

Security & Adversarial

Trojan-Speak: Bypassing Constitutional Classifiers via Adversarial Finetuning — Shows that fine-tuning APIs allow adversaries to embed triggers that bypass content classifiers with no perceptible “jailbreak tax” — outputs appear normal until a trigger phrase is present.
Architecting Secure AI Agents: System-Level Defenses Against Indirect Prompt Injection — Presents architectural patterns — input sanitization pipelines, context compartmentalization, and permission hierarchies — that reduce susceptibility to injected instructions from untrusted environments.
GUARD-SLM: Token Activation-Based Defense Against Jailbreak Attacks for Small LMs — Proposes a lightweight defense mechanism for edge-deployed small language models that monitors internal token activations to detect and block jailbreak attempts without retraining.
CivicShield: Defense-in-Depth for Government AI Chatbots vs. Multi-Turn Adversarial Attacks — Documents that multi-turn adversarial attacks succeed over 90% of the time against current government chatbot deployments and proposes a layered defense framework reducing success to under 15%.
Security in LLM-as-a-Judge: A Comprehensive SoK — Systematizes knowledge on attack surfaces in LLM-based evaluation pipelines, covering prompt injection, output manipulation, and gaming strategies that undermine automated quality assessment.
Design Principles for a Security Operations Benchmark for Multi-Agent AI Systems — Proposes evaluation criteria for AI systems performing cybersecurity operations, addressing gaps in existing benchmarks that fail to capture adversarial conditions or multi-agent coordination requirements.

Alignment & Safety

Extending MONA: Reward-Hacking Mitigation via Myopic Optimization with Non-Myopic Approval — Reproduces and extends MONA, a technique that limits an agent’s planning horizon while allowing a human-in-the-loop approver to evaluate longer-term consequences, reducing multi-step reward hacking.
Robust Safety Monitoring of Language Models via Activation Watermarking — Embeds verifiable watermarks into model activations to enable reliable detection of misuse (e.g., weapon instructions, malware generation) even when outputs are obfuscated at the text level.
SafeClaw-R: Towards Safe and Secure Multi-Agent Personal Assistants — Addresses safety and authorization gaps in multi-agent personal assistant frameworks (e.g., AutoGPT-style systems), proposing sandboxing and inter-agent trust hierarchies.

Guardrails & Robustness

Towards Policy-Adaptive Image Guardrail: Benchmark and Method — Introduces a content moderation framework that adapts rejection thresholds to context-specific policies rather than applying fixed rules, with an accompanying benchmark for evaluating policy-conditioned refusals.
Uncertainty Gating for Cost-Aware Explainable AI — Uses epistemic uncertainty as a proxy for when explanations are unreliable, gating expensive explanation generation to cases where it is most needed — particularly relevant for high-stakes clinical and legal deployments.

Applications

Symphony for Medical Coding: Agentic System for Scalable and Explainable Medical Coding — Presents an agentic pipeline that translates free-text clinical notes into standardized billing codes with higher accuracy and built-in audit trails, addressing a major bottleneck in healthcare administration.

Key Themes

AI infrastructure concentration: Meta’s natural gas buildout and the OpenAI mega-round signal that AI’s resource demands are now a macroeconomic factor, with capital and energy access shaping who can compete.
Agentic security as the new frontier: The Claude Code leak, Google DeepMind’s agent hijacking research, ShadowPrompt, and four new papers on prompt injection underscore that agents operating in the real world introduce a qualitatively different attack surface from static LLMs.
Benchmark credibility under scrutiny: Multiple papers this cycle challenge the validity of existing benchmarks — from redundancy (BenchScope) to labeling errors (ELT-Bench) to the capability/reliability gap (Beyond pass@1) — signaling growing methodological maturity in AI evaluation.
Regulatory and institutional guardrails taking shape: The EU content ban and Gartner’s LLM observability forecast indicate that governance requirements are moving from abstract to operational, creating demand for explainability and audit infrastructure.
North Korea’s escalation in open-source targeting: The Axios npm attack extends a pattern of supply chain operations to the most critical shared infrastructure of AI/ML development, raising the risk profile for the entire open-source ecosystem.

For detailed summaries of selected research papers, see papers.md.