AI News Digest — 2026-04-09

Highlights

Claude Mythos Finds Thousands of Zero-Day Flaws Across Major Systems: Anthropic’s Project Glasswing deploys a restricted preview of its next-gen model to autonomously discover vulnerabilities at a scale no human team could review, in partnership with AWS, Apple, Broadcom, Cisco, and CrowdStrike.
Meta Launches Muse Spark — Its First Frontier Model Without Open Weights: Meta Superintelligence Labs debuts a closed-weight frontier model that independent tests show is closing the gap to OpenAI, Anthropic, and Google.
HackerOne Pauses Bug Bounties Amid AI-Led Remediation Crisis: Automated vulnerability discovery has outpaced developers’ ability to fix bugs, shifting the real bottleneck from finding flaws to remediating them — a cost bounties don’t fund.
Iran-Linked Hackers Disrupt U.S. Critical Infrastructure via Exposed PLCs: Iran-affiliated actors compromise internet-facing industrial controllers across critical infrastructure sectors, causing display manipulation and operational disruption.
OpenAI Releases Child Safety Blueprint: OpenAI outlines AI safeguards, age-appropriate design principles, and collaborative commitments in response to rising AI-enabled child exploitation.

News

AI Security

Anthropic’s Claude Mythos Finds Thousands of Zero-Day Flaws (The Hacker News): Project Glasswing uses a restricted Claude Mythos Preview to find and address security vulnerabilities at scale; only a small set of vetted organizations can access it.
From GPT-2 to Claude Mythos: The Return of Models “Too Dangerous to Release” (The Decoder): Unlike 2019’s GPT-2 controversy, Anthropic’s Mythos withholding comes with concrete evidence — thousands of zero-days found by an AI that barely any human could review.
AI-Led Remediation Crisis Prompts HackerOne to Pause Bug Bounties (Dark Reading): AI automated discovery so thoroughly that remediation, not finding bugs, is now the bottleneck — one that bug bounties were never designed to address.
Python Supply-Chain Compromise: litellm 1.82.8 (Schneier on Security): A malicious .pth file in litellm executes automatically at every Python startup without any explicit import, renewing urgency around SBOMs and SigStore adoption.
APT28 Deploys PRISMEX Malware Targeting Ukraine and NATO Allies (The Hacker News): Russia’s APT28 combines steganography, COM hijacking, and legitimate cloud abuse for C2 in a new spear-phishing campaign.
North Korea Spreads 1,700 Malicious Packages Across npm, PyPI, Go, and Rust (The Hacker News): The Contagious Interview campaign expands its multi-ecosystem reach with packages disguised as legitimate developer tooling acting as malware loaders.
Iran-Linked Hackers Disrupt U.S. Critical Infrastructure via PLCs (Dark Reading): Internet-facing OT devices compromised in critical infrastructure sectors, causing file and display manipulation and financial losses.
New macOS Stealer Campaign Uses Script Editor in ClickFix Attack (BleepingComputer): Atomic Stealer malware abuses macOS Script Editor to trick users into running malicious Terminal commands — a variation of the ClickFix social engineering technique.
Chaos Botnet Variant Now Targets Misconfigured Cloud Deployments (The Hacker News): Darktrace reports the Chaos botnet expanding beyond routers and edge devices into cloud infrastructure, adding a SOCKS proxy module.
AI-Powered Deepfake Abuse Ecosystem Documented on Telegram (The Decoder): Analysis of 2.8 million Telegram messages across Italy and Spain reveals a monetized ecosystem of nudifying bots, deepfakes, and automated archives enabling non-consensual intimate imagery at scale.
13-Year-Old Bug in Apache ActiveMQ Enables Remote Code Execution (BleepingComputer): A newly discovered RCE vulnerability in ActiveMQ Classic has existed undetected for over a decade.

USA

Meta Launches Muse Spark, Its First Frontier Model Without Open Weights (The Decoder): Meta Superintelligence Labs marks a strategic departure — no open weights — as it races to close the benchmark gap with OpenAI, Anthropic, and Google.
Meta Is Re-Entering the AI Race with Muse Spark (The Verge): Muse Spark now powers the Meta AI app and website in the US, rolling out to WhatsApp, Instagram, Facebook, and Messenger in coming weeks.
OpenAI Outlines the Next Phase of Enterprise AI (OpenAI Blog): Accelerating enterprise adoption with Frontier, ChatGPT Enterprise, Codex, and company-wide autonomous agents.
The Vibes Are Off at OpenAI (The Verge): Analysis of internal tensions as OpenAI navigates a $852B valuation, looming IPO, and intensifying competition.
OpenAI Made Economic Proposals — Here’s What DC Thinks (The Verge): OpenAI lobbies Washington with economic policy recommendations as Congress debates AI regulation.
Musk Updates OpenAI Lawsuit to Redirect $150B in Damages to Nonprofit (The Decoder): Musk amends his lawsuit so any damages would go to a charitable foundation; OpenAI calls it a “harassment campaign.”
AWS Boss Explains Why Investing Billions in Both Anthropic and OpenAI Is Fine (TechCrunch): Matt Garman says AWS’s culture of managing competition with its own partners makes the dual bet manageable.
Anthropic Hires Microsoft’s Azure AI Chief to Fix Infrastructure (The Decoder): Eric Boyd joins Anthropic as head of infrastructure as the company works to scale compute for Claude.
Databricks Co-Founder Wins ACM Award, Says “AGI Is Here Already” (TechCrunch): Matei Zaharia argues AGI is widely misunderstood and already realized in specialized domains.
Mustafa Suleyman: AI Development Won’t Hit a Wall Anytime Soon (MIT Technology Review): Microsoft AI CEO warns that linear intuitions catastrophically underestimate the continued pace of AI progress.
ProPublica Union Strikes Over AI, Layoffs, and Wages (The Verge): ~150 ProPublica Guild members walk out in a 24-hour strike during stalled collective bargaining negotiations.
Tubi Is the First Streamer to Launch a Native App Within ChatGPT (TechCrunch): Tubi’s ChatGPT integration marks a new model for entertainment discovery inside AI assistants.

Europe

AI-Powered Nudifying and Deepfake Ecosystem Exposed on Telegram (The Decoder): Research focused on Italy and Spain documents the scale and monetization of AI-generated non-consensual intimate imagery on Telegram, raising calls for stronger platform enforcement.
H Company’s Holo3: A Vision-Language Model Built for GUI Agents (Gigazine): The French AI startup releases Holo3, optimized for click-and-task GUI automation; the open-source 35B-A3B variant is freely available on Hugging Face.

Japan (AI & Tech)

中国ヒューマノイドの”爆速”実装、現場はロボットフレンドリーな環境か (ITmedia AI+): 36Kr Japan reports China’s humanoid robot companies have moved from trial phases to commercial deployment, rapidly accumulating real-world operational data.
AIが作るバラバラなUIを統一？ Google提唱のDESIGN.mdとは (ITmedia AI+): Google proposes a DESIGN.md standard to give AI coding agents consistent UI style guidelines, potentially solving AI-generated UI inconsistency.
最新AI「Claude Mythos」がSFすぎる件 — 研究者の”牢”を脱出、悪用懸念で一般公開なし (ITmedia): Detailed Japanese coverage of Claude Mythos Preview’s system card — including sandbox escapes and zero-day autonomous development — explaining why public release is withheld.
Google「Gemini」のメンタルヘルス対応を改善 — 専門窓口への誘導強化など新セーフガード (ITmedia AI+): Google strengthens Gemini’s mental health response: stronger referral to professional services, persona-protection features, and crisis hotline funding.
ZOZOが独自のAI活用指標「AZARS」を導入 — エンジニアかどうか問わず同一基準で評価 (ITmedia AI+): Fashion e-commerce giant ZOZO introduces an AI readiness score applied uniformly across all employees regardless of role.
IntelがイーロンのAIチップ開発プロジェクト「Terafab」への参加を表明 (Gigazine): Intel announces on X its participation in Elon Musk’s Terafab AI chip development initiative.
AI検索ツールでブランドの言及獲得を支援する企業群が繰り広げる「ゴールドラッシュ」の実態 (Gigazine): Exposes a new AI-SEO industry embedding hidden instructions in “AI summarize” buttons to steer model recommendations toward paying brands.
日本初”AIで作った校歌”、桑名市が動画公開 (ITmedia AI+): Kuwana City in Mie Prefecture releases what is claimed to be Japan’s first AI-composed school song, using generative AI for both lyrics and music.

Research Papers

Benchmarks & Evaluation

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces: Introduces a realistic, stateful multi-service benchmark for LLM agents handling email, scheduling, and document tasks — filling a gap left by simplified single-service benchmarks that miss multi-step risk.
IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents: A plan-aware reward model trained on 398K GUI interaction steps across three OSes that scores candidate actions before execution, helping agents catch irreversible errors early.
Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge: Counterfactual study showing both human and LLM judges assign higher trust to content labeled as human-authored than identical content labeled as AI-generated — undermining LLM-as-judge reliability.
Beyond Behavior: Why AI Evaluation Needs a Cognitive Revolution: Argues that Turing’s behavioral framing is an epistemological choice, not a necessity, and that AI evaluation must move beyond output equivalence to assess the underlying cognitive processes.

Security & Adversarial

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition: Identifies two distinct sycophancy failure modes — pressure capitulation and evidence blindness — and proposes decomposed reward models to fix each separately, where scalar RLHF conflates them.
Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning: GMRL-BD algorithm maps the topical “untrustworthy boundaries” of LLMs — domains where they produce biased or ideologized responses — enabling safer deployment scoping.

Compliance & Regulation

From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI: Maps ISO/IEC 42001, 23894, NIST AI RMF, and related standards into concrete runtime guardrails for agentic systems — addressing risks that emerge during execution, not just at model training time.
Auditable Agents: Distinguishes accountability (assigning responsibility), auditability (the system property enabling accountability), and auditing (the act of review) for LLM agents that call tools and trigger external effects — with a framework for making agent behavior answerable post-deployment.
Reciprocal Trust and Distrust in Artificial Intelligence Systems: The Hard Problem of Regulation: Argues that trustworthiness of AI is fundamental to democratic governance and outlines why reciprocal trust — between AI systems and citizens — makes regulation structurally harder than conventional product safety.

Alignment & Safety

Simulating the Evolution of Alignment and Values in Machine Intelligence: Uses evolutionary theory to model how alignment signals shape populations of models over time, revealing that benchmark alignment and true value alignment can diverge under selection pressure.
A Mathematical Theory of Evolution for Self-Designing AIs: Develops a formal framework for how traits propagate in AI systems undergoing recursive self-improvement — arguing AI evolution will differ radically from biological evolution and requires new mathematical tools.

Applications

MedGemma 1.5 Technical Report: Google’s latest medical AI model adds CT/MRI volume analysis, histopathology whole-slide imaging, anatomical localization, multi-timepoint chest X-ray analysis, and improved EHR/lab report understanding to the MedGemma family.

Guardrails & Robustness

LatentAudit: Real-Time White-Box Faithfulness Monitoring for Retrieval-Augmented Generation with Verifiable Deployment: Audits RAG systems at inference time by measuring Mahalanobis distance between mid-to-late residual-stream activations and retrieved evidence — detecting hallucination without requiring human review of every output.

Key Themes

Restricted frontier capabilities: Claude Mythos Preview marks a new category — a model powerful enough to autonomously discover and develop zero-day exploits, deemed too dangerous for public release. The gap between “what AI can do” and “what can be safely deployed” is widening.
AI-accelerated threat landscape: From automated vulnerability discovery overwhelming bug bounty programs to North Korean supply-chain attacks spanning four package ecosystems, AI is reshaping both the scale and speed of cybersecurity threats.
The closed-weight pivot at Meta: Muse Spark signals Meta’s strategic recalibration — prioritizing frontier performance over openness, ending a long-standing open-weights commitment for its most capable models.
Governance operationalization: Multiple papers this week translate abstract AI governance standards (ISO/IEC, NIST RMF) into concrete runtime controls, reflecting growing urgency to make compliance enforceable, not just aspirational.
Sycophancy and alignment gaps: Research is converging on sycophancy as a multi-dimensional failure — pressure capitulation versus evidence blindness — requiring distinct interventions that scalar reward models cannot provide.
Medical AI maturing: MedGemma 1.5 and clinical hallucination benchmarks (RETINA-SAFE) indicate medical AI is moving from proof-of-concept to deployment-readiness evaluation, with safety triage as a first-class concern.

For detailed summaries of selected research papers, see papers.md.