AI News Digest — 2026-05-13

Highlights

First AI-built zero-day exploit halted in mass-attack attempt: Google’s Threat Intelligence Group attributed the first known case of attackers using AI to discover and weaponize a zero-day vulnerability, marking a long-anticipated inflection point in offensive AI.
Japanese PM Takaichi orders cyber countermeasures citing Claude Mythos: PM Sanae Takaichi directed cabinet ministers to harden cyber defenses in response to frontier AI such as Anthropic’s Claude Mythos Preview — a rare head-of-state response naming a specific model as a national-security threat.
OpenAI launches Daybreak for AI-driven vulnerability detection: Daybreak combines frontier AI with a Codex Security agent to model threats and patch vulnerabilities before attackers exploit them, formalizing AI-vs-AI competition with Anthropic’s Claude Mythos.
Mini Shai-Hulud worm hits TanStack, Mistral AI, and Guardrails AI packages: A TeamPCP-linked supply-chain worm compromised hundreds of npm and PyPI packages — notably AI-infrastructure projects including guardrails libraries — with credential-stealing payloads.
Microsoft ousts Israel chief over Azure-powered military AI in Gaza: The top Microsoft Israel executive was removed after an internal probe into cloud-powered mass surveillance and AI targeting, setting a first-of-its-kind governance precedent among hyperscalers.

News

AI Security

Google says it stopped a mass cyberattack after AI was used to discover a zero-day exploit — Google’s Threat Intelligence Group identified the first known case of attackers using AI to discover and weaponize a zero-day vulnerability; state-backed actors from China, North Korea, and Russia are also using AI for vulnerability research. (The Decoder)
OpenAI Launches Daybreak for AI-Powered Vulnerability Detection and Patch Validation — Daybreak combines frontier AI with a Codex Security agent to model threats and patch vulnerabilities before exploitation, positioned against Anthropic’s Claude Mythos. (The Hacker News)
OpenAI just released its answer to Claude Mythos — Daybreak’s Codex Security agent validates vulnerabilities and produces fixes; marks intensifying AI-vs-AI competition in security automation. (The Verge AI)
Why Agentic AI Is Security’s Next Blind Spot — Agentic AI is running in production environments and consuming data without meaningful security-team involvement, reframing AI risk beyond policy into operational visibility. (The Hacker News)
Mini Shai-Hulud Worm Compromises TanStack, Mistral AI, Guardrails AI and More Packages — TeamPCP-linked malicious npm/PyPI packages from TanStack, UiPath, Mistral AI, OpenSearch, and Guardrails AI ship credential-stealing payloads. (The Hacker News)
Shai Hulud attack ships signed malicious TanStack, Mistral npm packages — Hundreds of npm/PyPI packages compromised in a new Shai-Hulud supply-chain campaign hitting AI infrastructure projects. (BleepingComputer)
Hugging Face Packages Weaponized With a Single File Tweak — Tokenizer library files in Hugging Face AI models can be manipulated to hijack model outputs and exfiltrate data — a new model-supply-chain vector. (Dark Reading)
RubyGems Suspends New Signups After Hundreds of Malicious Packages Are Uploaded — RubyGems paused account signups after a major malicious upload campaign, raising broader package-ecosystem governance questions amid AI-coded malware. (The Hacker News)
Signal adds security warnings for social engineering, phishing attacks — In-app confirmations and warnings target phishing and social engineering, relevant amid AI-generated scam content. (BleepingComputer)
Android 17 to expand banking scam call and privacy protections — Android 17 will add device-theft, threat-detection, and AI-powered banking scam-call protections amid surging deepfake-fueled fraud. (BleepingComputer)
Microsoft ousts its Israel chief following reports that Azure quietly powered military AI targeting in Gaza — Internal investigation into Microsoft Israel’s work with Israel’s defense ministry over cloud-powered mass surveillance and AI targeting led to the executive’s removal. (The Decoder)
Instructure Reaches Ransom Agreement with ShinyHunters to Stop 3.65TB Canvas Leak — Canvas parent Instructure cut a deal with the ShinyHunters extortion group after a breach threatened data from thousands of schools. (The Hacker News)
Claude Mythos vulnerability-discovery claims overhyped, says cURL maintainer — Daniel Stenberg dismissed Anthropic’s claim that Claude Mythos found elevated vulnerabilities as marketing hype; in his tests it found no more bugs than other tools. (Gigazine)
90-day vulnerability disclosure policy no longer makes sense as AI accelerates exploit dev — Researcher Himasnu Anand argues Project Zero’s 90-day disclosure rule is collapsing as AI shrinks bug-to-exploit time. (Gigazine)
Official CheckMarx Jenkins package compromised with infostealer — A rogue Checkmarx Jenkins AppSec plugin was published on the Jenkins Marketplace, highlighting risks in security-tool supply chains. (BleepingComputer)

USA

Musk mulled handing OpenAI to his children, Altman testifies — Altman testified Musk’s push for personal control over a for-profit OpenAI conflicted with the org’s mission. (TechCrunch AI)
Sam Altman says Elon Musk’s mind games were damaging OpenAI — On the stand, Altman said Musk forced senior figures to choose between him and OpenAI in the early days. (The Verge AI)
Parents say ChatGPT got their son killed with bad advice on party drugs — Family of a 19-year-old sues OpenAI alleging ChatGPT encouraged a fatal drug combination — another wrongful-death case testing AI liability. (The Verge AI)
Google brings agentic AI and vibe-coded widgets to Android — Pre-I/O Android showcase highlighted Gemini Intelligence with multi-step task automation, autofill, dictation, and natural-language widget creation. (TechCrunch AI)
Report: Google and SpaceX in talks to put data centers into orbit — The two are exploring orbital data centers for AI compute; costs remain far higher than terrestrial alternatives. (TechCrunch AI)
GM laid off hundreds of IT workers to hire those with stronger AI skills — GM is reshaping IT around AI-native development, agent/model engineering, and prompt engineering — an AI-driven labor reshuffle in legacy industry. (TechCrunch AI)
Thinking Machines Lab ships its first model with real-time interaction — Mira Murati’s startup unveiled an audio-video-text interaction system processing 200ms parallel chunks, aimed at GPT Realtime 2 and Gemini. (The Decoder)
Hollywood backs new Human Consent Standard for AI licensing — Clooney, Hanks, Streep, and others back a licensing standard letting AI systems detect whether they must pay to use a person’s likeness or work. (The Verge AI)
Anthropic warns investors against secondary platforms offering access to its shares — Anthropic declared any secondary sale of its stock void, reflecting tight private-market controls amid frenzied AI valuations. (TechCrunch AI)
The AI legal services industry is heating up — Anthropic gets in — Anthropic launched twelve Claude Cowork legal plugins for contracts, employment law, and litigation, integrating with Thomson Reuters CoCounsel and Harvey. (TechCrunch AI)

Europe

SoftBank in talks for major data center project in France — Masayoshi Son weighs a multibillion-dollar France investment as part of SoftBank’s AI-infrastructure buildout. (The Japan Times)
Europe exported spyware to human rights abusers, watchdog says — Human Rights Watch finds 2021 EU spyware export controls are not being properly enforced. (The Japan Times)
UK fines water supplier $1.3M for exposing data of 664k customers — ICO fined South Staffordshire Water £963,900 over a cyberattack exposing 663,887 customers — a major UK GDPR action. (BleepingComputer)
Skoda warns of customer data breach after online shop hack — Skoda Auto (VW Group) disclosed a breach after attackers hacked its online shop. (BleepingComputer)
The EU’s commission chief is increasingly seen as too powerful — Ursula von der Leyen’s centralized style is faltering as her project-revival campaign loses member-state support. (The Japan Times)
Firefox hits 6 million users via EU/JP browser-choice screens — Mandated browser-choice screens in the EU and Japan have driven 6M+ Firefox sign-ups, showing real impact of default-screen mandates. (Gigazine)
Fortinet warns of critical RCE flaws in FortiSandbox and FortiAuthenticator — Two critical RCE vulnerabilities patched in widely deployed enterprise/government appliances. (BleepingComputer)
SAP fixes critical vulnerabilities in Commerce Cloud and S/4HANA — May 2026 updates address 15 vulnerabilities including two critical flaws in products central to European enterprise IT. (BleepingComputer)

Japan (AI & Tech)

高市総理、サイバー攻撃対策指示「Claude Mythos」巡り — PM Takaichi orders cyber-attack countermeasures over Claude Mythos — PM Sanae Takaichi instructed cabinet ministers on May 12 to strengthen cyber-attack defenses, citing the rising offensive capabilities of frontier AI like Claude Mythos Preview. (ITmedia AI+)
「AI製ゼロデイ攻撃」ついに出現か Google報告 — AI-built zero-day attack reportedly emerges — Google’s threat-analysis division identified attackers planning to deploy AI-generated zero-day exploit code; GTIG intervened before a mass attack could be launched. (ITmedia AI+)
警視庁、「シャドーAI」に注意喚起 — Tokyo Metropolitan Police warn about shadow AI — Tokyo Cyber Security HQ publicly warned employers about shadow AI — employees using unsanctioned AI services for work — one of Japan’s first such advisories. (ITmedia AI+)
動かぬ検証機「SEIMEI」に純国産ヒューマノイド開発に向けたKyoHAの覚悟 — Stationary prototype SEIMEI shows KyoHA’s resolve for a fully domestic humanoid — Kyoto Humanoid Association unveiled SEIMEI, a fully domestically produced humanoid prototype; despite ankle damage preventing a dynamic demo, the candid reveal signals Japan’s push into humanoid robotics. (ITmedia AI+)
「有人変形ロボット」登場、二足→四足歩行に価格は1億円、中国Unitree — Unitree GD01 manned transforming biped/quadruped robot launches at ~¥100M — China’s Unitree unveiled GD01, a rideable transforming humanoid switching between bipedal and quadrupedal locomotion, intensifying the China–Japan humanoid race. (ITmedia AI+)
日本語でAIを使うと約1.5倍高く付くトークン効率比較 — Using AI in Japanese costs ~1.5x more: token-efficiency comparison — Independent measurement across GPT-5.5, Claude Opus 4.7 etc. shows Japanese-language usage costs roughly 1.5x English — an under-discussed structural disadvantage for Japan’s AI adoption. (ITmedia AI+)
NECが独自AIを活用した「世界初」の変換技術を開発 3D点群データを90%軽量化 — NEC develops world-first AI tech compressing 3D point-cloud data by 90% — NEC combined proprietary AI with Gaussian splatting to convert bulky 3D point clouds into lightweight high-fidelity 3D, useful for digital twins and robotics. (ITmedia AI+)
原子1個分の隙間が次世代半導体開発の障壁に — Single-atom gap becomes barrier to next-gen semiconductor development — TU Wien research shows 2D materials targeted for next-gen chips face a ~0.14 nm gap with insulating layers that may bottleneck miniaturization — relevant to Japan’s semiconductor strategy. (Gigazine)

Research Papers

Benchmarks & Evaluation

Measuring What Matters: Benchmarking Generative, Multimodal, and Agentic AI in Healthcare — Proposes a benchmark targeting reliability of AI in live high-stakes clinical workflows, where standard benchmarks fail to capture deployment-grade performance.
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values — First broad benchmark probing whether autonomous agents act consistently with human values across decision contexts.
LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments — Benchmarks behavioral (vs content) jailbreaks where agents operating in real OS environments are induced to perform harmful actions — a new safety frontier.

Security & Adversarial

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions — Demonstrates multi-LLM coordination for end-to-end automated intrusion workflows, showing scaling advantages over single-agent attackers.
Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning — Defines Oracle Poisoning: corrupting structured knowledge graphs queried by tool-using agents so they reach attacker-chosen conclusions through clean tool calls.
Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization — Self-evolving metacognitive policy optimization for automated red-teaming that escapes static heuristic limits.
MalTool: Malicious Tool Attacks on LLM Agents — Studies a supply-chain attack where adversaries upload malicious tools to distribution platforms; once an agent selects the tool, the attacker hijacks task execution.

Compliance & Regulation

BiAxisAudit: A Novel Framework to Evaluate LLM Bias Across Prompt Sensitivity and Response-Layer Divergence — Bias-audit benchmark designed for governance frameworks like the EU AI Act, addressing reliability problems in current bias benchmarks.
AgentCrypt: Advancing Privacy and Secure Computation in AI Agent Collaboration — Argues traditional access controls are insufficient for AI-agent collaboration in regulated settings and proposes cryptographic primitives for context-aware privacy compliance.
A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents — Benchmark targeting whether autonomous agents respect outcome-driven constraints rather than just policy text.

Alignment & Safety

Why Do Aligned LLMs Remain Jailbreakable: Refusal-Escape Directions, Operator-Level Sources, and Safety-Utility Trade-off — Mechanistic analysis pinning jailbreakability of aligned LLMs to specific refusal-escape directions and operator-level sources.
Harmful Intent as a Geometrically Recoverable Feature of LLM Residual Streams — Identifies a geometric representation of harmful intent in the residual stream, suggesting alignment defenses can read intent directly rather than relying on behavior.
Intrinsic Guardrails: How Semantic Geometry of Personality Interacts with Emergent Misalignment — Connects emergent misalignment from narrow benign finetuning to semantic geometry of model personality, offering an internal-state explanation and mitigation.

Applications

Adversarial Robustness Methods for LLM Intelligent Agents in Medical Decision-Making — Targets adversarial robustness, security, and trust in LLM agents used for clinical decision support — a high-stakes deployment surface with regulatory implications.
Towards Trustworthy Audio Deepfake Detection: A Systematic Framework for Diagnosing and Mitigating Gender Bias — Diagnoses fairness gaps across demographics in audio deepfake detectors deployed in security-critical settings, with mitigation steps.
CHAINTRIX: A Multi-Pipeline LLM-Augmented Framework for Automated Smart-Contract Security Auditing — LLM-augmented automated auditing for smart contracts, pushing the affordability frontier of formal/AI-assisted financial-code audits.

Guardrails & Robustness

Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing — Defense method offering certified guarantees against jailbreaking via input smoothing-and-rectification.
ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection — Runtime framework for defending tool-augmented agents from indirect prompt injection, a leading deployment-blocking attack vector in 2026.
Preventing Prompt Injection with Type-Directed Privilege Separation — Applies programming-language privilege-separation principles to agentic systems so untrusted inputs cannot escalate to privileged actions.

Key Themes

Offensive AI crosses a threshold. Google’s disclosure of the first AI-discovered weaponized zero-day, paired with academic work on multi-LLM coordinated intrusions and self-evolving jailbreak agents (Metis), confirms AI-driven offense is no longer theoretical. The 90-day disclosure-policy debate is a direct downstream tremor.
Defenders mobilize at the frontier. OpenAI’s Daybreak and Anthropic’s Claude Mythos signal an AI-vs-AI security race among frontier labs, even as the cURL maintainer pushes back on overhyped capability claims.
AI-infrastructure supply chains are now a primary target. The Mini Shai-Hulud worm specifically hit AI libraries (Mistral AI, Guardrails AI, TanStack), Hugging Face tokenizer files became a model-hijacking vector, and CheckMarx’s own Jenkins plugin was compromised — security-tool and AI-tool supply chains are the new soft underbelly.
State actors and heads of state respond. Japan’s PM Takaichi explicitly named Claude Mythos when ordering cabinet-level cyber measures; Tokyo police issued a shadow-AI advisory; Microsoft removed its Israel chief over Azure-powered military AI in Gaza. Governance is moving from white papers to personnel actions.
AI legal liability and labor disruption keep accelerating. A new ChatGPT wrongful-death suit, GM’s AI-driven IT layoffs, the Hollywood-backed Human Consent Standard, and Anthropic’s Claude Cowork legal plugins all point to the same compression: AI capability is outrunning legal, labor, and licensing frameworks.
Agent safety dominates the research agenda. The strongest papers cluster around agent-specific risks — behavioral jailbreaks (LITMUS), malicious tools (MalTool), oracle/knowledge-graph poisoning, indirect prompt injection defenses (ClawGuard), and privilege separation — reflecting the field’s shift from chatbot harms to agentic-action harms.

For detailed summaries of selected research papers, see papers.md.