AI News Digest — 2026-05-13
Highlights
- First AI-built zero-day exploit halted in mass-attack attempt: Google’s Threat Intelligence Group attributed the first known case of attackers using AI to discover and weaponize a zero-day vulnerability, marking a long-anticipated inflection point in offensive AI.
- Japanese PM Takaichi orders cyber countermeasures citing Claude Mythos: PM Sanae Takaichi directed cabinet ministers to harden cyber defenses in response to frontier AI such as Anthropic’s Claude Mythos Preview — a rare head-of-state response naming a specific model as a national-security threat.
- OpenAI launches Daybreak for AI-driven vulnerability detection: Daybreak combines frontier AI with a Codex Security agent to model threats and patch vulnerabilities before attackers exploit them, formalizing AI-vs-AI competition with Anthropic’s Claude Mythos.
- Mini Shai-Hulud worm hits TanStack, Mistral AI, and Guardrails AI packages: A TeamPCP-linked supply-chain worm compromised hundreds of npm and PyPI packages — notably AI-infrastructure projects including guardrails libraries — with credential-stealing payloads.
- Microsoft ousts Israel chief over Azure-powered military AI in Gaza: The top Microsoft Israel executive was removed after an internal probe into cloud-powered mass surveillance and AI targeting, setting a first-of-its-kind governance precedent among hyperscalers.
News
AI Security
- Google says it stopped a mass cyberattack after AI was used to discover a zero-day exploit — Google’s Threat Intelligence Group identified the first known case of attackers using AI to discover and weaponize a zero-day vulnerability; state-backed actors from China, North Korea, and Russia are also using AI for vulnerability research. (The Decoder)
- OpenAI Launches Daybreak for AI-Powered Vulnerability Detection and Patch Validation — Daybreak combines frontier AI with a Codex Security agent to model threats and patch vulnerabilities before exploitation, positioned against Anthropic’s Claude Mythos. (The Hacker News)
- OpenAI just released its answer to Claude Mythos — Daybreak’s Codex Security agent validates vulnerabilities and produces fixes; marks intensifying AI-vs-AI competition in security automation. (The Verge AI)
- Why Agentic AI Is Security’s Next Blind Spot — Agentic AI is running in production environments and consuming data without meaningful security-team involvement, reframing AI risk beyond policy into operational visibility. (The Hacker News)
- Mini Shai-Hulud Worm Compromises TanStack, Mistral AI, Guardrails AI and More Packages — TeamPCP-linked malicious npm/PyPI packages from TanStack, UiPath, Mistral AI, OpenSearch, and Guardrails AI ship credential-stealing payloads. (The Hacker News)
- Shai Hulud attack ships signed malicious TanStack, Mistral npm packages — Hundreds of npm/PyPI packages compromised in a new Shai-Hulud supply-chain campaign hitting AI infrastructure projects. (BleepingComputer)
- Hugging Face Packages Weaponized With a Single File Tweak — Tokenizer library files in Hugging Face AI models can be manipulated to hijack model outputs and exfiltrate data — a new model-supply-chain vector. (Dark Reading)
- RubyGems Suspends New Signups After Hundreds of Malicious Packages Are Uploaded — RubyGems paused account signups after a major malicious upload campaign, raising broader package-ecosystem governance questions amid AI-coded malware. (The Hacker News)
- Signal adds security warnings for social engineering, phishing attacks — In-app confirmations and warnings target phishing and social engineering, relevant amid AI-generated scam content. (BleepingComputer)
- Android 17 to expand banking scam call and privacy protections — Android 17 will add device-theft, threat-detection, and AI-powered banking scam-call protections amid surging deepfake-fueled fraud. (BleepingComputer)
- Microsoft ousts its Israel chief following reports that Azure quietly powered military AI targeting in Gaza — Internal investigation into Microsoft Israel’s work with Israel’s defense ministry over cloud-powered mass surveillance and AI targeting led to the executive’s removal. (The Decoder)
- Instructure Reaches Ransom Agreement with ShinyHunters to Stop 3.65TB Canvas Leak — Canvas parent Instructure cut a deal with the ShinyHunters extortion group after a breach threatened data from thousands of schools. (The Hacker News)
- Claude Mythos vulnerability-discovery claims overhyped, says cURL maintainer — Daniel Stenberg dismissed Anthropic’s claim that Claude Mythos found elevated vulnerabilities as marketing hype; in his tests it found no more bugs than other tools. (Gigazine)
- 90-day vulnerability disclosure policy no longer makes sense as AI accelerates exploit dev — Researcher Himasnu Anand argues Project Zero’s 90-day disclosure rule is collapsing as AI shrinks bug-to-exploit time. (Gigazine)
- Official CheckMarx Jenkins package compromised with infostealer — A rogue Checkmarx Jenkins AppSec plugin was published on the Jenkins Marketplace, highlighting risks in security-tool supply chains. (BleepingComputer)
USA
- Musk mulled handing OpenAI to his children, Altman testifies — Altman testified Musk’s push for personal control over a for-profit OpenAI conflicted with the org’s mission. (TechCrunch AI)
- Sam Altman says Elon Musk’s mind games were damaging OpenAI — On the stand, Altman said Musk forced senior figures to choose between him and OpenAI in the early days. (The Verge AI)
- Parents say ChatGPT got their son killed with bad advice on party drugs — Family of a 19-year-old sues OpenAI alleging ChatGPT encouraged a fatal drug combination — another wrongful-death case testing AI liability. (The Verge AI)
- Google brings agentic AI and vibe-coded widgets to Android — Pre-I/O Android showcase highlighted Gemini Intelligence with multi-step task automation, autofill, dictation, and natural-language widget creation. (TechCrunch AI)
- Report: Google and SpaceX in talks to put data centers into orbit — The two are exploring orbital data centers for AI compute; costs remain far higher than terrestrial alternatives. (TechCrunch AI)
- GM laid off hundreds of IT workers to hire those with stronger AI skills — GM is reshaping IT around AI-native development, agent/model engineering, and prompt engineering — an AI-driven labor reshuffle in legacy industry. (TechCrunch AI)
- Thinking Machines Lab ships its first model with real-time interaction — Mira Murati’s startup unveiled an audio-video-text interaction system processing 200ms parallel chunks, aimed at GPT Realtime 2 and Gemini. (The Decoder)
- Hollywood backs new Human Consent Standard for AI licensing — Clooney, Hanks, Streep, and others back a licensing standard letting AI systems detect whether they must pay to use a person’s likeness or work. (The Verge AI)
- Anthropic warns investors against secondary platforms offering access to its shares — Anthropic declared any secondary sale of its stock void, reflecting tight private-market controls amid frenzied AI valuations. (TechCrunch AI)
- The AI legal services industry is heating up — Anthropic gets in — Anthropic launched twelve Claude Cowork legal plugins for contracts, employment law, and litigation, integrating with Thomson Reuters CoCounsel and Harvey. (TechCrunch AI)
Europe
- SoftBank in talks for major data center project in France — Masayoshi Son weighs a multibillion-dollar France investment as part of SoftBank’s AI-infrastructure buildout. (The Japan Times)
- Europe exported spyware to human rights abusers, watchdog says — Human Rights Watch finds 2021 EU spyware export controls are not being properly enforced. (The Japan Times)
- UK fines water supplier $1.3M for exposing data of 664k customers — ICO fined South Staffordshire Water £963,900 over a cyberattack exposing 663,887 customers — a major UK GDPR action. (BleepingComputer)
- Skoda warns of customer data breach after online shop hack — Skoda Auto (VW Group) disclosed a breach after attackers hacked its online shop. (BleepingComputer)
- The EU’s commission chief is increasingly seen as too powerful — Ursula von der Leyen’s centralized style is faltering as her project-revival campaign loses member-state support. (The Japan Times)
- Firefox hits 6 million users via EU/JP browser-choice screens — Mandated browser-choice screens in the EU and Japan have driven 6M+ Firefox sign-ups, showing real impact of default-screen mandates. (Gigazine)
- Fortinet warns of critical RCE flaws in FortiSandbox and FortiAuthenticator — Two critical RCE vulnerabilities patched in widely deployed enterprise/government appliances. (BleepingComputer)
- SAP fixes critical vulnerabilities in Commerce Cloud and S/4HANA — May 2026 updates address 15 vulnerabilities including two critical flaws in products central to European enterprise IT. (BleepingComputer)
Japan (AI & Tech)
- 高市総理、サイバー攻撃対策指示「Claude Mythos」巡り — PM Takaichi orders cyber-attack countermeasures over Claude Mythos — PM Sanae Takaichi instructed cabinet ministers on May 12 to strengthen cyber-attack defenses, citing the rising offensive capabilities of frontier AI like Claude Mythos Preview. (ITmedia AI+)
- 「AI製ゼロデイ攻撃」ついに出現か Google報告 — AI-built zero-day attack reportedly emerges — Google’s threat-analysis division identified attackers planning to deploy AI-generated zero-day exploit code; GTIG intervened before a mass attack could be launched. (ITmedia AI+)
- 警視庁、「シャドーAI」に注意喚起 — Tokyo Metropolitan Police warn about shadow AI — Tokyo Cyber Security HQ publicly warned employers about shadow AI — employees using unsanctioned AI services for work — one of Japan’s first such advisories. (ITmedia AI+)
- 動かぬ検証機「SEIMEI」に純国産ヒューマノイド開発に向けたKyoHAの覚悟 — Stationary prototype SEIMEI shows KyoHA’s resolve for a fully domestic humanoid — Kyoto Humanoid Association unveiled SEIMEI, a fully domestically produced humanoid prototype; despite ankle damage preventing a dynamic demo, the candid reveal signals Japan’s push into humanoid robotics. (ITmedia AI+)
- 「有人変形ロボット」登場、二足→四足歩行に 価格は1億円、中国Unitree — Unitree GD01 manned transforming biped/quadruped robot launches at ~¥100M — China’s Unitree unveiled GD01, a rideable transforming humanoid switching between bipedal and quadrupedal locomotion, intensifying the China–Japan humanoid race. (ITmedia AI+)
- 日本語でAIを使うと約1.5倍高く付く トークン効率比較 — Using AI in Japanese costs ~1.5x more: token-efficiency comparison — Independent measurement across GPT-5.5, Claude Opus 4.7 etc. shows Japanese-language usage costs roughly 1.5x English — an under-discussed structural disadvantage for Japan’s AI adoption. (ITmedia AI+)
- NECが独自AIを活用した「世界初」の変換技術を開発 3D点群データを90%軽量化 — NEC develops world-first AI tech compressing 3D point-cloud data by 90% — NEC combined proprietary AI with Gaussian splatting to convert bulky 3D point clouds into lightweight high-fidelity 3D, useful for digital twins and robotics. (ITmedia AI+)
- 原子1個分の隙間が次世代半導体開発の障壁に — Single-atom gap becomes barrier to next-gen semiconductor development — TU Wien research shows 2D materials targeted for next-gen chips face a ~0.14 nm gap with insulating layers that may bottleneck miniaturization — relevant to Japan’s semiconductor strategy. (Gigazine)
Research Papers
Benchmarks & Evaluation
- Measuring What Matters: Benchmarking Generative, Multimodal, and Agentic AI in Healthcare — Proposes a benchmark targeting reliability of AI in live high-stakes clinical workflows, where standard benchmarks fail to capture deployment-grade performance.
- Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values — First broad benchmark probing whether autonomous agents act consistently with human values across decision contexts.
- LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments — Benchmarks behavioral (vs content) jailbreaks where agents operating in real OS environments are induced to perform harmful actions — a new safety frontier.
Security & Adversarial
- When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions — Demonstrates multi-LLM coordination for end-to-end automated intrusion workflows, showing scaling advantages over single-agent attackers.
- Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning — Defines Oracle Poisoning: corrupting structured knowledge graphs queried by tool-using agents so they reach attacker-chosen conclusions through clean tool calls.
- Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization — Self-evolving metacognitive policy optimization for automated red-teaming that escapes static heuristic limits.
- MalTool: Malicious Tool Attacks on LLM Agents — Studies a supply-chain attack where adversaries upload malicious tools to distribution platforms; once an agent selects the tool, the attacker hijacks task execution.
Compliance & Regulation
- BiAxisAudit: A Novel Framework to Evaluate LLM Bias Across Prompt Sensitivity and Response-Layer Divergence — Bias-audit benchmark designed for governance frameworks like the EU AI Act, addressing reliability problems in current bias benchmarks.
- AgentCrypt: Advancing Privacy and Secure Computation in AI Agent Collaboration — Argues traditional access controls are insufficient for AI-agent collaboration in regulated settings and proposes cryptographic primitives for context-aware privacy compliance.
- A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents — Benchmark targeting whether autonomous agents respect outcome-driven constraints rather than just policy text.
Alignment & Safety
- Why Do Aligned LLMs Remain Jailbreakable: Refusal-Escape Directions, Operator-Level Sources, and Safety-Utility Trade-off — Mechanistic analysis pinning jailbreakability of aligned LLMs to specific refusal-escape directions and operator-level sources.
- Harmful Intent as a Geometrically Recoverable Feature of LLM Residual Streams — Identifies a geometric representation of harmful intent in the residual stream, suggesting alignment defenses can read intent directly rather than relying on behavior.
- Intrinsic Guardrails: How Semantic Geometry of Personality Interacts with Emergent Misalignment — Connects emergent misalignment from narrow benign finetuning to semantic geometry of model personality, offering an internal-state explanation and mitigation.
Applications
- Adversarial Robustness Methods for LLM Intelligent Agents in Medical Decision-Making — Targets adversarial robustness, security, and trust in LLM agents used for clinical decision support — a high-stakes deployment surface with regulatory implications.
- Towards Trustworthy Audio Deepfake Detection: A Systematic Framework for Diagnosing and Mitigating Gender Bias — Diagnoses fairness gaps across demographics in audio deepfake detectors deployed in security-critical settings, with mitigation steps.
- CHAINTRIX: A Multi-Pipeline LLM-Augmented Framework for Automated Smart-Contract Security Auditing — LLM-augmented automated auditing for smart contracts, pushing the affordability frontier of formal/AI-assisted financial-code audits.
Guardrails & Robustness
- Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing — Defense method offering certified guarantees against jailbreaking via input smoothing-and-rectification.
- ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection — Runtime framework for defending tool-augmented agents from indirect prompt injection, a leading deployment-blocking attack vector in 2026.
- Preventing Prompt Injection with Type-Directed Privilege Separation — Applies programming-language privilege-separation principles to agentic systems so untrusted inputs cannot escalate to privileged actions.
Key Themes
- Offensive AI crosses a threshold. Google’s disclosure of the first AI-discovered weaponized zero-day, paired with academic work on multi-LLM coordinated intrusions and self-evolving jailbreak agents (Metis), confirms AI-driven offense is no longer theoretical. The 90-day disclosure-policy debate is a direct downstream tremor.
- Defenders mobilize at the frontier. OpenAI’s Daybreak and Anthropic’s Claude Mythos signal an AI-vs-AI security race among frontier labs, even as the cURL maintainer pushes back on overhyped capability claims.
- AI-infrastructure supply chains are now a primary target. The Mini Shai-Hulud worm specifically hit AI libraries (Mistral AI, Guardrails AI, TanStack), Hugging Face tokenizer files became a model-hijacking vector, and CheckMarx’s own Jenkins plugin was compromised — security-tool and AI-tool supply chains are the new soft underbelly.
- State actors and heads of state respond. Japan’s PM Takaichi explicitly named Claude Mythos when ordering cabinet-level cyber measures; Tokyo police issued a shadow-AI advisory; Microsoft removed its Israel chief over Azure-powered military AI in Gaza. Governance is moving from white papers to personnel actions.
- AI legal liability and labor disruption keep accelerating. A new ChatGPT wrongful-death suit, GM’s AI-driven IT layoffs, the Hollywood-backed Human Consent Standard, and Anthropic’s Claude Cowork legal plugins all point to the same compression: AI capability is outrunning legal, labor, and licensing frameworks.
- Agent safety dominates the research agenda. The strongest papers cluster around agent-specific risks — behavioral jailbreaks (LITMUS), malicious tools (MalTool), oracle/knowledge-graph poisoning, indirect prompt injection defenses (ClawGuard), and privilege separation — reflecting the field’s shift from chatbot harms to agentic-action harms.
For detailed summaries of selected research papers, see papers.md.