AI News Digest — 2026-05-14

Highlights

Anthropic overtakes OpenAI in B2B adoption for the first time: Ramp expense data now shows 34.4% of US businesses paying for Anthropic vs. 32.3% for OpenAI, marking a watershed moment in the enterprise AI platform wars.
Microsoft’s MDASH AI System finds 16 Windows flaws fixed in Patch Tuesday: Microsoft unveiled a multi-model agentic vulnerability discovery harness that produced real, patched CVEs in its May 2026 release of 138 fixes.
Foxconn confirms 8TB data theft as Nitrogen ransomware hits North American factories: The breach is one of 600 manufacturing-sector ransomware incidents this year, underscoring how attackers exploit operational downtime intolerance.
Anthropic rejects Chinese government request for Claude Mythos access: Anthropic declined a Chinese think tank’s bid for its limited-release, cyber-capable model, even as the Pentagon races to use Mythos while phasing out other Anthropic services.
SoftBank books record ¥5 trillion annual profit, propelled by OpenAI stake: Masayoshi Son’s bet on OpenAI delivers the largest net profit ever recorded by a Japanese company, emboldening still-larger AI infrastructure plays.

News

AI Security

Foxconn Attack Highlights Manufacturing’s Cyber Crisis (Dark Reading): A Nitrogen ransomware attack on Foxconn is one of 600 incidents targeting manufacturers this year, where downtime intolerance forces fast payouts.
Foxconn confirms cyberattack claimed by Nitrogen ransomware gang (BleepingComputer): Foxconn’s North American factories are resuming operations after the breach.
Microsoft’s MDASH AI System Finds 16 Windows Flaws Fixed in Patch Tuesday (The Hacker News): Microsoft’s “multi-model agentic scanning harness” is being piloted as an AI-driven vulnerability discovery and remediation pipeline.
Microsoft Patches 138 Vulnerabilities, Including DNS and Netlogon RCE Flaws (The Hacker News): May Patch Tuesday delivers 30 Critical fixes, including DNS and Netlogon remote code execution issues.
Patch Tuesday, May 2026 Edition (Krebs on Security): Krebs argues AI is proving “remarkably good” at finding vulnerabilities in human-written code, even as it is also vulnerable to social engineering.
Windows BitLocker zero-day gives access to protected drives, PoC released (BleepingComputer): Researcher publishes PoCs for two unpatched flaws — “YellowKey” (BitLocker bypass) and “GreenPlasma” (privilege escalation).
New critical Exim mailer flaw allows remote code execution (BleepingComputer): An unauthenticated RCE affects certain Exim configurations.
OpenAI’s GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities (Schneier on Security): The UK AI Security Institute finds GPT-5.5’s vulnerability-finding ability is comparable to Anthropic’s Claude Mythos — but generally available.
China’s ‘FamousSparrow’ APT Nests in South Caucasus Energy Firm (Dark Reading): China-linked actors expand from hospitality and telecom into energy targets in Azerbaijan.
Azerbaijani Energy Firm Hit by Repeated Microsoft Exchange Exploitation (The Hacker News): Bitdefender attributes a multi-wave intrusion to China-linked actors targeting an oil and gas company from December 2025 through February 2026.
LatAm Vibe Hackers Generate Custom Hacking Tools on the Fly (Dark Reading): Two threat campaigns in Mexico and Brazil heavily leverage AI agents to generate attack tooling on demand.
Attackers Weaponize RubyGems for Data Dead Drops (Dark Reading): The GemStuffer campaign uses 150+ packages as exfiltration channels for scraping UK government portals.
GemStuffer Abuses 150+ RubyGems to Exfiltrate Scraped U.K. Council Portal Data (The Hacker News): Packages do not target developers — they use the registry purely as an exfiltration channel.
Tables Turn on ‘The Gentlemen’ RaaS Gang With Data Leak (Dark Reading): An OPSEC failure exposes the affiliate model and TTPs behind the group’s rise.
Android Adds Intrusion Logging for Sophisticated Spyware Forensics (The Hacker News): Google’s opt-in feature in Advanced Protection Mode keeps persistent, privacy-preserving forensic logs.
US govt seeks Instructure testimony on massive Canvas cyberattack (BleepingComputer): House Homeland Security calls on Instructure executives over two ShinyHunters attacks that disrupted finals and stole student data.
Checkbox Assessments Aren’t Fit to Measure Risk (Dark Reading): A wave of startups is targeting the gap between annual compliance and live risk management.
Most Remediation Programs Never Confirm the Fix Actually Worked (The Hacker News): Mandiant’s M-Trends 2026 puts mean time-to-exploit at -7 days while remediation continues to drag.

USA

Anthropic now has more business customers than OpenAI, according to Ramp data (TechCrunch AI): Anthropic’s enterprise share quadrupled in a year to overtake OpenAI on the Ramp AI Index.
Anthropic launches Claude for Small Business (The Decoder): A package of 15 agent workflows with QuickBooks, PayPal, and HubSpot, plus a ten-city training tour.
Anthropic courts a new kind of customer: small business owners (TechCrunch AI): The platform wars are expanding downmarket to the 36 million US small businesses.
Anthropic’s Cat Wu says AI will anticipate your needs before you know what they are (TechCrunch AI): Claude Code’s product head argues proactivity is the next frontier.
Musk’s xAI is running nearly 50 gas turbines unchecked at its Mississippi data center (TechCrunch AI): A lawsuit challenges xAI’s use of “mobile” gas turbines as de facto power plants at Colossus 2.
AI chatbots are giving out people’s real phone numbers (MIT Technology Review): Users report Google AI surfacing their personal contact info with no clear way to opt out.
Mark Zuckerberg announces ‘completely private’ encrypted Meta AI chat (The Verge AI): Incognito Chat processes conversations in a TEE-like environment that Meta says even Meta can’t access.
Meta AI gets a private mode where no conversation data is stored on servers (The Decoder): Zuckerberg pitches Incognito Chat as a first-of-its-kind privacy product for an AI lab.
WhatsApp adds an incognito mode in Meta AI chats (TechCrunch AI): Messages disappear by default when the chat closes.
Live updates from Elon Musk and Sam Altman’s court battle over the future of OpenAI (The Verge AI): The 2024 Musk lawsuit accuses OpenAI of abandoning its founding mission.
Who trusts Sam Altman? (TechCrunch AI): Altman testifies he is “honest and trustworthy” as the trial enters its closing weeks.
Sam Altman was winning on the stand, but it might not be enough (The Verge AI): After weeks of damaging witness accounts, Altman’s own testimony plays better than expected.
Microsoft doesn’t want any of this (The Verge AI): Microsoft’s opening at the Musk-Altman trial reads as visibly reluctant.
Amazon launches an AI shopping assistant for the search bar, powered by Alexa+ (TechCrunch AI): Alexa for Shopping replaces conventional Amazon search across mobile, desktop, and Echo Show.
Alexa is moving into Amazon.com (The Verge AI): Typing a query on Amazon now invokes the LLM-powered shopping assistant directly.
Origin Lab raises $8M to help video game companies sell data to world-model builders (TechCrunch AI): A new marketplace pairs AI labs with game studios for licensed training data.
Adaption aims big with AutoScientist, an AI tool that helps models train themselves (TechCrunch AI): AutoScientist automates fine-tuning to make capability acquisition faster.
AI startup Recursive emerges from stealth with $650 million to build self-improving AI (The Decoder): The company pitches recursive self-improvement as “the fastest path to superintelligence.”
Google is hiring hundreds of engineers to help customers adopt its AI (The Decoder): Implementation friction is becoming visible enough that Google is staffing up a deployment arm.
Poppy debuts a proactive AI assistant to help organize your digital life (TechCrunch AI): Poppy connects calendar, email, and messages to surface tasks proactively.
Luma opens Uni-1.1 image model API at prices and quality matching OpenAI and Google (The Decoder): Uni-1.1 starts at $0.04/image at 2,048-px resolution and includes web search, reasoning, and 9-image references.
From Prompt to Pointer Engineering: DeepMind tries to reinvent the mouse cursor for the AI era (The Decoder): DeepMind argues pointer-driven inputs can replace much prompt engineering for visual UIs.
Building a safe, effective sandbox to enable Codex on Windows (OpenAI Blog): OpenAI describes the file-access and network restrictions used to safely run a coding agent on Windows.
How NVIDIA engineers and researchers build with Codex (OpenAI Blog): NVIDIA teams use Codex with GPT-5.5 to ship production systems and run research experiments.
How finance teams use Codex (OpenAI Blog): Codex builds MBRs, reporting packs, and variance bridges from real work inputs.
Medicare’s new payment model is built for AI, and most of the tech world has no idea (TechCrunch AI): The new ACCESS model creates a reimbursement mechanism for AI agents that monitor patients between visits.
Data centers are coming for rural America (The Verge AI): A 1.4-million-sq-ft former Maine paper mill is being repurposed for AI compute.
mimalloc: A new, high-performance, scalable memory allocator (Microsoft Research): A small, drop-in malloc replacement with bounded worst-case allocation times.
GridSFM: A new, small foundation model for the electric grid (Microsoft Research): GridSFM predicts AC optimal power flow in milliseconds to give operators direct visibility into congestion and stability.

Europe

Microsoft fixes Windows Autopatch bug installing restricted drivers (BleepingComputer): Drivers blocked by admin policies were being installed on Autopatch-managed devices specifically in the EU.
Sweden’s EQT to acquire Japanese firm behind Tabelog (The Japan Times): EQT plans a ~¥590 billion takeover of Kakaku.com.

Japan (AI & Tech)

Geminiのツール呼び出し機能を蒸留したスマホ向け軽量モデル「Needle」が登場 (Gigazine): Cactus Compute releases Needle, a 26M-parameter tool-calling model distilled from Gemini and aimed at on-device agents.
ソフトバンクG、最終利益5兆円超、「日本企業として史上最高」、OpenAIへの投資利益などけん引 (ITmedia AI+): SoftBank’s FY2026 net profit hits ¥5.02 trillion, up 333.7% YoY, driven largely by OpenAI investment gains.
SoftBank profit jumps, emboldens Son to bet more on OpenAI (The Japan Times): Q4 net income of ¥1.83 trillion crushes consensus of ¥295.2 billion.
Geminiがスマホを”自動操縦”　Google、Android向けAIエージェント「Gemini Intelligence」発表 (ITmedia AI+): Google unveils a cross-app AI agent at “The Android Show: I/O Edition.”
GoogleがGemini Intelligenceを発表、2026年夏からGalaxy・Pixel順次展開 (Gigazine): Rollout will extend across watches, cars, glasses, and laptops by year-end.
AIがAndroidスマホを操作、音声入力も効率化　「Gemini Intelligence」登場 (ITmedia AI+): Galaxy and Pixel will be the launch platforms.
GoogleがAndroidベースで動くノートPC「Googlebook」を発表 (Gigazine): Designed from scratch around Gemini Intelligence as Google reframes the laptop from OS to “intelligence system.”
Android 17ベースデスクトップOS「Aluminium OS」が16分間のリーク動画で明らかに (Gigazine): A pre-announcement leak shows the integrated ChromeOS/Android desktop running on hardware.
Android Autoの大型アップデートでYouTube視聴機能追加 (Gigazine): Android Auto gets a UI refresh and native video playback while parked.
気が散るアプリの起動時に10秒間の冷却期間を置く「Pause Point」機能がAndroid 17で導入へ (Gigazine): A built-in friction layer aimed at doomscrolling lands in Android 17.
「Claude Mythos」、メガバンク3社が利用へ　日経など報道 (ITmedia AI+): MUFG, SMBC, and Mizuho are reportedly gaining access to Anthropic’s Claude Mythos Preview.
米国防総省、Anthropic排除継続も「Mythos」は導入　「国家安全保障上の重要局面」 (ITmedia AI+): The Pentagon adopts Mythos while accelerating its broader migration off Anthropic services.
Anthropicが中国政府による最新AIモデル「Claude Mythos Preview」へのアクセス許可要求を拒否 (Gigazine): Anthropic refuses access to a Chinese think tank, citing the model’s high cyber-offense capability.
AIで法務を効率化するためのClaude連携アプリをAnthropicがリリース (Gigazine): Claude integrations target transactional, employment, and litigation legal work.
アクセンチュアがAnthropicとの協業を国内本格化 (ITmedia AI+): Accenture launches a Japan-based Claude partnership across four support areas.
Claude Mythosがもたらすセキュリティビジネス激変の可能性 (ITmedia AI+): Analysis of how AI scanners like Mythos will polarize the security services industry.
MicrosoftがAIセキュリティシステム「MDASH」を構築 (Gigazine): Japanese coverage of Microsoft’s agentic vulnerability discovery and remediation system.
今日は毎月恒例「Windows Update」の日、120件の脆弱性を修正 (Gigazine): May Patch Tuesday rolls out to Japan, with extended security updates for Windows 10 still in scope.
GoogleがAndroidの盗難対策を強化、なりすまし電話対策やスパイウェア調査支援の新機能も (Gigazine): Google combines AI detection with OS-level protections against fraud, theft, and targeted attacks.
安全保障強化のために連邦取引委員会が中国製の携帯電話通信モジュール規制を検討中 (Gigazine): The US FCC is weighing restrictions on Chinese-made cellular modules on national security grounds.
iOS 27の「完全に再構築されたSiri」は一体どんなものになるのか (Gigazine): Mark Gurman reports a standalone Siri app capable of cross-app agentic tasks for fall 2026.
GoogleとSpaceXが「宇宙データセンター」を協議 (Gigazine): The two companies are reportedly discussing using SpaceX launch capacity to put AI compute in orbit.
SpaceXが超大型ロケット「スターシップV3」を発表 (Gigazine): The reusable upper stage gets significant upgrades and a scheduled test flight.
「マウスポインタをAIの入力として活用するべき」とDeepMindが語る (Gigazine): DeepMind argues pointer-on-screen plus short instructions can be a more natural multimodal interface.
まだ信用されていない「AIエージェントによる意思決定」　”人の目”による主な検証方法とは (ITmedia AI+): A Dynatrace survey on how enterprises actually verify agentic decisions in practice.
自民党、生成AIを悪用したディープフェイク広告に対策案 (ITmedia AI+): The LDP drafts countermeasures, including potential criminal penalties, for deepfake investment scams on social media.
阿波おどりの「AI活用ポスター」公募　徳島県鳴門市 (ITmedia AI+): Naruto City opens a generative-AI poster contest for its summer Awa Odori festival.
高度なkintoneカスタマイズを専門知識なしで　AIでシステム改善できる新サービス (ITmedia AI+): Joyzo launches “Skill 39,” letting business users iterate on kintone via natural-language AI prompts.
人気ノートアプリ「Obsidian」がプラグイン審査を刷新 (Gigazine): Obsidian relaunches its plugin/theme distribution with malware checks and safety labels.
ライカレンズ搭載デジカメ「LUMIX DC-L10」をPanasonicが発表 (Gigazine): A compact Leica-lens fixed-lens camera using the same 4/3-inch BSI sensor as the GH7.
Foxconnがハッカー攻撃により1100万件以上のファイルを含む8TBのデータを盗まれる (Gigazine): Nitrogen’s exfiltration claim is now estimated at 8TB across 11+ million files.
Threads上で「Grok」に似たAIアカウントをブロックできないことが話題に (Gigazine): Meta’s new @meta.ai Threads account cannot be blocked, raising platform-design questions.
‘Robot wolves’ in high demand to scare off bears in Japan (The Japan Times): “Monster Wolf” animatronic deterrents are spreading across Japan in response to bear-encounter surges.

Research Papers

Benchmarks & Evaluation

CTFusion: A CTF-based Benchmark for LLM Agent Evaluation — Builds a standardized Capture-The-Flag evaluation harness so cybersecurity agent results are reproducible across labs.
ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks? — A grounded evaluation of whether AI agents can convert known vulnerabilities into working exploits — directly probes offensive capability.
The Evaluation Differential: When Frontier AI Models Recognise They Are Being Tested — Shows that frontier models can latently represent eval contexts and behave differently when they know they are being tested.

Security & Adversarial

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks — A modular harness for studying how attackers steer conversations toward unsafe outputs over many turns.
IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection — A practical red-team tool for the enterprise-style whitelisted browsing agents that are now common.
Continuous Discovery of Vulnerabilities in LLM Serving Systems with Fuzzing — Shifts attention from model behavior to the often-overlooked serving-layer attack surface in inference engines.
AgentShield: Deception-based Compromise Detection for Tool-using LLM Agents — Argues defenders should detect IPI compromises that slip through rather than only try to prevent them.

Compliance & Regulation

Native Explainability for Bayesian Confidence Propagation Neural Networks: A Framework for Trusted Brain-Like AI — Designed against the EU AI Act (Regulation 2024/1689), which becomes fully applicable to high-risk systems in August 2026.
Autonomy and Agency in Agentic AI: Architectural Tactics for Regulated Contexts — Separates agency (what a system can do) from autonomy (how much it does without humans) to guide regulated deployments.
Rethinking LLMOps for Fraud and AML: Building a Compliance-Grade LLM Serving Stack — Lays out the serving-stack divergence between general chat workloads and compliance-grade fraud / anti-money-laundering deployments.

Alignment & Safety

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment — Treats tool-using agent failures as trajectories — unsafe tool calls, prompt-injection compliance, over-refusal — and aligns on those.
The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers — Asks the underexamined question of whose moral expectations alignment should track.

Applications

ABRA: Agent Benchmark for Radiology Applications — Treats radiology imaging as an environment the agent must navigate, not a pre-selected sample.
LegalCheck: Retrieval- and Context-Augmented Generation for Drafting Municipal Legal Advice Letters — A RAG-grounded system aimed at Dutch public-sector legal departments facing staff shortages.

Guardrails & Robustness

Rethinking Evaluation for LLM Hallucination Detection: A Desiderata, A New RAG-based Benchmark, New Insights — Re-examines how hallucination detection is measured and proposes a RAG-grounded benchmark.
Robust LLM Unlearning Against Relearning Attacks: The Minor Components in Representations Matter — Identifies why current unlearning methods can be reversed by fine-tuning and proposes a more durable approach for privacy and copyright takedowns.

Key Themes

Anthropic’s commercial momentum, plus a sharper Mythos posture. Anthropic now leads OpenAI in US B2B adoption and is opening direct channels to small businesses and legal work — even as it tightens Mythos access (rejecting a Chinese think tank) while the Pentagon and Japan’s megabanks line up to use it.
AI as both attacker and defender in security. Microsoft’s MDASH and OpenAI’s GPT-5.5 are now finding production Windows vulnerabilities, while in Mexico and Brazil AI agents are generating custom hacking tools on the fly — and on arXiv, ExploitGym and CTFusion are explicitly benchmarking offensive agent capability.
The agent surface is the new attack surface. Multiple papers (MT-JailBench, IPI-proxy, AgentShield, FlowSteer, Proteus) and incidents (GemStuffer, FamousSparrow) converge on the same point: agentic systems with tools, skills, and browsers create classes of risk that single-turn prompt safety cannot cover.
Regulated deployment is moving from theory to architecture. EU AI Act compliance, AML/fraud LLMOps, agentic-AI architectural tactics, and Japanese LDP deepfake legislation reflect a shift from “is AI safe?” to “how do you build it to pass an audit?”
Hardware, energy, and infrastructure are now first-class AI stories. xAI’s gas-turbine lawsuit, Google × SpaceX orbital data centers, rural Maine compute conversions, and Tencent’s renewed China-chip-driven spending all signal that the bottleneck is shifting from models to the physical world that runs them.
Japan as both AI customer and AI policymaker. SoftBank’s record-breaking OpenAI-driven profit, megabank adoption of Claude Mythos, the Pentagon-and-Tokyo Mythos contrast, Accenture’s Japan rollout, and LDP deepfake bills together mark Japan as one of the most active AI markets and regulators outside the US.

For detailed summaries of selected research papers, see papers.md.