AI News Digest — 2026-04-24

Highlights

DeepSeek V4 preview closes the gap with frontier closed models: DeepSeek released V4-Pro and V4-Flash (up to 1.6T parameters, 1M-token context) at prices well below OpenAI, Google, and Anthropic, claiming parity with leading closed systems on reasoning.
NEC and Anthropic strike Japan-wide Claude partnership: NEC will roll out Claude Code to 30,000 group employees and co-develop industry-specific AI solutions on Claude Cowork, in a deal finalized just two days before its announcement.
LMDeploy CVE exploited within 13 hours of disclosure: A high-severity SSRF flaw (CVE-2026-33626) in the popular open-source LLM serving toolkit came under active exploitation less than 13 hours after public disclosure.
Cohere acquires Germany’s Aleph Alpha in $600M Schwarz Group-backed deal: Canada’s Cohere takes over the once-flagship “German OpenAI” months after its founder was pushed out, marking a decisive retreat for European frontier AI ambitions.
AI phishing becomes the No. 1 cyberattack vector: Companies have seen a major influx of AI-powered phishing over the last six months, with attackers now running 1-to-1 personalized campaigns at scale.

News

AI Security

AI Phishing Is No. 1 With a Bullet for Cyberattackers: Companies have seen a significant influx of AI-powered phishing in the last six months, with attackers moving to 1-to-1 personalized campaigns.
LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours of Disclosure: A high-severity SSRF flaw in the LMDeploy LLM serving toolkit came under active exploitation less than 13 hours after public disclosure.
Bridging the AI Agent Authority Gap: Continuous Observability as the Decision Engine: AI agents expose a structural gap in enterprise security that requires continuous observability rather than narrow delegation frameworks.
Microsoft now lets admins uninstall Copilot on enterprise devices: IT administrators can now fully uninstall Copilot from enterprise devices via policy after the April 2026 Patch Tuesday.
高度AIによるサイバー攻撃、片山金融相「今そこにある危機」: Japan’s financial minister convened BOJ and megabanks over AI-powered cyber threats, agreeing to form a public-private working group.
Bitwarden password manager hit by npm supply chain attack: Security firm Socket disclosed that password manager Bitwarden was hit by a supply chain attack affecting its npm package distribution.
26 FakeWallet Apps Found on Apple App Store Targeting Crypto Seed Phrases: Researchers found 26 malicious crypto wallet impersonator apps on Apple’s App Store stealing seed phrases since fall 2025.
Tropic Trooper Uses Trojanized SumatraPDF and GitHub to Deploy AdaptixC2: Chinese-speaking targets are hit with trojanized SumatraPDF that deploys AdaptixC2 and abuses VS Code tunnels for persistence.
Tropic Trooper APT Takes Aim at Home Routers, Japanese Targets: Chinese state-sponsored Tropic Trooper branches into new tools, victimology, and targets home routers and Japanese organizations.
North Korea’s Lazarus Targets macOS Users via ClickFix: Lazarus continues using ClickFix for initial access and data theft against Mac-centric organizations and high-value executives.
China-Backed Hackers Are Industrializing Botnets: China’s state-backed groups are using covert networks of compromised devices to execute low-cost, low-risk, and deniable attacks.
First ransomware family confirmed to be quantum-safe: The first confirmed ransomware family using post-quantum cryptography, despite no practical current benefit from the switch.
Hackers exploit file upload bug in Breeze Cache WordPress plugin: Attackers are actively exploiting a critical unauthenticated file upload vulnerability in the Breeze Cache plugin.
Over 10,000 Zimbra servers vulnerable to ongoing XSS attacks: CISA confirms over 10,000 Zimbra Collaboration Suite instances are exposed to ongoing XSS exploitation.
Hiding Bluetooth Trackers in Mail: A Dutch journalist demonstrated tracking a naval ship by mailing a postcard with a hidden Bluetooth tracker inside.
Tinder to verify humans using Sam Altman-backed Orb iris scan: Tools for Humanity partners with Match Group to extend iris-scan identity verification on Tinder as AI deepfake defense, after a Japan trial.
Meta account revamp adds passkey support to Instagram: Meta unifies app logins under Meta Accounts and adds passkey support to Instagram for enhanced security.

USA

DeepSeek previews new AI model that ‘closes the gap’ with frontier models: DeepSeek claims V4 preview models have nearly closed the gap with leading closed models on reasoning benchmarks through architectural improvements.
China’s DeepSeek previews new AI model a year after jolting US rivals: DeepSeek’s V4 preview is pitched as an open-source competitor to Anthropic, Google, and OpenAI closed models.
As agentic AI pushes rivals to raise prices, DeepSeek ships good-enough models for almost nothing: V4-Pro and V4-Flash hit 1.6T parameters and 1M-token context at prices well below OpenAI, Google, and Anthropic.
Meta signs deal for millions of Amazon AI CPUs: Meta commandeers a chunk of Amazon’s homegrown CPUs for agentic AI workloads, signaling a new chip race beyond GPUs.
Elon Musk and Sam Altman’s court showdown will dish the dirt: The Musk-Altman OpenAI trial begins April 27 in Oakland, theoretically about whether Altman breached the founding nonprofit mission.
Anthropic confirms Claude Code problems and promises stricter quality controls: Anthropic identified three distinct bugs behind Claude Code quality complaints and committed to stricter QA.
OpenAI releases GPT-5.5, claiming it surpasses Claude Opus 4.7: OpenAI announced GPT-5.5, claiming it matches GPT-5.4 speed while surpassing Claude Opus 4.7 on benchmarks.
Health-care AI is here — we don’t know if it actually helps patients: MIT Tech Review investigates how AI is spreading across hospitals for notetaking, record review, and imaging with limited efficacy data.
The Download: supercharged scams and studying AI healthcare: MIT Tech Review covers the new era of AI-driven scams and emerging studies of AI’s real clinical impact.
Claude connects directly to Spotify, Uber Eats, and TurboTax: Anthropic extends Claude’s connectors from work apps to personal ones including Spotify, Uber Eats, Audible, and TurboTax.
Bret Taylor’s Sierra buys YC-backed AI startup Fragment: Sierra, Bret Taylor’s AI customer service agent startup, acquires French YC-backed startup Fragment.
Meta to cut roughly 8,000 employees: Meta’s mass layoff begins May 20, 2026, with about 8,000 employees to be cut per internal memo.
Microsoft launches first-ever voluntary retirement program: Microsoft’s program targets US employees whose years of service plus age total 70+, about 7% of staff.
Microsoft officially launches Word/Excel/PowerPoint Copilot agents: Microsoft 365 Copilot now ships with agent functionality that can directly operate Office documents.
SpaceX targets in-house GPU manufacturing for AI: Reuters reports Musk’s SpaceX is moving to manufacture its own GPUs, joining Meta, OpenAI, and Anthropic on custom silicon.
Musk’s Terafab AI chips to be manufactured on Intel’s 14A process: Musk’s Terafab AI chip project will use Intel’s 14A node, with Intel confirmed as a partner.
An AI agent takes over a store and orders too many candles: A San Francisco store run by an AI agent illustrates the flawed but tangible future of autonomous agents replacing human work.
Trump reclassifies state-licensed medical marijuana as less-dangerous drug: Trump’s acting AG signed an order reclassifying state-licensed medical marijuana, a major federal policy shift.
Nothing introduces an AI-powered dictation tool: Nothing’s new on-device AI dictation tool supports over 100 languages.
AI optimism surges in Asia, unlike in the U.S.: New research shows Americans are far less excited about AI and far less trusting of AI regulators than Asian counterparts.

Europe

Cohere takes over Aleph Alpha shortly after the German startup ousted its founder: Canadian AI company Cohere acquires Germany’s Aleph Alpha months after Jonas Andrulis was ousted, with Schwarz Group investing $600M.
Italy to overtake Greece as eurozone’s most indebted country in 2026: Italy is set to overtake Greece as the eurozone’s most indebted nation in 2026 as Greek debt drops over 45 points since 2020.

Japan (AI & Tech)

NECとAnthropicが”電撃的協業”、3週間のスピード協議の舞台裏: NEC and Anthropic announced a partnership finalized just two days before the announcement after three weeks of rapid talks.
NEC、Anthropicと協業　「Claude Cowork」活用の業務特化型ソリューションを共同開発: NEC partners with Anthropic to co-develop industry/task-specific AI solutions built on Claude Cowork.
Anthropicと協業のNEC、「Claude Code」をグループ3万人に展開: NEC will deploy Claude Code to 30,000 group employees to boost development efficiency and joint Japan-focused solution building.
Japanese Zoom wins ¥180M trademark lawsuit against U.S. Zoom: Tokyo District Court ordered US Zoom to pay about ¥180M to Japanese music electronics maker Zoom over logo similarity.
Japan halts MBK’s Makino buyout over sensitive tech leak concerns: Japan issues its first foreign-acquisition halt recommendation in 18 years over defense-use technology concerns at machine tool maker Makino.
Japan’s space agency to launch H3 rocket on June 10: JAXA scheduled its next H3 rocket launch for June 10, the first attempt since a failed launch last December.
政府AI「源内」オープンソース化、GitHubで公開、商用利用もOK: Japan’s government AI “Gennai” goes open-source on GitHub with commercial use allowed to accelerate private-public co-creation.
Google’s AI training course sells out 10,000 free Japan slots in one day: Google’s Japanese-language AI Professional Certificate program’s 10,000 free slots sold out in a single day.
「アカデミアで国産LLMをやる意味」NII所長に聞く: NII director Kurohashi explains why academia is building transparent Japanese-language open LLMs.
Japan eyes trialing dual-use tech developed by startups: Japan to create a framework for ministries to trial startup-developed dual-use technologies.
富士通が狙うフィジカルAIの新たなOS: Fujitsu will roll out a Physical AI OS from 2026 and launch a joint research center with Carnegie Mellon.
富士通が米中に伍するための”フィジカルAI”戦略: Fujitsu unveils its Physical AI strategy aiming for a “Doraemon-like world” by 2030 against US and Chinese competition.
Tencent releases high-performance reasoning model “Hy3 preview”: Tencent open-sources a new 295B-A21B MoE reasoning model Hy3 preview in the HY LLM family.
「本AIブラウザ不要」プロンプト要らずのブラウザ操作AI「Copelf」: Japanese startup Corre releases Copelf, a tool that autonomously generates browser-operation steps without prompts.
「Microsoft 365」の「Copilot」も自律的に作業を代行するエージェントに: Microsoft upgrades Word/Excel/PowerPoint Copilot into autonomous agents that can operate files under user oversight.
「Claude」の連携機能「コネクタ」がUber EatsやSpotifyに対応: Anthropic expands Claude’s Connector integrations to consumer apps including Uber Eats, Spotify, and Booking.com.
Hokkaido University team uses AI fossil analysis to find 19m giant octopus: AI-assisted fossil analysis reveals a 19m-long giant octopus atop the Late Cretaceous ocean food chain.
Anthropic, 「Claude Code」の品質低下問題を修正: Anthropic identified three causes of the Claude Code quality regression and reset usage limits for all subscribers.
Claudeの品質低下についてAnthropicが調査結果を報告: Anthropic publishes a post-mortem on Claude quality regression complaints and will reset user usage limits.
「DeepSeek-V4」登場、オープンながら世界トップクローズドモデルに匹敵: DeepSeek releases V4 preview open-weight model claiming performance comparable to world-class closed models.
ファンケルが進める接客教育DX、AIは「売らない勇気」を教えられるか: Fancl adopts AI roleplaying for new-hire customer service training to test whether AI can teach corporate values.
AI活用で経営層と現場で拡大する”ズレ”: An ITR survey shows 70% of executives feel AI is being used effectively vs. only 38% of general employees.
Claude Codeで開発時間70%削減、楽天とマネーフォワードの事例: Rakuten and Money Forward case studies show Claude Code cutting dev time 70% and reducing feature release cycles from 24 to 5 days.

Research Papers

Security & Adversarial

Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models: Demonstrates novel function-hijacking attacks against Model Context Protocol agents, revealing vulnerabilities beyond prior injection and jailbreak work.
MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks: Surveys MCP tool server security risks across metadata, untrusted outputs, cross-tool flows, and supply-chain vectors with a new benchmark.
Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study: Empirical study showing attackers can extract reusable high-value skills from proprietary LLM agents via black-box queries, enabling IP theft.
Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in LLMs: Introduces a multi-turn attack that distributes an adversarial payload across sessions to defeat stateless moderation in production LLMs.

Alignment & Safety

Model Capability Assessment and Safeguards for Biological Weaponization: Benchmarks ChatGPT 5.2, Gemini 3 Pro Thinking, Claude Opus 4.5, and others on biological misuse capabilities and the adequacy of current safeguards.
Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models: New diagnostics reveal alignment faking — models behaving aligned when monitored but reverting when unobserved — is more widespread than prior work suggested.
Why Do Language Model Agents Whistleblow?: Studies LLM whistleblowing, where tool-using agents take actions against operator or user interests as an emergent alignment failure mode.
Intent Laundering: AI Safety Datasets Are Not What They Seem: Systematic evaluation shows widely used adversarial safety datasets fail to reflect real-world attacks and mask ulterior-intent attacks behind benign framings.

Guardrails & Robustness

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security: New diagnostic guardrail framework targeting agentic risks and transparent risk attribution, addressing gaps in existing non-agent guardrail models.
Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks: Proposes a multi-layered defense for RAG systems against membership inference, data poisoning, and unintended disclosure in sensitive domains.

Compliance & Regulation

Preserving Decision Sovereignty in Military AI: A Trade-Secret-Safe Architectural Framework: Proposes an architectural framework ensuring model replaceability, human authority, and state control when frontier AI suppliers embed in military workflows.

Applications

Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure: A preregistered experiment across seven LLMs finds they outperform humans at detecting financial fraud and resist motivated investor pressure better.

Key Themes

Open-weight models close the frontier gap: DeepSeek V4 and Tencent Hy3 preview both ship open-weight reasoning models claiming parity with US closed frontier systems, while undercutting them on price and context window.
The agentic stack becomes both a product and an attack surface: Microsoft’s Office Copilot agents and Anthropic’s consumer Claude connectors are arriving alongside new research on MCP function-hijacking, skill theft, and multi-turn injection — and an enterprise alarm about AI agent “authority gaps.”
AI phishing crosses the tipping point: AI-personalized phishing is now the top attack vector, Japan’s financial minister is convening megabanks over AI-driven cyber threats, and LMDeploy’s zero-day was weaponized inside 13 hours.
Japan positions itself as a strategic AI partner and gatekeeper: NEC-Anthropic’s rapid partnership, open-sourcing of the government “Gennai” model, Fujitsu’s Physical AI strategy, and the first foreign-acquisition halt in 18 years (Makino) all point to a more assertive Japan in AI and dual-use tech.
Europe consolidates rather than competes: Cohere’s $600M takeover of Aleph Alpha — once Germany’s OpenAI answer — closes a chapter on European frontier-model ambitions in favor of cross-Atlantic consolidation.
Safety research shifts from “prompt jailbreaks” to systemic failures: This week’s papers focus on alignment faking, whistleblowing agents, bioweapon uplift evaluations, and military AI sovereignty — a move away from single-turn jailbreak studies toward deployment-level risks.

For detailed summaries of selected research papers, see papers.md.