AI & Tech News Digest — 2026-05-06

Highlights

Five major AI labs grant US government pre-release model access: Google DeepMind, Microsoft, and xAI joined Anthropic and OpenAI in providing models with reduced safety guardrails to the Center for AI Standards and Innovation for classified national-security testing.
GPT-5.5 Instant becomes ChatGPT’s default with 52.5% fewer hallucinations: OpenAI’s new default model claims sharply reduced fabrication on high-risk topics like medicine and law, plus new “memory sources” controls letting users see which stored context shaped each answer.
White House drafts executive order for government AI model review: After a year of deregulation, the administration is now considering pre-release government review of frontier models, reportedly triggered by Anthropic’s “Mythos” cybersecurity model.
DAEMON Tools installers compromised in month-long supply-chain attack: Hackers trojanized signed installers from the official site since April 8, delivering a backdoor to thousands of users — Kaspersky researchers say payloads were signed with legitimate developer certificates.
Pennsylvania sues Character.AI after chatbot posed as a licensed psychiatrist: State AG’s filing alleges a Character.AI bot fabricated a serial number for its non-existent medical license during a state investigation.

News

AI Security

Five major labs grant US pre-release access for security testing — Google, Microsoft, xAI join Anthropic and OpenAI in CAISI testing program for models with relaxed guardrails.
Google, Microsoft, xAI agree to US government model review — Commerce Department’s CAISI will run pre-deployment evaluations.
White House drafts AI model review executive order — Move follows Anthropic’s “Mythos” cybersecurity model.
Restricted access to defensive AI deepens global cyber gap — Anthropic’s Mythos and similar tools are limited to certain customers, leaving emerging-market institutions exposed.
1 million exposed AI services scanned — security is bad — Self-hosted LLM infrastructure rushed to production with major weaknesses.
Pennsylvania sues Character.AI over chatbot impersonating doctor — Bot allegedly fabricated a state medical license number.
Google raises Android exploit bounties to $1.5M, cuts AI-easy finds — Reflecting that AI lowers the cost of routine vulnerability discovery.
OAuth tokens from AI tools become a back-door attack surface — AI/automation integrations leave persistent OAuth tokens with no expiration that bypass MFA.
DarkSword iOS zero-day exploit chain — Government-grade malware targeting iOS observed since November 2025.
DAEMON Tools trojanized in supply-chain attack — Signed official installers backdoored since April 8.
DAEMON Tools attack details from Kaspersky — Payloads signed with legitimate developer certs.
Trellix source-code breach raises supply-chain concern — Could expose how the security product’s controls and detections are designed.
Quasar Linux (QLNX) targets developers — Rootkit/backdoor/credential-stealer combo.
Instructure hacker claims data on 8,800 schools — 280M records allegedly stolen.
Apache HTTP/2 flaw CVE-2026-23918 enables possible RCE — Double-free in HTTP/2 handling, CVSS 8.8.
Microsoft Edge stores passwords in process memory — PoC shows admin can extract them.
China-linked UAT-8302 hits governments in South America and SE Europe — Cisco Talos attribution.
Karakurt ransomware negotiator gets 8.5 years — Latvian extradited to US.
Microsoft phishing campaign hits 35,000 users in 26 countries — Code-of-conduct lures via legitimate email services.
ScarCruft pushes BirdCall via game-platform supply chain — North Korean APT37 expands to Android.
ScarCruft trojanizes gaming platform — TheHackerNews — Targeting ethnic Koreans in China.
Vimeo breach exposes 119,000 records — ShinyHunters extortion.
FTC bans Kochava from selling location data — Settlement covers hundreds of millions of devices.
Taiwan student hacked high-speed rail TETRA, triggered emergency brakes — 23-year-old arrested.
CloudZ RAT abuses Microsoft Phone Link to steal SMS/OTPs — New “Pheno” plugin.
MetInfo CMS RCE CVE-2026-29014 exploited — Unauthenticated PHP code injection (CVSS 9.8).
Weaver E-cology RCE CVE-2026-22679 actively exploited — Targeting OA platform; exploitation since March.
Weaver E-cology bug also tracked by BleepingComputer — Discovery commands run on victim systems.
Cybercriminals fueling physical cargo theft — Transnational syndicates are rerouting goods using compromised supply-chain systems.
Amazon SES abused for evasive phishing — Bypasses reputation-based filters.
USB pen-test story retrospective — Looking back at Stasiukonis’ rigged-thumb-drive credit-union test.
EOL software creates blind spots in CVE feeds — Critical bugs in unsupported open source missed by SCA tools.

USA

GPT-5.5 Instant rolled out — fewer hallucinations — New default ChatGPT model with memory sources controls.
OpenAI claims 52.5% drop in hallucinated claims for new model — Internal evaluations on high-risk topics.
GPT-5.5 Instant System Card — OpenAI’s safety document for the release.
TechCrunch on GPT-5.5 Instant rollout — Reduced hallucinations in law, medicine, finance.
Apple to pay $250M for not delivering Apple Intelligence Siri — Class-action settlement covers iPhone 16 and 15 Pro buyers.
Apple plans third-party AI models in iOS 27 — Users would pick which model powers Apple Intelligence.
Verge on Apple’s iOS 27 third-party chatbot plan — Bloomberg’s Mark Gurman reports system-wide third-party chatbot support.
Microsoft kills Xbox Copilot — New Xbox CEO winds down Copilot mobile and console development.
Anthropic ships ten preconfigured AI agents for finance — Investment banking, asset management, insurance workflows.
OpenAI reportedly planning a phone with MediaTek/Qualcomm/Luxshare — Mass production targeted for early 2027.
OpenAI phone rumors via Verge — Ming-Chi Kuo says 30M units over first two years.
OpenAI launches new self-serve ChatGPT Ads Manager — Beta CPC bidding with ads-conversation separation.
OpenAI and PwC partner on CFO automation — AI agents for forecasting and controls.
Etsy launches a native ChatGPT app — Conversational shopping experience.
PayPal pitches AI-led turnaround for $1.5B in savings — Tied to job cuts and tech-stack modernization.
Five book publishers and an author sue Meta over Llama training — Macmillan, McGraw Hill, Elsevier, Hachette among plaintiffs.
ElevenLabs adds BlackRock, Foxx, Longoria as investors at $500M ARR — Voice AI gains enterprise traction.
Google Home Gemini upgraded to 3.1 — Multi-step requests handled.
Meta uses AI to flag minors via bone structure and body size — Visual analysis without facial recognition.
TechCrunch on Meta’s age-detection rollout — Currently in select countries.
Anthropic co-founder essay: recursive self-improvement plausible by 2028 — Jack Clark puts odds at 60%.
Inside the Musk v. Altman trial week one — MIT TR on courtroom dynamics.
Blueprint for AI strengthening democracy — MIT TR essay on info-shifts and governance.
CopilotKit raises $27M Series A for app-native agents — Glilot, NFX, SignalFire-led.
Cerebras heading for blockbuster IPO — Could value at $26.6B+, deep OpenAI ties.
Amazon brings agentic fine-tuning to SageMaker — Llama, Qwen, DeepSeek, Nova supported.
Eli Lilly: AI saving in pharma manufacturing, not the lab — Drug discovery still hype-laden.
Microsoft at NSDI 2026 on large-scale networked systems — Datacenter and networking research with AI focus.
Google funds $3.5M XPRIZE Future Vision film competition — With Range Media Partners.
Verge podcast on AI car design — How AI accelerates the five-year vehicle cycle.
Jensen Huang: AI is creating jobs, not destroying them — Counter-narrative to displacement fears.
AI’s hottest privates have a crypto shadow market — Token-style proxies for Anthropic, OpenAI, SpaceX exposure.

Europe

ASML CEO Fouquet: “no one is coming for us” on EUV monopoly — Milken-stage interview on lithography dominance.
SAP acquires Dremio (lakehouse) and Prior Labs (AI) — Enterprise AI-ready data platform push.
The EU needs new security partners — Commentary on broader security partnerships beyond NATO.

Japan (AI & Tech)

Anthropic and Blackstone establish enterprise AI services firm — New company will support Claude rollout to Japanese SMBs, complementing existing partner network.
Most popular AI agents in Japanese SMBs — Google Agentspace ranks 2nd — ITmedia survey on adopted/planned AI agents and IT spending.
ASUS Zenbook DUO UX8407 dual-display laptop reviewed for portability — Benchmarks Intel Core Ultra X9 388H AI processing under load.
Puget Systems Docker App Packs for one-command local-AI environments — Open-source toolkit auto-builds GPU-enabled containers for ComfyUI, Open WebUI, local LLM servers.

Research Papers

Benchmarks & Evaluation

Using LLMs for embodied planning introduces systematic safety risks — DESPITE benchmark of 12,279 tasks shows even near-perfect planners fail safety checks; 23 models evaluated.
Toward a Principled Framework for Agent Safety Measurement — Argues agent safety should be measured by search, not sampling — single safe/unsafe rates miss long-tail unsafe trajectories.

Security & Adversarial

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking — Systematically reuses successful and failed attack experience to evolve jailbreaks against aligned models.
ContextualJailbreak: Evolutionary Red-Teaming via Conversational Priming — Automates the multi-turn priming attacks that hand-crafted scaffolds had been shown to win with.
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration — A single untrusted tool call (e.g. crafted email) plants a dormant payload in agent long-term memory.
Autonomous LLM Agent Worms: Cross-Platform Propagation — First demonstration of attacker content propagating through persistent agent state via scheduled autoloading.

Compliance & Regulation

GPU Fingerprinting for Location Verification — Hardware fingerprints replace extractable on-chip keys for AI-chip governance and unauthorized-deployment detection.

Alignment & Safety

RefusalGuard: Geometry-Preserving Fine-Tuning for Safety — Investigates how downstream fine-tuning degrades refusal behavior in activation space and proposes a fix.
Logit-Gap Steering: Forward-Pass Diagnostic for Alignment Robustness — Refusal-affirmation logit gap quantifies per-prompt safety margin; alignment widens it on 97.5–99.8% of toxic prompts.

Applications

AgenticVM: Adaptive Software Vulnerability Management — Multi-agent LLM framework with BERT-based CVSS prediction for triage automation in SOCs.

Guardrails & Robustness

LocalAlign: Generalizable Prompt Injection Defense via Near-Target Adversarial Alignment — Trains models to maintain trust/data boundaries against prompt injection in tool-using agents.
Architectural Obsolescence of Unhardened Agentic-AI Runtimes — Shows the most-engineered public agent gateway catches none of four divergence patterns between actions and audit records.

Key Themes

Government oversight is hardening: Five major labs now share pre-release models with the US government for red-teaming, and a White House executive order on model review is being drafted.
Hallucinations and personalization advance simultaneously: GPT-5.5 Instant pairs claimed factuality gains with deeper personal-context features — raising new privacy and reliability questions.
Supply chains are the soft target: DAEMON Tools’ signed installers, Trellix source code, and ScarCruft’s gaming-platform pivot all show attackers moving up the trust chain.
Agent safety as research frontier: Multiple papers on agent worms, persistent-memory trojans, and embodied-planning safety converge on the same finding — alignment doesn’t survive contact with stateful, tool-using deployment.
AI regulation arrives via courts, not just policy: Pennsylvania v. Character.AI, Apple’s Siri settlement, and the publishers v. Meta suit are pushing limits while legislatures lag.
Hardware governance enters the threat model: GPU fingerprinting research and bounty restructuring (Google’s $1.5M Android tier) reflect AI’s reshaping of where defenders and attackers spend effort.

For detailed summaries of selected research papers, see papers.md.