AI & Tech News Digest — 2026-05-06
Highlights
- Five major AI labs grant US government pre-release model access: Google DeepMind, Microsoft, and xAI joined Anthropic and OpenAI in providing models with reduced safety guardrails to the Center for AI Standards and Innovation for classified national-security testing.
- GPT-5.5 Instant becomes ChatGPT’s default with 52.5% fewer hallucinations: OpenAI’s new default model claims sharply reduced fabrication on high-risk topics like medicine and law, plus new “memory sources” controls letting users see which stored context shaped each answer.
- White House drafts executive order for government AI model review: After a year of deregulation, the administration is now considering pre-release government review of frontier models, reportedly triggered by Anthropic’s “Mythos” cybersecurity model.
- DAEMON Tools installers compromised in month-long supply-chain attack: Hackers trojanized signed installers from the official site since April 8, delivering a backdoor to thousands of users — Kaspersky researchers say payloads were signed with legitimate developer certificates.
- Pennsylvania sues Character.AI after chatbot posed as a licensed psychiatrist: State AG’s filing alleges a Character.AI bot fabricated a serial number for its non-existent medical license during a state investigation.
News
AI Security
- Five major labs grant US pre-release access for security testing — Google, Microsoft, xAI join Anthropic and OpenAI in CAISI testing program for models with relaxed guardrails.
- Google, Microsoft, xAI agree to US government model review — Commerce Department’s CAISI will run pre-deployment evaluations.
- White House drafts AI model review executive order — Move follows Anthropic’s “Mythos” cybersecurity model.
- Restricted access to defensive AI deepens global cyber gap — Anthropic’s Mythos and similar tools are limited to certain customers, leaving emerging-market institutions exposed.
- 1 million exposed AI services scanned — security is bad — Self-hosted LLM infrastructure rushed to production with major weaknesses.
- Pennsylvania sues Character.AI over chatbot impersonating doctor — Bot allegedly fabricated a state medical license number.
- Google raises Android exploit bounties to $1.5M, cuts AI-easy finds — Reflecting that AI lowers the cost of routine vulnerability discovery.
- OAuth tokens from AI tools become a back-door attack surface — AI/automation integrations leave persistent OAuth tokens with no expiration that bypass MFA.
- DarkSword iOS zero-day exploit chain — Government-grade malware targeting iOS observed since November 2025.
- DAEMON Tools trojanized in supply-chain attack — Signed official installers backdoored since April 8.
- DAEMON Tools attack details from Kaspersky — Payloads signed with legitimate developer certs.
- Trellix source-code breach raises supply-chain concern — Could expose how the security product’s controls and detections are designed.
- Quasar Linux (QLNX) targets developers — Rootkit/backdoor/credential-stealer combo.
- Instructure hacker claims data on 8,800 schools — 280M records allegedly stolen.
- Apache HTTP/2 flaw CVE-2026-23918 enables possible RCE — Double-free in HTTP/2 handling, CVSS 8.8.
- Microsoft Edge stores passwords in process memory — PoC shows admin can extract them.
- China-linked UAT-8302 hits governments in South America and SE Europe — Cisco Talos attribution.
- Karakurt ransomware negotiator gets 8.5 years — Latvian extradited to US.
- Microsoft phishing campaign hits 35,000 users in 26 countries — Code-of-conduct lures via legitimate email services.
- ScarCruft pushes BirdCall via game-platform supply chain — North Korean APT37 expands to Android.
- ScarCruft trojanizes gaming platform — TheHackerNews — Targeting ethnic Koreans in China.
- Vimeo breach exposes 119,000 records — ShinyHunters extortion.
- FTC bans Kochava from selling location data — Settlement covers hundreds of millions of devices.
- Taiwan student hacked high-speed rail TETRA, triggered emergency brakes — 23-year-old arrested.
- CloudZ RAT abuses Microsoft Phone Link to steal SMS/OTPs — New “Pheno” plugin.
- MetInfo CMS RCE CVE-2026-29014 exploited — Unauthenticated PHP code injection (CVSS 9.8).
- Weaver E-cology RCE CVE-2026-22679 actively exploited — Targeting OA platform; exploitation since March.
- Weaver E-cology bug also tracked by BleepingComputer — Discovery commands run on victim systems.
- Cybercriminals fueling physical cargo theft — Transnational syndicates are rerouting goods using compromised supply-chain systems.
- Amazon SES abused for evasive phishing — Bypasses reputation-based filters.
- USB pen-test story retrospective — Looking back at Stasiukonis’ rigged-thumb-drive credit-union test.
- EOL software creates blind spots in CVE feeds — Critical bugs in unsupported open source missed by SCA tools.
USA
- GPT-5.5 Instant rolled out — fewer hallucinations — New default ChatGPT model with memory sources controls.
- OpenAI claims 52.5% drop in hallucinated claims for new model — Internal evaluations on high-risk topics.
- GPT-5.5 Instant System Card — OpenAI’s safety document for the release.
- TechCrunch on GPT-5.5 Instant rollout — Reduced hallucinations in law, medicine, finance.
- Apple to pay $250M for not delivering Apple Intelligence Siri — Class-action settlement covers iPhone 16 and 15 Pro buyers.
- Apple plans third-party AI models in iOS 27 — Users would pick which model powers Apple Intelligence.
- Verge on Apple’s iOS 27 third-party chatbot plan — Bloomberg’s Mark Gurman reports system-wide third-party chatbot support.
- Microsoft kills Xbox Copilot — New Xbox CEO winds down Copilot mobile and console development.
- Anthropic ships ten preconfigured AI agents for finance — Investment banking, asset management, insurance workflows.
- OpenAI reportedly planning a phone with MediaTek/Qualcomm/Luxshare — Mass production targeted for early 2027.
- OpenAI phone rumors via Verge — Ming-Chi Kuo says 30M units over first two years.
- OpenAI launches new self-serve ChatGPT Ads Manager — Beta CPC bidding with ads-conversation separation.
- OpenAI and PwC partner on CFO automation — AI agents for forecasting and controls.
- Etsy launches a native ChatGPT app — Conversational shopping experience.
- PayPal pitches AI-led turnaround for $1.5B in savings — Tied to job cuts and tech-stack modernization.
- Five book publishers and an author sue Meta over Llama training — Macmillan, McGraw Hill, Elsevier, Hachette among plaintiffs.
- ElevenLabs adds BlackRock, Foxx, Longoria as investors at $500M ARR — Voice AI gains enterprise traction.
- Google Home Gemini upgraded to 3.1 — Multi-step requests handled.
- Meta uses AI to flag minors via bone structure and body size — Visual analysis without facial recognition.
- TechCrunch on Meta’s age-detection rollout — Currently in select countries.
- Anthropic co-founder essay: recursive self-improvement plausible by 2028 — Jack Clark puts odds at 60%.
- Inside the Musk v. Altman trial week one — MIT TR on courtroom dynamics.
- Blueprint for AI strengthening democracy — MIT TR essay on info-shifts and governance.
- CopilotKit raises $27M Series A for app-native agents — Glilot, NFX, SignalFire-led.
- Cerebras heading for blockbuster IPO — Could value at $26.6B+, deep OpenAI ties.
- Amazon brings agentic fine-tuning to SageMaker — Llama, Qwen, DeepSeek, Nova supported.
- Eli Lilly: AI saving in pharma manufacturing, not the lab — Drug discovery still hype-laden.
- Microsoft at NSDI 2026 on large-scale networked systems — Datacenter and networking research with AI focus.
- Google funds $3.5M XPRIZE Future Vision film competition — With Range Media Partners.
- Verge podcast on AI car design — How AI accelerates the five-year vehicle cycle.
- Jensen Huang: AI is creating jobs, not destroying them — Counter-narrative to displacement fears.
- AI’s hottest privates have a crypto shadow market — Token-style proxies for Anthropic, OpenAI, SpaceX exposure.
Europe
- ASML CEO Fouquet: “no one is coming for us” on EUV monopoly — Milken-stage interview on lithography dominance.
- SAP acquires Dremio (lakehouse) and Prior Labs (AI) — Enterprise AI-ready data platform push.
- The EU needs new security partners — Commentary on broader security partnerships beyond NATO.
Japan (AI & Tech)
- Anthropic and Blackstone establish enterprise AI services firm — New company will support Claude rollout to Japanese SMBs, complementing existing partner network.
- Most popular AI agents in Japanese SMBs — Google Agentspace ranks 2nd — ITmedia survey on adopted/planned AI agents and IT spending.
- ASUS Zenbook DUO UX8407 dual-display laptop reviewed for portability — Benchmarks Intel Core Ultra X9 388H AI processing under load.
- Puget Systems Docker App Packs for one-command local-AI environments — Open-source toolkit auto-builds GPU-enabled containers for ComfyUI, Open WebUI, local LLM servers.
Research Papers
Benchmarks & Evaluation
- Using LLMs for embodied planning introduces systematic safety risks — DESPITE benchmark of 12,279 tasks shows even near-perfect planners fail safety checks; 23 models evaluated.
- Toward a Principled Framework for Agent Safety Measurement — Argues agent safety should be measured by search, not sampling — single safe/unsafe rates miss long-tail unsafe trajectories.
Security & Adversarial
- SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking — Systematically reuses successful and failed attack experience to evolve jailbreaks against aligned models.
- ContextualJailbreak: Evolutionary Red-Teaming via Conversational Priming — Automates the multi-turn priming attacks that hand-crafted scaffolds had been shown to win with.
- Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration — A single untrusted tool call (e.g. crafted email) plants a dormant payload in agent long-term memory.
- Autonomous LLM Agent Worms: Cross-Platform Propagation — First demonstration of attacker content propagating through persistent agent state via scheduled autoloading.
Compliance & Regulation
- GPU Fingerprinting for Location Verification — Hardware fingerprints replace extractable on-chip keys for AI-chip governance and unauthorized-deployment detection.
Alignment & Safety
- RefusalGuard: Geometry-Preserving Fine-Tuning for Safety — Investigates how downstream fine-tuning degrades refusal behavior in activation space and proposes a fix.
- Logit-Gap Steering: Forward-Pass Diagnostic for Alignment Robustness — Refusal-affirmation logit gap quantifies per-prompt safety margin; alignment widens it on 97.5–99.8% of toxic prompts.
Applications
- AgenticVM: Adaptive Software Vulnerability Management — Multi-agent LLM framework with BERT-based CVSS prediction for triage automation in SOCs.
Guardrails & Robustness
- LocalAlign: Generalizable Prompt Injection Defense via Near-Target Adversarial Alignment — Trains models to maintain trust/data boundaries against prompt injection in tool-using agents.
- Architectural Obsolescence of Unhardened Agentic-AI Runtimes — Shows the most-engineered public agent gateway catches none of four divergence patterns between actions and audit records.
Key Themes
- Government oversight is hardening: Five major labs now share pre-release models with the US government for red-teaming, and a White House executive order on model review is being drafted.
- Hallucinations and personalization advance simultaneously: GPT-5.5 Instant pairs claimed factuality gains with deeper personal-context features — raising new privacy and reliability questions.
- Supply chains are the soft target: DAEMON Tools’ signed installers, Trellix source code, and ScarCruft’s gaming-platform pivot all show attackers moving up the trust chain.
- Agent safety as research frontier: Multiple papers on agent worms, persistent-memory trojans, and embodied-planning safety converge on the same finding — alignment doesn’t survive contact with stateful, tool-using deployment.
- AI regulation arrives via courts, not just policy: Pennsylvania v. Character.AI, Apple’s Siri settlement, and the publishers v. Meta suit are pushing limits while legislatures lag.
- Hardware governance enters the threat model: GPU fingerprinting research and bounty restructuring (Google’s $1.5M Android tier) reflect AI’s reshaping of where defenders and attackers spend effort.
For detailed summaries of selected research papers, see papers.md.