AI News Digest — March 8, 2026

Highlights

OpenAI’s Robotics Chief Quits Over Pentagon Deal: Caitlin Kalinowski, head of hardware and robotics at OpenAI, resigned citing concerns about mass surveillance and lethal autonomy after the company’s military contract with the DoD was approved without sufficient internal deliberation.
Pentagon-Anthropic Controversy Chills Startup-Defense Relations: The Anthropic-Pentagon standoff is prompting broader questions about whether AI startups can navigate federal contracts without compromising their safety missions.
Luma AI’s Uni-1 Beats GPT Image 1.5 on Logic Benchmarks: Luma AI’s new unified image understanding-and-generation model outperforms OpenAI and Google on logic-based visual benchmarks using a single integrated architecture.
AI Agent Benchmarks Ignore 92% of the Labor Market: A large-scale study finds that AI agent development is almost entirely focused on programming tasks, leaving the vast majority of real-world occupations unevaluated.
Hallucinated References Are Passing Peer Review at Top AI Conferences: Fake citations generated by LLMs are slipping through review at major AI venues, and commercial models including GPT, Gemini, and Claude cannot reliably detect their own fabrications.

News

AI Security

Hackers Abuse .arpa DNS and IPv6 to Evade Phishing Defenses (BleepingComputer): Threat actors are exploiting the special-use .arpa domain and IPv6 reverse DNS lookups in phishing campaigns designed to bypass domain reputation checks and email security gateways.
EU Court Adviser: Banks Must Immediately Refund Phishing Victims (BleepingComputer): The Advocate General of the CJEU issued a formal opinion that banks should refund unauthorized phishing-related transactions even when account holders bear some fault — a significant shift in liability framing.
Engineer Threatened Legally After Reporting Insurance Portal Vulnerability (Gigazine): A platform engineer who discovered and reported a serious personal data leak in an insurance company’s portal received a letter from the firm’s data protection officer suggesting his disclosure could constitute a criminal offense.

USA

Will the Pentagon-Anthropic Controversy Scare Startups Away from Defense Work? (TechCrunch): The high-profile clash between Anthropic’s stated safety mission and its Pentagon contract is now being discussed as a cautionary tale for AI startups weighing federal partnerships.
OpenAI Robotics Chief Quits Over Military Deal (The Decoder): Caitlin Kalinowski stepped down after OpenAI finalized a DoD contract she says raised insufficient internal scrutiny around mass surveillance and lethal autonomy applications.
A Roadmap for AI, If Anyone Will Listen (TechCrunch): The Pro-Human Declaration — a structured AI governance roadmap backed by researchers including Max Tegmark — was finalized just before the Pentagon-Anthropic controversy broke, making its timing unexpectedly resonant.
Luma AI’s Uni-1 Tops Competitors on Logic-Based Image Benchmarks (The Decoder): Uni-1 combines image understanding and generation in a single architecture that reasons through prompts during creation, outperforming GPT Image 1.5 and Google’s Nano Banana 2 on logic-based evaluations.
Meta: Unlabeled Video Is the Next Massive LLM Training Frontier (The Decoder): Meta FAIR researchers trained a multimodal model from scratch and found that video — not text — may be the key to overcoming data scarcity, challenging several long-held assumptions about multimodal AI development.
AI Agent Benchmarks Obsess Over Coding, Ignoring 92% of the Labor Market (The Decoder): A comprehensive study reveals that current AI agent benchmarks are dangerously narrow, leaving most human job categories — healthcare, education, trades — unrepresented in evaluation frameworks.
Hallucinated References Are Passing Peer Review at AI Conferences (The Decoder): Fake LLM-generated citations are appearing in accepted papers at leading venues; open-source tool CiteAudit claims to catch fabrications that GPT, Gemini, and Claude miss.
Google Gives Sundar Pichai a $692M Pay Package (TechCrunch): The compensation is largely performance-based and includes new stock incentives tied to Waymo and Wing, signaling Google’s commitment to autonomous vehicles and drone delivery alongside AI.
ICE Detention Firm Eyes AI Data Center “Man Camps” as Growth Market (TechCrunch): The operator of ICE detention facilities is pivoting toward modular worker housing for AI data center construction crews — a convergence of immigration infrastructure and compute buildout raising ethical questions.
Joseph Weizenbaum on AI and Delusional Thinking (Simon Willison): A resonant 1976 observation from ELIZA’s creator resurfaces: “extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.”

Europe

Sony Faces £2.1B ($2.7B) Class-Action Suit from UK PlayStation Users (Japan Times): UK PlayStation users are pursuing one of the largest consumer tech class-action suits in British history, alleging Sony systematically overcharged for digital games and in-game content for nearly a decade.

Japan — AI & Tech

Using Claude AI to Audit Chrome Extension Permissions (ITmedia AI+): An engineer used the Claude Cowork AI agent to analyze the permissions and runtime behavior of Chrome extensions installed on their PC — surfacing a new use case for AI-powered local security auditing.
Free Chrome Built-In AI Translation Plugin for WordPress (Gigazine): The “Multilingual AI Translator” WordPress plugin leverages Chrome’s built-in AI APIs to provide zero-cost multilingual translation with bulk conversion and SEO optimization features.
llmfit: Terminal Tool That Recommends AI Models Based on Your Hardware (Gigazine): A new terminal utility analyzes a system’s RAM, CPU, and GPU specs and recommends which local AI models can run comfortably — lowering the barrier to local inference for non-experts.
Local Japanese Governments Bypassing Drone Permission Rules During Emergencies (Japan News): An increasing number of municipal governments are deploying drones without prior central government approval during bear sightings, wildfires, and other urgent situations — highlighting tension between rapid-response needs and existing tech regulation.

Key Themes

AI and military ethics: OpenAI’s robotics departure and the Anthropic-Pentagon controversy mark a turning point in how the industry grapples with dual-use AI contracts. The question of what internal governance processes must look like before signing defense deals is now live.
Benchmark validity crisis: Two separate stories — one on labor market coverage gaps in agent benchmarks, another on hallucinated citations in peer-reviewed AI papers — expose a systemic credibility problem in how the field evaluates itself.
AI compute infrastructure expansion: From Meta’s pivot to video training data to AI “man camps” and the physics of space-based data centers, the logistics of scaling AI are increasingly visible as a distinct challenge.
Security and disclosure norms under pressure: Vulnerability reporters face legal threats, phishing techniques evolve to outpace gateway defenses, and the EU moves to shift liability onto financial institutions — the security landscape is in flux on multiple fronts.
Japan’s regulatory and drone frontier: Japanese municipalities are quietly rewriting drone deployment norms through emergency use, while AI tools for local hardware and browser security gain traction among Japanese users.

For detailed summaries of selected research papers, see papers.md.