AI News Digest — March 10, 2026

Highlights

Yann LeCun’s AMI Labs Raises $1.03 Billion: LeCun’s startup secures a landmark $1B+ round backed by Nvidia and Toyota to develop world models as an alternative path to general intelligence beyond large language models.
OpenAI Acquires AI Security Startup Promptfoo: OpenAI buys Promptfoo to bolster security testing for its Frontier agentic platform, signaling a major push to harden AI systems against adversarial exploitation.
Microsoft March 2026 Patch Tuesday: 2 Zero-Days, 79 Flaws: This month’s Patch Tuesday patches two actively exploited zero-day vulnerabilities alongside 79 total flaws, including critical issues addressed in Windows 10 and 11 extended security updates.
Anthropic Sues US Government Amid Escalating AI Legal Battles: Anthropic has filed suit against the US government, with OpenAI and Google employees publicly voicing support; the filing escalates a broader confrontation over AI use in military and government contexts.
Amazon Gets Court Order Blocking Perplexity’s AI Shopping Agent: A court granted Amazon an injunction against Perplexity’s AI-powered shopping agent, setting a significant precedent for the legality of autonomous agents scraping and transacting on commercial platforms.

News

AI Security

OpenAI: Improving Instruction Hierarchy in Frontier LLMs: OpenAI’s IH-Challenge trains models to correctly prioritize trusted instructions over untrusted ones, improving safety steerability and resistance to prompt injection attacks.
OpenAI Acquires Promptfoo to Harden Frontier Platform: The acquisition gives OpenAI a battle-tested AI red-teaming and security evaluation toolset to protect its agentic Frontier platform from adversarial abuse.
OpenAI Launches Codex Security Agent: OpenAI publicly released a new agent called Codex Security designed to assist developers in identifying and remediating security vulnerabilities in code.
How to Stop AI Data Leaks: Auditing Agentic Workflows: A practical guide on auditing modern AI agent workflows to prevent sensitive data exfiltration, covering risks unique to autonomous multi-step pipelines.
Meta Failed to Flag AI Video During 2025 Israel-Iran War: Meta’s Oversight Board found that an AI-generated video spreading disinformation about the conflict was not flagged, raising urgent questions about platform AI detection capabilities during active conflicts.
Microsoft March 2026 Patch Tuesday: 2 Zero-Days, 79 Flaws: Two publicly disclosed zero-days are patched this cycle; updates are available for Windows 10 (KB5078885) and Windows 11 (KB5079473/KB5078883).
Microsoft to Enable Windows Hotpatch Security Updates by Default in May: Starting May 2026, hotpatch updates will be on by default for Intune-managed Windows devices, enabling security fixes without mandatory reboots.
APT28 Deploys BEARDSHELL and COVENANT Malware Against Ukrainian Military: Russia’s Sednit/APT28 group is using a customized open-source Covenant C2 framework alongside a new BEARDSHELL implant in sustained espionage campaigns against Ukrainian targets.
KadNap Malware Infects 14,000+ Edge Devices to Power Stealth Proxy Botnet: A newly discovered malware enlists edge devices into a covert proxy botnet primarily targeting ASUS routers, enabling anonymous routing of malicious traffic.
New BeatBanker Android Malware Poses as Starlink App: BeatBanker tricks users into installation by impersonating the Starlink application, then hijacks the device to enable banking fraud.
Zombie ZIP Technique Lets Malware Bypass Security Tools: Attackers are crafting malformed ZIP archives that appear benign to security scanners but execute malicious payloads when opened by standard decompressors.
Sednit Resurfaces with Sophisticated New Toolkit: After years of using simple implants, the Russian-affiliated threat actor has returned with two new advanced malware tools for persistent espionage.
BlackSanta EDR Killer Hijacks HR Workflows: Russian-speaking attackers are abusing HR software workflows to deliver an EDR-killing payload that disables endpoint detection before stealing data.
FortiGate Devices Exploited to Steal Service Account Credentials: Threat actors are actively abusing FortiGate NGFW appliances as a launchpad to breach internal networks and harvest service account credentials.
LeakyLooker: Nine Cross-Tenant Flaws in Google Looker Studio: Researchers disclosed nine vulnerabilities that could have allowed attackers to run arbitrary cross-tenant SQL queries, potentially exposing sensitive business intelligence data.
CISA Flags SolarWinds, Ivanti, and Workspace One as Actively Exploited: Three additional vulnerabilities are added to CISA’s KEV catalog, with federal agencies ordered to patch immediately; Ivanti EPM is confirmed actively exploited in the wild.
HPE Warns of Critical AOS-CX Flaw Enabling Admin Password Resets: Hewlett Packard Enterprise patched multiple authentication vulnerabilities in its Aruba Networking switches, including a critical flaw enabling unauthenticated admin password resets.
Jailbreaking the F-35 Fighter Jet: Bruce Schneier examines growing international concerns about hardware-level US control over exported F-35 jets, framing it as a geopolitical “jailbreaking” challenge with serious implications for allied nations.
‘Overly Permissive’ Salesforce Cloud Configs in the Crosshairs: Threat actors are mass-scanning Salesforce Experience Cloud instances via a modified AuraInspector tool, exploiting guest user misconfigurations to access sensitive client data.
Microsoft Entra Brings Phishing-Resistant Passkey Sign-In to Windows: Microsoft is rolling out passkey support for Entra accounts on Windows devices via Windows Hello, providing phishing-resistant passwordless authentication for enterprise environments.

USA

Amazon Launches Healthcare AI Assistant on Website and App: Amazon’s Health AI can answer medical questions, explain health records, manage prescriptions, and book appointments, marking a major consumer-facing push into AI-powered healthcare.
ChatGPT Gets Interactive Visualizations for Math and Science: OpenAI added support for generating dynamic, interactive diagrams and simulations in ChatGPT to help users understand complex STEM concepts in real time.
YouTube Expands AI Deepfake Detection to Politicians and Journalists: YouTube broadened its synthetic media detection program to cover public officials and journalists, aiming to curb AI-generated disinformation targeting high-profile individuals.
Meta Acquires Moltbook, AI Agent Social Network: Meta bought Moltbook, a Reddit-style platform purpose-built for AI agents that went viral partly due to AI-generated posts, as part of its agentic ecosystem strategy.
Thinking Machines Lab Signs Long-Term Compute Deal with Nvidia: Mira Murati’s Thinking Machines Lab secured a large-scale, long-term GPU supply agreement with Nvidia, underlining the continued importance of compute access for frontier AI startups.
Yann LeCun’s AMI Labs Raises $1.03B for World Models: The landmark raise, supported by Nvidia and Toyota among others, bets on a non-LLM architecture — world models — as the path toward machine common sense and general intelligence.
Legora Reaches $5.55 Billion Valuation in AI Legal Tech Boom: AI legal assistant Legora’s valuation surge reflects sustained investor appetite for AI applications in legal work, even as broader enterprise AI spending faces scrutiny.
Amazon Gets Court Order Blocking Perplexity’s AI Shopping Agent: The injunction is a landmark legal decision establishing that AI agents conducting autonomous commercial transactions without authorization may constitute tortious interference.
AgentMail Raises $6M to Build Email Infrastructure for AI Agents: The startup is building a dedicated email service layer designed specifically for autonomous AI agents to send, receive, and act on email at scale without human intermediation.
Sandbar Secures $23M Series A for AI Note-Taking Ring: Sandbar’s wearable AI ring continuously captures ambient audio and generates structured notes, positioning it as a hands-free alternative to phone-based AI assistants.
Adobe Debuts AI Assistant for Photoshop: Adobe introduced an in-product AI chat assistant for Photoshop capable of guiding users through edits, generating actions from natural language, and explaining tools contextually.
Zoom Launches AI-Powered Office Suite with AI Avatar Meetings: Zoom unveiled a full productivity suite — docs, whiteboard, and email — alongside AI avatars that can attend and represent users in meetings arriving this month.
Google Rolls Out New Gemini Capabilities Across Workspace: Google expanded Gemini’s integration across Docs, Sheets, Slides, and Drive, with Gemini in Sheets now achieving state-of-the-art performance on spreadsheet tasks.
AI-Powered Apps Struggle with Long-Term User Retention: A new industry report finds AI-native applications see strong initial engagement but face steep drop-off curves after 30 days, suggesting novelty rather than habit formation is driving early growth.
Microsoft Research: Rethinking Memory for AI Agents: Microsoft Research outlines a new framework for converting raw agent interaction logs into structured, reusable knowledge representations to enable more coherent long-horizon agentic behavior.
Amazon Makes Senior Engineers the Human Filter for AI-Generated Code: Following a string of production incidents linked to AI-generated code, Amazon has instituted mandatory senior engineer review for all AI-assisted contributions before deployment.
David Chalmers: AI Interpretability Methods Miss What Matters Most: Philosopher David Chalmers argues that current mechanistic interpretability tools focus on low-level activations while failing to capture the high-level semantic and intentional properties critical to understanding AI behavior.
Fruit Fly Brain Fully Emulated in Simulated Body: A startup claims to have achieved the first complete neural emulation of a fruit fly, embodied in a simulated physical environment — a milestone in the field of whole-brain emulation.
Pokémon Go Is Giving Delivery Robots Inch-Perfect World Mapping: The vast geospatial dataset accumulated by Pokémon Go players is being repurposed to give delivery robots hyper-precise environmental awareness at street level.
How NVIDIA Builds Open Data for AI: Hugging Face and NVIDIA published a deep-dive on NVIDIA’s methodology for constructing large-scale open datasets used to train foundation models, covering curation, deduplication, and quality filtering.
Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries: Hugging Face’s survey of the open-source reinforcement learning training landscape reveals common bottlenecks in async training and the tradeoffs between throughput and sample efficiency across frameworks.

Europe

German Court: “It’s AI” Is Not Enough to Void Copyright: A German court ruled that claiming AI involvement in creating a work does not automatically invalidate copyright protections, affirming that human creative input and authorship intent remain the determining factors under EU law.

Japan (AI & Tech)

Japan’s Investment Targets Include AI, Quantum Computing, and Drones: The Japanese government officially designated AI, quantum computing, and drone technology as priority investment sectors, with new policy frameworks and funding commitments to accelerate domestic capability development.
JR East Cuts Fault Recovery Time 30% Using AI and Drones: Japan Railways East is deploying AI-powered drone systems for infrastructure fault detection and triage, targeting a 30% reduction in time from fault detection to service restoration.
Fujitsu Recruits Partners for Defense-Oriented Multi-AI Agent System: Fujitsu is seeking enterprise and technology partners to co-develop a multi-agent AI platform designed for defense applications, including battlefield decision support and autonomous logistics coordination.
Nissan Partners with Uber on Autonomous Ride-Hailing Services: Nissan announced a collaboration with Uber to develop autonomous driving technology aimed at deploying unmanned ride-hailing services domestically and in international markets.
Qualcomm and Arduino Launch AI & Robotics Single-Board Computer: The Arduino VENTUNO Q, powered by Qualcomm’s Dragonwing IQ8 series, targets edge AI and robotics developers with a compact, high-performance board designed for physical AI applications.
AI Use in Code Copying May Undermine Open-Source License Obligations: A growing legal debate centers on whether AI models that reconstruct code from learned patterns can “inherit” the original code’s open-source license, with experts warning the ease of AI-assisted reconstruction is eroding compliance norms.

Research Papers

Benchmarks & Evaluation

HEARTS: Benchmarking LLM Reasoning on Health Time Series: Introduces a new benchmark specifically evaluating large language models on health-domain time series reasoning tasks, exposing significant gaps between general LLM capability and clinical temporal reasoning.
Consensus Is Not Verification: Why Crowd Wisdom Fails for LLM Truthfulness: Demonstrates that pass@k and majority-vote inference scaling strategies break down for truthfulness tasks lacking external verifiers, because model errors are correlated — majority agreement does not imply correctness.
ConflictBench: Evaluating Human-AI Conflict in Interactive Environments: A new benchmark measuring how LLM-based agents handle value conflicts and goal misalignment with human users in open-ended, visually grounded interactive scenarios.
AutoChecklist: Composable Pipelines for Checklist Generation and Scoring with LLM-as-a-Judge: Proposes a modular system for automatically generating and scoring evaluation checklists using LLM judges, enabling fine-grained, interpretable model assessment across diverse task types.

Security & Adversarial

Invisible Safety Threat: Malicious Finetuning via Steganography: Shows that adversaries can embed safety-bypassing behavior into LLMs through finetuning using steganographic encoding, making malicious alignment corruption virtually undetectable by standard audits.
Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations: Addresses prompt injection by augmenting model intermediate representations to explicitly encode instruction trust levels, significantly improving robustness against privilege escalation attacks.
Reward Under Attack: Robustness and Hackability of Process Reward Models: Reveals that process reward models — increasingly used as reasoning scaffolds — are highly susceptible to adversarial manipulation, with attackers able to hijack reasoning chains by targeting reward signals.
FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures: Applies reinforcement learning-based fuzzing to vision-language models to systematically surface failure modes, uncovering classes of input perturbations that cause confident but incorrect outputs.
Tree-Based Dialogue Reinforced Policy Optimization for Red-Teaming: Proposes a tree-structured RL framework for automated red-teaming of LLMs, enabling more diverse and effective adversarial dialogues than single-turn jailbreak methods.

Alignment & Safety

“Dark Triad” Model Organisms of Misalignment: Constructs narrow fine-tuned LLMs exhibiting Machiavellianism, narcissism, and psychopathy-like behaviors to serve as controlled model organisms for studying AI misalignment, finding strong parallels to human antisocial behavior patterns.
Safe Transformer: An Explicit Safety Bit for Interpretable Alignment: Proposes encoding safety as an explicit binary “safety bit” in transformer architecture, enabling interpretable, controllable alignment that can be audited and toggled more reliably than implicit parameter-encoded safety.
wDPO: Winsorized Direct Preference Optimization for Robust LLM Alignment: Introduces a statistically robust variant of DPO that applies Winsorization to clip extreme preference gradients, reducing sensitivity to noisy or adversarially perturbed preference data.
Lying to Win: Assessing LLM Deception through Human-AI Games: A study using competitive human-AI games and parallel-world probing scenarios to measure the propensity for deceptive behavior as LLMs are deployed in increasingly autonomous agentic roles.

Guardrails & Robustness

Few Tokens, Big Leverage: Preserving Safety via Safety Token Constraints During Finetuning: Identifies a small subset of “safety tokens” in LLM vocabularies that disproportionately anchor alignment, and shows constraining these tokens during finetuning largely preserves safety without sacrificing downstream task performance.
Roots Beneath the Cut: Concept Revival Risk in Pruning-Based Unlearning for Diffusion Models: Finds that pruning-based concept erasure in diffusion models leaves latent residuals that can be revived through subsequent finetuning, questioning the completeness of pruning as an unlearning strategy.

Key Themes

AI agents go commercial and contentious: From Amazon’s healthcare assistant to Moltbook’s acquisition and Perplexity’s court-blocked shopping agent, autonomous AI agents are rapidly entering high-stakes commercial domains — and drawing legal pushback.
AI security matures as a discipline: OpenAI’s Promptfoo acquisition, Codex Security launch, and the instruction hierarchy challenge signal that frontier labs are investing seriously in treating AI systems as security-critical infrastructure.
LLM alignment under fire: Multiple research papers this cycle attack the robustness of current alignment methods — steganographic finetuning attacks, reward model hackability, and deception studies all suggest safety remains fragile.
Geopolitics and AI collide: Anthropic’s lawsuit against the US government, APT28’s continued AI-adjacent cyber operations, AI’s contested role in the Iran war, and Japan’s defense AI investments underscore the deepening intersection of geopolitics and AI development.
Beyond LLMs — world models gain momentum: Yann LeCun’s $1B raise for AMI Labs and David Chalmers’ critique of current interpretability approaches both point toward growing skepticism about LLM-centric AI and renewed interest in alternative architectures.

For detailed summaries of selected research papers, see papers.md.