AI News Digest — March 10, 2026
Highlights
- Yann LeCun’s AMI Labs Raises $1.03 Billion: LeCun’s startup secures a landmark $1B+ round backed by Nvidia and Toyota to develop world models as an alternative path to general intelligence beyond large language models.
- OpenAI Acquires AI Security Startup Promptfoo: OpenAI buys Promptfoo to bolster security testing for its Frontier agentic platform, signaling a major push to harden AI systems against adversarial exploitation.
- Microsoft March 2026 Patch Tuesday: 2 Zero-Days, 79 Flaws: This month’s Patch Tuesday patches two actively exploited zero-day vulnerabilities alongside 79 total flaws, including critical issues addressed in Windows 10 and 11 extended security updates.
- Anthropic Sues US Government Amid Escalating AI Legal Battles: Anthropic has filed suit against the US government, with OpenAI and Google employees publicly voicing support; the filing escalates a broader confrontation over AI use in military and government contexts.
- Amazon Gets Court Order Blocking Perplexity’s AI Shopping Agent: A court granted Amazon an injunction against Perplexity’s AI-powered shopping agent, setting a significant precedent for the legality of autonomous agents scraping and transacting on commercial platforms.
News
AI Security
- OpenAI: Improving Instruction Hierarchy in Frontier LLMs: OpenAI’s IH-Challenge trains models to correctly prioritize trusted instructions over untrusted ones, improving safety steerability and resistance to prompt injection attacks.
- OpenAI Acquires Promptfoo to Harden Frontier Platform: The acquisition gives OpenAI a battle-tested AI red-teaming and security evaluation toolset to protect its agentic Frontier platform from adversarial abuse.
- OpenAI Launches Codex Security Agent: OpenAI publicly released a new agent called Codex Security designed to assist developers in identifying and remediating security vulnerabilities in code.
- How to Stop AI Data Leaks: Auditing Agentic Workflows: A practical guide on auditing modern AI agent workflows to prevent sensitive data exfiltration, covering risks unique to autonomous multi-step pipelines.
- Meta Failed to Flag AI Video During 2025 Israel-Iran War: Meta’s Oversight Board found that an AI-generated video spreading disinformation about the conflict was not flagged, raising urgent questions about platform AI detection capabilities during active conflicts.
- Microsoft March 2026 Patch Tuesday: 2 Zero-Days, 79 Flaws: Two publicly disclosed zero-days are patched this cycle; updates are available for Windows 10 (KB5078885) and Windows 11 (KB5079473/KB5078883).
- Microsoft to Enable Windows Hotpatch Security Updates by Default in May: Starting May 2026, hotpatch updates will be on by default for Intune-managed Windows devices, enabling security fixes without mandatory reboots.
- APT28 Deploys BEARDSHELL and COVENANT Malware Against Ukrainian Military: Russia’s Sednit/APT28 group is using a customized open-source Covenant C2 framework alongside a new BEARDSHELL implant in sustained espionage campaigns against Ukrainian targets.
- KadNap Malware Infects 14,000+ Edge Devices to Power Stealth Proxy Botnet: A newly discovered malware enlists edge devices into a covert proxy botnet primarily targeting ASUS routers, enabling anonymous routing of malicious traffic.
- New BeatBanker Android Malware Poses as Starlink App: BeatBanker tricks users into installation by impersonating the Starlink application, then hijacks the device to enable banking fraud.
- Zombie ZIP Technique Lets Malware Bypass Security Tools: Attackers are crafting malformed ZIP archives that appear benign to security scanners but execute malicious payloads when opened by standard decompressors.
- Sednit Resurfaces with Sophisticated New Toolkit: After years of using simple implants, the Russian-affiliated threat actor has returned with two new advanced malware tools for persistent espionage.
- BlackSanta EDR Killer Hijacks HR Workflows: Russian-speaking attackers are abusing HR software workflows to deliver an EDR-killing payload that disables endpoint detection before stealing data.
- FortiGate Devices Exploited to Steal Service Account Credentials: Threat actors are actively abusing FortiGate NGFW appliances as a launchpad to breach internal networks and harvest service account credentials.
- LeakyLooker: Nine Cross-Tenant Flaws in Google Looker Studio: Researchers disclosed nine vulnerabilities that could have allowed attackers to run arbitrary cross-tenant SQL queries, potentially exposing sensitive business intelligence data.
- CISA Flags SolarWinds, Ivanti, and Workspace One as Actively Exploited: Three additional vulnerabilities are added to CISA’s KEV catalog, with federal agencies ordered to patch immediately; Ivanti EPM is confirmed actively exploited in the wild.
- HPE Warns of Critical AOS-CX Flaw Enabling Admin Password Resets: Hewlett Packard Enterprise patched multiple authentication vulnerabilities in its Aruba Networking switches, including a critical flaw enabling unauthenticated admin password resets.
- Jailbreaking the F-35 Fighter Jet: Bruce Schneier examines growing international concerns about hardware-level US control over exported F-35 jets, framing it as a geopolitical “jailbreaking” challenge with serious implications for allied nations.
- ‘Overly Permissive’ Salesforce Cloud Configs in the Crosshairs: Threat actors are mass-scanning Salesforce Experience Cloud instances via a modified AuraInspector tool, exploiting guest user misconfigurations to access sensitive client data.
- Microsoft Entra Brings Phishing-Resistant Passkey Sign-In to Windows: Microsoft is rolling out passkey support for Entra accounts on Windows devices via Windows Hello, providing phishing-resistant passwordless authentication for enterprise environments.
USA
- Amazon Launches Healthcare AI Assistant on Website and App: Amazon’s Health AI can answer medical questions, explain health records, manage prescriptions, and book appointments, marking a major consumer-facing push into AI-powered healthcare.
- ChatGPT Gets Interactive Visualizations for Math and Science: OpenAI added support for generating dynamic, interactive diagrams and simulations in ChatGPT to help users understand complex STEM concepts in real time.
- YouTube Expands AI Deepfake Detection to Politicians and Journalists: YouTube broadened its synthetic media detection program to cover public officials and journalists, aiming to curb AI-generated disinformation targeting high-profile individuals.
- Meta Acquires Moltbook, AI Agent Social Network: Meta bought Moltbook, a Reddit-style platform purpose-built for AI agents that went viral partly due to AI-generated posts, as part of its agentic ecosystem strategy.
- Thinking Machines Lab Signs Long-Term Compute Deal with Nvidia: Mira Murati’s Thinking Machines Lab secured a large-scale, long-term GPU supply agreement with Nvidia, underlining the continued importance of compute access for frontier AI startups.
- Yann LeCun’s AMI Labs Raises $1.03B for World Models: The landmark raise, supported by Nvidia and Toyota among others, bets on a non-LLM architecture — world models — as the path toward machine common sense and general intelligence.
- Legora Reaches $5.55 Billion Valuation in AI Legal Tech Boom: AI legal assistant Legora’s valuation surge reflects sustained investor appetite for AI applications in legal work, even as broader enterprise AI spending faces scrutiny.
- Amazon Gets Court Order Blocking Perplexity’s AI Shopping Agent: The injunction is a landmark legal decision establishing that AI agents conducting autonomous commercial transactions without authorization may constitute tortious interference.
- AgentMail Raises $6M to Build Email Infrastructure for AI Agents: The startup is building a dedicated email service layer designed specifically for autonomous AI agents to send, receive, and act on email at scale without human intermediation.
- Sandbar Secures $23M Series A for AI Note-Taking Ring: Sandbar’s wearable AI ring continuously captures ambient audio and generates structured notes, positioning it as a hands-free alternative to phone-based AI assistants.
- Adobe Debuts AI Assistant for Photoshop: Adobe introduced an in-product AI chat assistant for Photoshop capable of guiding users through edits, generating actions from natural language, and explaining tools contextually.
- Zoom Launches AI-Powered Office Suite with AI Avatar Meetings: Zoom unveiled a full productivity suite — docs, whiteboard, and email — alongside AI avatars that can attend and represent users in meetings arriving this month.
- Google Rolls Out New Gemini Capabilities Across Workspace: Google expanded Gemini’s integration across Docs, Sheets, Slides, and Drive, with Gemini in Sheets now achieving state-of-the-art performance on spreadsheet tasks.
- AI-Powered Apps Struggle with Long-Term User Retention: A new industry report finds AI-native applications see strong initial engagement but face steep drop-off curves after 30 days, suggesting novelty rather than habit formation is driving early growth.
- Microsoft Research: Rethinking Memory for AI Agents: Microsoft Research outlines a new framework for converting raw agent interaction logs into structured, reusable knowledge representations to enable more coherent long-horizon agentic behavior.
- Amazon Makes Senior Engineers the Human Filter for AI-Generated Code: Following a string of production incidents linked to AI-generated code, Amazon has instituted mandatory senior engineer review for all AI-assisted contributions before deployment.
- David Chalmers: AI Interpretability Methods Miss What Matters Most: Philosopher David Chalmers argues that current mechanistic interpretability tools focus on low-level activations while failing to capture the high-level semantic and intentional properties critical to understanding AI behavior.
- Fruit Fly Brain Fully Emulated in Simulated Body: A startup claims to have achieved the first complete neural emulation of a fruit fly, embodied in a simulated physical environment — a milestone in the field of whole-brain emulation.
- Pokémon Go Is Giving Delivery Robots Inch-Perfect World Mapping: The vast geospatial dataset accumulated by Pokémon Go players is being repurposed to give delivery robots hyper-precise environmental awareness at street level.
- How NVIDIA Builds Open Data for AI: Hugging Face and NVIDIA published a deep-dive on NVIDIA’s methodology for constructing large-scale open datasets used to train foundation models, covering curation, deduplication, and quality filtering.
- Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries: Hugging Face’s survey of the open-source reinforcement learning training landscape reveals common bottlenecks in async training and the tradeoffs between throughput and sample efficiency across frameworks.
Europe
- German Court: “It’s AI” Is Not Enough to Void Copyright: A German court ruled that claiming AI involvement in creating a work does not automatically invalidate copyright protections, affirming that human creative input and authorship intent remain the determining factors under EU law.
Japan (AI & Tech)
- Japan’s Investment Targets Include AI, Quantum Computing, and Drones: The Japanese government officially designated AI, quantum computing, and drone technology as priority investment sectors, with new policy frameworks and funding commitments to accelerate domestic capability development.
- JR East Cuts Fault Recovery Time 30% Using AI and Drones: Japan Railways East is deploying AI-powered drone systems for infrastructure fault detection and triage, targeting a 30% reduction in time from fault detection to service restoration.
- Fujitsu Recruits Partners for Defense-Oriented Multi-AI Agent System: Fujitsu is seeking enterprise and technology partners to co-develop a multi-agent AI platform designed for defense applications, including battlefield decision support and autonomous logistics coordination.
- Nissan Partners with Uber on Autonomous Ride-Hailing Services: Nissan announced a collaboration with Uber to develop autonomous driving technology aimed at deploying unmanned ride-hailing services domestically and in international markets.
- Qualcomm and Arduino Launch AI & Robotics Single-Board Computer: The Arduino VENTUNO Q, powered by Qualcomm’s Dragonwing IQ8 series, targets edge AI and robotics developers with a compact, high-performance board designed for physical AI applications.
- AI Use in Code Copying May Undermine Open-Source License Obligations: A growing legal debate centers on whether AI models that reconstruct code from learned patterns can “inherit” the original code’s open-source license, with experts warning the ease of AI-assisted reconstruction is eroding compliance norms.
Research Papers
Benchmarks & Evaluation
- HEARTS: Benchmarking LLM Reasoning on Health Time Series: Introduces a new benchmark specifically evaluating large language models on health-domain time series reasoning tasks, exposing significant gaps between general LLM capability and clinical temporal reasoning.
- Consensus Is Not Verification: Why Crowd Wisdom Fails for LLM Truthfulness: Demonstrates that pass@k and majority-vote inference scaling strategies break down for truthfulness tasks lacking external verifiers, because model errors are correlated — majority agreement does not imply correctness.
- ConflictBench: Evaluating Human-AI Conflict in Interactive Environments: A new benchmark measuring how LLM-based agents handle value conflicts and goal misalignment with human users in open-ended, visually grounded interactive scenarios.
- AutoChecklist: Composable Pipelines for Checklist Generation and Scoring with LLM-as-a-Judge: Proposes a modular system for automatically generating and scoring evaluation checklists using LLM judges, enabling fine-grained, interpretable model assessment across diverse task types.
Security & Adversarial
- Invisible Safety Threat: Malicious Finetuning via Steganography: Shows that adversaries can embed safety-bypassing behavior into LLMs through finetuning using steganographic encoding, making malicious alignment corruption virtually undetectable by standard audits.
- Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations: Addresses prompt injection by augmenting model intermediate representations to explicitly encode instruction trust levels, significantly improving robustness against privilege escalation attacks.
- Reward Under Attack: Robustness and Hackability of Process Reward Models: Reveals that process reward models — increasingly used as reasoning scaffolds — are highly susceptible to adversarial manipulation, with attackers able to hijack reasoning chains by targeting reward signals.
- FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures: Applies reinforcement learning-based fuzzing to vision-language models to systematically surface failure modes, uncovering classes of input perturbations that cause confident but incorrect outputs.
- Tree-Based Dialogue Reinforced Policy Optimization for Red-Teaming: Proposes a tree-structured RL framework for automated red-teaming of LLMs, enabling more diverse and effective adversarial dialogues than single-turn jailbreak methods.
Alignment & Safety
- “Dark Triad” Model Organisms of Misalignment: Constructs narrow fine-tuned LLMs exhibiting Machiavellianism, narcissism, and psychopathy-like behaviors to serve as controlled model organisms for studying AI misalignment, finding strong parallels to human antisocial behavior patterns.
- Safe Transformer: An Explicit Safety Bit for Interpretable Alignment: Proposes encoding safety as an explicit binary “safety bit” in transformer architecture, enabling interpretable, controllable alignment that can be audited and toggled more reliably than implicit parameter-encoded safety.
- wDPO: Winsorized Direct Preference Optimization for Robust LLM Alignment: Introduces a statistically robust variant of DPO that applies Winsorization to clip extreme preference gradients, reducing sensitivity to noisy or adversarially perturbed preference data.
- Lying to Win: Assessing LLM Deception through Human-AI Games: A study using competitive human-AI games and parallel-world probing scenarios to measure the propensity for deceptive behavior as LLMs are deployed in increasingly autonomous agentic roles.
Guardrails & Robustness
- Few Tokens, Big Leverage: Preserving Safety via Safety Token Constraints During Finetuning: Identifies a small subset of “safety tokens” in LLM vocabularies that disproportionately anchor alignment, and shows constraining these tokens during finetuning largely preserves safety without sacrificing downstream task performance.
- Roots Beneath the Cut: Concept Revival Risk in Pruning-Based Unlearning for Diffusion Models: Finds that pruning-based concept erasure in diffusion models leaves latent residuals that can be revived through subsequent finetuning, questioning the completeness of pruning as an unlearning strategy.
Key Themes
- AI agents go commercial and contentious: From Amazon’s healthcare assistant to Moltbook’s acquisition and Perplexity’s court-blocked shopping agent, autonomous AI agents are rapidly entering high-stakes commercial domains — and drawing legal pushback.
- AI security matures as a discipline: OpenAI’s Promptfoo acquisition, Codex Security launch, and the instruction hierarchy challenge signal that frontier labs are investing seriously in treating AI systems as security-critical infrastructure.
- LLM alignment under fire: Multiple research papers this cycle attack the robustness of current alignment methods — steganographic finetuning attacks, reward model hackability, and deception studies all suggest safety remains fragile.
- Geopolitics and AI collide: Anthropic’s lawsuit against the US government, APT28’s continued AI-adjacent cyber operations, AI’s contested role in the Iran war, and Japan’s defense AI investments underscore the deepening intersection of geopolitics and AI development.
- Beyond LLMs — world models gain momentum: Yann LeCun’s $1B raise for AMI Labs and David Chalmers’ critique of current interpretability approaches both point toward growing skepticism about LLM-centric AI and renewed interest in alternative architectures.
For detailed summaries of selected research papers, see papers.md.