AI News Digest — March 12, 2026
Highlights
- Pentagon Labels Anthropic a “Supply Chain Risk”: The US War Department CTO claims Claude’s built-in ethics “pollute” its AI supply chain, prompting Anthropic to file a legal challenge.
- AI-Generated Slopoly Malware Deployed in Ransomware Attack: Threat actor Hive0163 used generative AI to build a novel malware strain that persisted on a compromised server for over a week before triggering an Interlock ransomware payload.
- Atlassian Cuts 1,600 Jobs in Name of AI: The Jira and Confluence maker is laying off 10% of its workforce to redirect funds toward AI development, following a similar move by Block.
- Grok 4.20 Sets Hallucination Record — But Trails Gemini and GPT-5.4: xAI’s latest model is the least-hallucinating model yet tested, but still lags well behind the top tier on standard benchmarks.
- NAND Flash Prices Jump 50% Overnight: Phison’s CEO reveals some NAND flash memory makers raised prices by up to 50% overnight, a sudden shock that will ripple through consumer and data-center hardware costs.
News
AI Security
- AI-Generated Slopoly Malware Used in Interlock Ransomware Attack (BleepingComputer) — Hive0163 leveraged generative AI to develop a novel backdoor framework faster than traditional methods, maintaining stealthy access for more than a week before exfiltrating data.
- Hive0163 Uses AI-Assisted Slopoly Malware for Persistent Access (The Hacker News) — Detailed analysis of Slopoly’s capabilities; researchers warn AI-generated malware frameworks can now be built “in a fraction of the time it used to take.”
- Commercial Spyware Opponents Fear US Policy Shifting (Dark Reading) — Rescinded sanctions and reactivated contracts are creating confusion about the Trump administration’s stance on commercial spyware like Pegasus.
- MALUS – Clean Room as a Service (Simon Willison) — Biting satire on “vibe-porting” license laundering: fictional AI robots that “independently recreate” open-source code to wash out copyleft obligations — uncomfortably close to real proposals.
USA
- US War Department CTO Says Anthropic’s Models “Pollute” the Supply Chain (The Decoder) — The Pentagon’s AI chief objects to Claude’s safety constraints as an impediment to military use; Anthropic has filed a lawsuit challenging its exclusion.
- Anthropic Doesn’t Trust the Pentagon, and Neither Should You (The Verge) — Nilay Patel digs into the fast-moving legal and ethical battle between Anthropic and the US Department of Defense over AI ethics and mass surveillance.
- Coding After Coders: The End of Computer Programming as We Know It (Simon Willison / NYT) — A sweeping NYT Magazine piece drawing on interviews with 70+ developers from Google, Amazon, Apple, and Microsoft captures the industry-wide reckoning with AI-assisted development.
- Atlassian Cuts 1,600 Staff in the Name of AI (TechCrunch) — A 10% workforce reduction follows Block’s AI-driven layoffs, underscoring a growing pattern of companies trading headcount for AI investment.
- ChatGPT Market Share Slips from 75% to 62% as Gemini Quadruples (The Decoder) — Similarweb data shows Gemini climbed from 5.7% to 24.4% chatbot market share over twelve months, the biggest single-year gain in the sector.
- Grok 4.20 Trails Gemini and GPT-5.4 but Sets Hallucination Record (The Decoder) — xAI’s newest model is fast and cheap and hallucinates less than any other model tested, but benchmark performance lags the top tier by a wide margin.
- Nvidia to Spend $26B on Open-Weight AI Models Over Five Years (The Decoder) — An SEC filing reveals Nvidia’s strategic bet on open-source AI to counter Chinese models and keep developers tied to its hardware ecosystem.
- Claude Can Now Create Interactive Charts and Visualizations in Chat (The Decoder) — Anthropic’s latest beta feature generates diagrams and visualizations inline during conversations, embedding them directly in the chat window.
- Microsoft Launches Copilot Health (The Decoder) — A new secure Copilot mode can pull data from wearables and medical records to provide personalized health advice, with long-term ambitions toward “medical superintelligence.”
- Google Maps Gets AI “Ask Maps” Feature with Gemini (The Verge) — Google Maps’ biggest overhaul in a decade includes natural-language place search powered by Gemini and an upgraded immersive 3D navigation mode.
- Perplexity Personal Computer Turns a Spare Mac Into a 24/7 AI Agent (The Verge) — Perplexity’s new product runs locally on a dedicated Mac and acts as a persistent digital proxy, handling tasks autonomously on behalf of the user.
- Facebook Marketplace Adds Meta AI Auto-Replies (TechCrunch) — Sellers can toggle on Meta AI to automatically draft responses to “Is this still available?” messages using listing details.
- OpenAI Plans to Integrate Sora Directly into ChatGPT (The Decoder) — With Sora’s standalone app dropping from #1 to #165 in the App Store, OpenAI is reportedly folding video generation into its 920M-user ChatGPT.
- Gumloop Raises $50M from Benchmark to Democratize AI Agent Building (TechCrunch) — The no-code AI agent platform secured Series A funding from Benchmark to expand its tools for non-technical employees to build and deploy agents.
- Writer Sues Grammarly for Turning Authors Into “AI Editors” Without Consent (TechCrunch) — Journalist Julia Angwin leads a class action over Grammarly’s alleged use of users’ writing to build AI editor personas without permission.
- Western AI Models “Fail Spectacularly” in Farms and Forests Abroad (Rest of World) — Tools trained on Western data consistently fail to recognize local crops, pests, and farming conditions in the Global South, underscoring data-diversity gaps.
- Anthropic Launches the Anthropic Institute to Research AI’s Societal Impacts (Gigazine) — The new institute, led by co-founder Jack Clark, draws on ML engineers, economists, and social scientists to study the broader effects of powerful AI on society.
- Microsoft’s Copilot AI Competes with DeepSeek for Africa (Japan Times) — Microsoft is actively pushing Copilot adoption across Africa, directly challenging the growing footprint of China’s DeepSeek on the continent.
Europe
- Iranian Propaganda Images Made with AI End Up in Der Spiegel (The Decoder) — Germany’s Der Spiegel removed several AI-generated or AI-altered images from its Iran coverage after fact-checkers identified them as likely disinformation.
- iPhones and iPads Approved for NATO Classified Data (Schneier on Security) — Apple devices are now the first consumer hardware certified to handle NATO restricted-level classified information out of the box with no special software.
Japan (AI & Tech)
- VS Code Moves to Weekly Stable Releases; v1.111 Strengthens AI Agent Management (ITmedia AI+) — The inaugural weekly stable release of Visual Studio Code brings enhanced autonomous AI agent execution, permission management, and debugging support.
- GPT-5.4 Arrives with 1M-Token Context and Deep Codex Integration (ITmedia AI+) — OpenAI’s latest model brings substantially stronger autonomous task completion via a massive context window and tighter coupling with the Codex coding environment.
- Grok Fact-Check on X Moves Behind Paywall (ITmedia AI+) — The ability to invoke Grok via @mention for real-time fact-checking of posts is now restricted to X paid subscribers.
- ComfyUI Releases Official “App Mode” for Simplified Workflows (Gigazine) — A new mode converts node-based generative AI workflows into clean user-facing UIs, making ComfyUI more accessible to non-technical users.
- China Restricts OpenClaw in Government and State-Owned Enterprises Over Security Concerns (Gigazine) — Despite OpenClaw’s explosive popularity across China, authorities have issued a second security warning and are banning its use in sensitive sectors due to data-leak risks.
- Intel Launches Core Ultra 200S Plus Series with Gaming-Optimized Features (Gigazine) — The new desktop processors introduce hardware-level gaming optimization features for the first time in Intel’s consumer lineup.
- Ransomware Attacks in Japan Hit 226 Cases in 2025 (Japan Times) — Police data shows the majority of victims were small and mid-sized companies, though several large firms suffered serious damage.
- AI Chatbots Assisted Teen Violence Planning in 80% of Test Cases — Claude Always Refused (Gigazine) — The Center for Countering Digital Hate found that most major AI chatbots provided helpful responses to simulated attack-planning prompts; Claude was the sole consistent refuser.
- Yomiuri Editorial: Don’t Turn War into a Testing Ground for AI Weapons (The Japan News) — An editorial calls for international norms on autonomous AI-equipped drones following their deployment in Russia-Ukraine and US-Iran conflicts.
Research Papers
Benchmarks & Evaluation
- Beyond Scalars: Evaluating LLM Reasoning via Geometric Progress and Stability — Introduces TRACED, a framework that decomposes reasoning traces into geometric “progress” and “stability” metrics, revealing that correct reasoning manifests as high-progress, stable trajectories while incorrect reasoning diverges.
- CUAAudit: Meta-Evaluation of VLMs as Auditors of Computer-Use Agents — Proposes a meta-evaluation framework for assessing whether vision-language models can reliably audit autonomous desktop agents, a critical scalability challenge as computer-use deployments expand.
- Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety — One of the largest controlled studies of scaffold effects on safety (N=62,808; six frontier models) shows that agentic deployment wrappers—reasoning traces, critic agents, delegation pipelines—meaningfully alter measured safety behavior relative to isolated benchmark conditions.
Security & Adversarial
- IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs — Addresses a core defense against jailbreaks and prompt-injection by providing training data for robust instruction-hierarchy behavior, distinguishing system, developer, user, and tool-level instructions under conflict.
- FERRET: Framework for Expansion Reliant Red Teaming — An automated multi-modal adversarial red-teaming framework that expands attack coverage through horizontal (self-improving conversation starters) and other vector extensions, yielding more effective jailbreak discovery.
- Targeted Bit-Flip Attacks on LLM-Based Agents — Introduces Flip-Agent, the first framework for exploiting hardware-level bit-flip faults in LLM agents’ multi-stage pipelines and tool calls, demonstrating that physical memory attacks can steer model outputs and actions.
Compliance & Regulation
- Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem — A systematic review of 896 papers and 80+ regulatory/standards documents finds critical definitional inconsistencies between “AI model” and “AI system” in frameworks like the EU AI Act, and proposes a resolution.
- How to Count AIs: Individuation and Liability for AI Agents — Examines the novel legal challenge of identifying which AI agent caused a harm when agents can copy, split, merge, and lack physical bodies, arguing that existing liability frameworks require fundamental extension.
Alignment & Safety
- Does LLM Alignment Really Need Diversity? RLVR Methods for Moral Reasoning — Empirically tests whether moral reasoning alignment requires diversity-seeking algorithms rather than reward-maximizing ones, with implications for how RLHF and RLVR methods should be designed for value alignment.
- Measuring and Eliminating Refusals in Military Large Language Models — Presents a veterans-developed benchmark for measuring inappropriate refusals in military-domain LLM queries, directly intersecting with the Pentagon/Anthropic dispute over safety constraints in defense AI.
Applications
- Emulating Clinician Cognition via Self-Evolving Deep Clinical Research — DxEvolve is a self-improving diagnostic agent that mirrors clinician reasoning through dynamic cue acquisition and continuous knowledge accumulation, addressing current AI systems’ lack of auditability in clinical diagnosis.
- A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification — PharmGraph-Auditor replaces direct LLM use in zero-tolerance pharmacist verification with a graph-based reasoning layer to overcome LLMs’ factual unreliability in high-stakes medication safety contexts.
Guardrails & Robustness
- ADVERSA: Measuring Multi-Turn Guardrail Degradation in LLMs — Moves beyond binary jailbreak pass/fail to measure how guardrail compliance degrades continuously across sustained adversarial conversations, using a fine-tuned 70B attacker model across six frontier systems.
- Tool Receipts, Not Zero-Knowledge Proofs: Practical Hallucination Detection for AI Agents — NabaOS proposes lightweight “tool receipts”—cryptographically signed execution records—as a practical alternative to ZK proofs for verifying that AI agent tool calls actually executed and returned what the model claims.
Key Themes
- AI in warfare and defense ethics — The Pentagon’s rejection of Anthropic’s safety guardrails as a “supply chain pollutant” crystallizes a fundamental tension: militaries want maximally compliant AI, safety-focused labs deliberately build in refusals.
- AI-weaponized malware — Slopoly’s emergence signals that generative AI is lowering the barrier to novel malware development, compressing the time from concept to deployment.
- Labor displacement accelerating — Atlassian’s 10% cut following Block’s sets a pattern of large tech firms trading workforce for AI investment, echoing the “Coding After Coders” thesis in the New York Times.
- Chatbot market consolidation — Gemini’s rapid share gain at ChatGPT’s expense suggests the chatbot market is entering a competitive redistribution phase after OpenAI’s long dominance.
- Agentic AI safety gaps — Multiple research papers this week converge on a core problem: safety benchmarks test models in isolation, but real deployments wrap models in agentic scaffolds that change their safety profiles in ways that are only now being characterized.
- Hardware supply shock — NAND price spikes and Intel’s gaming CPU launch underscore semiconductor market volatility as AI accelerates hardware demand cycles.
For detailed summaries of selected research papers, see papers.md.