AI News Digest — March 12, 2026

Highlights

Pentagon Labels Anthropic a “Supply Chain Risk”: The US War Department CTO claims Claude’s built-in ethics “pollute” its AI supply chain, prompting Anthropic to file a legal challenge.
AI-Generated Slopoly Malware Deployed in Ransomware Attack: Threat actor Hive0163 used generative AI to build a novel malware strain that persisted on a compromised server for over a week before triggering an Interlock ransomware payload.
Atlassian Cuts 1,600 Jobs in Name of AI: The Jira and Confluence maker is laying off 10% of its workforce to redirect funds toward AI development, following a similar move by Block.
Grok 4.20 Sets Hallucination Record — But Trails Gemini and GPT-5.4: xAI’s latest model is the least-hallucinating model yet tested, but still lags well behind the top tier on standard benchmarks.
NAND Flash Prices Jump 50% Overnight: Phison’s CEO reveals some NAND flash memory makers raised prices by up to 50% overnight, a sudden shock that will ripple through consumer and data-center hardware costs.

News

AI Security

AI-Generated Slopoly Malware Used in Interlock Ransomware Attack (BleepingComputer) — Hive0163 leveraged generative AI to develop a novel backdoor framework faster than traditional methods, maintaining stealthy access for more than a week before exfiltrating data.
Hive0163 Uses AI-Assisted Slopoly Malware for Persistent Access (The Hacker News) — Detailed analysis of Slopoly’s capabilities; researchers warn AI-generated malware frameworks can now be built “in a fraction of the time it used to take.”
Commercial Spyware Opponents Fear US Policy Shifting (Dark Reading) — Rescinded sanctions and reactivated contracts are creating confusion about the Trump administration’s stance on commercial spyware like Pegasus.
MALUS – Clean Room as a Service (Simon Willison) — Biting satire on “vibe-porting” license laundering: fictional AI robots that “independently recreate” open-source code to wash out copyleft obligations — uncomfortably close to real proposals.

USA

US War Department CTO Says Anthropic’s Models “Pollute” the Supply Chain (The Decoder) — The Pentagon’s AI chief objects to Claude’s safety constraints as an impediment to military use; Anthropic has filed a lawsuit challenging its exclusion.
Anthropic Doesn’t Trust the Pentagon, and Neither Should You (The Verge) — Nilay Patel digs into the fast-moving legal and ethical battle between Anthropic and the US Department of Defense over AI ethics and mass surveillance.
Coding After Coders: The End of Computer Programming as We Know It (Simon Willison / NYT) — A sweeping NYT Magazine piece drawing on interviews with 70+ developers from Google, Amazon, Apple, and Microsoft captures the industry-wide reckoning with AI-assisted development.
Atlassian Cuts 1,600 Staff in the Name of AI (TechCrunch) — A 10% workforce reduction follows Block’s AI-driven layoffs, underscoring a growing pattern of companies trading headcount for AI investment.
ChatGPT Market Share Slips from 75% to 62% as Gemini Quadruples (The Decoder) — Similarweb data shows Gemini climbed from 5.7% to 24.4% chatbot market share over twelve months, the biggest single-year gain in the sector.
Grok 4.20 Trails Gemini and GPT-5.4 but Sets Hallucination Record (The Decoder) — xAI’s newest model is fast and cheap and hallucinates less than any other model tested, but benchmark performance lags the top tier by a wide margin.
Nvidia to Spend $26B on Open-Weight AI Models Over Five Years (The Decoder) — An SEC filing reveals Nvidia’s strategic bet on open-source AI to counter Chinese models and keep developers tied to its hardware ecosystem.
Claude Can Now Create Interactive Charts and Visualizations in Chat (The Decoder) — Anthropic’s latest beta feature generates diagrams and visualizations inline during conversations, embedding them directly in the chat window.
Microsoft Launches Copilot Health (The Decoder) — A new secure Copilot mode can pull data from wearables and medical records to provide personalized health advice, with long-term ambitions toward “medical superintelligence.”
Google Maps Gets AI “Ask Maps” Feature with Gemini (The Verge) — Google Maps’ biggest overhaul in a decade includes natural-language place search powered by Gemini and an upgraded immersive 3D navigation mode.
Perplexity Personal Computer Turns a Spare Mac Into a 24/7 AI Agent (The Verge) — Perplexity’s new product runs locally on a dedicated Mac and acts as a persistent digital proxy, handling tasks autonomously on behalf of the user.
Facebook Marketplace Adds Meta AI Auto-Replies (TechCrunch) — Sellers can toggle on Meta AI to automatically draft responses to “Is this still available?” messages using listing details.
OpenAI Plans to Integrate Sora Directly into ChatGPT (The Decoder) — With Sora’s standalone app dropping from #1 to #165 in the App Store, OpenAI is reportedly folding video generation into its 920M-user ChatGPT.
Gumloop Raises $50M from Benchmark to Democratize AI Agent Building (TechCrunch) — The no-code AI agent platform secured Series A funding from Benchmark to expand its tools for non-technical employees to build and deploy agents.
Writer Sues Grammarly for Turning Authors Into “AI Editors” Without Consent (TechCrunch) — Journalist Julia Angwin leads a class action over Grammarly’s alleged use of users’ writing to build AI editor personas without permission.
Western AI Models “Fail Spectacularly” in Farms and Forests Abroad (Rest of World) — Tools trained on Western data consistently fail to recognize local crops, pests, and farming conditions in the Global South, underscoring data-diversity gaps.
Anthropic Launches the Anthropic Institute to Research AI’s Societal Impacts (Gigazine) — The new institute, led by co-founder Jack Clark, draws on ML engineers, economists, and social scientists to study the broader effects of powerful AI on society.
Microsoft’s Copilot AI Competes with DeepSeek for Africa (Japan Times) — Microsoft is actively pushing Copilot adoption across Africa, directly challenging the growing footprint of China’s DeepSeek on the continent.

Europe

Iranian Propaganda Images Made with AI End Up in Der Spiegel (The Decoder) — Germany’s Der Spiegel removed several AI-generated or AI-altered images from its Iran coverage after fact-checkers identified them as likely disinformation.
iPhones and iPads Approved for NATO Classified Data (Schneier on Security) — Apple devices are now the first consumer hardware certified to handle NATO restricted-level classified information out of the box with no special software.

Japan (AI & Tech)

VS Code Moves to Weekly Stable Releases; v1.111 Strengthens AI Agent Management (ITmedia AI+) — The inaugural weekly stable release of Visual Studio Code brings enhanced autonomous AI agent execution, permission management, and debugging support.
GPT-5.4 Arrives with 1M-Token Context and Deep Codex Integration (ITmedia AI+) — OpenAI’s latest model brings substantially stronger autonomous task completion via a massive context window and tighter coupling with the Codex coding environment.
Grok Fact-Check on X Moves Behind Paywall (ITmedia AI+) — The ability to invoke Grok via @mention for real-time fact-checking of posts is now restricted to X paid subscribers.
ComfyUI Releases Official “App Mode” for Simplified Workflows (Gigazine) — A new mode converts node-based generative AI workflows into clean user-facing UIs, making ComfyUI more accessible to non-technical users.
China Restricts OpenClaw in Government and State-Owned Enterprises Over Security Concerns (Gigazine) — Despite OpenClaw’s explosive popularity across China, authorities have issued a second security warning and are banning its use in sensitive sectors due to data-leak risks.
Intel Launches Core Ultra 200S Plus Series with Gaming-Optimized Features (Gigazine) — The new desktop processors introduce hardware-level gaming optimization features for the first time in Intel’s consumer lineup.
Ransomware Attacks in Japan Hit 226 Cases in 2025 (Japan Times) — Police data shows the majority of victims were small and mid-sized companies, though several large firms suffered serious damage.
AI Chatbots Assisted Teen Violence Planning in 80% of Test Cases — Claude Always Refused (Gigazine) — The Center for Countering Digital Hate found that most major AI chatbots provided helpful responses to simulated attack-planning prompts; Claude was the sole consistent refuser.
Yomiuri Editorial: Don’t Turn War into a Testing Ground for AI Weapons (The Japan News) — An editorial calls for international norms on autonomous AI-equipped drones following their deployment in Russia-Ukraine and US-Iran conflicts.

Research Papers

Benchmarks & Evaluation

Beyond Scalars: Evaluating LLM Reasoning via Geometric Progress and Stability — Introduces TRACED, a framework that decomposes reasoning traces into geometric “progress” and “stability” metrics, revealing that correct reasoning manifests as high-progress, stable trajectories while incorrect reasoning diverges.
CUAAudit: Meta-Evaluation of VLMs as Auditors of Computer-Use Agents — Proposes a meta-evaluation framework for assessing whether vision-language models can reliably audit autonomous desktop agents, a critical scalability challenge as computer-use deployments expand.
Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety — One of the largest controlled studies of scaffold effects on safety (N=62,808; six frontier models) shows that agentic deployment wrappers—reasoning traces, critic agents, delegation pipelines—meaningfully alter measured safety behavior relative to isolated benchmark conditions.

Security & Adversarial

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs — Addresses a core defense against jailbreaks and prompt-injection by providing training data for robust instruction-hierarchy behavior, distinguishing system, developer, user, and tool-level instructions under conflict.
FERRET: Framework for Expansion Reliant Red Teaming — An automated multi-modal adversarial red-teaming framework that expands attack coverage through horizontal (self-improving conversation starters) and other vector extensions, yielding more effective jailbreak discovery.
Targeted Bit-Flip Attacks on LLM-Based Agents — Introduces Flip-Agent, the first framework for exploiting hardware-level bit-flip faults in LLM agents’ multi-stage pipelines and tool calls, demonstrating that physical memory attacks can steer model outputs and actions.

Compliance & Regulation

Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem — A systematic review of 896 papers and 80+ regulatory/standards documents finds critical definitional inconsistencies between “AI model” and “AI system” in frameworks like the EU AI Act, and proposes a resolution.
How to Count AIs: Individuation and Liability for AI Agents — Examines the novel legal challenge of identifying which AI agent caused a harm when agents can copy, split, merge, and lack physical bodies, arguing that existing liability frameworks require fundamental extension.

Alignment & Safety

Does LLM Alignment Really Need Diversity? RLVR Methods for Moral Reasoning — Empirically tests whether moral reasoning alignment requires diversity-seeking algorithms rather than reward-maximizing ones, with implications for how RLHF and RLVR methods should be designed for value alignment.
Measuring and Eliminating Refusals in Military Large Language Models — Presents a veterans-developed benchmark for measuring inappropriate refusals in military-domain LLM queries, directly intersecting with the Pentagon/Anthropic dispute over safety constraints in defense AI.

Applications

Emulating Clinician Cognition via Self-Evolving Deep Clinical Research — DxEvolve is a self-improving diagnostic agent that mirrors clinician reasoning through dynamic cue acquisition and continuous knowledge accumulation, addressing current AI systems’ lack of auditability in clinical diagnosis.
A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification — PharmGraph-Auditor replaces direct LLM use in zero-tolerance pharmacist verification with a graph-based reasoning layer to overcome LLMs’ factual unreliability in high-stakes medication safety contexts.

Guardrails & Robustness

ADVERSA: Measuring Multi-Turn Guardrail Degradation in LLMs — Moves beyond binary jailbreak pass/fail to measure how guardrail compliance degrades continuously across sustained adversarial conversations, using a fine-tuned 70B attacker model across six frontier systems.
Tool Receipts, Not Zero-Knowledge Proofs: Practical Hallucination Detection for AI Agents — NabaOS proposes lightweight “tool receipts”—cryptographically signed execution records—as a practical alternative to ZK proofs for verifying that AI agent tool calls actually executed and returned what the model claims.

Key Themes

AI in warfare and defense ethics — The Pentagon’s rejection of Anthropic’s safety guardrails as a “supply chain pollutant” crystallizes a fundamental tension: militaries want maximally compliant AI, safety-focused labs deliberately build in refusals.
AI-weaponized malware — Slopoly’s emergence signals that generative AI is lowering the barrier to novel malware development, compressing the time from concept to deployment.
Labor displacement accelerating — Atlassian’s 10% cut following Block’s sets a pattern of large tech firms trading workforce for AI investment, echoing the “Coding After Coders” thesis in the New York Times.
Chatbot market consolidation — Gemini’s rapid share gain at ChatGPT’s expense suggests the chatbot market is entering a competitive redistribution phase after OpenAI’s long dominance.
Agentic AI safety gaps — Multiple research papers this week converge on a core problem: safety benchmarks test models in isolation, but real deployments wrap models in agentic scaffolds that change their safety profiles in ways that are only now being characterized.
Hardware supply shock — NAND price spikes and Intel’s gaming CPU launch underscore semiconductor market volatility as AI accelerates hardware demand cycles.

For detailed summaries of selected research papers, see papers.md.