AI & Tech News Digest — March 3, 2026

Highlights

OpenAI and Google both shipped new models on the same day: GPT-5.3 Instant targets smoother everyday conversation with less “cringe” over-caution, while Gemini 3.1 Flash-Lite brings configurable thinking levels at $0.25/M tokens input — but with tripled output pricing vs its predecessor.
Perplexity’s Comet browser was hijacked via a calendar invite, enabling credential theft from 1Password — a sharp demonstration of how browser-integrated AI agents expand the attack surface.
AI companies are spending $125M via tech-billionaire-backed super PACs to defeat congressional candidates pushing for AI regulation, raising concerns about the industry’s influence over AI governance.
Apple launched MacBook Pro with M5 Pro and M5 Max, delivering a claimed 4× AI performance boost over M4, with 1TB storage now standard — signaling hardware-accelerated inference moving into the mainstream.
Anthropic pitched Claude for a Pentagon drone swarm competition but lost to SpaceX/xAI and OpenAI-partnered defense contractors, as the US State Department separately replaced Claude with GPT-4.1 in federal deployments.

News

AI Security

Calendar invite hijacks Perplexity Comet browser — Security researchers showed a calendar invite is sufficient to hijack Comet and steal 1Password credentials, highlighting critical attack vectors in agentic browser environments. (The Decoder)
LLMs can unmask pseudonymous users at scale — Research shows LLMs can de-anonymize users with surprising accuracy, undermining pseudonymity as a privacy strategy. (Ars Technica)
AI agents as “identity dark matter” — Model Context Protocols (MCPs) are enabling LLMs to act as powerful, invisible, unmanaged enterprise identities — posing significant IAM risks. (The Hacker News)
CyberStrikeAI deployed in FortiGate attacks across 55 countries — Open-source AI-assisted attack platform used in coordinated global campaigns targeting FortiGate infrastructure. (The Hacker News)
Starkiller phishing suite bypasses MFA via AitM proxy — Jinkusu threat group deploys reverse-proxy platform to intercept and relay legitimate login sessions, defeating multi-factor authentication. (The Hacker News)
Meta sends private Ray-Ban glasses footage to Kenya with minimal safeguards — Data annotators in Nairobi manually reviewed footage including nude content and banking details, raising GDPR concerns in Europe. (The Decoder)
Smart glasses detection app alerts users to nearby recording — New app warns when AI glasses wearers are nearby, addressing growing ambient surveillance concerns. (Gigazine)

USA

OpenAI releases GPT-5.3 Instant — Designed for smoother, more natural everyday conversation with reduced over-cautious responses; system card also published. (OpenAI)
Google launches Gemini 3.1 Flash-Lite — Four configurable thinking levels, 2.5× faster output than Gemini 2.5 Flash, targeting bulk and SaaS agentic use cases — though output cost tripled vs predecessor. (Google AI Blog)
Claude Code adds voice mode — Anthropic expands its agentic coding tool with voice interaction capability. (TechCrunch)
AI companies spend $125M against pro-regulation congressional candidate — Tech-billionaire PAC targets NY candidate Alex Bores pushing AI oversight legislation. (TechCrunch)
Anthropic pitched Claude for Pentagon drone swarm competition — Lost bid to defense contractors partnered with SpaceX/xAI and OpenAI, underscoring the militarization of frontier AI. (The Decoder)
State Department replaces Claude with GPT-4.1 — Federal agencies shifting AI providers amid broader government AI vendor consolidation. (The Decoder)
OpenAI adds safeguards to Pentagon contract — Following public backlash over the contract’s scope, Sam Altman introduced clauses excluding mass surveillance and autonomous lethal weapons. (The Decoder)
US Supreme Court dismisses AI copyright appeal — Court declined to hear appeal arguing AI-generated images can hold copyright, maintaining that AI cannot own intellectual property. (Gigazine)
Apple MacBook Pro gets M5 Pro and M5 Max — 4× AI performance boost over M4, 1TB standard storage, starting at ¥278,800 in Japan. (ITmedia)
AI users work longer hours, not shorter — Studies reveal a productivity paradox: frequent AI tool users report longer working hours, prompting questions about efficiency claims. (ITmedia AI+)
Google Gemini can now order groceries and book rides on Pixel — March Pixel Drop update enables Gemini to complete real-world actions end-to-end. (The Verge)
X to suspend creators for unlabeled AI content of armed conflict — Three-month revenue share suspensions for AI-generated war imagery lacking labels. (TechCrunch)

Europe

European capitals push back on Ukraine’s fast-track EU membership bid — Ukraine seeks 2027 accession as part of war settlement; major EU states resist the accelerated timeline. (The Japan Times)
Meta’s Ray-Ban data practices may trigger EU privacy regulators — Sending footage from European users to Kenyan contractors with minimal safeguards likely violates GDPR frameworks. (The Decoder)
Deepfake verification guide after Iran conflict — Experts explain how disinformation—including AI-manipulated footage and gaming clips—spread during the conflict, and how to identify fakes. (The Verge)

Japan

Japan and businesses scramble amid Middle East conflict — Vessels halted, flights cancelled, employees evacuated; Strait of Hormuz effective closure raises crude prices and economic risks. (The Japan Times)
Flight prices soar on Asia–Europe routes — Dubai hub closure reduces capacity on key routes, inflating ticket prices significantly. (The Japan Times)
PM Takaichi to discuss Iran strikes with Trump at US summit — Japan’s PM cannot legally assess US-Israeli strikes at this stage; editorial calls for diplomatic efforts. (The Japan Times)
Bank of Japan faces harder policy path as oil prices rise — High crude prices from Hormuz disruption complicate monetary policy decisions. (The Japan News)
Tokyo High Court to rule on Unification Church dissolution — Ruling expected Wednesday on government request to dissolve the organization. (The Japan News)
Foreign driver’s license conversion pass rates plunge — Written test pass rate dropped to 42.8%, practical to 13.1%, after October 2025 rule tightening. (The Japan Times)
LDP begins talks to revise Japan’s three security documents — Discussions underway on National Security Strategy revision amid regional tensions. (The Japan News)
Figure skater Riku Miura returns home with Olympic gold — Japan’s first pairs gold medalist visits hometown following Milano Cortina 2026 victory. (The Japan News)

Research Papers

AI

MetaMind: General Cognitive World Models via Meta-Theory of Mind — Proposes a framework enabling AI systems to model other agents’ mental states generically, advancing social cognition in AI. (arXiv:2603.00808)
Monotropic AI: Domain-Specialized Language Models — Argues for deep specialization over generalization; introduces “monotropic” models optimized for narrow, high-stakes domains. (arXiv:2603.00350)

Agents

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking — System using multiple specialized agents to retrieve and cross-validate evidence from heterogeneous sources for automated fact verification. (arXiv:2603.00267)
EmCoop: Benchmark for Embodied Cooperation Among LLM Agents — Framework and benchmark evaluating how LLM-based agents coordinate in embodied environments requiring joint action. (arXiv:2603.00349)
K²-Agent: Co-Evolving Know-What and Know-How for Mobile Device Control — Agent architecture for mobile OS control that jointly learns task decomposition (know-what) and execution skills (know-how). (arXiv:2603.00676)
HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents — Addresses long-horizon task failure by separating high-level planning from low-level action in a hierarchical architecture. (arXiv:2603.00977)

Reasoning

Draft-Thinking: Efficient Reasoning in Long Chain-of-Thought LLMs — Introduces a “draft” phase that prunes unnecessary reasoning steps before committing, improving token efficiency in CoT reasoning. (arXiv:2603.00578)
LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks — Generates agent evaluation tasks with formal logical constraints, enabling automatic verifiability of task completion. (arXiv:2603.00540)

Safety

Tracking Capabilities for Safer Agents — Applies programming-language type-tracking techniques to constrain what capabilities agentic AI systems can access at runtime. (arXiv:2603.00991)
Fair in Mind, Fair in Action? Fairness in Unified Multimodal LLMs — Audits whether fairness reasoning in UMLLMs translates to fair outputs across vision-language tasks; finds significant gaps. (arXiv:2603.00590)

Benchmarks

TraderBench: Robustness of AI Agents in Adversarial Capital Markets — Evaluates how AI trading agents perform against adversarial market participants, probing brittleness under strategic pressure. (arXiv:2603.00285)
The Synthetic Web: Adversarially-Curated Mini-Internets — Constructs small, controlled web environments to diagnose epistemic failures in language agents navigating information sources. (arXiv:2603.00801)

Applied AI

MED-COPILOT: Medical Assistant Powered by GraphRAG — Clinical decision-support system integrating patient histories and case trajectories via graph-based retrieval-augmented generation. (arXiv:2603.00460)
SWE-Hub: Unified System for Scalable Software Engineering Tasks — Unified benchmark and agent framework for end-to-end software engineering, covering bug fixing, feature addition, and code review. (arXiv:2603.00575)

Key Themes

Model releases compress: Same-day drops from OpenAI and Google signal an intensifying release cadence, with both models targeting cost-efficient, scalable deployment rather than raw capability.
AI in defense and geopolitics: Anthropic, OpenAI, and others are now visibly competing for Pentagon contracts — and facing backlash requiring public safeguard commitments. The intersection of AI and armed conflict (Iran, drone swarms, deepfakes) dominated the news cycle.
Agentic AI security gap: From Comet browser hijacking to identity dark matter, the week reinforced that agentic AI systems create novel attack surfaces that security tooling is not yet equipped to address.
Privacy eroding on multiple fronts: LLM de-anonymization, Ray-Ban footage routing, and smart glasses surveillance all highlight how AI is systematically undermining conventional privacy assumptions.
AI governance battles: $125M in PAC spending against AI regulation candidates, Supreme Court AI copyright ruling, and federal AI vendor churn all point to AI policy entering an intensely contested phase.

For detailed summaries of selected research papers, see papers.md.