AI Daily

Tuesday, May 19, 2026

Google Unveils Gemini 3.5: A New Era of Agentic 'Frontier Intelligence'

At Google I/O 2026, Google introduced the Gemini 3.5 model series, headlined by the highly efficient Gemini 3.5 Flash. This release marks a strategic pivot from passive chat interfaces to active AI agents, emphasizing 'intelligence with action' where models orchestrate tasks across Google's entire ecosystem. The update includes advanced multimodal capabilities and a significant shift in Search towards natural language 'AI Mode' queries, which have seen rapid adoption in the U.S. The Gemini 3.5 series is designed to handle complex, multi-step workflows directly within Workspace applications. Key additions include new voice capabilities for Gmail and Docs, a design tool called Google Pics, and upgraded AI Inbox features. By integrating these models deeply into Search and Productivity tools, Google aims to move beyond keyword-based retrieval toward a comprehensive action-oriented assistant model.

Hacker News · Google AI · Google AI · Google AI · Google AI · Google AI

Andrej Karpathy Joins Anthropic to Lead Frontier LLM Research

Andrej Karpathy, a founding member of OpenAI and former Director of AI at Tesla, has joined Anthropic to focus on R&D. Karpathy stated that he believes the next few years at the frontier of LLMs will be 'especially formative,' signaling his return to fundamental research after a period focused on educational content. His move is seen as a significant win for Anthropic as it continues to compete with OpenAI and Google for elite research talent.

Twitter/@karpathy

ICRL: Internalizing Self-Critique directly into Model Weights

Researchers have proposed Internalized Critique Reinforcement Learning (ICRL), a method that allows LLMs to absorb the benefits of self-correction without requiring an external critic model in the inference loop. In current agentic frameworks, a 'critic' is often used to review and correct a model's output, which adds significant latency and cost. ICRL uses reinforcement learning to bake this corrective guidance directly into the underlying model capability, enabling the agent to improve its own performance over time and internalize critique without permanent reliance on a frozen external critic.

arxiv/cs.AI

NIMO Controller Leverages Model Context Protocol for Scientific Labs

A new framework called NIMO Controller has been introduced to orchestrate 'self-driving laboratories' (SDLs) using the Model Context Protocol (MCP). Developing software for autonomous laboratories has historically been technically demanding due to a lack of standardized interfaces for AI agents. By utilizing MCP, NIMO provides a unified orchestration layer that allows LLM-based agents to interact directly with hardware and scientific equipment, significantly lowering the barrier for AI-driven scientific discovery.

arxiv/cs.AI

Securing Enterprise Agents with State-Constrained Dispatch and Proofs

New research into State-Constrained Dispatch (SDOF) and Verifiable Agentic Infrastructure addresses the growing security risks of autonomous agents in enterprise and sovereign cloud settings. As agents move toward higher autonomy, traditional 'identity-centric' authorization—which assumes a caller with valid credentials is safe—becomes a liability. These frameworks propose treating multi-agent execution as a constrained state machine with proof-derived authorization, ensuring agents cannot generate semantically unsafe actions that deviate from established business processes.

arxiv/cs.AI · arxiv/cs.AI

Optimizing Coding Agents via SkillSmith and Multi-Rubric Context Pruning

Efficiency in LLM-powered coding agents is receiving a major boost from two new methodologies: SkillSmith and Multi-Rubric Latent Reasoning for context pruning. SkillSmith reduces redundant reasoning by compiling agent skills into boundary-guided runtime interfaces, while the new pruning method identifies and removes irrelevant code from the context window using multi-objective latent reasoning. These approaches address the 'modeling bottleneck' where agents spend excessive token budgets reading repository files that are unnecessary for the task at hand.

arxiv/cs.AI · arxiv/cs.AI

Advancing Formal Reasoning: Agentic Evolution and Mathematical Proving

New developments are bridging the gap between neural intuition and symbolic rigor in hard reasoning tasks. The Solvita framework introduces 'agentic evolution' for competitive programming, allowing LLMs to learn from previous debugging experiences rather than discarding that data after each session. Meanwhile, new research in automated polynomial inequality proving uses LLM-generated conjectures to produce 'Sum-of-Squares' certificates that can be formally verified in the Lean theorem prover, offering a scalable way to prove complex mathematical challenges that were previously computationally prohibitive.

arxiv/cs.AI · arxiv/cs.AI

OpenAI Implements New Content Provenance and Verification Tools

OpenAI has announced a major update to its content provenance efforts, integrating Content Credentials (C2PA) and Google’s SynthID watermarking technology. The initiative includes a new verification tool designed to help users identify AI-generated media. These measures are intended to build a safer and more transparent AI ecosystem by providing verifiable metadata and watermarking that persists across different platforms, helping to combat the spread of misinformation and deepfakes.

OpenAI

KPMG Deploys Claude AI to 276,000 Employees in Global Alliance

Professional services giant KPMG has announced a strategic alliance with Anthropic to integrate the Claude AI model across its entire global workforce of more than 276,000 people. This deployment is one of the largest corporate rollouts of LLM technology to date. KPMG intends to use Claude to augment its core business operations, audit workflows, and tax services, signaling a major move toward full-scale AI integration in the professional services sector.

Anthropic