Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents
IBM and HuggingFace introduce the VAKRA benchmark, revealing that current AI agents perform poorly on complex multi-step tasks, with key failure modes including tool-chain planning, parameter passing, and error recovery.
Hugging Face Blog · Apr 15, 2026
Previewing Interrupt 2026: Agents at Enterprise Scale
LangChain's annual conference focuses on the challenges of scaling AI agents from production validation to enterprise-wide deployment, revealing how major companies build platforms, evaluate performance, and structure teams.
LangChain Blog · Apr 10, 2026
March 2026: LangChain Newsletter
LangChain is pushing AI agents from experimental prototypes to manageable, collaborative, and securely deployable enterprise productivity tools through features like LangSmith Fleet, Skills, and Sandboxes.
LangChain Blog · Apr 2, 2026
May 5, 2026 Announcements Agents for financial services
Anthropic launches ten ready-to-run agent templates for financial services, covering tedious tasks from modeling and pitchbooks to compliance screening, marking a key step for AI agents moving from concept to large-scale industry adoption.
Anthropic News ·