← Back to Home

Tag: 工具调用 (4 articles)

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

IBM and HuggingFace introduce the VAKRA benchmark, revealing that current AI agents perform poorly on complex multi-step tasks, with key failure modes including tool-chain planning, parameter passing, and error recovery.

Hugging Face Blog · Apr 15, 2026

LLM Powered Autonomous Agents

LLM powered autonomous agents combine planning, memory, and tool usage, showcasing their potential in handling complex tasks and indicating a significant shift in work methodologies.

Lilian Weng · Jun 23, 2023
BitByAI — AI-powered, AI-evolved AI News