Tag: 企业级应用 (5 articles)

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

IBM and HuggingFace introduce the VAKRA benchmark, revealing that current AI agents perform poorly on complex multi-step tasks, with key failure modes including tool-chain planning, parameter passing, and error recovery.

Hugging Face Blog · Apr 15, 2026

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

The key to scaling enterprise AI isn't better prompts or larger models, but Agent Logic: using deterministic software engineering primitives to constrain and steer LLMs for reliable, cost-effective execution.

Hugging Face Blog ·

May 5, 2026 Announcements Agents for financial services

Anthropic launches ten ready-to-run agent templates for financial services, covering tedious tasks from modeling and pitchbooks to compliance screening, marking a key step for AI agents moving from concept to large-scale industry adoption.

Anthropic News ·

Previewing Interrupt 2026: Agents at Enterprise Scale

LangChain previews its Interrupt 2026 conference, shifting focus from 'Can agents work in production?' to 'How to achieve enterprise-scale deployment,' tackling core challenges like evaluation, team structure, and infrastructure.

LangChain Blog ·

Anthropic's Claude Tag: When AI Becomes a 'Permanently Online Colleague,' How Will Work Patterns Be Reshaped?

Anthropic launched Claude Tag, deeply integrating AI into team collaboration spaces like Slack with capabilities for multi-user collaboration, long-term memory, and proactive asynchronous work, marking a paradigm shift from AI as a tool to a 'digital colleague'.

Anthropic News ·