research-llm-apis 2026-04-04
Simon Willison used AI to analyze raw HTTP APIs from Anthropic, OpenAI, Gemini, and Mistral to redesign LLM library's abstraction layer.
Key Points
- LLM Python library undergoing major redesign to support server-side tool execution
- Claude Code read through four major vendors' Python clients and generated curl commands
- All scripts and raw outputs are open-source, valuable for studying API differences
Analysis
Peeling Back the API Layers: A Head-to-Head Comparison of Four Major LLM HTTP Interfaces
Simon Willison is undertaking a major architectural overhaul of his LLM Python library. The impetus behind this upgrade is quite practical: over the past year, major LLM providers have been rolling out new features (such as server-side tool execution), and the existing abstraction layer simply can't keep up.
A Unique Research Methodology
Willison didn't just pore over API documentation – instead, he used Claude Code to directly analyze the Python client library source code from Anthropic, OpenAI, Gemini, and Mistral. He then had the AI generate curl commands to access the raw JSON interfaces.
The brilliance of this approach lies in the fact that documentation can become outdated, but code never lies.
The Output
All scripts and captured outputs have been compiled into a new repository called research-llm-apis, covering:
- Complete request/response pairs in both streaming and non-streaming modes.
- API behavior comparisons across various scenarios.
- Implementation differences in features like tool calling and structured output among different providers.
What This Means for Developers
If you're working on multi-model switching (i.e., calling different providers' models through a unified interface), this resource is invaluable. You'll discover:
- Significant differences in streaming protocol details between providers.
- Inconsistent parameter formats and response structures for tool calling.
- Non-uniform error handling and retry mechanisms.
This also explains why abstraction layers like llm and LangChain are becoming increasingly difficult to maintain – each provider is constantly adding new bells and whistles to their APIs, forcing the abstraction layer to play catch-up.
Willison's research provides a code-based, factual foundation for designing the next generation of LLM abstraction layers, moving away from guesswork based on potentially outdated documentation.
Analysis generated by BitByAI · Read original English article