Tag: 开源模型 (13 articles)

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

Simon Willison reviews the open-source Ornith-1.0 model, highlighting its efficient tool calling and code understanding for agentic tasks, signaling new advances in open agentic coding models.

Simon Willison · Jun 30, 2026

GLM-5.2: Built for Long-Horizon Tasks

Z.ai releases GLM-5.2, the first open-source model to achieve stable 1M-token context and rival top closed-source models on long-horizon coding benchmarks.

Hugging Face Blog · Jun 17, 2026

Holo3.1: Fast & Local Computer Use Agents

Holo3.1 makes critical breakthroughs in environment robustness, local deployment, and real-time speed, signaling that general-purpose computer use agents are moving from capability demos to production-ready engineering.

Hugging Face Blog · Jun 2, 2026

Introducing the Ettin Reranker Family

Hugging Face has released six Ettin reranker models of varying sizes, designed to significantly improve the accuracy of search and RAG systems at low cost through a 'retrieve-then-rerank' two-stage architecture.

Hugging Face Blog · May 19, 2026

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

IBM releases two Apache 2.0 open-source multilingual embedding models, where the 97-million-parameter compact version outperforms all models of similar size on various benchmarks, demonstrating the huge potential of 'small but mighty' models for specific tasks.

Hugging Face Blog · May 15, 2026

Granite 4.1 LLMs: How They’re Built

IBM's Granite 4.1 series demonstrates that a meticulously engineered data pipeline and multi-stage training can enable an 8B dense model to match or exceed the performance of a previous 32B MoE model, highlighting a paradigm shift where data quality trumps parameter count.

Hugging Face Blog · Apr 29, 2026

microsoft/VibeVoice

Microsoft releases VibeVoice, an MIT-licensed Whisper-style speech model with built-in speaker diarization, capable of locally transcribing up to one hour of audio on a Mac.

Simon Willison · Apr 28, 2026

DeepSeek V4 - almost on the frontier, a fraction of the price

DeepSeek's V4 series delivers near-frontier performance at a fraction of the cost (Pro at $1.74/M input, Flash at just $0.14/M), potentially reshaping the cost-effectiveness standard for open-weight models.

Simon Willison · Apr 24, 2026

DeepSeek-V4: a million-token context that agents can actually use

DeepSeek-V4 makes million-token context windows practically usable for long-running AI agents by dramatically cutting inference costs and memory usage through its novel hybrid attention architecture.

Hugging Face Blog · Apr 24, 2026

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Alibaba's Qwen releases Qwen3.6-27B, a dense 27B parameter model that outperforms the previous generation's 397B MoE flagship on coding benchmarks, signaling a turning point for efficient, local-first coding models.

Simon Willison · Apr 23, 2026

Building a Fast Multilingual OCR Model with Synthetic Data

NVIDIA trained the Nemotron OCR v2 model on 12 million synthetic images, achieving high accuracy (NED as low as 0.035) and high speed (34.7 pages/second on a single A100 GPU) across six languages, demonstrating that synthetic data is a key solution to the multilingual data bottleneck in OCR.

Hugging Face Blog · Apr 18, 2026

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Simon Willison's famous 'pelican riding a bicycle' benchmark surprisingly shows a locally-run, smaller Alibaba Qwen3.6 model outperforming the cloud-based, massive Claude Opus 4.7 in creative SVG generation, revealing the surprising potential of open-source models for specific tasks.

Simon Willison · Apr 17, 2026

Open Models have crossed a threshold

LangChain's evaluations show that open-source models like GLM-5 and MiniMax M2.7 now match top closed-source models on core agent tasks, while offering up to 90% cost reduction and significantly lower latency.

LangChain Blog ·