← Back to Home

Tag: 模型训练 (4 articles)

Granite 4.1 LLMs: How They’re Built

IBM's Granite 4.1 series demonstrates that a meticulously engineered data pipeline and multi-stage training can enable an 8B dense model to match or exceed the performance of a previous 32B MoE model, highlighting a paradigm shift where data quality trumps parameter count.

Hugging Face Blog · Apr 29, 2026

Introducing talkie: a 13B vintage language model from 1930

A 13B model trained exclusively on pre-1931 text aims to explore AI's reasoning, creativity, and 're-discovery' abilities within knowledge boundaries, sparking new discussions on data copyright and model purity.

Simon Willison · Apr 28, 2026