olmo-eval: An evaluation workbench for the model development loop
Allen AI releases olmo-eval, shifting evaluation from final benchmarking to an iterative development loop with prompt-level analysis and flexible execution.
Hugging Face Blog · Jun 12, 2026