vLLM x Novita AI: PegaFlow for Production-Grade External KV Cache
vLLM and Novita AI collaborate on PegaFlow, externalizing the KV cache into a standalone service with a three-level cache hierarchy, achieving doubled startup speed and significantly higher throughput.
vLLM Blog · May 18, 2026