投机解码 — Tag

EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec

EAGLE 3.1 addresses the performance degradation of speculative decoding in long-context and varied chat templates by introducing FC normalization and post-norm design, doubling acceptance length in long-context scenarios and significantly improving the robustness and practicality of inference acceleration.

vLLM Blog ·

Tag: 投机解码 (1 articles)

EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec