DiffusionGemma: The First Diffusion LLM (dLLM) Natively Supported in vLLM
vLLM natively supports a discrete diffusion language model that replaces sequential generation with parallel block denoising, trading compute for bandwidth to significantly reduce latency.
vLLM Blog · Jun 10, 2026