Skip to content
Change the repository type filter

All

    Repositories list

    • recipes

      Public
      Common recipes to run vLLM
      JavaScript
      Apache License 2.0
      2306951538Updated Apr 21, 2026Apr 21, 2026
    • tpu-inference

      Public
      TPU inference for vLLM, with unified JAX and PyTorch support.
      Python
      Apache License 2.0
      16829653181Updated Apr 21, 2026Apr 21, 2026
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      16k78k1.8k2.5kUpdated Apr 21, 2026Apr 21, 2026
    • speculators

      Public
      A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
      Python
      Apache License 2.0
      713571932Updated Apr 21, 2026Apr 21, 2026
    • vllm-omni

      Public
      A framework for efficient model inference with omni-modality models
      Python
      Apache License 2.0
      8074.4k389320Updated Apr 21, 2026Apr 21, 2026
    • guidellm

      Public
      Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
      Python
      Apache License 2.0
      1451k6126Updated Apr 21, 2026Apr 21, 2026
    • ci-infra

      Public
      This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
      HCL
      Apache License 2.0
      6735041Updated Apr 21, 2026Apr 21, 2026
    • vllm-xpu-kernels

      Public
      The vLLM XPU kernels for Intel GPU
      C++
      Apache License 2.0
      50341433Updated Apr 21, 2026Apr 21, 2026
    • vllm-ascend

      Public
      Community maintained hardware plugin for vLLM on Ascend
      Python
      Apache License 2.0
      1.1k2k1.3k404Updated Apr 21, 2026Apr 21, 2026
    • vllm-gaudi

      Public
      Community maintained hardware plugin for vLLM on Intel Gaudi
      Python
      Apache License 2.0
      12838471Updated Apr 21, 2026Apr 21, 2026
    • compressed-tensors

      Public
      A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      Apache License 2.0
      832731214Updated Apr 21, 2026Apr 21, 2026
    • semantic-router

      Public
      System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
      Go
      Apache License 2.0
      6363.8k10170Updated Apr 21, 2026Apr 21, 2026
    • llm-compressor

      Public
      Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
      Python
      Apache License 2.0
      4873.1k6363Updated Apr 21, 2026Apr 21, 2026
    • vllm-metal

      Public
      Community maintained hardware plugin for vLLM on Apple Silicon
      Python
      Apache License 2.0
      9394830Updated Apr 21, 2026Apr 21, 2026
    • production-stack

      Public
      vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
      Python
      Apache License 2.0
      3912.3k9766Updated Apr 21, 2026Apr 21, 2026
    • vllm-daily

      Public
      vLLM Daily Summarization of Merged PRs
      45000Updated Apr 20, 2026Apr 20, 2026
    • FlashMLA

      Public
      C++
      MIT License
      1k1103Updated Apr 20, 2026Apr 20, 2026
    • flash-attention

      Public
      Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      2.6k121024Updated Apr 20, 2026Apr 20, 2026
    • vllm-project.github.io

      Public
      HTML
      853614Updated Apr 19, 2026Apr 19, 2026
    • router

      Public
      A high-performance and light-weight router for vLLM large scale deployment
      Rust
      Apache License 2.0
      711991318Updated Apr 17, 2026Apr 17, 2026
    • aibrix

      Public
      Cost-efficient and pluggable Infrastructure components for GenAI inference
      Go
      Apache License 2.0
      5614.7k28240Updated Apr 17, 2026Apr 17, 2026
    • dllm-plugin

      Public
      vLLM plugin for block-based diffusion language model (dLLM) support
      Python
      Apache License 2.0
      512110Updated Apr 16, 2026Apr 16, 2026
    • agentic-api

      Public
      Stateful API logic for agentic applications using vLLM
      Python
      Apache License 2.0
      92313Updated Apr 16, 2026Apr 16, 2026
    • bart-plugin

      Public
      vLLM Model plugin for the encoder-decoder BART model
      Python
      Apache License 2.0
      71116Updated Apr 10, 2026Apr 10, 2026
    • vllm-skills

      Public
      Agent skills for vLLM
      Shell
      Apache License 2.0
      186532Updated Apr 3, 2026Apr 3, 2026
    • vllm-neuron

      Public
      Community maintained hardware plugin for vLLM on AWS Neuron
      Python
      Apache License 2.0
      112931Updated Mar 20, 2026Mar 20, 2026
    • perf-dashboard

      Public
      Performance dashboard for vLLM
      Python
      2101Updated Mar 10, 2026Mar 10, 2026
    • media-kit

      Public
      vLLM Logo Assets
      5720Updated Jan 15, 2026Jan 15, 2026
    • vLLM-in-PyTorch-Conference-2025

      Public
      11200Updated Dec 14, 2025Dec 14, 2025
    • vllm-openvino

      Public
      Python
      Apache License 2.0
      124531Updated Dec 4, 2025Dec 4, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.