GitHub - hao-ai-lab/FastVideo: A unified inference and post-training framework for accelerated video generation.

FastVideo is a unified post-training and real-time inference framework for accelerated video generation.

NEWS

2026/06/23: Release FastWan-QAD: 5s of Video generated in 1.8s E2E. FastWan-QAD models, check out the Blog.
2026/03/17: Release demo: Into the Dreamverse: Vibe Directing in FastVideo, check out the Blog.
2026/03/13: Release demo: Create a 5s 1080p Video in 4.5s with FastVideo on a Single GPU, check out the Blog.
2025/11/19: Release CausalWan2.2 I2V A14B Preview models, Blog and Inference Code!.
2025/08/04: Release FastWan models and Sparse-Distillation.

More News

2025/06/14: Release finetuning and inference code for VSA.
2025/04/24: FastVideo V1 is released!
2025/02/18: Release the inference code for Sliding Tile Attention.

Key Features

FastVideo has the following features:

End-to-end post-training support for bidirectional and autoregressive models:
- Support full finetuning and LoRA finetuning for state-of-the-art open video DiTs
- Data preprocessing pipeline for video, image, and text data
- Distribution Matching Distillation (DMD2) stepwise distillation.
- Sparse attention with Video Sparse Attention
- Sparse distillation to achieve >50x denoising speedup
- Scalable training with FSDP2, sequence parallelism, and selective activation checkpointing.
- Causal distillation through Self-Forcing
- See this page for full list of supported models and recipes.
State-of-the-art performance optimizations for inference
- Sequence Parallelism for distributed inference
- Multiple state-of-the-art attention backends
- User-friendly CLI and Python API
- See this page for full list of supported optimizations.
Diverse hardware and OS support
- Support H100, A100, 4090
- Support Linux, Windows, MacOS
- See this page for full list of supported models, hardware assumptions, and optimization compatibility.
Realtime video generation & editing
- Dreamverse: stream and "vibe direct" video in realtime (live demo), deployable on local GPU, a self-hosted B200 server, Docker, or serverless Modal

Getting Started

We recommend using uv to create a clean environment. If you previously used Conda, switching to uv generally gives faster and more stable installs.

# Create and activate a new uv environment
uv venv --python 3.12 --seed
source .venv/bin/activate

# Install FastVideo on NVIDIA CUDA 12
UV_TORCH_BACKEND=cu126 uv pip install fastvideo

Use UV_TORCH_BACKEND=cu130 on CUDA 13. Apple silicon users should follow the MPS installation guide.

Please see our docs for more detailed installation instructions.

On an NVIDIA DGX Spark (GB10 / ARM64 + CUDA 13)? There's no prebuilt ARM wheel for the FastVideo CUDA kernel, so it's an editable from-source install (UV_TORCH_BACKEND=cu130 uv pip install -e ., which compiles that kernel for you) rather than UV_TORCH_BACKEND=cu130 uv pip install fastvideo. A compatible prebuilt ARM64 FlashAttention wheel is available separately. Follow the DGX Spark install guide.

Install with an AI coding agent

FastVideo is a monorepo with rich agent guidance (see AGENTS.md). If you use Claude Code, Cursor, or another coding agent, paste the prompt below — it detects your platform and follows the matching guide:

Install FastVideo (https://github.com/hao-ai-lab/FastVideo) into a fresh uv virtual environment.

1. Detect the platform: run `uname -m`, `nvidia-smi`, and `nvcc --version`.
2. Read and follow the matching install guide exactly (in this repo, or at
   https://hao-ai-lab.github.io/FastVideo/getting_started/installation/):
     - NVIDIA GPU, x86_64                         -> docs/getting_started/installation/gpu.md
     - NVIDIA DGX Spark / GB10, aarch64, CUDA 13  -> docs/getting_started/installation/spark.md
     - Apple Silicon, macOS                       -> docs/getting_started/installation/mps.md
3. Use uv for every step. If a command fails, debug it and tell me what you changed.
4. Verify the result:
     python -c "import fastvideo, torch; print('cuda', torch.cuda.is_available())"
     fastvideo --help
5. Report which platform you detected and any deviations you had to make.

Sparse Distillation

For our sparse distillation techniques, please see our distillation docs and check out our blog.

See below for recipes and datasets:

Model	Sparse Distillation	Dataset
FastWan2.1-T2V-1.3B	Recipe	FastVideo Synthetic Wan2.1 480P
FastWan2.2-TI2V-5B	Recipe	FastVideo Synthetic Wan2.2 720P

Dreamverse — Realtime Video Generation & Editing

Dreamverse is FastVideo's realtime video generation and editing platform — "vibe directing" a video as it streams. It lives in the monorepo under apps/dreamverse/ and ships its own backend (dreamverse-server) plus a web UI.

Try the live demo, read the blog, or run it yourself. Dreamverse deploys on a local GPU, a self-hosted B200 server over SSH, Docker, or serverless Modal — see the Dreamverse README.

Inference

Generating Your First Video

Here's a minimal example to generate a video using the default settings. Make sure VSA kernels are installed. Create a file called example.py with the following code:

import os
from fastvideo import VideoGenerator

def main():
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = "VIDEO_SPARSE_ATTN"

    # Create a video generator with a pre-trained model
    generator = VideoGenerator.from_pretrained(
        "FastVideo/FastWan2.1-T2V-1.3B-Diffusers",
        num_gpus=1,  # Adjust based on your hardware
    )

    # Define a prompt for your video
    prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest."

    # Generate the video
    video = generator.generate_video(
        prompt,
        output_path="my_videos/",  # Controls where videos are saved
        save_video=True
    )

if __name__ == '__main__':
    main()

Run the script with:

python example.py

For a more detailed guide, please see our inference quick start.

More Guides

Awesome work using FastVideo or our research projects

SGLang: SGLang's diffusion inference functionality is based on a fork of FastVideo on Sept. 24, 2025.
DanceGRPO: A unified framework to adapt Group Relative Policy Optimization (GRPO) to visual generation paradigms. Code based on FastVideo.
SRPO: A method to directly align the full diffusion trajectory with fine-grained human preference. Code based on FastVideo.
DCM: Dual-expert consistency model for efficient and high-quality video generation. Code based on FastVideo.
HY-WorldPlay: An action-conditioned world model model trained using FastVideo framework.
Hunyuan Video 1.5: A leading lightweight video generation model, where they proposed SSTA based on Sliding Tile Attention.
Kandinsky-5.0: A family of diffusion models for video & image generation, where their NABLA attention includes a Sliding Tile Attention branch.
LongCat Video: A foundational video generation model with 13.6B parameters with block-sparse attention similar to Video Sparse Attention.

🤝 Contributing

We welcome all contributions. Please check out our guide here. See details in development roadmap.

Acknowledgement

We learned the design and reused code from the following projects: Wan-Video, ThunderKittens, DMD2, diffusers, xDiT, vLLM, SGLang. We thank MBZUAI, Anyscale, and GMI Cloud for their support throughout this project.

Citation

If you find FastVideo useful, please consider citing our research work:

@article{zhang2025vsa,
  title={Vsa: Faster video diffusion with trainable sparse attention},
  author={Zhang, Peiyuan and Chen, Yongqi and Huang, Haofeng and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao},
  journal={arXiv preprint arXiv:2505.13389},
  year={2025}
}

@article{zhang2025fast,
  title={Fast video generation with sliding tile attention},
  author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao},
  journal={arXiv preprint arXiv:2502.04507},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 832 Commits
.agents		.agents
.buildkite		.buildkite
.github		.github
apps		apps
assets		assets
comfyui		comfyui
docker		docker
docs		docs
examples		examples
fastvideo-kernel		fastvideo-kernel
fastvideo		fastvideo
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
collect_env.py		collect_env.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
pyproject_other.toml		pyproject_other.toml
requirements-mkdocs.in		requirements-mkdocs.in
requirements-mkdocs.txt		requirements-mkdocs.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NEWS

More News

Key Features

Getting Started

Install with an AI coding agent

Sparse Distillation

Dreamverse — Realtime Video Generation & Editing

Inference

Generating Your First Video

More Guides

Awesome work using FastVideo or our research projects

🤝 Contributing

Acknowledgement

Citation

About

Uh oh!

Releases 8

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NEWS

More News

Key Features

Getting Started

Install with an AI coding agent

Sparse Distillation

Dreamverse — Realtime Video Generation & Editing

Inference

Generating Your First Video

More Guides

Awesome work using FastVideo or our research projects

🤝 Contributing

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Uh oh!

Contributors

Uh oh!

Languages