For the best experience, it's better to use a desktop computer to view this website.

🏄 SURF: Signature-Retained Fast Video Generation

Kaixin Ding1, Xi Chen1, Sihui Ji1, Yuan Gao2, Liang Hou2, Xin Tao2, Pengfei Wan2, Hengshuang Zhao1

1The University of Hong Kong
2Kling Team, KuaishouTechnology

Please stay tuned for all videos loading...

TL;DR: High-resolution video generation is slow: for example, Wan 2.1 takes over 50 minutes to generate a 720p video. Existing acceleration methods often compromise model priors (layout, semantics, motion). We propose SURF, a two-stage framework: first, a fast low-resolution preview using a pretrained model; second, a Refiner to upscale while preserving priors. Key techniques include noise reshifting to reduce prior loss and shifting windows with careful training design. SURF is simple, efficient, and compatible with various base models, achieving 12.5× speedup for generating 5-second, 16fps, 720p Wan 2.1 videos and 8.7× speedup for generating 5-second, 24fps, 720p HunyuanVideo.