Abstract: Video summarization aims to generate a concise yet informative version of a lengthy video for efficient viewing. Generally, humans can discern important shots using audiovisual information ...
Phantom is a unified video generation framework for single and multi-subject references, built on existing text-to-video and image-to-video architectures. It achieves cross-modal alignment using ...
Abstract: Video inpainting modifies local regions in video while ensuring spatial and temporal coherence. However, existing methods-both traditional and recent diffusion-based ones-face key ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果