
|
Adaptive 1D Video Diffusion Autoencoder
Yao Teng, Minxuan Lin, Xian Liu, Shuai Wang, Xiao Yang and Xihui Liu
Under review.
Diffusion-based video generation on one-dimensional video diffusion tokenizer.
[ Paper ]
|

|
SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
Under review.
Improvement over SJD in speed.
[ Paper ]
|

|
Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
Yao Teng, Fuyun Wang, Xian Liu, Zhekai Chen, Han Shi, Yu Wang, Zhenguo Li, Weiyang Liu, Difan Zou, and Xihui Liu
Neurips 2025.
Unifying denoising and autoregressive decoding.
[ Paper ]
|

|
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
ICLR 2025.
The first training-free acceleration method for T2I AR models.
[ Paper ]
[ Code ]
|

|
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
arxiv preprint 2024.
A Mamba-based diffusion model for image generation.
[ Paper ]
[ Code ]
|

|
Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Yao Teng, Enze Xie, Yue Wu, Haoyu Han, Zhenguo Li and Xihui Liu
arxiv preprint 2023.
The "drag" in video editing.
[ Paper ]
|

|
StageInteractor: Query-based Object Detector with Cross-stage Interaction
Yao Teng, Haisong Liu, Sheng Guo and Limin Wang
ICCV 2023.
New label assignment and feature interaction for query-based object detection.
[ Paper ]
[ Code ]
|

|
Logit Normalization for Long-tail Object Detection
Liang Zhao*, Yao Teng* and Limin Wang
International Journal of Computer Vision.
A test-time batch normalization to calibrate logits for long-tail object detection.
[ Paper ]
[ Code ]
|

|
Structured Sparse R-CNN for Direct Scene Graph Generation
Yao Teng and Limin Wang
CVPR 2022.
The first one-stage end-to-end scene graph generation framework without extra object detector for inference.
[ Paper ]
[ Code ]
[ Zhihu ]
|

|
Target Adaptive Context Aggregation for Video Scene Graph Generation
Yao Teng, Limin Wang, Zhifeng Li and Gangshan Wu
ICCV 2021.
The first framework unifying the frame-level and video-level scene graph generation.
[ Paper ]
[ Code ]
|