Yao Teng


About Me [ CV ]

I am a third-year PHD student in The university of Hong Kong, supervised by Xihui Liu.

I was a M.Sc. student in the MCG Group, Department of Computer Science and Technology, Nanjing University, supervised by Prof. Limin Wang, from 2020 to 2023.

Previously, I obtained the B.Sc. degree from Software Engineering, Xidian University in 2020.

My researches focus on image/video generation and object detection.

Research [ Google Scholar ]

ISL

Adaptive 1D Video Diffusion Autoencoder
Yao Teng, Minxuan Lin, Xian Liu, Shuai Wang, Xiao Yang and Xihui Liu
Under review.
Diffusion-based video generation on one-dimensional video diffusion tokenizer.
[ Paper ]

ISL

SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
Under review.
Improvement over SJD in speed.
[ Paper ]

ISL

Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
Yao Teng, Fuyun Wang, Xian Liu, Zhekai Chen, Han Shi, Yu Wang, Zhenguo Li, Weiyang Liu, Difan Zou, and Xihui Liu
Neurips 2025.
Unifying denoising and autoregressive decoding.
[ Paper ]

ISL

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
ICLR 2025.
The first training-free acceleration method for T2I AR models.
[ Paper ] [ Code ]

ISL

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li and Xihui Liu
arxiv preprint 2024.
A Mamba-based diffusion model for image generation.
[ Paper ] [ Code ]

ISL

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Yao Teng, Enze Xie, Yue Wu, Haoyu Han, Zhenguo Li and Xihui Liu
arxiv preprint 2023.
The "drag" in video editing.
[ Paper ]

ISL

StageInteractor: Query-based Object Detector with Cross-stage Interaction
Yao Teng, Haisong Liu, Sheng Guo and Limin Wang
ICCV 2023.
New label assignment and feature interaction for query-based object detection.
[ Paper ] [ Code ]

ISL

Logit Normalization for Long-tail Object Detection
Liang Zhao*, Yao Teng* and Limin Wang
International Journal of Computer Vision.
A test-time batch normalization to calibrate logits for long-tail object detection.
[ Paper ] [ Code ]

ISL

Structured Sparse R-CNN for Direct Scene Graph Generation
Yao Teng and Limin Wang
CVPR 2022.
The first one-stage end-to-end scene graph generation framework without extra object detector for inference.
[ Paper ] [ Code ] [ Zhihu ]

ISL

Target Adaptive Context Aggregation for Video Scene Graph Generation
Yao Teng, Limin Wang, Zhifeng Li and Gangshan Wu
ICCV 2021.
The first framework unifying the frame-level and video-level scene graph generation.
[ Paper ] [ Code ]

Selected Awards

  • National Scholarship for Postgraduates, Nanjing University, 2021
  • Pacemaker to Outstanding Graduate Student, Nanjing University, 2021
  • 1st Prize, Scholarship for Postgraduate Students, Nanjing University, 2021
  • 2nd Award, University Scholarship, Xidian University, 2016-2018
  • Silver Medal, The 2019 ICPC China Shaanxi Provincial Programming Contest, Xidian University, 2019