I am a second-year PHD student in The university of Hong Kong, supervised by Xihui Liu. |
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding |
|
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis | |
Drag-A-Video: Non-rigid Video Editing with Point-based Interaction | |
StageInteractor: Query-based Object Detector with Cross-stage Interaction | |
Logit Normalization for Long-tail Object Detection | |
Structured Sparse R-CNN for Direct Scene Graph Generation | |
Target Adaptive Context Aggregation for Video Scene Graph Generation |