Home Research Publications Members

Categories

CVPR

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

arXiv

GPT4SGG: Synthesizing Scene Graphs from Holistic and Region-specific Narratives
Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
Towards a Unified View on Visual Parameter-Efficient Transfer Learning
Prompt-Matched Semantic Segmentation

MM

Toward Human Perception-Centric Video Thumbnail Generation

© 2024 CHEN Lab. All rights reserved.