STRUCT
Spatial and Temporal Restoration, Understanding and Compression Team
Tutorials
- New Era of Artificial Intelligence: Unleashing the Power of Large Models in Visual Applications @IEEE ISCAS-2024
- 智能媒体计算专题论坛 @ WAIC-2023
- Intelligent Image Enhancement and Restoration - from Prior Driven Model to Advanced Deep Learning @IEEE ICME-2019
Talks
- 恶劣场景下视觉感知与理解 2024-05 by Jiaying Liu @ CCIG-2024
- Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation 2023-10 by Rundong Luo @ ICCV-2023
- Dual Prompt Learning for Continual Rain Removal from Single Images 2023-08 by Minghao Liu @ IJCAI-2023
- Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition 2023-06 by Lilang Lin @ CVPR-2023
- Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations 2023-02 by Jiahang
Zhang @ AAAI-2023 - AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation 2022-10 by Yiyang Ma @ ACM MM-2022
- Controllable Artistic Text Style Transfer via Shape-Matching GAN 2019-10 by Shuai Yang @ ICCV-2019
- Unsupervised Person Image Generation with Semantic Parsing Transformation 2019-06 by Sijie Song @ CVPR-2019
Group Seminar
Current Semester
Date | Presenter | Conference | Title | Links |
24/09/14 | Xichen Lan | CVPR24 | Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models | |
24/09/03 | Wenshuo Gao | CVPR24 | Generative Image Dynamics | |
24/07/20 | Yifan Li | ECCV24 | ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation | |
24/07/08 | Shaofan Sun | SIGGRAPH23 | 3D Gaussian Splatting for Real-Time Radiance Field Rendering | |
24/06/02 | Zejia Fan | arXiv | LM4LV: AFrozen Large Language Model for Low-level Vision Tasks | |
24/05/26 | Guo Tang | arXiv | DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | |
24/05/19 | Xiang Gao | arXiv | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | |
24/04/14 | Minghao Liu | arXiv | Mamba: Linear-Time Sequence Modeling with Selective State Spaces | |
24/03/31 | Jiahang Zhang | CVPR24 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | |
24/03/17 | Lilang Lin | arXiv | Learning by Reconstruction Produces Uninformative Features For Perception | |
24/03/10 | Yiyang Ma | ICCV23 | DiffIR: Efficient Diffusion Model for Image Restoration | |
24/03/03 | Haowei Kuang | NeurIPS23 | Towards Efficient Image Compression Without Autoregressive Models | |
24/02/25 | Wenjing Wang | ICLR24 | SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis | |
24/01/28 | Lehong Wu | ICLR24 | Interpreting CLIP's Image Representation via Text-Based Decomposition | |
24/01/21 | Yifan Li | CVPR23 | Generative Diffusion Prior for Unified Image Restoration and Enhancement | |
24/01/14 | Shaofan Sun | NeurIPS23 | Siamese Masked Autoencoders | |
23/12/24 | Jiaxuan Xie | ICCV23 | Tracking Anything with Decoupled Video Segmentation | |
23/12/10 | Rundong Luo | arXiv | DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model | |
23/12/03 | Minghao Liu | NeurIPS23 | Rotating Features for Object Discovery | |
23/11/19 | Haowei Kuang | arXiv | Idempotent Generative Network | |
23/11/05 | Zhengbo Xu | CVPR23 | Null-text Inversion for Editing Real Images using Guided Diffusion Models | |
23/10/24 | Jiahang Zhang | ICCV23 | Rosetta Neurons: Mining the Common Units in a Model Zoo | |
23/10/15 | Zejia Fan | CVPR23 | All-in-one Image Restoration for Unknown Degradations Using Adaptive Discriminative Filters for Specific Degradations | |
23/10/08 | Yiyang Ma | ICLR21 | Denoising Diffusion Implicit Models | |
23/09/10 | Haofeng Huang | CVPR23 | EDICT: Exact Diffusion Inversion via Coupled Transformations | |
23/09/10 | Xiang Gao | arXiv | Inversion-based Style Transfer with Diffusion Models | |
23/07/31 | Minghao Liu | arXiv | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | |
23/07/15 | Minghao Liu | arXiv | Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language | |
23/06/04 | Jiahang Zhang | CVPR23 | Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation | |
23/05/28 | Minghao Liu | CVPR23 | Progressive Transformation Learning for Leveraging Virtual Images in Training | |
23/05/14 | Lilang Lin | CVPR23 | VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | |
23/05/07 | Haowei Kuang | CVPR23 | Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models | |
23/04/16 | Yexiang Cheng | arXiv | Segment Anything | |
23/04/02 | Zhengbo Xu | arXiv | Your Diffusion Model is Secretly a Zero-Shot Classifier | |
23/03/26 | Yiyang Ma | arXiv | Adding Conditional Control to Text-to-Image Diffusion Models | |
23/03/19 | Rundong Luo | arXiv | Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames | |
23/03/05 | Wenjing Wang | arXiv | DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | |
23/02/26 | Yuzhang Hu | CVPR22 | Dataset Distillation by Matching Training Trajectories | |
23/02/19 | Zejia Fan | ICLR23 | Image as Set of Points | |
23/02/19 | Yueru Jia | ECCV22 | Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields | |
23/02/19 | Dezhao Wang | NeurIPS22 | Visual Prompting via Image Inpainting | |
23/02/12 | Lilang Lin | ECCV22 | On the Versatile Uses of Partial Distance Correlation in Deep Learning | |
23/02/12 | Minghao Liu | CVPR22 | Text2Mesh: Text-Driven Neural Stylization for Meshes | |
23/02/12 | Jiahang Zhang | ECCV22 | AutoMix: Unveiling the Power of Mixup for Stronger Classifiers | |
23/02/05 | Haofeng Huang | AAAI23 | GAN Prior based Null-Space Learning for Consistent Super-Resolution | |
23/02/05 | Haowei Kuang | AAAI23 | Robust Image Denoising of No-Flash Images Guided by Consistent Flash Images | |
23/02/05 | Shujia Li | CVPR22 | Diffusion Autoencoders: Toward a Meaningful and Decodable Representation | |
23/01/29 | Zhengbo Xu | AAAI23 | MicroAST: Towards Super-Fast Ultra-Resolution Arbitrary Style Transfer | |
23/01/29 | Yexiang Cheng | AAAI23 | Target-Free Text-guided Image Manipulation | |
23/01/29 | Yiyang Ma | NeurIPS22 | DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps | |
23/01/15 | Wenjing Wang | arXiv | Tune-A-Video: One-Shot Tuning of Image Diffusion Models For Text-to-Video Generation | |
23/01/15 | Rundong Luo | ECCV22 | SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image | |
23/01/15 | Yuzhang Hu | AAAI23 | Curriculum Temperature for Knowledge Distillation | |
22/10/30 | Wenjing Wang | NeurIPS22 | Poisson Flow Generative Models | |
22/10/16 | Yilun Xu | CVPR22 | HDR-NeRF: High Dynamic Range Neural Radiance Fields | |
22/10/16 | Yilun Xu | CVPR22 | NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images | |
22/10/09 | Lilang Lin | ECCV22 | Prompting Visual-Language Models for Efficient Video Understanding | |
22/09/18 | Yiyang Ma | ACM MM22 | AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation | |
21/12/19 | Yuzhang Hu | ICCV21 | SimROD: A Simple Adaptation Method for Robust Object Detection | |
21/12/12 | Haofeng Huang | NeurIPS21 | Pragmatic Image Compression for Human-in-the-Loop Decision-Making | |
21/12/05 | Dezhao Wang | arXiv | Masked Autoencoders Are Scalable Vision Learners | |
21/10/24 | Zejia Fan | ICML21 | How Much Position Information Do Convolutional Neural Networks Encode? | |
21/10/15 | Wenjing Wang | ICCV21 | Improving Contrastive Learning by Visualizing Feature Transformation | |
21/09/24 | Lilang Lin | ICML21 | Understanding self-supervised learning dynamics without contrastive pairs | |
21/09/12 | Haofeng Huang | CVPR21 | Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation | |
21/08/18 | Shuhong Zheng | CVPR21 | NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections | |
21/07/26 | Zejia Fan | arXiv | Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight | |
21/07/13 | Haofeng Huang | IJCV | Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations | |
21/07/13 | Wenjing Wang | CVPR21 | GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields | |
21/05/30 | Haofeng Huang | CVPR21 | Image-to-image Translation via Hierarchical Style Disentanglement | |
21/05/30 | Xinhao Wang | CVPR21 | Style-Aware Normalized Loss for Improving Arbitrary Style Transfer | |
21/05/23 | Shixing Yu | arXiv | MLP-Mixer: An all-MLP Architecture for Vision | |
21/05/23 | Hao Liang | CVPR21 | RepVGG: Making VGG-style ConvNets Great Again | |
21/05/16 | Shuhong Zheng | ECCV20 | In-Domain GAN Inversion for Real Image Editing | |
21/04/25 | Lilang Lin | arXiv | Barlow Twins: Self-Supervised Learning via Redundancy Reduction | |
21/04/25 | Yueyu Hu | NeuIPS20 | Improving Inference for Neural Image Compression | |
21/04/18 | Yu Han | ICML21 | Generative Pretraining from Pixels | |
21/04/18 | Zejia Fan | CVPR21 | Scaling Local Self-Attention for Parameter Efficient Visual Backbones | |
21/03/28 | Yuzhang Hu | CVPR21 | Learning Continuous Image Representation with Local Implicit Image Function | |
21/03/28 | Dezhao Wang | CVPR21 | Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder | |
21/03/21 | Wenjing Wang | CVPR21 | Generative Hierarchical Features from Synthesizing Images | |
21/03/21 | Shixing Yu | ICLR21 | An image is worth 16x16 words:Transformers for image recognition at scale | |
21/03/14 | Wenjing Wang | CVPR21 | Closed-Form Factorization of Latent Semantics in GANs | |
21/02/09 | Zejia Fan | arXiv | Fast Convergence of DETR with Spatially Modulated Co-Attention | |
21/02/09 | Qiyu Dai | ECCV20 | SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects |