Multi-modal understanding plays a crucial role in enabling the machine to perceive the physical world with multiple sensor cues as humans. Recently, large-scale pre-trained models (PTMs) has become a research hotspot in the field of artificial intelligence. Existing techniques follow the self-supervised learning paradigm achieve great success on the uni-modal scenes, such as computer vision (CV) and natural language process (NLP). The recent advances in large-scale pre-trained models inspire the researchers to explore more and deeper pre-training techniques for the multi-modal understanding problem. In this workshop, we aim to bring together researchers from the field of multimedia to discuss recent research and future directions on pre-trained models with self-supervised learning for multimedia understanding.
In recent years, we have witnessed the great success of pre-trained models (PTM) in natural language processing (NLP), such as GPT3, BERT, Roberta, DEBERTA, etc. It motivates the researchers in the multimedia community to leverage the idea of PTM to address multi-modal tasks. The scope of this workshop is focused on pre-trained models with self-supervised learning for multimedia understanding. The potential topics include architecture design for multi-modal PTM, pre-text task design for self-supervised learning, multi-modal data modeling, efficiency enhancing for PTM, interpretability of PTM, etc.
Jifeng DaiPh.D, Senior Researcher
An area chair of CVPR 2021, 2023 and ECCV 2020, a public chair of ICCV 2019, and a senior PC member of AAAI 2018, 2022. He is a Young Scientist at Beijing Academy of Artificial Intelligence (BAAI).
Si LiuProfessor, Doctoral supervisor
She is currently the associate editor of IEEE TMM and IEEE TCSVT and has served as the area chair of ICCV, CVPR, ECCV, ACM MM, and other top conferences many times.
Jiajun DengPh.D, Postdoctoral research associate
He is serving as the Guest Editor of IEEE Transactions on Multimedia for the Special Issue of Pre-trained Models for Multi-modality Understanding in 2022.
Wengang ZhouPh.D, Professor
EEIS Department, University of Science and Technology of China
Jiaxin ShiPh.D, Senior Researcher
Huawei Cloud Computing Technologies Co., Ltd.