ACM Multimedia 2022 Workshop on

Multimedia Understanding with Pre-trained Models

December 13 - 16, 2022 Tokyo, Japan

WORKSHOP OVERVIEW

Multi-modal understanding plays a crucial role in enabling the machine to perceive the physical world with multiple sensor cues as humans. Recently, large-scale pre-trained models (PTMs) has become a research hotspot in the field of artificial intelligence. Existing techniques follow the self-supervised learning paradigm achieve great success on the uni-modal scenes, such as computer vision (CV) and natural language process (NLP). The recent advances in large-scale pre-trained models inspire the researchers to explore more and deeper pre-training techniques for the multi-modal understanding problem. In this workshop, we aim to bring together researchers from the field of multimedia to discuss recent research and future directions on pre-trained models with self-supervised learning for multimedia understanding.

In recent years, we have witnessed the great success of pre-trained models (PTM) in natural language processing (NLP), such as GPT3, BERT, Roberta, DEBERTA, etc. It motivates the researchers in the multimedia community to leverage the idea of PTM to address multi-modal tasks. The scope of this workshop is focused on pre-trained models with self-supervised learning for multimedia understanding. The potential topics include architecture design for multi-modal PTM, pre-text task design for self-supervised learning, multi-modal data modeling, efficiency enhancing for PTM, interpretability of PTM, etc.

Invited Speakers

Jifeng Dai

Ph.D, Senior Researcher

An area chair of CVPR 2021, 2023 and ECCV 2020, a public chair of ICCV 2019, and a senior PC member of AAAI 2018, 2022. He is a Young Scientist at Beijing Academy of Artificial Intelligence (BAAI).

Si Liu

Professor, Doctoral supervisor

She is currently the associate editor of IEEE TMM and IEEE TCSVT and has served as the area chair of ICCV, CVPR, ECCV, ACM MM, and other top conferences many times.

Jiajun Deng

Ph.D, Postdoctoral research associate

He is serving as the Guest Editor of IEEE Transactions on Multimedia for the Special Issue of Pre-trained Models for Multi-modality Understanding in 2022.

Zhengyuan Yang

Ph.D, Senior Researcher

Senior researcher at Microsoft. He is a member of the Video Technology Circuits and Systems (TCSVT), AAAI 2023 Senior Projects Committee (SPC).

Organizers

Wengang Zhou

Ph.D, Professor

EEIS Department, University of Science and Technology of China
Email: zhwg@ustc.edu.cn

Jiaxin Shi

Ph.D, Senior Researcher

Huawei Cloud Computing Technologies Co., Ltd.
Email: shijiaxin3@huawei.com

Lingxi Xie

Ph.D, Senior Researcher

Huawei Cloud Computing Technologies Co., Ltd.
Email: 198808xc@gmail.com