Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Chaoqun Wang, Shaobo Min, Xuejin Chen*, Xiaoyan Sun, Houqiang Li

School of Data Science

National Engineering Laboratory for Brain-inspired Intelligence Technology and Application

University of Science and Technology of China

Tencent Data Platform

Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS-2021)


Generalized Zero-Shot Learning (GZSL) aims to recognize new categories with auxiliary semantic information, e.g., category attributes. In this paper, we handle the critical issue of domain shift problem, i.e., confusion between seen and unseen categories, by progressively improving cross-domain transferability and category discriminability of visual representations. Our approach, named Dual Progressive Prototype Network (DPPN), constructs two types of prototypes that record prototypical visual patterns for attributes and categories, respectively. With attribute prototypes, DPPN alternately searches attribute-related local regions and updates corresponding attribute prototypes to progressively explore accurate attribute-region correspondence. This enables DPPN to produce visual representations with accurate attribute localization ability, which benefits the semantic-visual alignment and representation transferability. Besides, along with progressive attribute localization, DPPN further projects category prototypes into multiple spaces to progressively repel visual representations from different categories, which boosts category discriminability. Both attribute and category prototypes are collaboratively learned in a unified framework, which makes visual representations of DPPN transferable and distinctive. Experiments on four benchmarks prove that DPPN effectively alleviates the domain shift problem in GZSL.

Figure 1: The motivation of DPPN. (a) General GZSL methods directly align global image features with category attributes. (b) A typical part-based method, i.e., APN [1], learns prototypes shared by all images for attribute localization. (c) DPPN progressively adjusts prototypes according to different images and introduces category prototypes to enhance category discriminability.


Figure 2: Visualization of attribute localization at different iterations. The localization gets more and more accurate as k increases from 0 to 2.
Figure 3: Effect of progressive updating with varying K on four datasets.
Table 1: Effect of PCC and PAL on CUB and aPY datasets. GFLOPs is calculated with input size 448 * 448 on the CUB dataset.
Table 2: Results of GZSL on four classification benchmarks. Generative methods (GEN) utilize extra synthetic unseen domain data for training.

This work was supported by National Natural Science Foundation of China (NSFC) under Grants 61632006 and 62076230.

Main References:

[1] Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata. Attribute Prototype Network for Zero-Shot Learning. in NeurIPS 2020.


Copyright @ 2021 USTC-VGG , USTC