武汉大学主页平台管理系统 Kehua--Home-- Textual Enhanced Adaptive Meta-Fusion for Few-shot Visual Recognition

苏科华

Supervisor of Doctorate Candidates
Supervisor of Master's Candidates

E-Mail:

Date of Employment:2008-11-02

School/Department:计算机学院

Education Level:研究生毕业

Business Address:D203

Gender:Male

Contact Information:13517299596

Status:Employed

Discipline:Computer Applications Technology
Communications and Information Systems
Other specialties in Software Engineering
Cyberspace Security

Paper Publications

Textual Enhanced Adaptive Meta-Fusion for Few-shot Visual Recognition

Hits:

DOI number:10.1109/TMM.2023.3295731

Affiliation of Author(s):Institute of Electrical and Electronics Engineers Inc.

Journal:IEEE Transactions on Multimedia

Abstract:Few-shot learning (FSL) is a challenging task that aims to train a classifier to recognize novel categories, where only a few annotated examples are available in each category. Recently, many FSL approaches have been proposed based on the meta-learning paradigm, which attempts to learn transferable knowledge from similar tasks by designing a meta-learner. However, most of these approaches only exploit the information from visual modality and do not utilize ones from additional modalities (<italic>e.g.</italic>, textual description). Since the labeled examples in FSL are limited, increasing the information on the examples is a probable solution to improve the classification performance. This motivates us to propose a novel meta-learning method, termed textual enhanced adaptive meta-fusion FSL (TAMF-FSL), which leverages both the visual information from the visual image and semantic information from language supervision. Specifically, TAMF-FSL exploits the semantic information of textual description to improve the visual-based models. We first employ a text encoder to learn the semantic features of each visual category, and then design a modality alignment module and meta-fusion module to align and fuse the visual and semantic features for final prediction. Extensive experiments show that the proposed method outperforms many recent or competitive FSL counterparts on two popular datasets.

Co-author:Zhan Yibing,Luo Yong,Hu Han,Du Bo

First Author:Han Mengya

Indexed by:Journal paper

Correspondence Author:Su Kehua

Page Number:1-11

ISSN No.:1520-9210

Translation or Not:no

Date of Publication:2023-03-01

Included Journals:EI

Pre One:Improving Heterogeneous Model Reuse by Density Estimation

Next One:Fine-Grained Position Helps Memorizing More, a Novel Music Compound Transformer Model with Feature Interaction Fusion

Profile

苏科华，男，武汉大学计算机学院教授、博导；武汉大学科技成果转化中心（技术转移中心）副主任。研究主要集中在最优传输（Optimal Transport）领域，它是研究概率测度间最优变换的一类优化问题。在计算机图形学、机器视觉、人工智能、医学图像处理等领域有着广泛的应用。本人主要研究最优传输的几何计算理论和高效算法，并将其应用于网格保测参数化、三维场景优化、智能烧伤评估和卫星互联网任务优化中。主持包括国家自然科学基金、中央军科委、航天5院、华为公司等20多个项目支持，发表论文50余篇，获批发明专利10余项。为CCF计算机辅助设计与图形学（CAD/CG）和虚拟现实与可视化(TCVRV)专委会的执行委员。