苏科华
Supervisor of Doctorate Candidates
Supervisor of Master's Candidates
E-Mail:
Date of Employment:2008-11-02
School/Department:计算机学院
Education Level:研究生毕业
Business Address:D203
Gender:Male
Contact Information:13517299596
Status:Employed
Discipline:Computer Applications Technology
Communications and Information Systems
Other specialties in Software Engineering
Cyberspace Security
Hits:
Impact Factor:10.6
DOI number:10.1109/TIP.2023.3261743
Affiliation of Author(s):IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Journal:IEEE TRANSACTIONS ON IMAGE PROCESSING
Place of Publication:445 HOES LANE, PISCATAWAY, NJ 08855-4141
Key Words:VisualizationTask analysis;Semantics;Feature extraction;Training;Image segmentation;Image color analysis;Visual intention understanding;cross modality;hierarchical relation
Abstract:Visual intention understanding is the task of exploring the potential and underlying meaning expressed in images. Simply modeling the objects or backgrounds within the image content leads to unavoidable comprehension bias. To alleviate this problem, this paper proposes a Cross-modality Pyramid Alignment with Dynamic optimization (CPAD) to enhance the global understanding of visual intention with hierarchical modeling. The core idea is to exploit the hierarchical relationship between visual content and textual intention labels. For visual hierarchy, we formulate the visual intention understanding task as a hierarchical classification problem, capturing multiple granular features in different layers, which corresponds to hierarchical intention labels. For textual hierarchy, we directly extract the semantic representation from intention labels at different levels, which supplements the visual content modeling without extra manual annotations. Moreover, to further narrow the domain gap between different modalities, a cross-modality pyramid alignment module is designed to dynamically optimize the performance of visual intention understanding in a joint learning manner. Comprehensive experiments intuitively demonstrate the superiority of our proposed method, outperforming existing visual intention understanding methods.
Co-author:Shi Qinghongya,Du Bo
First Author:Ye Mang
Indexed by:Article
Correspondence Author:Su Kehua
Document Type:J
Volume:32
Page Number:2190-2201
ISSN No.:1057-7149
Translation or Not:no
Date of Publication:2023-05-05
Included Journals:SCI、EI
苏科华,男,武汉大学计算机学院教授、博导;武汉大学科技成果转化中心(技术转移中心)副主任。研究主要集中在最优传输(Optimal Transport)领域,它是研究概率测度间最优变换的一类优化问题。在计算机图形学、机器视觉、人工智能、医学图像处理等领域有着广泛的应用。本人主要研究最优传输的几何计算理论和高效算法,并将其应用于网格保测参数化、三维场景优化、智能烧伤评估和卫星互联网任务优化中。主持包括国家自然科学基金、中央军科委、航天5院、华为公司等20多个项目支持,发表论文50余篇,获批发明专利10余项。为CCF计算机辅助设计与图形学(CAD/CG)和虚拟现实与可视化(TCVRV)专委会的执行委员。