Multimodal AI + Imaging Collaboration
Xiangming Wang builds multimodal models for robust visual understanding and restoration.
Ph.D. candidate at HIT Shenzhen, focusing on vision-language modeling, multi-sensor fusion, and optimization-driven learning.
My work connects multimodal representation learning with practical deployment, from vision-language guided restoration to multi-sensor RAW imaging pipelines. I focus on methods that are structurally faithful, degradation-aware, and efficient enough for real-world devices.