CVPR2022 paper reading - Balanced multimodal learning


The Japanese Study Group on Computer Vision is a semestral conference in which researchers from all over Japan present the most recent topics in computer vision. In my talk, I introduce a paper from the International Conference on Computer Vision and Pattern Recognition (CVPR2022) that addresses the simultaneous learning of data coming from different modalities. This topic is very relevant in computer vision, as visual data is prone to learn at different speeds than other modalities, such as text and audio.