ECCV2022 paper reading - Multi-modal Multi-task Masked Autoencoders


The Japanese Study Group on Computer Vision is a semestral conference in which researchers from all over Japan present the most recent topics in computer vision. In my talk, I introduce a paper from the European Conference on Computer Vision (ECCV2022) that addresses the pretraining of autoencoder models with different tasks and data coming from different modalities. This topic is very relevant in computer vision, as training a model with heterogeneous data and tasks is very challenging.