Learning Object-Centric Representations of Visually Complex Scenes

Dec 1, 2022 · 1 min read

Learning distributed vector representations has been a driving force behind the remarkable success of deep neural networks in perceptual tasks on ImageNet-like photos with solitary objects (cars, trunks, etc.) in the center. One emerging paradigm for capturing the compositional nature of complex scenes is to learn slot-based object-centric representations. Nevertheless, existing methods to date are still largely limited to simple synthetic settings with low visual complexity.