Title: Shape-Biased Representations for Object Category Recognition
Date: Friday, October 20th, 2023
Time: 1:00PM-3:00PM ET
In-person Location: Coda C1315 Grant Park
Zoom link: https://gatech.zoom.us/j/98841973871
Stefan Stojanov
PhD Student in Computer Science
School of Interactive Computing
College of Computing
Georgia Institute of Technology
Committee
Dr. James M. Rehg (advisor), College of Computing, Georgia Institute of Technology
Dr. James Hays, College of Computing, Georgia Institute of Technology
Dr. Judy Hoffman, College of Computing, Georgia Institute of Technology
Dr. Chen Yu, Department of Psychology, University of Texas at Austin
Dr. Subhransu Maji, College of Information and Computer Sciences, University of Massachusetts, Amherst
Summary
While object recognition is a classical problem in computer vision that has witnessed incredible progress as a result of contemporary deep learning research, the key challenges of developing systems that can learn object categories from continually arriving data, from a few samples, and with limited supervision still remain. In this dissertation, we aim to borrow the learning strategy of shape bias and environmental bias of repetition, both of which are observed in young children, and apply them to continual, low-shot, and self-supervised learning of objects and object parts. In the continual learning domain, we demonstrate that repetition of learned concepts significantly ameliorates catastrophic forgetting. For low-shot learning we develop two methods for learning shape-biased object representations with decreasing supervision requirements: based on learning a joint image and 3D shape metric space from point clouds, and by self-supervised learning of object parts from multi-view pixel correspondences. We demonstrate that these methods of introducing a shape bias improve low-shot category recognition. Last, we find that contrastive learning from multi-view images allows for category-level part matching with performance competitive with baseline that have over 10 times more parameters, while being trained only on synthetic data. To support our investigations, we present two synthetic 3D object datasets, Toys200 and Toys4K, and develop a series of highly realistic synthetic data rendering systems that enable real-world generalization.