Title: Shape-Biased Representations for Object Category Recognition

Date: Friday, October 20th, 2023

Time: 1:00PM-3:00PM ET

In-person Location: Coda C1315 Grant Park

Zoom link:  https://gatech.zoom.us/j/98841973871

 

Stefan Stojanov

PhD Student in Computer Science

School of Interactive Computing

College of Computing

Georgia Institute of Technology

 

Committee

Dr. James M. Rehg (advisor), College of Computing, Georgia Institute of Technology
Dr. James Hays, College of Computing, Georgia Institute of Technology

Dr. Judy Hoffman, College of Computing, Georgia Institute of Technology

Dr. Chen Yu, Department of Psychology, University of Texas at Austin

Dr. Subhransu Maji, College of Information and Computer Sciences, University of Massachusetts, Amherst

 

Summary

 

While object recognition is a classical problem in computer vision that has witnessed incredible progress as a result of contemporary deep learning research, the key challenges of developing systems that can learn object categories from continually arriving data, from a few samples, and with limited supervision still remain. In this dissertation, we aim to borrow the learning strategy of shape bias and environmental bias of repetition, both of which are observed in young children, and apply them to continual, low-shot, and self-supervised learning of objects and object parts. In the continual learning domain, we demonstrate that repetition of learned concepts significantly ameliorates catastrophic forgetting. For low-shot learning we develop two methods for learning shape-biased object representations with decreasing supervision requirements: based on learning a joint image and 3D shape metric space from point clouds, and by self-supervised learning of object parts from multi-view pixel correspondences. We demonstrate that these methods of introducing a shape bias improve low-shot category recognition. Last, we find that contrastive learning from multi-view images allows for category-level part matching with performance competitive with baseline that have over 10 times more parameters, while being trained only on synthetic data. To support our investigations, we present two synthetic 3D object datasets, Toys200 and Toys4K, and develop a series of highly realistic synthetic data rendering systems that enable real-world generalization.