Title:  Self-Supervised Multimodal Representation Learning for Neuroimaging

Committee: 

Dr. Calhoun, Advisor

Dr. Plis, Co-Advisor      

Dr. Rozell, Chair

Dr. Dyer

Abstract: The objective of the proposed research is to develop a self-supervised representation learning framework for multi-modal brain imaging data. Predicting brain disorders via modern machine-learning approaches is a hot topic of research. There is a great hope that modern machine learning approaches can improve healthcare in clinical settings and fundamental understanding of our brain from complex high-dimensional data. Specifically, it can boost understanding of the underlying causes of brain disorders. However, common studies include only single modality and over-parameterized supervised models. A single modality offers only a narrow view of the brain, while supervised models are limited by the availability of costly labels that require expert knowledge. Further, coarse labels might not explain brain disorders because such models might not capture the entire spectrum of brain disorder phenotypes. To tackle these challenges, we propose a closed-loop approach with a unified multi-modal self-supervised framework and methodology to evaluate the quality of the representation. In preliminary research, we approached the difficulty of training supervised models without expert labels via training with imperfect labeling and pre-training by mutual information maximization. Then, we adapt the mutual information maximization to the context of multi-modal coordination in a unified framework. Lastly, we develop different methodologies to evaluate the representation's quality using a downstream evaluation on multiple tasks, alignment between modalities, interpretability, out-of-distribution, and fairness evaluation. Based on these preliminary results, we propose a fission model to disentangle unique modality-specific and joint information between modalities that are anticipated to improve performance and interpretability. Furthermore, our research proposal will be a starting point for a multifaceted methodology to develop more sophisticated multi-modal representation learning algorithms.