Emily Kaczmarek

Publications & Conferences

MetaCAM: Ensemble-Based Class Activation Map

The need for clear, trustworthy explanations of deep learning model predictions is essential for high-criticality fields, such as medicine and biometric identification. Class Activation Maps (CAMs) are an increasingly popular category of visual explanation methods for Convolutional Neural Networks (CNNs). However, the performance of individual CAMs depends largely on experimental parameters such as the selected image, target class, and model. Here, we propose MetaCAM, an ensemble-based method for combining multiple existing CAM methods based on the consensus of the top-k% most highly activated pixels across component CAMs. We perform experiments to quantifiably determine the optimal combination of 11 CAMs for a given MetaCAM experiment. A new method denoted Cumulative Residual Effect (CRE) is proposed to summarize large-scale ensemble-based experiments. We also present adaptive thresholding and demonstrate how it can be applied to individual CAMs to improve their performance, measured using pixel perturbation method Remove and Debias (ROAD). Lastly, we show that MetaCAM outperforms existing CAMs and refines the most salient regions of images used for model predictions. In a specific example, MetaCAM improved ROAD performance to 0.393 compared to 11 individual CAMs with ranges from -0.101-0.172, demonstrating the importance of combining CAMs through an ensembling method and adaptive thresholding.

Deep Learning Prediction of Renal Anomalies for Prenatal Ultrasound Diagnosis

Deep learning algorithms have demonstrated remarkable potential in clinical diagnostics, particularly in the field of medical imaging. In this study, we investigated the application of deep learning models in early detection of fetal kidney anomalies. To provide an enhanced interpretation of those models’ predictions, we proposed an adapted two-class representation and developed a multi-class model interpretation approach for problems with more than two labels and variable hierarchical grouping of labels. Additionally, we employed the explainable AI (XAI) visualization tools Grad-CAM and HiResCAM, to gain insights into model predictions and identify reasons for misclassifications. The study dataset consisted of 969 unique ultrasound images; 646 control images and 323 cases of kidney anomalies, including 259 cases of unilateral urinary tract dilation and 64 cases of unilateral multicystic dysplastic kidney. The best performing model achieved a cross-validated area under the ROC curve of 90.71% ± 0.54%, with an overall accuracy of 81.70% ± 0.88%, sensitivity of 81.20% ± 2.40% and specificity of 82.06% ± 1.74% on a test dataset. Our findings emphasize the potential of deep learning models in predicting kidney anomalies from limited prenatal ultrasound imagery. The proposed adaptations in model representation and interpretation represent a novel solution to multi-class prediction problems.

Topology Preserving Stratification of Tissue Neoplasticity using Deep Neural Maps and microRNA Signatures

Accurate cancer classification is essential for correct treatment selection and better prognostication. microRNAs (miRNAs) are small RNA molecules that negatively regulate gene expression, and their dyresgulation is a common disease mechanism in many cancers. Through a clearer understanding of miRNA dysregulation in cancer, improved mechanistic knowledge and better treatments can be sought. We present a topology-preserving deep learning framework to study miRNA dysregulation in cancer. Our study comprises miRNA expression profiles from 3685 cancer and non-cancer tissue samples and hierarchical annotations on organ and neoplasticity status. Using unsupervised learning, a two-dimensional topological map is trained to cluster similar tissue samples. Labelled samples are used after training to identify clustering accuracy in terms of tissue-of-origin and neoplasticity status. In addition, an approach using activation gradients is developed to determine the attention of the networks to miRNAs that drive the clustering. Using this deep learning framework, we classify the neoplasticity status of held-out test samples with an accuracy of 91.07%, the tissue-of-origin with 86.36%, and combined neoplasticity status and tissue-of-origin with an accuracy of 84.28%. The topological maps display the ability of miRNAs to recognize tissue types and neoplasticity status. Importantly, when our approach identifies samples that do not cluster well with their respective classes, activation gradients provide further insight in cancer subtypes or grades. An unsupervised deep learning approach is developed for cancer classification and interpretation. This work provides an intuitive approach for understanding molecular properties of cancer and has significant potential for cancer classification and treatment selection.

Multi-Omic Graph Transformers for Cancer Classification and Interpretation

Next-generation sequencing has provided rapid collection and quantification of ‘big’ biological data. In particular, multi-omics and integration of different molecular data such as miRNA and mRNA can provide important insights to disease classification and processes. There is a need for computational methods that can correctly model and interpret these relationships, and handle the difficulties of large-scale data. In this study, we develop a novel method of representing miRNA-mRNA interactions to classify cancer. Specifically, graphs are designed to account for the interactions and biological communication between miRNAs and mRNAs, using message-passing and attention mechanisms. Patient-matched miRNA and mRNA expression data is obtained from The Cancer Genome Atlas for 12 cancers, and targeting information is incorporated from TargetScan. A Graph Transformer Network (GTN) is selected to provide high interpretability of classification through self-attention mechanisms. The GTN is able to classify the 12 different cancers with an accuracy of 93.56% and is compared to a Graph Convolutional Network, Random Forest, Support Vector Machine, and Multilayer Perceptron. While the GTN does not outperform all of the other classifiers in terms of accuracy, it allows high interpretation of results. Multi-omics models are compared and generally outperform their respective single-omics performance. Extensive analysis of attention identifies important targeting pathways and molecular biomarkers based on integrated miRNA and mRNA expression.

Emily Kaczmarek

Ph.D. Student

About

Research Interests

Education

Experience

Publications & Conferences

MetaCAM: Ensemble-Based Class Activation Map

Deep Learning Prediction of Renal Anomalies for Prenatal Ultrasound Diagnosis

Topology Preserving Stratification of Tissue Neoplasticity using Deep Neural Maps and microRNA Signatures

Multi-Omic Graph Transformers for Cancer Classification and Interpretation