Profile Image

Berardino Barile

Post-doc @ Mcgill
Researcher @ Mila

I am a postdoctoral fellow in the Probabilistic Vision Group (PVG) under the guidance of Prof. Tal Arbel. My research interests are related to causal machine learning and generative models for personalized recommendation systems and Individual Treatment Effect Estimation (ITE) using both tabular and imaging data. The integration of Causal Inference and Reinforcement Learning also represents a fascinating synergy that I am currently exploring.

You can find my complete CV here.

Research Interests

  • Statistical Learning and Deep Learning
  • Causal Machine Learning
  • Generative Models
  • Reinforcement Learning

Education

  • (Second) PhD in Engineering of Science

    KU Leuven, Leuven, Belgium

  • (First) PhD in Biomedical Engineering and Science

    Université Claude Bernard Lyon1, Lyon, France

  • Master of Science in Statistics

    "La Sapienza" university of Rome, Rome, Italy

  • Bachelor of Science in Statistics

    "La Sapienza" university of Rome, Rome, Italy

Experience

  • Senior Data Scientist #1

    Verti spa (Mapfre) #1, Italy

  • Big Data Scientist #2

    Isiway srl #2, Italy

  • Research Scientist #3

    Invitalia spa #3, Italy

  • Data Analyst #3

    Johnson & Johnson MedTech #3, Italy


Data augmentation using generative adversarial neural networks on brain structural connectivity in multiple sclerosis

Background: and objective: Machine learning frameworks have demonstrated their potentials in dealing with complex data structures, achieving remarkable results in many areas, including brain imaging. However, a large collection of data is needed to train these models. This is particularly challenging in the biomedical domain since, due to acquisition accessibility, costs and pathology related variability, available datasets are limited and usually imbalanced. To overcome this challenge, generative models can be used to generate new data.
Methods: In this study, a framework based on generative adversarial network is proposed to create synthetic structural brain networks in Multiple Sclerosis (MS). The dataset consists of 29 relapsing-remitting and 19 secondary-progressive MS patients. T1 and diffusion tensor imaging (DTI) acquisitions were used to obtain the structural brain network for each subject. Evaluation of the quality of newly generated brain networks is performed by (i) analysing their structural properties and (ii) studying their impact on classification performance.
Results: We demonstrate that advanced generative models could be directly applied to the structural brain networks. We quantitatively and qualitatively show that newly generated data do not present significant differences compared to the real ones. In addition, augmenting the existing dataset with generated samples leads to an improvement of the classification performance (F1score 81%) with respect to the baseline approach (F1score 66%).
Conclusions: Our approach defines a new tool for biomedical application when connectome-based data augmentation is needed, providing a valid alternative to usual image-based data augmentation techniques.

Ensemble Learning for Multiple Sclerosis Disability Estimation Using Brain Structural Connectivity

Background: Multiple sclerosis (MS) is an autoimmune inflammatory disease of the central nervous system characterized by demyelination and neurodegeneration processes. It leads to different clinical courses and degrees of disability that need to be anticipated by the neurologist for personalized therapy. Recently, machine learning (ML) techniques have reached a high level of performance in brain disease diagnosis and/or prognosis, but the decision process of a trained ML system is typically nontransparent. Using brain structural connectivity data, a fully automatic ensemble learning model, augmented with an interpretable model, is proposed for the estimation of MS patients' disability, measured by the Expanded Disability Status Scale (EDSS).
Materials and Methods: An ensemble of four boosting-based models (GBM, XGBoost, CatBoost, and LightBoost) organized following a stacking generalization scheme was developed using diffusion tensor imaging (DTI)-based structural connectivity data. In addition, an interpretable model based on conditional logistic regression was developed to explain the best performances in terms of white matter (WM) links for three classes of EDSS (low, medium, and high).
Results: The ensemble model reached excellent level of performance (root mean squared error of 0.92 ± 0.28) compared with single-based models and provided a better EDSS estimation using DTI-based structural connectivity data compared with conventional magnetic resonance imaging measures associated with patient data (age, gender, and disease duration). Used for interpretation of the estimation process, the counterfactual method showed the importance of certain brain networks, corresponding mainly to left hemisphere WM links, connecting the left superior temporal with the left posterior cingulate and the right precuneus gray matter regions, and the interhemispheric WM links constituting the corpus callosum. Also, a better accuracy estimation was found for the high disability class.
Conclusion: The combination of advanced ML models and sensitive techniques such as DTI-based structural connectivity demonstrated to be useful for the estimation of MS patients' disability and to point out the most important brain WM networks involved in disability.

Tensor Factorization of Brain Structural Graph for Unsupervised Classification in Multiple Sclerosis

Analysis of longitudinal changes in brain diseases is essential for a better characterization of pathological processes and evaluation of the prognosis. This is particularly important in Multiple Sclerosis (MS) which is the first traumatic disease in young adults, with unknown etiology and characterized by complex inflammatory and degenerative processes leading to different clinical courses. In this work, we propose a fully automated tensor-based algorithm for the classification of MS clinical forms based on the structural connectivity graph of the white matter (WM) network. Using non-negative tensor factorization (NTF), we first focused on the detection of pathological patterns of the brain WM network affected by significant longitudinal variations. Second, we performed unsupervised classification of different MS phenotypes based on these longitudinal patterns, and finally, we used the latent factors obtained by the factorization algorithm to identify the most affected brain regions.

Longitudinal Multiple Sclerosis Lesion Segmentation Using Pre-activation U-Net

Automated segmentation of new multiple sclerosis (MS) lesions in MRI data is crucial for monitoring and quantifying MS progression. Manual delineation of such lesions is laborious and time-consuming since experts need to deal with 3D images and numerous small lesions. We propose a 3D encoder-decoder architecture with pre-activation blocks to segment new MS lesions in longitudinal FLAIR images. We also applied intensive data augmentation and deep supervision to mitigate the limited data and the class imbalance problem. The proposed model, called Pre-U-Net, achieved a Dice score of 0.62 and a sensitivity of 0.58 on the public challenge MSSEG-2 dataset.

New multiple sclerosis lesion segmentation and detection using pre-activation U-Net

Automated segmentation of new multiple sclerosis (MS) lesions in 3D MRI data is an essential prerequisite for monitoring and quantifying MS progression. Manual delineation of such lesions is time-consuming and expensive, especially because raters need to deal with 3D images and several modalities. In this paper, we propose Pre-U-Net, a 3D encoder-decoder architecture with pre-activation residual blocks, for the segmentation and detection of new MS lesions. Due to the limited training set and the class imbalance problem, we apply intensive data augmentation and use deep supervision to train our models effectively. Following the same U-shaped architecture but different blocks, Pre-U-Net outperforms U-Net and Res-U-Net on the MSSEG-2 dataset, achieving a Dice score of 40.3% on new lesion segmentation and an F1 score of 48.1% on new lesion detection.

Classification of multiple sclerosis clinical profiles using machine learning and grey matter connectome

Purpose: The main goal of this study is to investigate the discrimination power of Grey Matter (GM) thickness connectome data between Multiple Sclerosis (MS) clinical profiles using statistical and Machine Learning (ML) methods.
Materials and Methods: A dataset composed of 90 MS patients acquired at the MS clinic of Lyon Neurological Hospital was used for the analysis. Four MS profiles were considered, corresponding to Clinical Isolated Syndrome (CIS), Relapsing-Remitting MS (RRMS), Secondary Progressive MS (SPMS), and Primary Progressive MS (PPMS). Each patient was classified in one of these profiles by our neurologist and underwent longitudinal MRI examinations including T1-weighted image acquisition at each examination, from which the GM tissue was segmented and the cortical GM thickness measured. Following the GM parcellation using two different atlases (FSAverage and Glasser 2016), the morphological connectome was built and six global metrics (Betweenness Centrality (BC), Assortativity (r), Transitivity (T), Efficiency (Eg), Modularity (Q) and Density (D)) were extracted. Based on their connectivity metrics, MS profiles were first statistically compared and second, classified using four different learning machines (Logistic Regression, Random Forest, Support Vector Machine and AdaBoost), combined in a higher level ensemble model by majority voting. Finally, the impact of the GM spatial resolution on the MS clinical profiles classification was analyzed.
Results: Using binary comparisons between the four MS clinical profiles, statistical differences and classification performances higher than 0.7 were observed. Good performances were obtained when comparing the two early clinical forms, RRMS and PPMS (F1 score of 0.86), and the two neurodegenerative profiles, PPMS and SPMS (F1 score of 0.72). When comparing the two atlases, slightly better performances were obtained with the Glasser 2016 atlas, especially between RRMS with PPMS (F1 score of 0.83), compared to the FSAverage atlas (F1 score of 0.69). Also, the thresholding value for graph binarization was investigated suggesting more informative graph properties in the percentile range between 0.6 and 0.8.
Conclusion: An automated pipeline was proposed for the classification of MS clinical profiles using six global graph metrics extracted from the GM morphological connectome of MS patients. This work demonstrated that GM morphological connectivity data could provide good classification performances by combining four simple ML models, without the cost of long and complex MR techniques, such as MR diffusion, and/or deep learning architectures.

A Kernel Based Blind Source Separation Approach for Classification of Multiple Sclerosis Clinical Profiles.

In machine learning, kernel data analysis represents a new approach to the study of neurological diseases such as Multiple Sclerosis (MS). In this work, a kernelization technique was combined with a tensor factorization method based on Multilinear Singular Value Decomposition (MLSVD) for MS profile classification. Our simple, yet effective, approach generates a meaningful feature embedding of multi-view data, allowing good classification performance. The results presented in this work define an interesting approach, given that only the anatomical T1-weighted image was used, which represents the most important modality in clinical applications.

T1/T2 ratio: A quantitative sensitive marker of brain tissue integrity in multiple sclerosis

Background and purpose: The aim of this study is to determine whether cerebral white matter (WM) microstructural damage, defined by decreased fractional anisotropy (FA) and increased axial (AD) and radial (RD) diffusivities, could be detected as accurately by measuring the T1/T2 ratio, in relapsing-remitting multiple sclerosis (RRMS) patients compared to healthy control (HC) subjects.
Methods: Twenty-eight RRMS patients and 24 HC subjects were included in this study. Region-based analysis based on the ICBM-81 diffusion tensor imaging (DTI) atlas WM labels was performed to compare T1/T2 ratio to DTI values in normal-appearing WM (NAWM) regions of interest. Lesions segmentation was also performed and compared to the HC global WM.
Results: A significant 19.65% decrease of T1/T2 ratio values was observed in NAWM regions of RRMS patients compared to HC. A significant 6.30% decrease of FA, as well as significant 4.76% and 10.27% increases of AD and RD, respectively, were observed in RRMS compared to the HC group in various NAWM regions. Compared to the global WM HC mask, lesions have significantly decreased T1/T2 ratio and FA and increased AD and RD (p < . 001).
Conclusions: Results showed significant differences between RRMS and HC in both DTI and T1/T2 ratio measurements. T1/T2 ratio even demonstrated extensive WM abnormalities when compared to DTI, thereby highlighting the ratio's sensitivity to subtle differences in cerebral WM structural integrity using only conventional MRI sequences.

Does Initial Access to Bank Loans Predict Start-ups' Future Default Probability? Evidence from Italy

In Europe, several countries have established public loan guarantee funds throughout direct/indirect loan programs to facilitate the access of SMEs and start-ups to bank credit. This paper investigates whether start-ups' level of access to bank loans during the early stage represents an imprinting factor with effects on the likelihood of survival once the firm reaches maturity. We rely on a firm-level longitudinal data set of 49,111 Italian startups born from 2003 to 2005. Implementing a 2SLS regression analysis we show that the initial level of start-up bank debt negatively influences the probability of default controlling for firm characteristics and performance.


2021, made with in pure Bootstrap 4, inspired by Academic Template for Hugo