- B.Sc. in Statistics;
- M.Sc. in Data Science;
- Currently enrolled in the PhD program in Physics and Nanosciences at the Università di Genova, in collaboration with the RNA System Biology lab (Istituto Italiano di Tecnologia).
You can contact me at email@example.com
Bini_FirstYearReport.pdf (79.71 KB)
Prediction of ICU admission for COVID-19 patients: a machine learning approach based on complete blood count data
In this article we discuss the development of prognostic Machine Learning (ML) models for COVID-19 progression: specifically, we address the task of predicting intensive care unit (ICU) admission in the next 5 days. We developed three ML models on the basis of 4995 Complete Blood Count (CBC) tests. We propose three ML models that differ in terms of interpretability: two fully interpretable models and a black-box one. We report an AUC of. 81 and. 83 for the interpretable models (the decision tree and logistic regression, respectively), and an AUC of. 88 for the black-box model (an ensemble). This shows that CBC data and ML methods can be used for cost-effective prediction of ICU admission of COVID-19 patients: in particular, as the CBC can be acquired rapidly through routine blood exams, our models could also be applied in resource-limited settings and to get fast indications at triage and daily rounds.
Machine learning methods applied to genotyping data capture interactions between single nucleotide variants in late onset Alzheimer's disease
Genome-wide association studies (GWAS) in late onset Alzheimer's disease (LOAD) provide lists of individual genetic determinants. However, GWAS do not capture the synergistic effects among multiple genetic variants and lack good specificity. We applied tree-based machine learning algorithms (MLs) to discriminate LOAD (>700 individuals) and age-matched unaffected subjects in UK Biobank with single nucleotide variants (SNVs) from Alzheimer's disease (AD) studies, obtaining specific genomic profiles with the prioritized SNVs. MLs prioritized a set of SNVs located in genes PVRL2, TOMM40, APOE, and APOC1, also influencing gene expression and splicing. The genomic profiles in this region showed interaction patterns involving rs405509 and rs1160985, also present in the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. rs405509 located in APOE promoter interacts with rs429358 among others, seemingly neutralizing their predisposing effect. Our approach efficiently discriminates LOAD from controls, capturing genomic profiles defined by interactions among SNVs in a hot-spot region.