Projects

Last updated in January 2022.

Deep Learning

Learning to detect rib fractures in chest X-rays

In this research project we developed Convolutional Neural Networks (CNNs) with ResNet based architectures to detect rib fractures in chest X-rays. We used a patch-based transfer learning paradigm with image augmentations to generate probabilty of fracture in various regions of the whole X-ray. Despite limited training data, our training paradigm helped us achieved an ROC-AUC of 0.75 and model explainability.

This work will be presented at the The Society for Pediatric Radiology Annual Meeting in April 2022 and is due for publication in a peer-reviewed journal. All code will be made available post publication.

Classical Machine Learning

Personalized prediction of asthma persistence

In this NIH funded research project, we developed a machine learning solution to predict persistent asthma in school aged children using their EHR data. The analysis pipeline consisted of feature selection, class balance, Bayesian hyperparameter tuning and model evaluation. We achieved an ROC-AUC of 0.86 (95% precision, 82% recall at 70% specificity) for our XGBoost model, demonstrating quantitative success of machine learning on a novel task. Further, we addressed clinical explainability needs by a qualitative audit of our model using permutation analysis. We found diagnosis age, race, prior diagnosis of allergic rhinitis, and eczema, health service utilization and prescription of Montelukast (an asthma controller medication) prior to age 5 years to be important predictors of asthma persistence.

This work was published in PLoS One.

Asthma Biomarker Detection

Asthma is a common allergic disorder characterized by airway inflammation and obstruction that affects at least 300 million people and accounts for up to 1 in 250 deaths worldwide. However, currently available diagnostic tests of asthma are either invasive or require patient compliance, making them difficult to use in children. In this NIH funded research project, we developed a machine learning based diagnostic tool to distinguish between asthmatics and healthy using their breath samples. The breath samples were processed using a GC/MS-qTOF system to create a high-dimensional dataset which was then used to detect asthma biomarkers using filter based feature selection and machine learning algorithms like XGBoost and Kernel SVM. We achieved an ROC-AUC of 0.91 (93% TNR at 70% TPR), outperforming previously developed linear and rule based systems. We also identified clinically relevant biomarkers such as toluene, pentanoic acid, 3-carene and cyclohexylmethane.

This work was presented at the International Conference on Health Informatics 2022.

Course Projects

Characterizing Air Quality in Philadelphia

In this exploratory data analysis project, I studied the environmental (Air quality, Humidity, Noise Level and Temperature) conditions in the city of Philadelphia. For the analysis I scraped most of the data from the publicly available AirCasting database, collected some of my own using AirBeam sensors and leveraged Python’s superior data munging and R’s amazing visualization capabilities.

Will Kobe make this shot?

In this fun application of Machine Learning we used a publicly available dataset that contains all of Kobe Bryant’s 30,697 shots that he attempted over the course of his NBA career to answer two questions:

  1. Can we predict the outcome of Kobe’s shot with reasonable accuracy?
  2. What are the important parameters in determining the outcome of Kobe’s shot?

Predicting Readmission Probability for Diabetes Inpatients

In this predictive analytics project, we used a dataset made publicly available by the Center for Clinical and Translational Research at Virginia Commonwealth University to help better manage diabetes patients with a hospital admission. Our goals were two-fold:

  1. Identify the factors determining whether or not the patient will be readmitted within 30 days.
  2. Develop a classification model to predict if a patient will be readmitted within 30 days.

Multi-class Classification of Tweets

In a “classic course project” fashion, we explored the basics of Natural Language Processing to classify the sentiment of tweets.