Highlights
- Developed a full workflow from data cleaning to model evaluation
- Compared Logistic Regression and XGBoost for binary classification
- Improved detection of high-risk patients with XGBoost
- Identified prior inpatient visits and disease severity as major drivers of readmission risk
Tools and methods
- Python
- SQL
- Scikit-learn
- XGBoost
- Time-series modeling
- Data cleaning
- Visualization
- Healthcare data analysis
Project link
Project Repo
Prior inpatient visits and severity-related clinical variables stood out as the strongest contributors to predicted readmission risk.
The curve reflects meaningful lift over baseline while supporting a higher-recall screening workflow for high-risk patients.