Predictive Modeling in Mortgage Lending
An analysis of risk, equity, and decision-making in U.S. housing finance

This project examines how mortgage approval decisions have evolved in the United States by applying predictive analytics to publicly available Home Mortgage Disclosure Act data. Using loan-level records from Georgia, I developed classification models to assess whether a mortgage application would be approved or denied, with particular attention to how lending behavior shifted between pre-pandemic and post-pandemic conditions.

The analysis compared 2018 data, representing a stable pre-COVID baseline, with 2024 data reflecting a significantly altered lending environment shaped by inflation, interest rate volatility, and post-pandemic risk recalibration. Each dataset was prepared and modeled using RapidMiner, with careful attention to data integrity, ethical considerations, and interpretability. Key variables included loan amount, debt-to-income ratio, loan purpose, occupancy type, applicant income, and census tract–level demographic indicators.

Two primary models were evaluated: logistic regression and decision trees. While logistic regression performed exceptionally well on 2018 data, its accuracy declined in 2024, suggesting that contemporary lending decisions are increasingly influenced by nonlinear and interacting factors. Decision tree models demonstrated greater stability across both periods, highlighting the growing importance of regional, demographic, and structural variables in post-pandemic mortgage approvals.

Beyond model accuracy, the project emphasized the business and ethical implications of predictive decision-making. False positives were treated as a critical risk, given their potential to mislead applicants and create operational inefficiencies. By analyzing false positive rates across demographic and geographic segments, the project explored how data-driven tools can support fair lending practices without reinforcing existing disparities.

The findings suggest that mortgage lending has become more complex, context-dependent, and sensitive to broader economic pressures since the pandemic. This work demonstrates how predictive analytics can support responsible underwriting, improve transparency, and adapt lending strategies to evolving borrower realities, while reinforcing the importance of interpretability and equity in financial decision systems.



You may also like

Back to Top