Tech Stack: Python (Pandas), Data Engineering, Financial Modelling, Tableau

Dataset: 2.26 Million Loan Records (Lending Club)

🎯 Executive Summary

This project evaluates the risk-return profile of a $30B+ loan portfolio. By processing 2.4GB of raw data, I built a data pipeline to analyze the correlation between credit grades and actual financial performance. Key Discovery: Identified a critical pricing failure in high-risk segments (Grades C-G), where the Default Rate significantly outpaced the Interest Premium, resulting in an expected loss of up to 12.5% per dollar for the riskiest tier.

https://github.com/ZIXUANZHAO1998/Credit-Risk-Profitability-Study

🛠 Technical Workflow

🛠 View Python Code: Data Slimming & Cleaning Pipeline

1. Data Engineering & "Slimming"

2. Financial Logic Cleaning

📊 Key Insights & Visualization

1. Risk-Return Divergence Analysis

image.png

Figure 1: Interest Rate vs. Default Rate: The Widening Risk Gap

Why does the portfolio lose money in high-grade segments?