PD Model for Retail Customers (Yelo Bank)
Built a Probability of Default (PD) model for retail customers within a one-month deadline mandated by the Central Bank of Azerbaijan.
Data Architecture & Preparation: Defined project scope, Critical Data Elements (CDEs), and feature list. Collected and validated data from Oracle SQL databases, verified against APIs, and cross-checked with business teams for accuracy.
Collaborated with Data Architects and Engineers to design and populate four datamarts:
- Target Datamart — Defines target variable and label structure.
- DPD Datamart — Includes Days-Past-Due (DPD) data from both Central Bank and internal systems.
- Contract-Based Datamart — Aggregates contract-level metrics (sum, min, max, avg) across 252 feature variations (4 aggregation types × 7 time windows × 3 portfolio groups × 3 data sources).
- Application Datamart — Contains operational data collected during customer loan applications.
Modeling & Evaluation: Applied Information Value (IV) and Weight of Evidence (WOE) for feature selection, followed by Logistic Regression with hyperparameter tuning. Evaluated multiple target definitions (15, 30, 60, 90 DPD) and loan age constraints (≥6 months, ≥12 months).
Performance Metrics:
- AUC: 0.78 (train), 0.74 (test), 0.71 (OOT)
- Precision/Recall: stable across datasets
- Approval Rate: optimized via business validation
Deployment & Validation: Validated by Internal Validation Team and ZYPLE, then deployed to Kubernetes for real-time scoring and monitoring.
Delivered the full pipeline — from data sourcing to deployment — in one month, setting a new internal benchmark for compliance-driven model delivery.