Data Cleaning & Preparation:
Removed null values, treated outliers, and ensured consistent formatting across historical, current, and future datasets.
Feature Engineering:
Grouped rare categories, selected meaningful variables using Chi-square tests for categorical features and correlation analysis for numerical ones to avoid redundancy.
Model Building:
Trained a Random Forest Classifier on historical customer data (calibration dataset) to predict the likelihood of churn.
Model Evaluation:
Evaluated the model using:
Top Decile Lift (TDL): Measures how many actual churners fall in the top 10% of predictions.
Gini Coefficient: Assesses how much better the model performs compared to random guessing by analyzing the lift curve.
Application on New Data:
Applied the model on current and future datasets to score customers and identify high-risk profiles.
This project combined data preprocessing, statistical testing, machine learning modeling, and business analytics, offering a complete view of how AI can solve a real business problem. The final output helps companies optimize marketing budgets, increase customer retention, and maximize lifetime value by acting on model insights.