https://youtu.be/oaG8SDvC8Cg

The end-to-end process involved:

  1. Data Cleaning & Preparation:

    Removed null values, treated outliers, and ensured consistent formatting across historical, current, and future datasets.

  2. Feature Engineering:

    Grouped rare categories, selected meaningful variables using Chi-square tests for categorical features and correlation analysis for numerical ones to avoid redundancy.

  3. Model Building:

    Trained a Random Forest Classifier on historical customer data (calibration dataset) to predict the likelihood of churn.

  4. Model Evaluation:

    Evaluated the model using:

    Top Decile Lift (TDL): Measures how many actual churners fall in the top 10% of predictions.

    Gini Coefficient: Assesses how much better the model performs compared to random guessing by analyzing the lift curve.

  5. Application on New Data:

    Applied the model on current and future datasets to score customers and identify high-risk profiles.

This project combined data preprocessing, statistical testing, machine learning modeling, and business analytics, offering a complete view of how AI can solve a real business problem. The final output helps companies optimize marketing budgets, increase customer retention, and maximize lifetime value by acting on model insights.