🌐 Problem Statement

In today's digital world, scam messages clutter our inboxes and pose potential threats. Identifying and filtering these messages automatically has become a crucial task.

Our goal in this capstone project is to build an intelligent scam detection system using a Generative AI pipeline that can accurately classify whether a message is scam or not.


🔬 Dataset Overview

We used a labeled dataset containing SMS/email messages, each labeled as "scam" or "not scam". The dataset was preprocessed and vectorized to feed into machine learning models.


📈 Message Class Distribution

Understanding the balance of spam vs ham messages helps us gauge how skewed our dataset is.

🟢 Bar chart showing distribution of "scam" and "not scam" messages

spam_distribution.png


🧑‍💻 Tools & Technologies