In today's digital world, scam messages clutter our inboxes and pose potential threats. Identifying and filtering these messages automatically has become a crucial task.
Our goal in this capstone project is to build an intelligent scam detection system using a Generative AI pipeline that can accurately classify whether a message is scam or not.
We used a labeled dataset containing SMS/email messages, each labeled as "scam" or "not scam". The dataset was preprocessed and vectorized to feed into machine learning models.
0
(not scam), 1
(scam)Understanding the balance of spam vs ham messages helps us gauge how skewed our dataset is.
🟢 Bar chart showing distribution of "scam" and "not scam" messages