📌 Detecting Scam Messages with GenAI: A Capstone Project

🌐 Problem Statement

In today's digital world, scam messages clutter our inboxes and pose potential threats. Identifying and filtering these messages automatically has become a crucial task.

Our goal in this capstone project is to build an intelligent scam detection system using a Generative AI pipeline that can accurately classify whether a message is scam or not.

🔬 Dataset Overview

We used a labeled dataset containing SMS/email messages, each labeled as "scam" or "not scam". The dataset was preprocessed and vectorized to feed into machine learning models.

Total records: ~6,000 messages
Target labels: 0 (not scam), 1 (scam)
Data type: Text (Natural Language)

📈 Message Class Distribution

Understanding the balance of spam vs ham messages helps us gauge how skewed our dataset is.

🟢 Bar chart showing distribution of "scam" and "not scam" messages

🧑‍💻 Tools & Technologies

Python
Scikit-learn
Pandas, Numpy