🎯 Objective
This project explores the relationship between lifestyle, health conditions, and the prevalence of heart disease using BRFSS survey data. The goal was to identify key risk factors and visualize trends across age, race, and geographic regions.
📦 Dataset
- Source: CDC BRFSS 2022 Survey Data
- Size: Over 445,000 individual responses
- Features: Age, gender, race, diabetes, smoking, exercise, alcohol use, blood pressure, mental health, and more
🛠️ Tools Used
- R: Logistic regression modeling
- Python: Exploratory data analysis and visualizations
- SQL (AWS RDS): Querying structured survey data
- AWS S3: Dataset storage and access
📊 Key Findings
- Diabetes, COPD, kidney disease, and smoking were strongly correlated with heart disease
- The highest heart disease prevalence was observed in men aged 65+ and among white non-Hispanic individuals
- States with the highest incidence: Mississippi, West Virginia, Arkansas
- Healthier regions included Hawaii, Colorado, and Utah