Diabetes Risk Factor Analysis


Executive Summary

This project analyzes health and lifestyle factors associated with diabetes using a dataset of 200 adult patients. Excel was used for data cleaning, transformation, and statistical analysis, while Power BI was used to create an interactive dashboard for visualization. The analysis identified key predictors such as BMI, cholesterol levels, age, and physical activity. Insights from the dashboard highlight increased diabetes prevalence among older adults, individuals with high BMI, and those with low physical activity. This project demonstrates end‑to‑end data analytics skills, including cleaning, modeling, exploratory analysis, and business intelligence reporting.

Problem Statement

The goal of this project is to identify the demographic, lifestyle, and clinical factors most associated with diabetes. Understanding these relationships can support healthcare professionals in developing targeted prevention strategies and improving early detection efforts.

Data Source & Description

The dataset consists of 200 patient records with variables such as age, gender, BMI, smoking status, alcohol intake, physical activity level, blood pressure, cholesterol, and diabetes status. Additional derived columns (BMI category, age group, hypertension flag, etc.) were created during preprocessing to enable deeper insights.

Tools Used

Microsoft Excel: Data cleaning, transformation, pivot tables, descriptive statistics

Power BI: Data modeling, interactive dashboards, DAX measures

Key Findings

Diabetes prevalence in the dataset: X%

The highest prevalence observed in age groups 50–59 and 60+

Obese BMI categories showed the highest diabetes rates

Low physical activity strongly associated with diabetes

Higher cholesterol levels observed among diabetic patients

Hypertensive individuals more likely to have diabetes

Recommendations

Encourage physical activity programs, especially for sedentary adults