World Life Expectancy Analysis (MySQL)
📌 Project Overview
This project focuses on cleaning and analyzing global life expectancy data using MySQL. The dataset contains health, economic, and demographic indicators across countries and years, with several real-world data quality issues such as missing values, inconsistencies, and duplicates.
The goal of this project is to:
Prepare a reliable, analysis-ready dataset through structured SQL data cleaning.
Extract meaningful insights related to life expectancy, economic factors, and health indicators.
All logic is clearly documented using in-line SQL comments, making the project easy to follow and reproducible.
🧱 Dataset Description
The dataset includes country-level data across multiple years with the following key attributes:
Demographics: Country, Year, Status (Developed / Developing)
Health Indicators: Life Expectancy, Adult Mortality, Infant Deaths, HIV/AIDS, BMI
Immunization: Polio, Diphtheria
Economic Factors: GDP, Percentage Expenditure
Education & Nutrition: Schooling, Thinness (1–19 & 5–9 years)
Example records include countries such as Afghanistan, Albania, Algeria, Australia, Argentina, and many more, spanning multiple years.
🧹 Part 1: Data Cleaning (SQL)
The cleaning phase focuses on making the dataset accurate, consistent, and trustworthy for analysis.
Key Cleaning Steps:
Checking for duplicates using count and concat functions, if found we need to delete them and update it to the database:
.png)