For this project, I explored the Spotify Tracks dataset, which contains thousands of songs with detailed audio features and metadata. My goal was simple: understand what makes a track popular, uncover patterns in musical attributes, and see how well machine learning models can predict popularity.
I began with data cleaning and exploration, then dug into trends across genres, release years, and features like danceability, energy, and tempo. From there, I trained a few models to predict popularity scores, keeping the focus on interpretability and insights that could actually help artists, producers, and marketers.
The big takeaway is that popularity isn’t just about how a song sounds. Audio features do play a role, but timing, artist reach, and cultural trends are just as important. Models can highlight useful patterns, but they’ll never fully explain why one track becomes a global hit while another fades into the background.
This case study sits in the space of music analytics, using the Spotify Tracks dataset to explore what drives song popularity. The dataset combines both metadata (artist, album, release year) and audio features such as danceability, energy, valence, tempo, and loudness, along with a popularity score for each track.
I worked with Python (pandas, scikit‑learn, seaborn, matplotlib) in a Jupyter Notebook to clean the data, run exploratory analysis, and build predictive models. The deliverables include a cleaned dataset, clear visuals, baseline models, and practical insights that connect the technical results back to real‑world music trends.
Main Question
What really drives a track’s popularity on Spotify, and can we use audio features alone to predict it?
Why it matters
For artists and producers, understanding these drivers can shape creative decisions and marketing strategies. For streaming platforms, predictive models can help improve recommendations, build smarter playlists, and keep listeners engaged.