Hi ๐Ÿ‘‹ Iโ€™m Stephanie Smith

I am a data scientist based in the Bay Area, CA. I turn complex datasets into insights that make technology more intuitive, programs more effective, and experiences more meaningful. Whether it be for the consumer, user, or participant, I am passionate about using data to elevate the human experience.

Get to know me:

๐Ÿ‘จโ€๐Ÿ’ป GitHub

๐Ÿ‘ค LinkedIn

๐ŸŽจ Tableau Portfolio

โœ‰๏ธ stjsmith8@gmail.com

ChatGPT Image Jun 3, 2025, 04_19_08 PM.png


Latest Project โ†’

Career Highlights โ†’

About Meโ†’

Data Science Projects


Predicting Customer Churn

In this project, I explore customer churn prediction. I compare evaluation metrics for multiple machine learning models and find that an XG Boost model delivers the strongest predictive performance. This analysis improves retention strategy by identifying that customer spending attributes such as total transaction amount and revolving balance are most strongly associated with customer churn.

Picture6.png

Spotify User Segmentation Analysis

This analysis of Spotify user data utilizes KModes clustering to reveal three distinct segments: free mainstream music listeners, playlist-oriented users, and premium-leaning podcast enthusiasts. The clustering results highlight differences in demographics, device usage, subscription preferences, and content choices, providing actionable insights for targeted marketing and product strategies.

Picture13.png

Wine Sales Forecasting (Time Series Analysis & Shiny App)

This project applies classical time-series forecasting methods (ETS, ARIMA, and TSLM) to Australian wine sales data to compare model performance across wine varietals. An interactive Shiny application enables users to explore trends, seasonality, and forecasts in real time. The live application is available here: Australian Wine Sales Forecasting App

image.png

Academic and Professional Highlights


๐Ÿง  ๐Ÿ“ˆ ๐Ÿ“š Graduate Research in Economics and Data Science

I earned my Master of Science in International & Development Economics from the University of San Francisco, where I built a strong foundation in statistics, survey design, and causal inference methodologies. During this time, I served as a Teaching Assistant for graduate-level econometrics, providing support with instruction and student guidance.

My graduate thesis applied machine learning techniques to identify mental health disorders in child refugee populations using therapy-induced drawings. I manually coded over 2,000 drawings to create training data with indicators of mental health diagnosis such as PTSD and depression, which were then combined with formal mental health diagnoses. I applied a Lasso regression to explore the relationship between psychological trauma and digitally coded drawing indicators. The features retained by Lasso were consistent with established correlations between drawing characteristics and psychological distress in clinical settings.

The results demonstrate that childrenโ€™s drawings can serve as a rapid, cost-effective, and non-invasive diagnostic tool for assessing mental health in humanitarian crisis settings, offering valuable insights into both exposure to violence and refugee integration outcomes. My work contributed to research which was published in the Journal of Development Economics. You can read the full paper here: Identifying psychological trauma among Syrian refugee children for early intervention: Analyzing digitized drawings using machine learning