Type: GTM Operations / RevOps Tooling

Stack: Bash, Python 3

Repo: github.com/Francessmarty/HubSpot-Contact-Audit

Status: Complete


The Problem

Most HubSpot portals collect data faster than anyone cleans it. Contacts pile up with missing job titles, no lifecycle stage, and phone numbers that were never captured. Duplicates appear every time a list is imported twice. Inactive contacts sit untouched for months, inflating numbers and skewing reporting. The only way most teams find out is when a sales rep complains or a campaign bounces. By then the damage is already done. There was no lightweight tool that could take a raw HubSpot export, scan it automatically, and tell you exactly what was wrong in plain language.


What I Built

A Bash script that audits any HubSpot contacts CSV export and outputs a structured report across four categories:

  1. Duplicate contacts flagged by email address
  2. Contacts missing required properties (job title, phone, lifecycle stage)
  3. Contacts inactive for 90 or more days
  4. A summary CSV with counts across all categories The script runs from the terminal in a single command and produces a timestamped report folder in under 30 seconds.

How It Works

Export contacts from HubSpot as CSV. Run the script against the file:

bash hubspot_audit.sh hubspot_audit_ready.csv

The script auto-detects HubSpot column names, handles variations between free and paid plan exports, and writes four clean CSV files to a timestamped output folder. No setup. No API token required. No external libraries.