7 Hacks to Cut Your Claude Code Usage by 80%

<aside> 📌 Most people burn through their Claude Code tokens in the first 20 minutes. Here's how I make mine last 5x longer — and get better answers while I do it.

</aside>

Hack 01 — The Caveman Method

Claude loves to explain itself. Every answer comes with filler, pleasantries, and "as I mentioned earlier..." recaps. That filler costs you tokens every single message.

The fix: make Claude talk like a caveman. Short sentences. Direct answers. No filler.

There's literally a free tool for this called Caveman — a Claude Code skill built by Julius Brussee that strips out the fluff automatically. Same technical accuracy, ~75% fewer output tokens.

Before (69 tokens): "The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle..."

After (19 tokens): "New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."

Install it in Claude Code with one command:

claude plugin marketplace add JuliusBrussee/caveman && claude plugin install caveman@caveman

You can pick your level of grunt: Lite, Full, Ultra, or 文言文 (classical Chinese mode, for fun). Repo: https://github.com/JuliusBrussee/caveman

Or if you don't want the tool, just add this line to your CLAUDE.md:

"No filler words. Short sentences. Direct answer only. Show result, no explanation."

Saves around 40% tokens per response either way.

Hack 02 — Pick the Right Model

Not every question needs the biggest brain. Running Opus to rename a variable is like driving a tank to the grocery store.

Opus — hard problems, complex reasoning, architecture decisions
Sonnet — everyday building, writing code, fixing bugs (your default)
Haiku — quick stuff, file searches, simple edits

Switch with /model sonnet or /model haiku. You'll burn through 5x fewer tokens per message on routine tasks.

Hack 01 — The Caveman Method

Hack 02 — Pick the Right Model

Hack 03 — Never Paste Raw Files