Context awareness for node packages

Goal: Structure node packages so AI agents read less and understand more.

Specifically: measure how TypeScript monorepo structure affects context window consumption, and build a tool that quantifies the waste and fixes it.

Repository: markkovari/context-pnpm

Before/After Highlights

When I work on different parts of a codebase with AI assistants, the context window fills up fast. Every file the assistant reads to understand a dependency is loaded in full, including implementation details it will never touch. For a busy utility module, that's thousands of tokens of waste, on every session, across every file that imports it. I kept hitting conversation compacting earlier than expected, and it was slowing me down.

My theory was that the shape of your modules, how many packages you have, how big they are, how nested, directly influences how many tokens get burned just loading context. But I didn't have numbers. I didn't know the threshold where splitting a module actually pays off versus adding maintenance overhead for no gain.

So I built a tool to find out.

The approach

I wanted to answer a simple question: given a TypeScript codebase, which files are costing you the most tokens per AI session, and is it worth restructuring them?

The core insight is that file size alone doesn't predict waste. What matters is how much of a file is implementation versus exported API, multiplied by how many files import it. A 10,000-token type declaration file with 98% exports barely registers. A 700-token utility module with a large implementation body, imported by 18 files, costs more than almost anything else.

I landed on this scoring formula:

score = (total_tokens − surface_tokens) × importer_count

Term	Definition
`total_tokens`	Full file token count (tiktoken cl100k_base)
`surface_tokens`	Only the exported declarations
`importer_count`	Number of files that import this one

💡 If the score is above 60 (the overhead of a package.json + index.ts boilerplate), extraction into a separate workspace package is worth it. Below that, leave it alone.

The toolchain

External packages

Package	Purpose
tiktoken (OpenAI)	Accurate token counting with cl100k_base encoding
typescript-estree (typescript-eslint)	ESTree-compatible AST parser to distinguish exported surface from implementation body

Internal packages

Package	Role
`analyzer`	Reads folders via glob pattern, returns total tokens, surface tokens, and importer counts
`estimator`	Projects token savings per AI session from analyzer output
`cli`	User-facing tool: `analyze`, `estimate`, `scaffold`, `verify`, `rebalance`. Dry-run by default; nothing written without `--apply`
`scaffolder`	Rewires imports/exports, registers new pnpm workspace packages, generates minimal `index.ts` re-export surfaces