I ranked the 15 companies on three investor-facing dimensions: market trajectory, wedge clarity/ defensibility, and vision ceiling. A secondary feasibility filter accounted for which companies I could credibly evaluate or build for in one day given my software engineering and applied AI background.
Track choice: Track 2 (Agentic GTM). A built GTM agent demo is a more interactive artifact than a written product test. Track 2 lets me ship a working agent and a live dashboard rather than a long evaluation writeup.
Top 5, scored
| Rank | Company | Market | Wedge | Vision Ceiling | Notes |
|---|---|---|---|---|---|
| 1 | Entire | H | H | H | Less public coverage, target user matches my domain |
| 2 | Mem0 | H | H | H | Crowded submission space, well-covered publicly |
| 3 | BAML | M-H | H | M-H | Niche developer audience, sharper as Track 1 test |
| 4 | Quill | M | M | M | Crowded category, GTM problem is the whole game |
| 5 | Rasa | M | M | M | Mature, less acute GTM problem (organic distribution) |
H is High, M-H is Medium-High, M is Medium. I used letter ratings rather than numbers to avoid false precision on qualitative dimensions.
Companies I evaluated
| Rank | Company | Read |
|---|---|---|
| 1 | Entire | Context preservation infra for AI coding agents. $60M seed, the largest dev-tools seed on record. Vision ceiling is GitHub-scale infrastructure for the agentic era. Wedge is racing to default before GitHub, Cursor, or Anthropic absorb the category. Two months old, less coverage to compete with. |
| 2 | Mem0 | Memory layer for LLM apps. Real OSS traction, in Basis Set portfolio. Foundational primitive every agent product eventually needs. Risk: memory commoditizes into frameworks. |
| 3 | BAML | Schema-first prompt language. Treats prompts as a typed interface. Vision ceiling is "the JSX of AI apps". Risk: LLM SDKs absorb the tooling. |
| 4 | Quill | AI meeting notes for B2B sales. Clear ICP, real GTM problem. Crowded category (Fathom, Granola, Gong), so execution and integrations are the moat. Vision ceiling bounded by category. |
| 5 | Rasa | Open-source conversational AI framework, mature. Defensibility is community plus enterprise relationships. Repositioning into an agent framework for regulated enterprises is the open question. |
| 6 | Tigris | Globally distributed object storage built for AI workloads. Real infra credibility. Risk: AWS S3 and Cloudflare R2 dominate, sales motion is long. |
| 7 | Primitive | Hardware engineering platform for embedded teams. Underserved category. Vision ceiling high if they become default for modern hardware development. Smaller TAM than pure software. |
| 8 | Atuin | Encrypted shell history sync. Excellent product, real community. Path to venture-scale outcome is unclear. Terminal tooling rarely supports $1B+ companies. |
| 9 | Parasail | AI inference platform. Crowded space (Together, Modal, Replicate, Fireworks). Differentiation requires deep infra benchmarking. |
| 10 | Hallway | Couldn't find enough public material to form a confident view on vision, wedge, or moat. Would risk being shallow. |
Companies I set aside
Five companies on the list (Vizcom, Solvely, Flint, Beeble, and Reve) operate in domains where credible diligence requires sector experience I don't have. Industrial design tooling, K-12 education sales motion, VFX and film production workflows, and creative-professional image generation each demand specific working knowledge of the buyer, the existing competitive set, and the channel economics. I could produce a surface-level read on each, but I'd rather acknowledge the gap than rank them on superficial signals. In a longer diligence window, the right move on these would be to consult someone with domain experience before forming a view.
Why Entire
Two months out from public launch with the largest dev-tools seed on record ($60M), led by Thomas Dohmke (ex-CEO of GitHub). The product solves a problem every Claude Code, Cursor, and Codex user hits today: state lost between agent sessions, mode drift across turns, persistent memory rules that don't actually persist. Best-case outcome is that Entire becomes the substrate layer beneath every AI coding session, plausibly a $10B+ infrastructure company given the trajectory of agentic coding tool adoption and the gross-margin profile of developer infrastructure plays (Cursor at $2B ARR, Vercel at $340M ARR run-rate).
The risk is real and time-pressured where the incumbents (Cursor, Anthropic, GitHub) could absorb context preservation natively before Entire reaches escape velocity. Speed-to-default matters more than feature parity at this stage.
The wedge I built against in the writeup is moment-of-pain engagement with developers and maintainers articulating session-context failures in public. The clearest example is a recent issue in anthropics/claude-code, where Claude lost track of conversation mode across turns, repeated implementation mistakes after correction, and violated its own saved memory rule. That's exactly the failure Entire's product addresses. I also used Entire myself during the build and found an onboarding edge case worth filing upstream (covered in the writeup).