Leni hits #1 on DRACO — the deep research benchmark by Perplexity & Harvard [hold until press release is live]


ai-edited-image.png

Leni just hit #1 on the DRACO deep research benchmark (71.6%), beating every purpose-built deep research system on the leaderboard.

Perplexity Deep Research: 70.5% Gemini Deep Research: 59.0% OpenAI Deep Research (o3): 52.1% OpenAI Deep Research (o4-mini): 41.9%

social_card2_leaderboard.png

Leni is not a deep research product. It's an AI Business Analyst. It handles spreadsheets, documents, presentations, and real estate analytics.

So how does a general-purpose business analyst outperform dedicated research agents from Perplexity, Google, and OpenAI?

The answer is in where the points come from.

On factual accuracy — did you find the right information? — Leni and Perplexity are nearly identical (66.7% vs. 66.5%). The retrieval race has converged. Every top system finds comparable information.

The gap is in delivery:

→ Presentation Quality: Leni 94.0% vs. Perplexity 82.5% (+11.5pp) → Citation Quality: Leni 86.6% vs. Perplexity 78.2% (+8.4pp)

Leni doesn't win by knowing more. It wins by delivering better.

social_card3_delivery_gap.png

Two tools in Leni's production harness make this possible:

  1. A web search agent that retrieves iteratively — it doesn't need to get everything right on the first pass
  2. A research validator that checks the complete, presented output — not a raw draft — and returns specific feedback: "need more evidence on X," "citation Y is unverifiable," "restructure section Z"

The validator can send the search agent back for more rounds. The user only sees the final version after it passes the quality gate.

This is the same production harness Leni uses for every task. No benchmark tuning. No research-specific configuration. Just a business analyst with good tools.

social_card4_domains.png