
Leni just hit #1 on the DRACO deep research benchmark (71.6%), beating every purpose-built deep research system on the leaderboard.
Perplexity Deep Research: 70.5% Gemini Deep Research: 59.0% OpenAI Deep Research (o3): 52.1% OpenAI Deep Research (o4-mini): 41.9%

Leni is not a deep research product. It's an AI Business Analyst. It handles spreadsheets, documents, presentations, and real estate analytics.
So how does a general-purpose business analyst outperform dedicated research agents from Perplexity, Google, and OpenAI?
The answer is in where the points come from.
On factual accuracy — did you find the right information? — Leni and Perplexity are nearly identical (66.7% vs. 66.5%). The retrieval race has converged. Every top system finds comparable information.
The gap is in delivery:
→ Presentation Quality: Leni 94.0% vs. Perplexity 82.5% (+11.5pp) → Citation Quality: Leni 86.6% vs. Perplexity 78.2% (+8.4pp)
Leni doesn't win by knowing more. It wins by delivering better.

Two tools in Leni's production harness make this possible:
The validator can send the search agent back for more rounds. The user only sees the final version after it passes the quality gate.
This is the same production harness Leni uses for every task. No benchmark tuning. No research-specific configuration. Just a business analyst with good tools.
