Automated research β polished report, end to end. Drop in a URL or file. Get back a fully rendered statistical report with narrative, tables, and charts.
| π’ Status | Active Development |
| π€ Owner | Justin Verlin |
| π Last updated | April 9, 2026 |
| π Repo | β |
| π§± Stack | Python 3.10 Β· Quarto Β· litellm Β· OpenAI |
| π€ Outputs | HTML Β· PDF Β· DOCX |
The pipeline has five stages. Each stage is independent and can be swapped out.
Source (URL or file) β Parse β Chunk β LLM Extract (map-reduce) β LLM Report Writer β Quarto .qmd β HTML/PDF/DOCX
| Source (URL or file) β Parse β Chunk β LLM Extract (map-reduce) β LLM Report Writer β Quarto .qmd β HTML/PDF/DOCX |
|---|
| Stage | What happens |
|---|---|
| Input | Detects URL vs. local file, downloads if needed, identifies MIME type |
| Parse | Routes to the right parser β PDF, DOCX, XLSX, CSV, web, or plain text |
| Chunk | Splits text into overlapping context-window-sized pieces; tables kept whole |
| Extract | Each chunk hits the LLM for structured data (entities, stats, claims); results are merged |
| Generate | Second LLM call turns the JSON extraction into a full .qmd with prose + charts |
| Render | Quarto compiles .qmd β HTML / PDF / DOCX |
| Format | Parser | Notes |
|---|---|---|
| π Web page | trafilatura | Any public URL |
| π PDF | PyMuPDF + pdfplumber | Text and tables |
| π Word | python-docx | .docx |
| π Excel | pandas + openpyxl | .xlsx |
| ποΈ CSV | pandas | |
| π Plain text | built-in |
Add a row each time a new report is generated. Link the
.qmdsource and the rendered output.
| Report | Source | Format | Generated | QMD | Output | Status |
|---|---|---|---|---|---|---|
| Q1 Market Analysis | market_data.xlsx |
HTML | Apr 7, 2026 | π Link | π Link | β Done |
| Industry Trends | bloomberg.com | Apr 5, 2026 | π Link | π Link | β Done | |
| Competitive Landscape | comp_data.pdf |
DOCX | Mar 31, 2026 | π Link | π Link | β Done |
| (next report) | π² Pending |
π‘ Tip: Turn this table into a Notion database for filtering by format, date, or status.