I will walk through the text-to-SQL exploration we have done and how it evolved through different challenges. With every iteration, our goal is to improve accuracy measurement as well as accuracy itself.
Content + Flow
First Iteration: Functional text to SQL flow
- Deterministic Agentic Workflow with SQL tool use.
- Pass question, formatted schema in prompt, generate exploratory queries
- Pass result of exploratory queries, question and generate the final executable query
- Will talk about problems and how we solved them (eg. response parsing issues, invalid queries generated)
Second Iteration: Quality Evaluation Against Benchmark and Improving Accuracy
- Use Bird-Bench text to sql benchmark dataset to evaluate accuracy
- Focus on improving accuracy against the benchmark
- Tried techniques from different papers on the bird-bench top results board
- M-schema, self-correction steps, harder exploratory queries. Will discuss how much impact each approach had.
- Evaluated reactive agentic workflow, discuss pros and cons.
Third Iteration: Evaluating against a different database and user questions
- Used publicly available IPL dataset to enable questions about IPL.