The case study demonstrates how to build production-ready conversational analytics applications by integrating LangGraph's multi-agent framework with Waii's advanced text-to-SQL capabilities. The solution tackles complex database operations through sophisticated join handling, knowledge graph construction, and agentic flows, enabling natural language interactions with complex data structures while maintaining high accuracy and scalability.
This case study presents Waii’s approach to building conversational analytics applications by combining their text-to-SQL capabilities with LangGraph, a framework for building stateful multi-agent applications. The article, published in October 2024, serves both as a technical deep-dive into handling complex SQL joins and as a demonstration of how to integrate Waii’s API into production LLM applications. It’s worth noting that this content originates from Waii’s own blog, so the claims about accuracy and capabilities should be viewed with appropriate skepticism as they represent vendor marketing material rather than independent validation.
The core problem addressed is the difficulty of translating natural language queries into accurate SQL, particularly when dealing with complex database schemas that require sophisticated join operations. Many text-to-SQL solutions struggle with multi-table joins, and this case study claims to demonstrate Waii’s differentiated approach to solving this challenge.
The solution architecture combines two main components: Waii’s text-to-SQL engine and LangGraph’s orchestration framework. The LangGraph application is structured as a state machine with multiple nodes handling different aspects of the conversational analytics workflow.
The workflow consists of several key nodes:
Question Classifier: Uses an LLM (OpenAI’s ChatGPT) to determine whether a user’s question relates to database information, visualization needs, or general queries. This node retrieves database catalog information from Waii and uses it as context for classification.
SQL Generator: Calls Waii’s Query.generate API to translate natural language into SQL. This is where the complex join handling occurs.
SQL Executor: Executes the generated SQL against the database through Waii’s Query.run API, which also injects security constraints before execution.
Result Classifier: Another LLM-based classifier that determines whether results should be presented as data or visualizations.
Chart Generator: Uses Waii’s Chart.generate_chart API to create visualizations from query results.
Insight Generator: A fallback path using OpenAI directly for general questions not requiring database access.
Result Synthesizer: Combines all outputs into a coherent response for the user.
The state management is handled through a Pydantic BaseModel that tracks the database description, query, generated SQL, data results, chart specifications, insights, responses, errors, and path decisions across the workflow.
The article’s primary technical contribution is explaining Waii’s approach to handling complex database joins. The example query provided is notably sophisticated, spanning 14 tables with various join types and semantics. The example creates a director performance dashboard combining data from movies, TV series, genres, keywords, awards, and actor collaborations.
Key join capabilities demonstrated include:
The generated SQL uses Common Table Expressions (CTEs) extensively, with named subqueries like director_movie_count, director_tv_count, combined_counts, ranked_directors, and various aggregation CTEs. This demonstrates that the system generates production-quality SQL rather than simple single-table queries.
A key differentiator claimed by Waii is their automatic construction of a knowledge graph representing database relationships. This graph incorporates multiple data sources:
The knowledge graph is described as continuously updated and refined with schema changes, new queries, and user feedback. This represents a form of continuous learning that could improve text-to-SQL accuracy over time, though no quantitative evidence is provided in the article.
Waii employs a sequence of specialized “agentic flows” for query construction:
Table Selection: Analyzing the user’s request to identify relevant tables, using semantic understanding and knowledge of common join relationships to find tables that might not be directly mentioned in the user’s input.
Join Graph Analysis: Proposing and evaluating potential join paths between selected tables, scoring alignment with previously seen joins and semantic understanding of relationships.
Join Condition Evaluation/Refinement: A separate check ensuring outer joins and join conditions are correctly applied, including proper handling of “on” vs “where” clause conditions.
Query Construction: Building the SQL query based on the chosen join graph and conditions.
Compilation and Optimization: Ensuring syntactic correctness, performance optimization, and enforcement of operational constraints (e.g., max output rows, max input partitions).
The implementation includes several production-relevant features. Security constraints are enforced at the execution layer, with code injection to limit row/column access based on user-level policies. The workflow includes error handling with a loop that can rewrite input and regenerate required objects on exceptions. State is maintained across interactions to allow follow-up questions and iterative analysis.
The code sample provided uses environment variables for configuration (WAII_URL, WAII_API_KEY, DB_CONNECTION), following standard practices for secrets management. The application maintains a continuous loop for interactive use, with exception handling that restarts the workflow on errors.
While the technical architecture presented is sound and the example query is genuinely impressive in its complexity, several caveats apply. The article provides no quantitative benchmarks comparing Waii’s accuracy to other text-to-SQL solutions. Claims about “high-accuracy joins” and “scalable table selection for large databases” are not substantiated with metrics. The knowledge graph construction process, while interesting conceptually, lacks detail on how predictions are made or how feedback is incorporated.
The example focuses on a single successful case rather than discussing failure modes, edge cases, or the types of queries where the system might struggle. The claim that users without specialized technical skills can perform complex data analysis should be viewed cautiously, as even correct SQL results require domain knowledge to interpret meaningfully.
The integration with LangGraph is relatively straightforward, primarily using Waii as an API service rather than demonstrating deep integration with LangGraph’s more advanced features like checkpointing, human-in-the-loop workflows, or parallel execution.
The article identifies several industry applications including business intelligence for executives, healthcare research database exploration, financial market analysis, e-commerce customer behavior analysis, and educational administration insights. These represent reasonable applications of conversational analytics, though actual implementations would require significant additional work beyond what’s shown in the code sample.
This case study demonstrates a practical approach to building production conversational analytics applications using LLMs and specialized text-to-SQL services. The combination of LangGraph for workflow orchestration and Waii for SQL generation represents a sensible architectural pattern. However, teams considering similar implementations should conduct their own evaluations of text-to-SQL accuracy, particularly for their specific database schemas and query patterns, rather than relying solely on vendor claims.
Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.
Cursor, a developer tool company, shares their journey of building what they call a "software factory" where AI agents handle increasingly autonomous software development tasks. The presentation outlines how they progressed through levels of autonomy from basic autocomplete to spawning hundreds of agents working asynchronously across their codebase. Their solution involves establishing guardrails through rules that emerge dynamically, creating verifiable systems with automated testing, and building skills and integrations that enable agents to work independently. Results include engineers managing fleets of agents rather than writing code directly, with some features being developed entirely by agents from feature flagging through testing to deployment, though significant work remains in observability, orchestration, and preventing agents from going off-track.
Notion, a knowledge work platform serving enterprise customers, spent multiple years (2022-2026) iterating through four to five complete rebuilds of their agent infrastructure before shipping Custom Agents to production. The core problem was enabling users to automate complex workflows across their workspaces while maintaining enterprise-grade reliability, security, and cost efficiency. Their solution involved building a sophisticated agent harness with progressive tool disclosure, SQL-like database abstractions, markdown-based interfaces optimized for LLM consumption, and a comprehensive evaluation framework. The result was a production system handling over 100 tools, serving majority-agent traffic for search, and enabling workflows like automated bug triaging, email processing, and meeting notes capture that fundamentally changed how their company and customers operate.