Company
14.ai
Title
Building Reliable AI Agent Systems with Effect TypeScript Framework
Industry
Tech
Year
2025
Summary (short)
14.ai, an AI-native customer support platform, uses Effect, a TypeScript framework, to manage the complexity of building reliable LLM-powered agent systems that interact directly with end users. The company built a comprehensive architecture using Effect across their entire stack to handle unreliable APIs, non-deterministic model outputs, and complex workflows through strong type guarantees, dependency injection, retry mechanisms, and structured error handling. Their approach enables reliable agent orchestration with fallback strategies between LLM providers, real-time streaming capabilities, and comprehensive testing through dependency injection, resulting in more predictable and resilient AI systems.
## Overview 14.ai is a company building an AI-native customer support platform where LLM-powered systems interact directly with end users. This case study, presented by Michael, the co-founder and CTO, describes how they use the Effect TypeScript library to manage the complexity and reliability challenges inherent in running LLMs in production. The core problem they address is the difficulty of building dependable systems when dealing with unreliable external APIs, non-deterministic model outputs, complex inter-system dependencies, and long-running workflows. The presentation offers valuable insights into the practical engineering decisions required when deploying agentic AI systems at scale. While the presentation is partly promotional for the Effect library, it provides genuine technical depth about the architectural patterns and operational considerations for production LLM systems. ## Technical Architecture The 14.ai platform uses Effect across their entire technology stack, demonstrating a commitment to consistency and type safety from frontend to backend. Their architecture consists of several key components working together: The frontend is built with React and powers dashboards, agent interfaces, knowledge management tools, insights, analytics, and SDKs. For internal communication, they use an RPC server built on Effect RPC combined with a modified version of TanStack Query on the frontend. Their public API server uses Effect HTTP with OpenAPI documentation autogenerated from annotated schemas, reducing documentation drift and maintenance overhead. Their data processing engine synchronizes data from CRMs, documentation systems, and databases, processing it for real-time analytics and reporting. This is a common pattern in customer support AI where context from multiple sources needs to be unified and made available to the AI agents. For database storage, they use PostgreSQL for both traditional data and vector storage, with Effect's SQL module handling queries. This is an interesting architectural choice as it consolidates storage into a single database technology rather than using specialized vector databases, which can simplify operational complexity at the potential cost of specialized performance optimizations. Everything across the stack is modeled using Effect schemas, which provides runtime validation, encoding, decoding, and type-safe input/output handling. A notable benefit they highlight is the automatic generation of documentation from these schemas. ## Agent Architecture and Workflow DSL The agent architecture at 14.ai follows a planner-executor pattern that is common in modern agentic systems. Agents take user input, formulate a plan, select appropriate actions or workflows, execute them, and repeat until task completion. The system distinguishes between three levels of abstraction: **Actions** are small, focused units of execution similar to tool calls in other LLM frameworks. Examples include fetching payment information or searching through logs. These are the atomic building blocks of the system. **Workflows** are deterministic multi-step processes that orchestrate multiple actions. The example given is a subscription cancellation workflow that might involve collecting a cancellation reason, offering retention options if applicable, checking eligibility, and finally performing the cancellation. The deterministic nature of workflows provides predictability for business-critical processes. **Sub-agents** group related actions and workflows into larger domain-specific modules. Examples include a billing agent or a log retrieval agent. This modular approach allows for separation of concerns and potentially parallel development of different capability domains. To manage this complexity, 14.ai built a custom domain-specific language (DSL) for workflows using Effect's functional pipe-based system. This DSL enables expressing branching logic, sequencing, retries, state transitions, and memory in a composable manner. Building a custom DSL on top of existing infrastructure like Effect is a sophisticated approach that suggests a mature engineering organization, though it also introduces potential maintenance burden and onboarding complexity for new engineers. ## LLM Reliability and Fallback Strategies Given that 14.ai's systems are described as "mission critical," reliability is paramount. One of the key strategies they employ is multi-provider fallback for LLM calls. When one LLM provider fails, the system automatically falls back to another provider with similar performance characteristics. The example given is GPT-4 Mini falling back to Gemini Flash 2.0 for tool calling. This fallback mechanism is implemented using retry policies that track state to avoid retrying providers that have already failed. This stateful retry approach is more sophisticated than simple exponential backoff and demonstrates thoughtful handling of the multi-provider landscape in production LLM systems. For streaming responses, which are common in customer-facing AI applications for improved perceived latency, they implement token stream duplication. One stream goes directly to the end user for real-time display, while a parallel stream is captured for storage and analytics purposes. Effect's streaming primitives reportedly make this pattern straightforward to implement. ## Testing and Dependency Injection Testing LLM-based systems presents unique challenges due to their non-deterministic nature and reliance on external services. 14.ai addresses this through heavy use of dependency injection to mock LLM providers and simulate failure scenarios. Their approach involves services being provided at the entry point of systems, with dependencies present at the type level. This means the compiler guarantees at compile time that all required services are provided, catching configuration errors before runtime. Services are designed to be modular and composable, making it easy to override behavior or swap implementations for testing without affecting the internal logic of the system. This DI approach enables testing of failure scenarios, alternative model behaviors, and edge cases without making actual API calls to LLM providers, which is both cost-effective and enables deterministic test execution. ## Observability The presentation mentions that Effect provides "very easy observability via OpenTelemetry." While not elaborated upon in detail, integration with OpenTelemetry suggests they can capture distributed traces, metrics, and logs in a standardized format that integrates with common observability platforms. For production LLM systems, observability is critical for debugging issues, monitoring performance, and understanding system behavior. ## Developer Experience and Onboarding The presentation highlights several aspects of developer experience with their Effect-based architecture. The schema-centric approach means input, output, and error types are defined upfront with built-in encoding and decoding. This provides strong type safety guarantees and automatic documentation. An interesting point raised is that the framework helps engineers new to TypeScript become productive quickly by preventing common mistakes through the type system's guardrails. This suggests that the investment in type safety pays dividends in reduced debugging time and fewer production issues. ## Lessons Learned and Honest Assessment The presentation includes a candid section on lessons learned that adds credibility to the overall case study: **Happy path bias**: While Effect makes writing code for the happy path clean and explicit, this can create a false sense of safety. It's easy to accidentally catch errors upstream and silently lose important failures if not careful. This is an honest acknowledgment that sophisticated tooling doesn't eliminate the need for careful engineering. **Dependency injection complexity at scale**: While DI is great in principle, tracing where services are provided across multiple layers or subsystems can become difficult to follow. This is a common challenge with DI-heavy architectures and worth considering for teams evaluating similar approaches. **Learning curve**: Effect is described as a big ecosystem with many concepts and tools that can be overwhelming at first. The presenter notes that once past the initial learning curve, things "start to click" and benefits compound. This suggests that teams should budget for ramp-up time when adopting Effect. **It's not magic**: The presenter explicitly states that Effect "helps us build systems that are predictable and resilient, but it's not magic. You still have to think." This tempers expectations and acknowledges that the library is a tool, not a solution unto itself. ## Recommendations for Adoption The presentation concludes with practical advice for teams considering similar approaches. They recommend incremental adoption, starting with a single service or endpoint rather than going "all in on day one." Effect is positioned as especially useful for LLM and AI-based systems where reliability and coping with non-determinism are primary concerns. The presenter also notes that while Effect brings functional programming rigor to TypeScript, you don't need to be a "functional programming purist" to derive value. This pragmatic positioning may make the approach more accessible to teams without deep FP backgrounds. ## Assessment This case study provides valuable insights into building production LLM systems with a focus on reliability, type safety, and testability. The architectural patterns described—multi-provider fallbacks, stream duplication, workflow DSLs, and comprehensive dependency injection—represent mature approaches to the challenges of agentic AI systems. The honest discussion of limitations and learning curves adds credibility, though the presentation is ultimately promotional for the Effect library. Teams considering similar architectures should carefully evaluate whether the upfront investment in learning Effect and building custom infrastructure like a workflow DSL is justified by their scale and reliability requirements.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.