Hatchet is a developer platform and orchestration engine for AI agents, durable workflows, background tasks, and parallel workloads, with SDKs across Python, TypeScript, Go, and Ruby. Hatchet Cloud or self-hosted, with the full operational surface that comes with a real orchestration product - priority queues, concurrency strategies, rate limits, alerts, and OTEL export. If your durability problem is shaped like a polyglot task queue that also handles agents, Hatchet is a credible answer.
Kitaru is narrower by design. We built it for Python agents because that’s the workload most teams we talk to are trying to ship right now. Today, it’s increasingly obvious that every team running agents in production ends up writing the same glue layer on top of a general-purpose orchestrator. A durable llm() call with token lineage. Artifact graphs linked to executions. Replay with input overrides. Tag-routed deployment snapshots. Kitaru ships those in the box, underneath whatever harness you already picked. Auth, observability, and governance stay where they are.
Use Kitaru if you are
- Running Python agents and want `kitaru.llm()`, `kitaru.wait()`, and artifact lineage as primitives, not glue code your platform team maintains forever
- Keeping the agent harness, auth, entitlements, and observability stack you already picked, and adding only the runtime layer underneath them
- Deploying across Kubernetes, AWS, GCP, or Azure (Vertex AI, SageMaker, AzureML) and want an opinionated stack abstraction where one config switches every flow's backend
- Replaying a failed run from a specific checkpoint with input overrides, without paying for the LLM calls above it again
Use Hatchet if you are
- Running a polyglot estate (Python, TypeScript, Go, Ruby) that needs one orchestration platform across all of it
- Building background tasks, durable workflows, parallel workloads, or queue replacement, not specifically Python agents
- Leaning on the flow-control surface (priority queues, concurrency strategies, static and dynamic rate limits) as a first-class feature
- Fine adopting Hatchet Cloud's hosted control plane with published Developer/Team/Scale/Enterprise tiers, or self-hosting the engine plus Postgres plus dashboard
Hatchet is the orchestration platform you adopt. Kitaru is the runtime layer you embed underneath the harness you already picked.
Runtime primitive vs orchestration platform
Defaults tell you what a tool is actually for. Hatchet’s defaults are tasks, workers, queues, durable workflows, durable tasks, events, and schedules. That’s a packaged orchestration platform across your stack. Kitaru’s defaults are @flow and @checkpoint on ordinary Python, with kitaru.llm(), kitaru.wait(), and kitaru.save() / kitaru.load() shipped underneath whatever harness your team already picked.
@flow, @checkpoint, llm(), wait(), save/load - Layered model: Kitaru owns the runtime layer and stays out of harness, auth, and platform. Hatchet packages an orchestration platform that includes the runtime, plus queues, workers, schedules, alerts, and dashboard. If you’ve already picked your harness, auth, and observability stack, only Kitaru fits underneath them without overlap.
- Adoption shape:
pip install kitaru, add two decorators to ordinary Python, the flow runs in your process. Hatchet workers register with the engine, get triggered by tasks or events, and operate alongside an API server, Postgres, optional RabbitMQ, and a dashboard. Both work; the question is whether the runtime sits inside your app or whether your app gets re-shaped around the runtime. - Operational footprint: A Kitaru server is one process plus your S3, GCS, or Azure Blob bucket and a metadata DB. A Hatchet self-host is API server, engine, Postgres, optional RabbitMQ, dashboard, and workers; the cloud control plane is hosted in US AWS. Both are reasonable. Kitaru is closer to a library, Hatchet is closer to a platform you operate.
LLM calls as first-class lineage
The LLM call is the unit of cost, latency, and failure in an agent. Hatchet has strong operational observability for tasks and workflows. There’s a dashboard, alerts, metrics, and OpenTelemetry export to Datadog or Grafana. What it doesn’t ship is an llm() primitive that resolves a model alias, injects the provider key from your configured secret backend, and logs prompt, response, latency, tokens, and resolved model against the enclosing checkpoint by default. That part is glue you write on top. Or simply use Kitaru for it.
- API surface:
kitaru.llm(prompt, model="fast")is the primitive. Resolved aliases (so the same code maps to whichever provider is configured in the stack), automatic key injection, response captured. In Hatchet you write your owncall_llm()inside a task and decide what to log. - Per-call lineage: Prompt, response, token counts, latency, and resolved model land on the run record automatically and link to the enclosing checkpoint. Hatchet has OTEL spans across tasks; it doesn’t position a per-call LLM record linked to a checkpoint as a first-class concept in the docs we reviewed.
- Replay reads, not re-bills: On replay,
kitaru.llm()reads the captured response from the checkpoint instead of hitting the provider again, unless the input changed. Hatchet’s durable replay applies to step return values broadly; whether the LLM call re-hits the provider depends on how you shaped the step body.
Replay with checkpoint-output overrides
Both products replay from durable state. The harder question is whether you can change a checkpoint’s output and have downstream consumers re-execute against the new value. Kitaru exposes this as a first-class primitive: pin an override against checkpoint.research and the dependents replay against the override, everything else stays cached. Hatchet describes replay from the event log and retry, cancel, or replay from the dashboard, but not an equivalent input-override model where you swap a single step’s output and re-execute downstream against the change.
- Mechanism: Kitaru exposes checkpoint selectors and override keys (
checkpoint.research,checkpoint.draft, …) so you can replay a flow with a swapped value at any checkpoint and re-execute downstream consumers. Hatchet’s replay is operationally-scoped (replay this run, retry this step) rather than parameterized over a specific checkpoint’s output. - Use case: The agent wrote a bad brief at step 1, but step 7 took 4 minutes and 10k tokens to get there. With Kitaru you swap the brief and the flow re-executes against the new value while upstream artifacts stay cached where the inputs match. With Hatchet you’d reset to the start of the run or roll your own override layer inside your step bodies.
- What gets cached: Kitaru caches typed artifacts per
@checkpoint, indexed by the checkpoint’s name. Hatchet caches step return values in the durable event log, keyed by the step’s invocation in the run. The difference shows up most when you want to inject a hypothetical at a specific step and rerun downstream.
What makes Kitaru unique
| Feature | Kitaru | Hatchet |
|---|---|---|
| Durable execution with checkpoint replay | Yes | Yes |
| Human/agent-in-the-loop waiting with compute released | Yes | Yes |
| Permissively-licensed self-hosting (Kitaru: Apache 2.0; Hatchet: MIT) | Yes | Yes |
| Python-agent-shaped primitives (`kitaru.llm()`, `kitaru.wait()`, `kitaru.save()`/`kitaru.load()`) | Yes | Not supported |
| Built-in LLM primitive with alias-resolved secrets and per-call token/latency logging | Yes | Not supported |
| Typed, versioned artifact lineage per checkpoint with cross-run diff | Yes | Not supported |
| Replay with checkpoint-output overrides (swap a step's output, re-execute downstream) | Yes | Not supported |
| Versioned, tag-routed deployment snapshots (default/canary/stable) | Yes | Not supported |
| Polyglot SDKs (Python, TypeScript, Go, Ruby) | Not supported | Yes |
| Documented concurrency strategies and static/dynamic rate limits | Not supported | Yes |
| Managed cloud with published tiers (Developer/Team/Scale/Enterprise) | Not supported | Yes |
How the two surfaces map
| Concept | Hatchet | Kitaru |
|---|---|---|
| Workflow boundary | hatchet.workflow(name="...") registered with the engine, run by workers | @flow on ordinary Python, called as flow.run(...) |
| Durable step | @review_flow.task() with optional parents=[...] for DAG dependencies | @checkpoint persists a typed, versioned artifact in your own bucket |
| Pause and resume | @hatchet.durable_task plus await ctx.aio_wait_for_event("...") to pause on an external event | kitaru.wait(name="...", schema=...) releases compute, resumes from any input source |
| LLM call | Your own call_llm() inside a task; provider key, prompt and token logging are glue you write | kitaru.llm() resolves the model alias, injects the provider key, logs prompt, tokens, and latency per call |
| Cross-run state | Bring your own store (Postgres, Redis, KV) | kitaru.save(name, value) / kitaru.load(exec_id, name): typed artifacts in your own bucket, queryable across runs |
| Invocation | review_flow.run(ReviewInput(topic="...")) or hatchet.event.push("...", {...}) to push an event | flow.run(...) for source/local execution; saved deployments via kitaru invoke FLOW ..., KitaruClient().deployments.invoke(flow="...", inputs={...}), or flow.invoke(...) |
| Deployment and versioning | Versioned through worker code; switch versions by deploying a new worker image | Immutable flow.deploy() snapshots, tag-routed (default, canary, stable) |
| Self-hosting | API server, engine, Postgres, optional RabbitMQ, dashboard, and workers | Single-service Kitaru server plus your S3, GCS, or Azure Blob and metadata DB |
Code comparison
import kitaru
from kitaru import checkpoint, flow
@checkpoint
def research(topic: str) -> str:
return kitaru.llm(
prompt=f"Research: {topic}. Return a brief.",
model="fast",
)
@checkpoint
def draft(brief: str) -> str:
return kitaru.llm(
prompt=f"Write a draft from this brief:\n{brief}",
model="fast",
)
@flow
def review_flow(topic: str) -> str:
brief = research(topic)
text = draft(brief)
approved = kitaru.wait(
name="approve_draft",
question="Approve draft?",
schema=bool,
)
return text if approved else "Rejected"
review_flow.run(topic="Durable agents").wait()from hatchet_sdk import Hatchet, Context, DurableContext
from pydantic import BaseModel
hatchet = Hatchet()
class ReviewInput(BaseModel):
topic: str
class TextOutput(BaseModel):
text: str
async def call_llm(prompt: str) -> str:
...
review_flow = hatchet.workflow(name="ReviewFlow")
@review_flow.task()
async def research(input: ReviewInput, ctx: Context) -> TextOutput:
return TextOutput(text=await call_llm(f"Research: {input.topic}"))
@review_flow.task(parents=[research])
async def draft(input: ReviewInput, ctx: Context) -> TextOutput:
brief = ctx.task_output(research).text
return TextOutput(text=await call_llm(f"Draft: {brief}"))
@review_flow.durable_task(parents=[draft])
async def approve(input: ReviewInput, ctx: DurableContext) -> TextOutput:
text = ctx.task_output(draft).text
decision = await ctx.aio_wait_for_event("review:approve")
return TextOutput(text=text if decision.data.get("ok") else "Rejected")
def main() -> None:
worker = hatchet.worker("review-worker", workflows=[review_flow])
worker.start()
# Trigger: review_flow.run(ReviewInput(topic="Durable agents"))
# Approve: hatchet.event.push("review:approve", {"ok": True})
# Workers register with the Hatchet engine; events route through it.A runtime, not a platform
If you want the full operational platform around durable workflows (hosted cloud, queues, rate limits, multi-language SDKs, alerts and dashboards), Hatchet is a strong pick, and I’d tell any team that.
For Python agent work specifically (a durable llm() call, artifact lineage linked to executions, replay with input overrides, versioned tag-routed deploys), the glue you’d write on top of a general-purpose orchestration platform is what Kitaru ships for you.
We’ve spent five years building the MLOps-ready version of this problem space at ZenML. JetBrains runs their AI globally on it; Adeo runs across all their brands and geographies on it. Kitaru is that team two years into the agent version. Bet on us for agent infrastructure and you’re betting on the group that’s been doing this the whole time.
pip install kitaru