Skip to content
Skip to main content
From Idea to Deployed AI Agent: My Full Stack in Allen, TX
18 min readBy Carlos Aragon

From Idea to Deployed AI Agent: My Full Stack in Allen, TX

Here is every tool in my AI agent stack — OpenClaw, OpenMOSS, n8n, Claude API, Retell, Supabase, and Vercel — how they connect, and exactly how I go from a new client problem to a running agent in production.

From Idea to Deployed: What I Built and Why

I build AI agents for a living out of Allen, TX. Not demos. Not proof-of-concepts that live in a Notion doc. Production agents that answer phones, qualify leads, trigger deployments, pull attribution data, and write client briefs — all without me in the loop.

When I started two years ago, my stack was a mess of duct tape: one n8n instance, the OpenAI API, a random Postgres database, and a lot of manual handoffs. Every time a client wanted something new, I rebuilt from scratch. I had no reusable infrastructure, no shared memory layer, and no way to coordinate agents with each other.

Today my stack is intentional. Every tool has a job. Every layer talks to the next. When a new client comes in and says "I need an AI to qualify my roofing leads by phone," I can have a Retell voice agent live, connected to Supabase, feeding into an n8n workflow, coordinated by OpenMOSS, and deployed on Vercel — in under a day.

This post is the complete picture. I am going to walk you through every tool, show you the actual code I use, and explain how it all connects. If you are building AI agents for clients or for your own business, this is the stack I would build again from scratch.

The Stack at a Glance

Before I go deep on each tool, here is how they fit together at a high level:

OpenClawAI nerve center — agent runtime, heartbeats, tool registry
OpenMOSSMulti-agent orchestration — task queue, sub-tasks, coordination
n8nWorkflow automation — triggers, integrations, glue logic
Claude APIThe brain — reasoning, writing, analysis, tool use
Retell AIVoice agents — inbound/outbound calls, lead qualification
SupabaseData layer — leads, agent memory, automation results
VercelDeployment — Next.js frontend, edge functions, env management

The data flow is roughly: a client interaction (call, form, webhook) hits Retell or n8n → gets stored in Supabase → triggers an OpenMOSS task → OpenClaw assigns an agent → Claude reasons and acts → results flow back to Supabase and get surfaced in the Next.js dashboard on Vercel. Each layer is independently replaceable. Claude can swap to GPT-4o. n8n can swap to Make. But this combination is what I have found works best at production scale.

OpenClaw: The AI Nerve Center

OpenClaw is the runtime backbone of my agent stack. Think of it as the operating system for agents — it tracks which agents are alive, what tools they have registered, and what state they are in. Before I had OpenClaw, I would start an agent session and have no idea if it was actually running until I pinged it manually.

The feature I use most is the heartbeat system. Each agent registers itself with OpenClaw on startup and sends a heartbeat every 30 seconds. If an agent misses three consecutive heartbeats, OpenClaw marks it as degraded and fires a webhook I have routed to a Slack alert. Here is the registration payload an agent sends on startup:

agent-register.ts
// Agent startup registration with OpenClaw
async function registerWithOpenClaw(agentId: string, tools: string[]) {
  const payload = {
    agentId,
    name: "BellaBot",
    version: "2.4.1",
    tools,
    heartbeatInterval: 30_000,
    metadata: {
      owner: "carlos@vixi.agency",
      environment: process.env.NODE_ENV,
      region: "us-central",
    },
  };

  await fetch(`${process.env.OPENCLAW_URL}/agents/register`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "x-openclaw-key": process.env.OPENCLAW_API_KEY!,
    },
    body: JSON.stringify(payload),
  });
}

// Heartbeat loop — fires every 30s to signal the agent is alive
setInterval(async () => {
  await fetch(`${process.env.OPENCLAW_URL}/agents/${agentId}/heartbeat`, {
    method: "POST",
    headers: { "x-openclaw-key": process.env.OPENCLAW_API_KEY! },
    body: JSON.stringify({ status: "healthy", queueDepth: pendingTasks.length }),
  });
}, 30_000);

OpenClaw also maintains a tool registry. When I add a new capability to an agent — say, a new MCP tool that reads Google Sheets — I register it with OpenClaw so other agents and orchestrators know it is available. OpenMOSS queries this registry when it needs to route a task to an agent that has the right tools.

The other thing OpenClaw handles is graceful shutdown. When I redeploy an agent, OpenClaw drains its queue before killing the process. No tasks dropped, no half-executed workflows. That alone has saved me from client complaints more than once.

OpenMOSS: Multi-Agent Task Orchestration

If OpenClaw is the OS, OpenMOSS is the scheduler. It runs a persistent task queue where work items can be assigned to specific agents, broken into sub-tasks, and tracked through completion. I wrote about OpenMOSS in depth in a previous post, but here is how it fits into the full stack.

When a new lead comes in via Retell, my n8n workflow creates an OpenMOSS task with type lead.qualify. OpenMOSS looks at its executor registry — queried from OpenClaw — finds the agent with the crm.write and lead.score tools, and dispatches the task. The executor agent runs, calls Claude to score the lead, writes the result to Supabase, and marks the task complete. If scoring fails, OpenMOSS retries with exponential backoff. If it fails three times, it escalates to a reviewer agent.

create-task.ts
// Create an OpenMOSS task from an n8n webhook handler
export async function createLeadQualificationTask(lead: Lead) {
  const task = {
    type: "lead.qualify",
    priority: lead.estimatedValue > 10_000 ? "high" : "normal",
    payload: {
      leadId: lead.id,
      name: lead.name,
      phone: lead.phone,
      source: lead.source,
      transcriptUrl: lead.transcriptUrl,
    },
    requiredTools: ["crm.write", "lead.score", "supabase.insert"],
    retryPolicy: {
      maxAttempts: 3,
      backoff: "exponential",
      initialDelay: 2000,
    },
    reviewerAgent: "orus",
    timeoutMs: 120_000,
  };

  const response = await fetch(`${process.env.OPENMOSS_URL}/tasks`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "x-openmoss-key": process.env.OPENMOSS_API_KEY!,
    },
    body: JSON.stringify(task),
  });

  const { taskId } = await response.json();
  return taskId;
}

The executor/reviewer pattern is the most powerful part. The executor agent is optimized for speed — it runs fast, calls tools directly, and does not second-guess itself. The reviewer agent (usually Orus) is slower but smarter — it checks the executor's output for quality, flags edge cases, and can override decisions before they hit production. I only pay the reviewer's token cost when something actually needs review.

n8n + Claude: Workflows That Think

n8n is my glue. It connects webhooks to Supabase, Supabase changes to OpenMOSS, and external APIs to everything else. But where n8n really earns its place in my stack is when I combine it with the Claude API — that is when workflows stop just moving data and start making decisions.

In n8n, calling Claude is an HTTP Request node pointed at https://api.anthropic.com/v1/messages. I set the method to POST, add my x-api-key and anthropic-version headers, and send a messages array. Here is the JSON body I use in the node:

n8n-claude-node.json
// n8n HTTP Request node — body (JSON expression mode)
{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "system": "You are a lead qualification specialist for a roofing company in Allen, TX. Score leads 1-10 based on urgency, budget signals, and job type. Return JSON only.",
  "messages": [
    {
      "role": "user",
      "content": "Lead transcript:\n{{ $json.transcript }}\n\nLead data:\nName: {{ $json.name }}\nPhone: {{ $json.phone }}\nSource: {{ $json.source }}\n\nReturn: { score: number, tier: 'hot'|'warm'|'cold', reasoning: string, nextAction: string }"
    }
  ]
}

// n8n HTTP Request node — headers
{
  "x-api-key": "={{ $env.ANTHROPIC_API_KEY }}",
  "anthropic-version": "2023-06-01",
  "content-type": "application/json"
}

The response comes back as { content: [{ type: 'text', text: '{...}' }] }. I add a Set node after to parse {{ JSON.parse($json.content[0].text) }} and extract the score. Then a Switch node routes hot leads to an immediate SMS trigger and cold leads to a 3-day drip sequence.

One pattern I use constantly is the "think before you act" chain. The first Claude call reasons through the situation and outputs a plan. The second call executes the plan. Splitting reasoning and execution cuts errors significantly because Claude is not trying to think and act in the same context window.

Retell AI: Voice Agents for Client Calls

Retell is one of the best investments I have made for client-facing work. It gives you a real-time voice AI that can make and receive calls, handle interruptions naturally, and send structured data back via webhook when the call ends. For roofing, real estate, and home services clients, it replaces a receptionist for first-touch qualification.

Here is how I set up a Retell agent for a roofing client in Allen. The agent is configured with a system prompt focused on storm damage qualification. When someone calls, Retell handles the real-time conversation. When the call ends, it fires a webhook to my n8n instance with the full transcript, call duration, and any dynamic variables the agent collected.

retell-agent-config.ts
// Retell agent creation via API
const agentConfig = {
  agent_name: "RoofBot — Allen TX",
  voice_id: "eleven_turbo_v2",
  language: "en-US",
  response_engine: {
    type: "retell-llm",
    llm_id: process.env.RETELL_LLM_ID, // custom LLM with our system prompt
  },
  webhook_url: "https://n8n.vixi.agency/webhook/retell-call-ended",
  end_call_after_silence_ms: 8000,
  max_call_duration_ms: 600_000,
  begin_message:
    "Hi, thanks for calling Texas Roofing Pros. I'm the AI assistant — I can get your inspection scheduled right now. Did you have storm damage you're calling about, or something else?",
};

// Dynamic variables the agent collects during the call
const dynamicVars = {
  caller_name: { type: "string", description: "Caller's full name" },
  address: { type: "string", description: "Property address" },
  damage_type: {
    type: "enum",
    values: ["hail", "wind", "fallen_tree", "leak", "other"],
    description: "Type of damage reported",
  },
  insurance_confirmed: {
    type: "boolean",
    description: "Whether caller has homeowner's insurance",
  },
  urgency: {
    type: "enum",
    values: ["emergency", "this_week", "flexible"],
    description: "How urgent the repair is",
  },
};

When the webhook fires into n8n, I have these collected variables available immediately. I pass the transcript to Claude for scoring (via the workflow above), write the lead to Supabase, and trigger an OpenMOSS task to notify the sales team via SMS with the lead summary. The whole pipeline — call ends to Slack notification — runs in under 8 seconds.

The same pattern works for real estate. I have an agent that qualifies motivated seller leads: it asks about timeline, situation, and property condition, then routes hot leads directly to a CRM and books a callback. Clients love it because they stop missing calls at 7pm on a Sunday.

Supabase: The Data Layer

Every agent in my stack reads and writes to Supabase. It is the shared memory layer — leads go in, automation results come out, agent context persists across sessions. Supabase gives me Postgres with a proper schema, real-time subscriptions, and a JavaScript client that works in edge functions without extra configuration.

I have three main tables for agent work: leads, agent_memory, and automation_results. Here is how I write a lead after a Retell call:

supabase-lead.ts
import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

interface Lead {
  id?: string;
  client_id: string;
  source: "retell" | "form" | "manual";
  name: string;
  phone: string;
  address?: string;
  transcript?: string;
  score: number;
  tier: "hot" | "warm" | "cold";
  damage_type?: string;
  insurance_confirmed?: boolean;
  call_duration_seconds?: number;
  claude_reasoning?: string;
  next_action?: string;
}

async function upsertLeadFromRetell(
  retellPayload: RetellWebhookPayload,
  claudeScore: ClaudeLeadScore
): Promise<Lead> {
  const { data, error } = await supabase
    .from("leads")
    .insert({
      source: "retell",
      name: retellPayload.dynamic_variables.caller_name,
      phone: retellPayload.from_number,
      address: retellPayload.dynamic_variables.address,
      transcript: retellPayload.transcript,
      score: claudeScore.score,
      tier: claudeScore.tier,
      damage_type: retellPayload.dynamic_variables.damage_type,
      insurance_confirmed: retellPayload.dynamic_variables.insurance_confirmed,
      call_duration_seconds: Math.floor(retellPayload.duration_ms / 1000),
      claude_reasoning: claudeScore.reasoning,
      next_action: claudeScore.nextAction,
    })
    .select()
    .single();

  if (error) throw new Error(`Supabase insert failed: ${error.message}`);
  return data;
}

The agent_memory table is how I give agents persistent context across sessions. Each row is a key-value pair scoped to an agent ID and optionally a client ID. When Orus starts a new session for a client, it loads the last 20 memory rows for that client and injects them into the system prompt as a summary. Agents do not forget what happened last week.

Real-time subscriptions are underused by most developers. I subscribe to the leads table for INSERT events in my dashboard. When a new hot lead lands, the dashboard updates without a page refresh and plays an alert sound. The same subscription triggers an n8n webhook via a Supabase Edge Function, which sends the SMS to the sales rep.

Vercel: Shipping It

I deploy everything on Vercel. This site, client dashboards, agent API routes — all of it. Vercel's edge network means my API routes respond from the closest region to the client, which matters for real-time voice agent handoffs where every millisecond counts.

The setup I have standardized on is a Next.js 15 monorepo with one app directory per client (or one per project). Each app has its own vercel.json that sets environment variable scoping. I keep secrets in Vercel's environment variable dashboard — never in the repo — and pull them in with process.env. Here is a typical API route that my agents call to write results:

app/api/agent-result/route.ts
import { NextRequest, NextResponse } from "next/server";
import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

export async function POST(req: NextRequest) {
  // Verify the request is from our trusted agent infrastructure
  const authHeader = req.headers.get("x-agent-key");
  if (authHeader !== process.env.AGENT_WEBHOOK_SECRET) {
    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
  }

  const body = await req.json();
  const { agentId, taskId, result, metadata } = body;

  const { error } = await supabase.from("automation_results").insert({
    agent_id: agentId,
    task_id: taskId,
    result,
    metadata,
    completed_at: new Date().toISOString(),
  });

  if (error) {
    console.error("Failed to write agent result:", error);
    return NextResponse.json({ error: "Write failed" }, { status: 500 });
  }

  return NextResponse.json({ ok: true });
}

// Run this route on Vercel's edge runtime for lowest latency
export const runtime = "edge";

Vercel's preview deployments are a workflow feature I cannot live without. Every PR gets its own preview URL with its own environment. I can test a new agent integration against real (sandboxed) Supabase data before it ever touches production. When the PR merges, Vercel promotes the preview to production automatically.

How It All Connects: Architecture Overview

Here is the full data flow for a roofing lead coming in through a Retell call, end to end:

architecture-flow.txt
┌─────────────────────────────────────────────────────────────────┐
│                     FULL STACK AI AGENT FLOW                    │
│                        Allen, TX — 2026                         │
└─────────────────────────────────────────────────────────────────┘

  📞 Inbound Call
       │
       ▼
  ┌─────────┐     Real-time voice conversation
  │  RETELL  │──► Collects: name, address, damage_type, insurance
  └─────────┘
       │ call_ended webhook
       ▼
  ┌─────────┐     Webhook trigger → HTTP node → Claude API call
  │   n8n   │──► Claude scores lead 1-10, returns JSON tier
  └─────────┘
       │ lead data + score
       ▼
  ┌──────────┐    INSERT into leads table
  │ SUPABASE │──► Real-time subscription fires → Vercel dashboard updates
  └──────────┘    Edge Function fires → n8n secondary webhook
       │ new lead event
       ▼
  ┌──────────┐    createLeadQualificationTask()
  │ OPENMOSS │──► Assigns to best-fit executor agent
  └──────────┘    Tracks: status, retries, reviewer
       │ task dispatch
       ▼
  ┌──────────┐    Queries OpenClaw tool registry
  │ OPENCLAW │──► Routes to executor with crm.write + lead.score tools
  └──────────┘    Monitors heartbeat, drains queue on redeploy
       │ agent execution
       ▼
  ┌────────────────────┐
  │  CLAUDE API        │   Executor agent calls Claude with lead context
  │  (claude-sonnet)   │──► Writes enriched result back to Supabase
  └────────────────────┘   Creates follow-up task if needed
       │ result written
       ▼
  ┌────────────┐    /api/agent-result route (edge function)
  │   VERCEL   │──► Updates dashboard, triggers SMS via Twilio node in n8n
  └────────────┘    Sales rep gets: name, score, summary, call recording URL

  Total time: call ends → sales rep SMS ≈ 6–9 seconds

The key insight in this architecture is that no single tool is doing more than one job. Retell handles voice. n8n handles integration glue. Claude handles reasoning. Supabase handles persistence. OpenMOSS handles coordination. OpenClaw handles runtime health. Vercel handles deployment and edge routing. When something breaks — and it always does eventually — you know exactly which layer to debug.

I also keep the critical path shallow. The path from "call ends" to "lead in Supabase" is just: Retell webhook → n8n → Claude API → Supabase. Three hops. Everything else — OpenMOSS enrichment, CRM sync, follow-up scheduling — happens asynchronously after the lead is safely written. If OpenMOSS is down, the lead still gets captured.

Lessons Learned + What Is Next

After two years of running agents in production for paying clients, here are the lessons that actually changed how I build:

Write the lead before you reason about it

Your pipeline will fail. The critical path should be: capture data first, enrich later. I used to score leads before writing them to Supabase. When Claude timed out, I lost leads. Now the INSERT is always step one, and everything else is async.

Structured output from Claude, always

If you are parsing Claude's output in n8n or in code, force JSON. Use a system prompt that says 'Return JSON only, no explanation.' Add a fallback node that catches parse errors and re-runs with a stricter prompt. Free-form output causes downstream parse failures that are hard to debug at 2am.

Heartbeats save you on Fridays

OpenClaw's heartbeat system has caught three silent agent failures before clients noticed. An agent crashing without alerting you is one of the worst feelings in production. If your agent infrastructure does not have a heartbeat/health check layer, build one or use OpenClaw.

Separation of concerns is worth the extra code

It is tempting to put everything in one big n8n workflow or one giant agent prompt. Resist it. When the Retell webhook handler also tries to score leads and notify the CRM and update the dashboard, you get a 200-node workflow that no one can debug. Separate responsibilities: one workflow per concern, one agent per domain.

Claude's context window is not free

I used to dump everything into the system prompt. Full CRM data, full lead history, full memory dump. Token costs doubled. Now I use Supabase queries to retrieve the last N relevant rows and inject only what Claude needs for the current task. Precision context beats comprehensive context.

As for what is next: I am working on adding MCP (Model Context Protocol) servers for each client's specific data sources — so any agent in the stack can pull client CRM data, calendar availability, or past job history via a standardized tool interface without hardcoded integrations. I am also experimenting with running smaller, specialized Claude Haiku agents as sub-tasks for high-volume operations where Sonnet's reasoning depth is overkill.

The stack I have described here is not final. It is the best version I have right now, in March 2026, running real workflows for real clients out of Allen, TX. In six months, something will have changed — a better orchestration layer, a faster voice AI, a smarter way to handle agent memory. That is what makes this work interesting.

If you are building something similar or want to talk through your stack, reach out at vixi.agency or find me on GitHub at CachoMX. I am always down to compare notes with other builders who are shipping, not just theorizing.

Want to Build This Stack?

I consult with businesses in the DFW area and remotely to design and deploy full AI agent infrastructure. If your team is ready to stop doing things manually, let's talk.

Get in Touch at VIXI

Related Posts