Skip to content
Skip to main content
Supabase + n8n Lead Database Tutorial
16 min readBy Carlos Aragon

Supabase + n8n: Build a Serverless Lead Database in 1 Hour

One webhook, any lead source, zero duplicates. Connect Retell AI call completions, Typeform submissions, and Meta Ads leads into a single Postgres table — with Row Level Security, proper deduplication, and a schema that actually holds up in production. Cost: ~$6/month.

The Problem: Leads Living in Five Different Places

Before I built this, a VIXI client's lead data lived in five places simultaneously: Retell AI call logs, a Typeform responses spreadsheet, Meta Ads Lead Center, their CRM (half-synced), and a shared Google Sheet their sales rep updated manually. "How many qualified roofing leads came in this week?" was a 20-minute manual job that routinely produced the wrong answer.

The $4k problem: a qualified lead called in via Retell on a Tuesday, then submitted the same contact form on Thursday. Two rows in two systems. Two salespeople followed up within hours of each other. The prospect picked a competitor. Not because we weren't good — because we looked disorganized.

The fix I needed was simple to describe and annoying to actually build without the right tools: one table, every lead, all sources normalized to the same schema, duplicates merged on insert. Supabase and n8n make this achievable in an afternoon. Here's the exact setup.

What we're building:

  • Supabase Postgres table — single source of truth for all leads
  • Row Level Security — nobody reads your data without a service key
  • n8n webhook workflow — normalizes any source payload to your schema
  • Deduplication — upsert on email OR phone, not blind insert
  • Source tagging — every lead knows where it came from
  • Telegram alert — ping when a high-quality lead arrives

Architecture: Why This Stack

The architecture is intentionally simple. Three layers: sources, processor, storage.

Sources:
  Retell AI  ──────────────────────────────────────────┐
  Typeform   ─────────────────────────── n8n Webhook   ├──► Supabase Postgres
  Meta Ads   ──────────────────────────── (normalize,  │
  Web Form   ─────────────────────────── dedup, upsert)┘
  Any webhook ──────────────────────────────────────────┘

Why Supabase: Postgres means real SQL — JOINs, window functions, indexed queries. The free tier gives you 500MB and REST + realtime APIs out of the box. Row Level Security means your table is locked by default; you control exactly who can read or write. And crucially: you own the data. No vendor lock-in to a CRM that changes its export format every six months.

Why n8n: Every lead source sends a different payload shape. Retell sends call metadata. Typeform sends form field IDs. Meta Ads sends a lead gen object. n8n's Function node lets you normalize all of them to one schema before the data ever touches your database. It also handles the deduplication check, the conditional upsert, and optional downstream steps like scoring or alerting.

Why not a direct integration: Every "native" CRM integration I've seen either doesn't support all sources, doesn't deduplicate, or charges $100+/month for the privilege. This setup costs the price of a Postgres hobby instance — or nothing on Supabase free tier.

Supabase Setup: Schema, RLS, and the Service Key

Create a free Supabase project at supabase.com. Once provisioned, open the SQL editor and run the following. This is the real schema I use in production — not a toy example.

Create the Leads Table

-- Enable UUID generation
create extension if not exists "uuid-ossp";

create table if not exists leads (
  id              uuid primary key default uuid_generate_v4(),

  -- Contact identity
  first_name      text,
  last_name       text,
  email           text,
  phone           text,

  -- Source metadata
  source          text not null,        -- 'retell' | 'typeform' | 'meta_ads' | 'web_form'
  source_id       text,                 -- original ID from the source system
  campaign        text,                 -- ad campaign name if applicable

  -- Lead quality
  status          text default 'new',   -- 'new' | 'contacted' | 'qualified' | 'lost' | 'won'
  score           int,                  -- 0-10 AI score (set by n8n + Claude)
  qualified       boolean default false,

  -- Voice call specifics (Retell)
  call_duration_s int,                  -- call length in seconds
  call_transcript text,                 -- truncated transcript
  call_recording  text,                 -- recording URL

  -- Deduplication tracking
  touch_count     int default 1,        -- incremented on each upsert
  first_seen_at   timestamptz default now(),
  last_seen_at    timestamptz default now(),

  -- Metadata
  raw_payload     jsonb,                -- full original webhook payload
  notes           text,
  created_at      timestamptz default now(),
  updated_at      timestamptz default now()
);

-- Indexes for fast dedup lookups
create index if not exists leads_email_idx on leads (email) where email is not null;
create index if not exists leads_phone_idx on leads (phone) where phone is not null;
create index if not exists leads_source_idx on leads (source);
create index if not exists leads_status_idx on leads (status);
create index if not exists leads_created_at_idx on leads (created_at desc);

-- Auto-update updated_at
create or replace function update_updated_at_column()
returns trigger as $$
begin
  new.updated_at = now();
  return new;
end;
$$ language plpgsql;

create trigger update_leads_updated_at
  before update on leads
  for each row execute function update_updated_at_column();

Row Level Security Policies

Enable RLS and lock the table down. Your n8n workflow will use the service_role key which bypasses RLS — everything else is blocked by default.

-- Enable RLS (locks table to all roles by default)
alter table leads enable row level security;

-- service_role bypasses RLS automatically in Supabase
-- Use it exclusively in n8n server-side requests

-- Optional: allow authenticated dashboard users to read (not write)
create policy "dashboard_read_leads"
  on leads for select
  using (auth.role() = 'authenticated');

-- For the n8n service account, use service_role key only
-- Never use anon key for server-side operations

Get Your Service Key

In Supabase: Settings → API → service_role key (the long one, not the anon key). Copy this — you'll add it to n8n as a credential. Never put this in client-side code, public env vars, or git. It bypasses RLS and has full database access.

Security note

The service_role key is a master key. Store it only in n8n's credential vault (encrypted at rest). Never log it, never commit it. Rotate it immediately if exposed.

n8n Workflow: Webhook to Supabase

The workflow has four nodes: Webhook → Normalize → Dedup Check → Upsert. Here's each one.

Node 1: Webhook Trigger

Create a new workflow in n8n. Add a Webhook node. Set method to POST, path to something like /leads/ingest. Copy the production URL — you'll give this to every source system. Authentication: use Header Auth with a static token you set and validate in the next node.

Node 2: Normalize Payload

This is the heart of the workflow. A Function node that takes any source payload and outputs a consistent lead object:

// n8n Function node: Normalize lead from any source
const body = $input.first().json.body || $input.first().json;
const source = body.source || body._source || 'unknown';

let lead = {
  source,
  raw_payload: body,
  status: 'new',
  first_seen_at: new Date().toISOString(),
  last_seen_at: new Date().toISOString(),
};

if (source === 'retell') {
  const call = body.call || body;
  lead.phone = call.from_number?.replace(/\D/g, '').replace(/^1/, '') || null;
  lead.first_name = call.extracted_data?.first_name || null;
  lead.last_name  = call.extracted_data?.last_name  || null;
  lead.email      = call.extracted_data?.email       || null;
  lead.call_duration_s = call.call_duration_s || null;
  lead.call_transcript = (call.transcript || '').slice(0, 2000);
  lead.call_recording  = call.recording_url  || null;
  lead.qualified  = call.extracted_data?.qualified === true;
  lead.campaign   = call.metadata?.campaign  || null;

} else if (source === 'typeform') {
  const answers = body.form_response?.answers || [];
  const getField = (ref) => answers.find(a => a.field?.ref === ref)?.text || null;
  lead.first_name = getField('first_name');
  lead.last_name  = getField('last_name');
  lead.email = getField('email') || answers.find(a => a.type === 'email')?.email || null;
  lead.phone = getField('phone') || answers.find(a => a.type === 'phone_number')?.phone_number || null;
  lead.source_id  = body.form_response?.token || null;
  lead.campaign   = body.form_response?.hidden?.campaign || null;

} else if (source === 'meta_ads') {
  const data = body.entry?.[0]?.changes?.[0]?.value || body;
  const fields = {};
  (data.field_data || []).forEach(f => { fields[f.name] = f.values?.[0] || null; });
  lead.first_name = fields.first_name || fields.full_name?.split(' ')[0] || null;
  lead.last_name  = fields.last_name  || fields.full_name?.split(' ')[1] || null;
  lead.email      = fields.email       || null;
  lead.phone      = fields.phone_number?.replace(/\D/g, '') || null;
  lead.source_id  = data.leadgen_id    || null;
  lead.campaign   = data.campaign_name || null;

} else if (source === 'web_form') {
  lead.first_name = body.first_name || body.name?.split(' ')[0] || null;
  lead.last_name  = body.last_name  || body.name?.split(' ')[1] || null;
  lead.email      = body.email       || null;
  lead.phone      = body.phone?.replace(/\D/g, '') || null;
  lead.campaign   = body.campaign    || body.utm_campaign || null;
  lead.source_id  = body.form_id     || null;
}

// Normalize phone: strip to 10 digits (US)
if (lead.phone) {
  lead.phone = lead.phone.replace(/^1/, '').slice(-10);
}

return [{ json: lead }];

Node 3: Dedup Check

HTTP Request node to Supabase REST API before inserting. Uses PostgREST's or filter to check both email and phone in one query:

// HTTP Request node config for dedup check
Method: GET
URL: https://YOUR_PROJECT.supabase.co/rest/v1/leads

// Query parameters:
select: id,touch_count
or: (email.eq.{{ $json.email }},phone.eq.{{ $json.phone }})
limit: 1

// Headers:
apikey: YOUR_SERVICE_ROLE_KEY
Authorization: Bearer YOUR_SERVICE_ROLE_KEY
Content-Type: application/json

After this node, add an IF node: if the response array has length > 0, route to UPDATE (existing lead). Otherwise route to INSERT (new lead).

Node 4a: Insert New Lead

// HTTP Request node: INSERT
Method: POST
URL: https://YOUR_PROJECT.supabase.co/rest/v1/leads

// Headers:
apikey: YOUR_SERVICE_ROLE_KEY
Authorization: Bearer YOUR_SERVICE_ROLE_KEY
Content-Type: application/json
Prefer: return=representation

// Body (JSON):
{
  "first_name":      "{{ $node['Normalize'].json.first_name }}",
  "last_name":       "{{ $node['Normalize'].json.last_name }}",
  "email":           "{{ $node['Normalize'].json.email }}",
  "phone":           "{{ $node['Normalize'].json.phone }}",
  "source":          "{{ $node['Normalize'].json.source }}",
  "source_id":       "{{ $node['Normalize'].json.source_id }}",
  "campaign":        "{{ $node['Normalize'].json.campaign }}",
  "qualified":       {{ $node['Normalize'].json.qualified }},
  "call_duration_s": {{ $node['Normalize'].json.call_duration_s ?? 'null' }},
  "call_transcript": "{{ $node['Normalize'].json.call_transcript }}",
  "raw_payload":     {{ JSON.stringify($node['Normalize'].json.raw_payload) }},
  "status":          "new"
}

Node 4b: Update Existing Lead

// HTTP Request node: UPDATE (duplicate lead — increment touch_count)
Method: PATCH
URL: https://YOUR_PROJECT.supabase.co/rest/v1/leads?id=eq.{{ $node['Dedup Check'].json[0].id }}

// Headers: same as INSERT

// Body:
{
  "last_seen_at": "{{ new Date().toISOString() }}",
  "touch_count":  {{ ($node['Dedup Check'].json[0].touch_count || 1) + 1 }},
  "source":       "{{ $node['Normalize'].json.source }}",
  "campaign":     "{{ $node['Normalize'].json.campaign }}"
}

Connecting Retell AI

Retell AI sends a webhook after every call completes. This is where the most valuable lead data lives — not a form submission, but a real conversation with qualification data extracted from the transcript.

Retell Webhook Setup

In your Retell dashboard: Agent Settings → Webhook URL → paste your n8n webhook URL. Set events to call_ended and call_analyzed. The analyzed event fires after Retell's AI analysis runs — this is the one you want because it includes extracted data from the conversation.

// Retell call_analyzed webhook payload (abbreviated)
{
  "event": "call_analyzed",
  "call": {
    "call_id": "call_abc123",
    "from_number": "+14695551234",
    "to_number": "+14695559999",
    "call_duration_s": 187,
    "recording_url": "https://...",
    "transcript": "Agent: Hi, this is...\nUser: Yeah I need a roof...",
    "extracted_data": {
      // Fields from your Retell agent's extraction config
      "first_name": "John",
      "last_name": "Martinez",
      "email": "john@example.com",
      "service_needed": "roof replacement",
      "qualified": true,
      "urgency": "high"
    },
    "metadata": {
      "campaign": "storm-damage-2026"
    }
  },
  "source": "retell"
}

Set up your Retell agent's extraction variables to pull the fields you care about from each call transcript. The qualified boolean is the most valuable — instruct the agent to set it based on whether the caller meets your lead criteria (homeowner, specific service needed, zip code match, etc.).

Deduplication: The Part Everyone Gets Wrong

Most tutorials show dedup as "check if email already exists." That breaks the moment someone submits a form with their work email and then calls in from their cell with a different email address. You need composite deduplication.

Dedup hierarchy (in order of confidence):

  1. Match on email AND phone → definitely the same person → update
  2. Match on email only → probably same person → update + flag for review
  3. Match on phone only → probably same person → update + flag for review
  4. No match → new lead → insert

The PostgREST or filter handles all cases simultaneously. When there's a match, you update and increment touch_count — so you always know how many times this person has engaged across sources.

Native Upsert Option

Prefer fewer round trips? Use Postgres's native upsert via partial unique indexes:

-- Add unique constraints for upsert (partial — NULLs don't conflict)
create unique index leads_email_unique
  on leads (email) where email is not null;

create unique index leads_phone_unique
  on leads (phone) where phone is not null;

-- In your Supabase REST POST request, add:
-- Header:  Prefer: resolution=merge-duplicates
-- Param:   on_conflict=email
--
-- This performs: INSERT ... ON CONFLICT (email) DO UPDATE
-- Supabase handles it automatically

The tradeoff: native upsert is faster (one round trip vs two), but the separate check-then-insert pattern gives more control — you can add custom merge logic, route branches differently, or log when a returning lead triggers an update.

Alerts, AI Scoring, and Agent Integration

The database is the foundation. Here's what you can build on top once every lead is in one place.

Telegram Alert for High-Quality Leads

// n8n Function node: Build Telegram alert for qualified leads
const lead = $node['Normalize'].json;
const isNew = $node['Dedup Check'].json.length === 0;

if (!lead.qualified && (lead.score || 0) < 7) {
  return []; // Skip low-quality leads
}

const name = [lead.first_name, lead.last_name].filter(Boolean).join(' ') || 'Unknown';
const contact = [lead.email, lead.phone].filter(Boolean).join(' / ') || 'No contact';
const tag = isNew ? '🆕 New Lead' : '🔁 Return Lead';

const text = [
  tag,
  '👤 ' + name,
  '📞 ' + contact,
  '📍 Source: ' + lead.source,
  '🎯 Campaign: ' + (lead.campaign || 'none'),
  '✅ Qualified: ' + (lead.qualified ? 'Yes' : 'No'),
  '⏱ Call: ' + (lead.call_duration_s ? lead.call_duration_s + 's' : 'N/A'),
].join('\n');

return [{ json: { text } }];

// Follow with Telegram node:
// Chat ID: your Telegram chat ID
// Text: {{ $json.text }}

AI Scoring with Claude

Add a Claude API call to score each lead before insertion. Use Haiku for speed and cost — scoring doesn't need Sonnet:

// HTTP Request node: Claude Haiku lead scoring
Method: POST
URL: https://api.anthropic.com/v1/messages

// Headers:
x-api-key: YOUR_ANTHROPIC_API_KEY
anthropic-version: 2023-06-01
Content-Type: application/json

// Body:
{
  "model": "claude-haiku-4-5-20251001",
  "max_tokens": 100,
  "messages": [{
    "role": "user",
    "content": "Score this lead 0-10 for a roofing company. Return only JSON: {"score": N, "reason": "one sentence"}\n\nName={{ $json.first_name }} {{ $json.last_name }}, Source={{ $json.source }}, Qualified={{ $json.qualified }}, Call={{ $json.call_duration_s }}s, Campaign={{ $json.campaign }}"
  }]
}

// Parse: JSON.parse($json.content[0].text).score
// Add to lead object before Supabase insert

Query Examples

With everything in one table, your SQL becomes genuinely useful:

-- This week's qualified leads by source
select source, count(*) as count, round(avg(score), 1) as avg_score
from leads
where qualified = true
  and created_at > date_trunc('week', now())
group by source order by count desc;

-- High-value leads not yet contacted
select first_name, last_name, email, phone, source, score, campaign
from leads
where score >= 7 and status = 'new'
  and created_at > now() - interval '7 days'
order by score desc, created_at desc;

-- Return lead audit (touched more than once)
select email, phone, touch_count,
       array_agg(distinct source) as sources,
       max(last_seen_at) as last_seen
from leads
where touch_count > 1
group by email, phone, touch_count
order by touch_count desc;

-- Monthly lead volume by campaign
select date_trunc('month', created_at) as month,
       campaign, source, count(*) as leads
from leads where created_at > now() - interval '6 months'
group by 1, 2, 3 order by 1 desc, 4 desc;

Cost Breakdown and Production Notes

I run this for two VIXI clients processing roughly 400 leads/month combined. Here's the real cost:

ComponentPlanMonthly Cost
SupabaseFree tier (500MB — well under)$0
n8nSelf-hosted on Mac mini$0
Claude Haiku (scoring)400 leads × ~300 tokens = 120K tokens~$0.03
Telegram BotFree$0
n8n Cloud (if not self-hosted)Starter plan~$6/mo

Checklist before going live:

  • Test webhook with each source in n8n's test mode
  • Verify RLS: call Supabase with anon key — should return 0 rows
  • Dedup test: submit same email twice — confirm touch_count increments
  • Dedup test: same phone, different email — confirm still deduplicated
  • Confirm service_role key is only in n8n credentials
  • Enable Supabase database backups: Settings → Database → Backups

One Table Changes How You Think About Leads

The operational shift that surprised me most wasn't the deduplication or the alerts — it was having SQL access to lead data. When you can write SELECT * FROM leads WHERE score >= 8 AND status = 'new' and get a clean answer in 50 milliseconds, the entire way you run follow-up changes. You stop asking "how many leads did we get?" and start asking "which leads should we call first today?"

This pipeline is also the foundation for everything else in the AI stack. Once leads are in Supabase, n8n workflows can query them to trigger other automations: follow-up sequences, CRM sync, BellaBot reconfiguration when a new client segment shows up. The database is the connective tissue between your AI agents and your business operations.

Start with one source — whichever has the most volume. Get it working end-to-end before adding the second. The normalization function is built to extend: one new else if branch per source, and everything downstream (dedup, scoring, alerting) works automatically.

Building this for a client?

I build Supabase + n8n lead pipelines as part of VIXI's AI automation stack. If you want this wired up for your agency — or want to talk through a more complex dedup strategy for your specific sources — reach out.

Get in touch →