Your App LogoYOUR APP EXPERTYAE
    • Services
    • About
    • Portfolio
    • Blog
    • FAQ
    • Build Your App
    1. Home
    2. Blog
    3. Webhook idempotency: the bug most teams ship
    Architecture

    Webhook idempotency: the bug most teams ship

    Why webhook handlers double-charge, double-grant, and double-cancel — and the three-line database pattern that fixes all of it.

    YAEL Engineering·18 Mar 2026·8 min read·1,652 words
    On this page
    • What actually goes wrong
    • The fix, in full
    • Why the transaction matters
    • Concurrent retries
    • What "applying the effect" looks like
    • The 200 response trick
    • What about Stripe's own idempotency keys?
    • Testing idempotency
    • Cleaning up processed_events
    • Other webhook sources, same pattern
    • FAQ
    • What if my handler is fast — do I still need this?
    • Can I use Redis instead of Postgres?
    • What about Stripe's webhook signing — isn't that enough?
    • Does Stripe retry forever?
    • What's the right table column for the event id?
    • How do I handle out-of-order events?
    • What if my handler depends on a slow downstream API?
    • Should I version the event payload?

    The webhook handler that processes a Stripe event twice is the most common preventable bug we see in SaaS code reviews. Stripe will retry. So will Paddle, Telegram, Slack, GitHub, Shopify, and every reasonable webhook source. If your handler is not idempotent, retries become double-charges, double-grants, duplicate emails, and customer support tickets you cannot explain. The fix is one database table and one unique constraint. It takes ten minutes to add and it survives every kind of failure your network can produce.

    This is the pattern we ship into every webhook handler we write — for CloudChat's Stripe billing, for partner-integration webhooks, for the Telegram bot Stripe integration we wrote about separately.

    What actually goes wrong

    Stripe (and every other webhook source) sends events with at-least-once delivery. Reasons retries happen:

    • Your endpoint returned non-2xx
    • Your endpoint took >10 seconds to respond
    • The network blipped between Stripe and you
    • Stripe's delivery infrastructure had a hiccup
    • You restarted your server mid-handler

    In every case Stripe assumes the event was not received and resends. If your handler ran to completion the first time but the response didn't make it back, Stripe will retry — and your handler will run again, on a payload it has already processed.

    A non-idempotent customer.subscription.created handler will provision the plan twice. A non-idempotent invoice.payment_succeeded will credit usage twice. A non-idempotent customer.subscription.deleted will downgrade an already-downgraded account (less bad, but still messy in audit logs).

    The fix, in full

    One table:

    sql
    create table processed_events (
      id         text primary key,
      source     text not null,
      type       text not null,
      payload    jsonb not null,
      processed_at timestamptz not null default now()
    );

    One pattern:

    ts
    // src/app/api/stripe/webhook/route.ts
    import { headers } from "next/headers";
    import Stripe from "stripe";
    import { db } from "@/lib/db";
    
    const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
    
    export async function POST(req: Request) {
      const sig = (await headers()).get("stripe-signature");
      const body = await req.text();
    
      let event: Stripe.Event;
      try {
        event = stripe.webhooks.constructEvent(
          body,
          sig!,
          process.env.STRIPE_WEBHOOK_SECRET!,
        );
      } catch {
        return new Response("invalid signature", { status: 400 });
      }
    
      const result = await db.transaction(async (tx) => {
        // INSERT ... ON CONFLICT DO NOTHING.
        // If the row already exists, this is a retry — skip the work.
        const inserted = await tx.execute(
          `insert into processed_events (id, source, type, payload)
           values ($1, 'stripe', $2, $3)
           on conflict (id) do nothing
           returning id`,
          [event.id, event.type, event],
        );
        if (inserted.length === 0) return "duplicate";
    
        await applyEffect(tx, event);
        return "processed";
      });
    
      return new Response(JSON.stringify({ result }), { status: 200 });
    }

    That's it. The unique constraint on id plus on conflict do nothing plus the transaction guarantees that applyEffect runs exactly once per unique event id — even under concurrent retries, even across process restarts, even if the database connection dies mid-handler.

    Why the transaction matters

    Two things must happen atomically: (1) record that you processed the event, (2) apply the effect. If they're separate, you have a window where one happens and the other doesn't.

    The transaction collapses that window. If applyEffect throws, the insert is rolled back, the event is not marked processed, and the next retry re-runs the whole thing. If applyEffect succeeds, both the insert and the effect commit together.

    Beware non-DB side effects

    The transaction only protects database side effects. If applyEffect sends an email or calls an external API, those happen even on rollback. Make the external call idempotent on its own (use a request key) or push it onto a job queue that runs after the transaction commits.

    Concurrent retries

    Stripe will sometimes send the same event in parallel — especially after their delivery infrastructure recovers from a hiccup. Two requests, same event id, both insert at the same time.

    The on conflict do nothing handles this correctly. One insert wins (returns the id), the other loses (returns empty). The losing handler skips applyEffect. No double-processing.

    For belt-and-suspenders, you can also select ... for update the row first:

    ts
    const existing = await tx.execute(
      `select id from processed_events where id = $1 for update`,
      [event.id],
    );
    if (existing.length > 0) return "duplicate";
    // ... insert and apply

    But the on conflict approach is simpler and equally safe.

    What "applying the effect" looks like

    Each event type has a handler. The handlers can be naive about idempotency because the outer wrapper guarantees they run at-most-once.

    ts
    async function applyEffect(tx: DbTx, event: Stripe.Event) {
      switch (event.type) {
        case "customer.subscription.created":
        case "customer.subscription.updated":
          return upsertSubscription(tx, event.data.object as Stripe.Subscription);
        case "customer.subscription.deleted":
          return cancelSubscription(tx, event.data.object as Stripe.Subscription);
        case "invoice.payment_succeeded":
          return creditUsage(tx, event.data.object as Stripe.Invoice);
        case "invoice.payment_failed":
          return flagFailure(tx, event.data.object as Stripe.Invoice);
        default:
          return; // unknown event type, ack and move on
      }
    }

    Note: even upsertSubscription doing an upsert is fine — the outer wrapper means we won't call it twice for the same event anyway.

    The 200 response trick

    Stripe (and most webhook sources) treat any non-2xx response as "redeliver this." That includes 500s. So an unhandled exception in your handler triggers a retry — which is what you want, as long as your handler is idempotent.

    The trap: returning 500 after the database commit. Stripe redelivers, your idempotency check fires, you return 200, and now Stripe stops retrying. Good. But what about a 500 before the commit? The transaction rolls back, the event isn't marked processed, the retry will run the handler again. Also good.

    What you don't want: returning 200 too eagerly. We've seen code like this in the wild:

    ts
    // BAD — never do this
    export async function POST(req: Request) {
      const event = parse(await req.text());
      setImmediate(() => applyEffect(event)); // fire and forget
      return new Response("ok"); // 200, but the work hasn't happened
    }

    This is broken. If applyEffect fails, Stripe doesn't know to retry because you already returned 200. Real production bug we've debugged at customer code reviews. Don't do this.

    What about Stripe's own idempotency keys?

    Stripe gives you an Idempotency-Key header you can pass when calling Stripe APIs. It guarantees Stripe won't double-process your request. That's a different problem. Webhook idempotency is about your handler not double-processing Stripe's events.

    You need both: idempotency keys on outgoing API calls (so creating a customer doesn't accidentally create two), and processed-event deduplication on incoming webhooks. They're symmetric and they solve different bugs.

    Testing idempotency

    Two tests in CI:

    ts
    test("processing the same event twice has the same effect as once", async () => {
      const event = makeStripeEvent({ id: "evt_test_1", type: "customer.subscription.created" });
      await handleEvent(event);
      await handleEvent(event); // second call should be a no-op
      const subs = await db.subscriptions.findMany();
      expect(subs.length).toBe(1);
    });
    
    test("concurrent processing of the same event still produces one effect", async () => {
      const event = makeStripeEvent({ id: "evt_test_2", type: "customer.subscription.created" });
      await Promise.all([handleEvent(event), handleEvent(event), handleEvent(event)]);
      const subs = await db.subscriptions.findMany();
      expect(subs.length).toBe(1);
    });

    The second test catches the "I forgot to use a transaction" bug.

    Cleaning up processed_events

    The table grows forever if you don't prune. Stripe's redelivery window is ~7 days. We keep 30 days of events for debugging and then trim:

    sql
    delete from processed_events
    where processed_at < now() - interval '30 days';

    Run nightly. Add a processed_at btree index if you have millions of events.

    Other webhook sources, same pattern

    This isn't Stripe-specific. The exact same pattern works for:

    • Telegram bot updates (use update_id)
    • WhatsApp Business API (use the message id)
    • GitHub webhooks (use the X-GitHub-Delivery header)
    • Slack events API (use event_id)
    • Shopify webhooks (use X-Shopify-Webhook-Id)
    • Discord interactions (use the interaction id)

    Every one of them sends a unique id per event. Every one of them will retry. Every one of them needs this same five-line pattern.

    We ship webhook integrations that don't double-fire.

    Stripe, Paddle, Telegram, Discord, GitHub — same pattern, hardened in production across every customer integration.

    See API integrations service

    FAQ

    What if my handler is fast — do I still need this?

    Yes. "Fast" doesn't prevent retries. Network failures, client-side timeouts, infrastructure restarts — they all cause retries. The idempotency layer is the only correct defense.

    Can I use Redis instead of Postgres?

    You can, but you'd need to be very careful about the atomicity guarantee. SETNX in Redis works for the lock, but you've now got two systems to coordinate (the lock in Redis, the effect in Postgres). One transaction in Postgres is simpler and safer.

    What about Stripe's webhook signing — isn't that enough?

    Signing verifies the event came from Stripe. It doesn't deduplicate retries. Those are different layers — you need both.

    Does Stripe retry forever?

    No. Stripe retries for ~3 days on a backoff schedule. If your endpoint is down for 4 days, you'll miss events. Set up a monitor that alerts when webhook delivery fails and a backfill job that fetches recent events via the API on recovery.

    What's the right table column for the event id?

    Use text not varchar. Stripe event ids are 30+ characters. Add primary key for the dedup constraint. Add processed_at for the cleanup job.

    How do I handle out-of-order events?

    Stripe (and most sources) don't guarantee order. Your handler must tolerate subscription.deleted arriving before subscription.updated. Pattern: always read the current state from Stripe's API at handler time rather than trusting the payload. The payload tells you what changed; the API tells you the truth.

    What if my handler depends on a slow downstream API?

    Don't do the slow work in the handler. Insert into processed_events, enqueue a job, return 200. The job does the slow work, with its own retries. Keeps your handler response time under 2 seconds.

    Should I version the event payload?

    If you serialize it into your warehouse for analytics, yes — schema changes over time. If you only use it for dedup, no — the unique id is all you need from a re-processing standpoint.

    TagsWebhooksStripeIdempotencyPostgresReliability
    ServiceStripe IntegrationAPI Integration ServicesSaaS Development
    Case studyCloudChat
    PreviousRAG vs fine-tuning: when to pick each (and when to pick both)Next Telegram bot payments with Stripe: the production integration guide

    Keep reading

    ArchitectureMulti-tenant Postgres: row-level security explained (with real code)How RLS actually works in production multi-tenant SaaS — set policies, set the session variable, handle bypass, and avoid the three failure modes that bite teams at scale.9 min readArchitectureNext.js App Router: server actions vs API routes — when to pick eachWhen server actions are the right call, when API routes still win, and the production patterns we use on every Next.js build.8 min readBots & MessagingTelegram bot payments with Stripe: the production integration guideHow to wire Stripe into a Telegram bot the right way — invoice flows, webhook idempotency, refund handling, and the parts the docs don't tell you.8 min read
    On this page
    • What actually goes wrong
    • The fix, in full
    • Why the transaction matters
    • Concurrent retries
    • What "applying the effect" looks like
    • The 200 response trick
    • What about Stripe's own idempotency keys?
    • Testing idempotency
    • Cleaning up processed_events
    • Other webhook sources, same pattern
    • FAQ
    • What if my handler is fast — do I still need this?
    • Can I use Redis instead of Postgres?
    • What about Stripe's webhook signing — isn't that enough?
    • Does Stripe retry forever?
    • What's the right table column for the event id?
    • How do I handle out-of-order events?
    • What if my handler depends on a slow downstream API?
    • Should I version the event payload?

    YOUR APP EXPERT LTD

    71-75 Shelton Street, LONDON WC2H 9JQ, UK

    +44 20 1234 5678

    [email protected]

    Quick Links

    • Services
    • About Us
    • Portfolio
    • Blog
    • Contact

    Stay Connected

    Newsletter

    Stay updated with our latest innovations and insights.

    © 2026 YOUR APP EXPERT LTD. All rights reserved.

    Engineering the Future of Technology