Engineering

Idempotency keys: making job publishing safe to retry

A network blip between you and your queue can leave you unsure whether a job got enqueued. An idempotency key turns that ambiguous retry into a no-op — here's how publish-boundary dedup works and why your worker still needs to be idempotent.

·10 min read
TL;DR

When a publish request times out, you don't know whether the job was enqueued — retrying risks a duplicate, not retrying risks a lost job. An idempotency key resolves the ambiguity: the first publish with a given key creates the job, and every later publish with the same key returns the existing job id instead of creating a new one. That dedupes the publish boundary. It does not dedupe delivery — the queue is still at-least-once, so your worker must also be idempotent. The combination gives you exactly-once effects without pretending exactly-once delivery exists.

Publishing a job looks atomic from your code: you POST a request, you get back an id. But the network between you and the queue is not atomic, and the failure mode that bites everyone is the same — the request that you can't tell succeeded or failed. This post is about that failure mode, the publish-boundary idempotency key that fixes it, and the part people forget: the key alone is not enough.

The ambiguous timeout

Here's the scenario. Your service receives an order, and it needs to enqueue a 'charge the customer' job. It POSTs to the queue and waits. The job is durably stored — but the response never makes it back. The connection drops, a load balancer recycles, your process gets OOM-killed mid-flight, or the request just exceeds your client timeout. You are now holding a request that might have succeeded and might not have.

Both obvious responses are wrong:

  • Retry blindly. If the first publish actually landed, you now have two 'charge the customer' jobs. The customer gets billed twice.
  • Give up. If the first publish didn't land, the job is gone. The customer never gets charged, and nobody notices until support does.

This isn't a rare edge. Restarts, deploys, autoscaling, and transient network errors all produce it, and the more jobs you publish the more often you hit it. Any publish path that you can't safely retry is a path that will either double-bill or drop work under load. The fix is to make retrying safe — and that's exactly what an idempotency key buys you.

How a publish-boundary key dedupes

An idempotency key is a string you send with the publish. The queue treats it as the identity of the publish operation, not of the job's contents. The protocol is simple:

  1. 1First publish with key K: the queue stores the job, records that K maps to that job id, and returns the id.
  2. 2Any subsequent publish with key K: the queue does not create a new job. It looks up K, finds the existing job, and returns that same job id.

From your caller's perspective, retrying after an ambiguous timeout is now free. If the first attempt landed, the retry returns the id of the job that already exists. If the first attempt didn't land, the retry creates it. Either way you end up with exactly one job and a job id in hand. You can wrap your publish in a retry loop and stop reasoning about whether the previous attempt got through.

publish-charge.ts
ts
1import { SimpleQ } from "@simpleq/sdk";
2 
3const sq = new SimpleQ({ apiKey: process.env.SIMPLEQ_API_KEY! });
4 
5async function enqueueCharge(orderId: string, amountCents: number) {
6 // Deterministic, business-meaningful key: the SAME logical operation
7 // always produces the SAME key, so a retry is a no-op.
8 const idempotencyKey = `order:${orderId}:charge`;
9 
10 // Safe to retry: a network blip or restart between attempts
11 // returns the existing job id instead of enqueuing a duplicate.
12 const job = await sq.publish("payments", {
13 payload: { orderId, amountCents },
14 idempotencyKey,
15 });
16 
17 return job.id; // same id on every retry with this key
18}

The same thing over raw HTTP — the API is HTTP-first, so any language works. Note the key travels in the request body alongside the payload:

publish-charge.sh
bash
1curl -X POST https://api.simpleq.io/v1/queues/payments/jobs \
2 -H "Authorization: Bearer sq_live_..." \
3 -H "Content-Type: application/json" \
4 -d '{
5 "payload": { "orderId": "9482", "amountCents": 4999 },
6 "idempotencyKey": "order:9482:charge"
7 }'
Generate the key before you try, not inside the retry

Compute the idempotency key once, upstream of your retry loop, and reuse it for every attempt. If you generate a fresh UUID inside the loop, each attempt carries a different key and the queue treats them as distinct publishes — you've reintroduced the duplicate you were trying to prevent. The key must be stable across the entire retry sequence for a single logical operation.

Publish-side dedup is not delivery-side dedup

This is the distinction that trips people up, so it's worth being precise. There are two separate boundaries where a duplicate can appear, and an idempotency key only guards one of them.

BoundaryDuplicate sourceGuarded by
PublishRetried POST after an ambiguous timeout / restartIdempotency key (one job created per key)
DeliveryAck timeout, nack-driven retry, redeliveryAn idempotent worker (your code)

The idempotency key collapses many publishes into one job. But that one job is still delivered at-least-once. SimpleQ is push-based: it POSTs the job to your own webhook and waits for one of three signals — ack for success, nack for failure (with a retryable flag), or defer for backpressure (with a retryAfter). If your worker is slow to ack and the ack timeout elapses, the same job is redelivered. If your worker nacks as retryable, the job comes back per your backoff policy. None of those redeliveries are duplicate publishes — they're the queue doing its job — and the idempotency key has nothing to say about them.

An idempotency key does not make delivery exactly-once

The key keeps duplicate jobs out of the queue. It does nothing about the same job arriving at your worker twice. If you wire up idempotent publishing and then assume each job runs exactly once, you've solved half the problem and shipped the other half to production. The worker must be idempotent too.

At-least-once delivery, exactly-once effects

It's tempting to ask for exactly-once delivery. It doesn't exist over an unreliable network — there is always a moment where the queue has handed you the job but doesn't yet know whether you processed it, and if the connection dies in that window it must either redeliver (risking a duplicate) or not (risking a loss). Every honest queue chooses at-least-once and pushes the final dedup to you.

What you actually want is exactly-once effects: no matter how many times a job is published or delivered, the customer is charged once, the email is sent once, the row is inserted once. You get there with two layers:

  • Publish layer — idempotency key. Dedupes duplicate enqueues so the queue holds one job per logical operation.
  • Effect layer — idempotent worker. Dedupes duplicate executions so the side effect lands once, no matter how many times the job is delivered.

Make the worker idempotent where the effect actually lands — in your database, not in the queue. The two standard patterns:

  1. 1Unique constraint. Insert a row keyed by the operation (e.g. a charges table with a unique index on order_id). A duplicate delivery hits the constraint, you catch it, and you ack the job as already-done.
  2. 2Processed-id ledger. Before doing the work, record the job id (or the business key) in a processed_jobs table inside the same transaction as the effect. On redelivery you see it's processed and ack immediately.
worker.ts
ts
1// Your webhook. SimpleQ POSTs the job here and waits for ack/nack/defer.
2app.post("/jobs/payments", async (req, res) => {
3 const { orderId, amountCents } = req.body.payload;
4 
5 try {
6 // Unique constraint on order_id makes the effect idempotent:
7 // a redelivery throws instead of charging twice.
8 await db.charges.insert({ orderId, amountCents });
9 await chargeCustomer(orderId, amountCents);
10 } catch (err) {
11 if (isUniqueViolation(err)) {
12 // Already charged on a prior delivery — treat as success.
13 return ackJob(req.body.id);
14 }
15 // Transient failure: report it and let SimpleQ retry with backoff.
16 return nackJob(req.body.id, { retryable: true });
17 }
18 
19 return ackJob(req.body.id);
20});

Whether the work is charging a card, calling OpenAI's gpt-4o-mini, or calling Anthropic's claude-sonnet-4-6, the shape is identical: the queue may deliver more than once, so the effect must be guarded where it lands. The idempotency key handles the front door; your unique constraint handles the back door.

Choosing good keys

A good idempotency key has two properties: it's deterministic (the same logical operation always produces the same key) and it's collision-free (two genuinely different operations never share a key). Get either wrong and the key works against you.

KeyVerdictWhy
order:9482:chargeGoodDeterministic and business-meaningful; one charge per order
invoice:2026-06:user-771GoodScoped to a logical billing operation
webhook:evt_abc123GoodDerived from the upstream event id you're reacting to
crypto.randomUUID()BadChanges on every retry — defeats dedup entirely
Date.now().toString()BadNon-deterministic; two retries get two keys
user-771BadToo coarse — every job for that user collides

The instinct to reach for a random UUID is the most common mistake, because UUIDs feel 'unique' — but uniqueness per call is the opposite of what you want here. You want uniqueness per logical operation, stable across retries of that operation. Derive the key from the thing the job is about: the order, the invoice, the upstream event, the user-plus-action. If you're reacting to a webhook from another system, that system's event id is often the perfect key.

When in doubt, name it after the effect

If you can write down the sentence 'this should happen at most once per ___', the blank is your key scope. 'At most once per order' → key on the order. 'At most once per invoice per month' → key on invoice plus month. Naming the key after the effect you're protecting keeps it both deterministic and correctly scoped.

Where it fits with retries and DLQs

Idempotent publishing is one piece of a reliable pipeline; it pairs with the queue's own delivery guarantees rather than replacing them. The full picture:

  • The idempotency key makes the publish safe to retry — you never enqueue the same logical job twice.
  • Retries (configurable exponential or fixed backoff, up to maxAttempts of 20) handle transient failures on the worker side after delivery. See why job retries matter.
  • Backpressure keeps a job alive through a rate limit: a downstream 429/503/529 with a Retry-After is deferred and redelivered, with no attempt burned.
  • The dead-letter queue catches jobs that exhaust their attempts, with single and bulk replay so nothing is silently lost. See dead-letter queues explained.

Notice the division of labor. The idempotency key is the only one of these that lives at the publish boundary — everything else operates after the job is durably stored. That's why it's the first thing to get right: if your publish path can double-enqueue, no amount of careful retry-and-DLQ handling downstream will save you, because you'll be faithfully processing two copies of the same work.

A short checklist

Before you call a publish path production-ready:

  1. 1Every publish carries an idempotencyKey derived from the business operation, computed once above your retry loop.
  2. 2The publish is wrapped in a retry loop that's safe to run on an ambiguous timeout.
  3. 3The worker guards its side effect with a unique constraint or a processed-id ledger — it never assumes exactly-once delivery.
  4. 4Keys are scoped to the effect ('at most once per ___'), not to the call and not to the whole user.
  5. 5Failures that exhaust attempts land in the DLQ where you can inspect and replay them.

Get those five right and the ambiguous timeout stops being scary. Retrying a publish becomes a no-op when it should be, and a recovery when it should be, and you never have to reason about which one happened.

SimpleQ implements publish-boundary idempotency keys, configurable retries with backoff, backpressure via defer, and a replayable dead-letter queue — you POST a job over HTTP and it durably delivers to your own worker. See the use cases for end-to-end examples of safe, idempotent job pipelines.

Frequently asked questions

An idempotency key is a string you attach to a publish request so that retrying the same request doesn't create a second job. The first POST with a given key creates the job and records the key; any later POST with the same key returns the id of the job that already exists instead of enqueuing a new one. It makes the publish operation safe to retry after a timeout or a crash.
Try SimpleQ

Ship reliable async work in minutes.

Free tier covers 10,000 job executions a month. No credit card.