What is the difference between ack, nack, and defer?

They are the three terminal signals in SimpleQ's ack protocol. POST /ack means the job succeeded and is done. POST /nack means it failed; include a retryable flag so the queue knows whether to retry with backoff or send it straight to the dead-letter queue. POST /defer means you hit backpressure and want the job redelivered later without burning an attempt — you pass a retryAfter in seconds.

What is ackTimeout and what happens when it expires?

ackTimeout is the deadline for reporting an outcome after you've returned 200. If you never call /ack, /nack, or /defer before it elapses, the worker is assumed dead and ackTimeoutAction decides what happens: redeliver the job (the default, treated like a retry) or send it to the dead-letter queue. The anthropic template sets a 600-second ackTimeout and the openai template sets 300 seconds.

When should I use standard mode instead of ack mode?

Use standard mode when the work reliably completes inside the ~15-second webhook timeout: fast database writes, cache warms, small synchronous API calls, fan-out enqueues. It's simpler — you return 200 on success or a non-2xx on failure and there's no second call to make. Reach for ack mode only when the work can genuinely exceed the timeout, like long LLM generations or video rendering.

Why do I need idempotency with ack mode?

Because work can be redelivered. If your endpoint returns 200, starts a 90-second generation, then crashes before calling /ack, the ackTimeout fires and the job is redelivered to a fresh worker. Without idempotency you'd run the expensive work twice. Make the handler safe to re-run — check whether the result already exists keyed by job id before starting, and use idempotencyKey at publish time to dedupe duplicate publishes.

Running jobs longer than a webhook timeout: ack mode

TL;DR

A synchronous webhook has a hard ~15-second ceiling, but LLM generations, video work, and slow third-party calls run longer. Ack mode fixes this: return 200 to confirm receipt, then report the real outcome out of band via POST /ack (success), /nack (failure, with a retryable flag), or /defer (backpressure, with retryAfter). ackTimeout sets the reporting deadline and ackTimeoutAction decides what happens if you miss it. Because work can be redelivered, make handlers idempotent. Use standard mode when work fits inside the timeout; use ack mode when it can't.

Push-based queues deliver work by making an HTTP request to your endpoint. That's clean and simple — until the work takes longer than the request is allowed to live. SimpleQ's standard delivery mode enforces a hard 15-second webhook timeout. A Claude or OpenAI generation, a video transcode, or a slow partner API can blow through that before it's anywhere near done. This post is about the mechanism that solves it: ack mode.

The 15-second wall

In standard delivery mode, the contract is synchronous and simple. SimpleQ POSTs the job to your webhook, your handler does the work, and you return a status code before the 15-second timeout expires. A 2xx means success. A non-2xx (or a timeout) means failure, and the job retries according to your queue's backoff policy.

That works beautifully for fast work. It falls apart for slow work. Consider what actually runs longer than 15 seconds:

LLM generations. A long completion from claude-sonnet-4-6 or gpt-4o-mini with a large output can take 30-120 seconds, and reasoning-heavy prompts go further.
Media work. Video transcoding, rendering, and thumbnail extraction routinely run for minutes.
Slow third-party calls. Document processing, payment settlement, KYC checks, and bulk imports against partner APIs that are themselves slow.
Chained external steps. A single job that calls two or three upstreams in sequence, each with its own latency.

The wrong fixes are tempting. You could hold the HTTP connection open and hope nothing times out — but proxies, load balancers, and the platform's own ceiling will cut you off. You could fire-and-forget from inside the handler and return 200 immediately — but then a crash mid-work silently loses the job, because the queue already saw your 200 and considers it delivered. Ack mode is the fix that keeps durability.

What ack mode actually does

Ack mode decouples receipt from outcome. Your endpoint does two things at two different times:

1Acknowledge receipt fast. When SimpleQ POSTs the job, your handler returns 200 quickly — it's just saying "I have this, I'm on it." Kick the actual work onto a background task and return.
2Report the outcome later. When the work finishes — seconds or minutes later — you make a second HTTP call back to SimpleQ telling it what happened.

That second call is one of three signals against the job id. This is the ack protocol, and it's the same vocabulary whether the work took 200 milliseconds or 9 minutes:

Signal	Endpoint	Meaning	Body
ack	POST /v1/jobs/:id/ack	Job succeeded, it's done	(none required)
nack	POST /v1/jobs/:id/nack	Job failed	{ retryable: true \| false }
defer	POST /v1/jobs/:id/defer	Backpressure — redeliver later, no attempt burned	{ retryAfter: <seconds> }

The retryable flag on /nack is load-bearing. retryable: true sends the job back through your queue's backoff policy for another attempt. retryable: false is a permanent failure — the job goes straight to the dead-letter queue without wasting more attempts on something that won't succeed (a malformed payload, a 400 from upstream, a content-policy rejection). For more on getting retry classification right, see why job retries matter.

Defer is not failure

If your downstream returns a 429, 503, or 529 with a Retry-After, don't nack it — defer it. POST /defer with retryAfter set to the upstream's wait, and SimpleQ redelivers the job later without burning an attempt. A job can ride out a sustained rate limit and still complete on its maxAttempts budget. This is covered in depth in handling 429, 503, and 529 backpressure.

ackTimeout and ackTimeoutAction

Once you've returned 200, SimpleQ can no longer tell whether your worker is busy or dead. So ack mode adds a deadline: ackTimeout. It's the maximum time the queue will wait for one of the three signals after delivery. If you call /ack, /nack, or /defer before it elapses, everything proceeds normally. If you don't, the worker is presumed dead and ackTimeoutAction takes over.

Setting	What it controls	Typical value
ackTimeout	How long to wait for an outcome after delivery	300-600s, sized to your worst-case run
ackTimeoutAction = retry	On timeout, redeliver the job (counts like a retry)	Default — survives worker crashes
ackTimeoutAction = dead	On timeout, send straight to the DLQ	When a stuck job should never auto-retry

Size ackTimeout to your realistic worst case, not your average. If a generation usually takes 40 seconds but a long one can hit four minutes, a 300-second ackTimeout gives you headroom; setting it to 60 seconds would redeliver healthy long-running jobs and double your work. The two built-in templates encode sensible defaults: the anthropic template uses a 600-second ackTimeout (Claude generations can be long), and the openai template uses 300 seconds.

Templates are starting points

Creating a queue from the anthropic or openai template wires up ack mode with a sensible ackTimeout, retry backoff, and rate-limit defaults for that provider. Override any field at queue-creation time — the template just saves you from setting six knobs by hand.

Idempotency for redelivered work

Ack mode introduces a possibility standard mode never had: the same job can run more than once. If your endpoint returns 200, starts a 90-second generation, and then the process crashes before calling /ack, the ackTimeout eventually fires and (with the default redeliver action) the job is delivered again to a fresh worker. That's exactly the durability you want — but it means the expensive work could run twice unless your handler is idempotent.

Two layers protect you, and you want both:

Publish-boundary idempotency. Pass an idempotencyKey when you publish the job. SimpleQ dedupes publishes that share a key, so a double-publish (your own retry of the enqueue call) doesn't create two jobs.
Handler-side idempotency. Make the work itself safe to re-run. Before starting, check whether a result already exists keyed by the job id; if it does, skip the work and just /ack. The job id is stable across redeliveries, so it's a natural idempotency key for the result.

Here's the shape of an ack-mode handler that's safe under redelivery. It acknowledges receipt immediately, runs the work in the background, and reports the real outcome — defer on backpressure, nack with retryable on failure, ack on success:

app/api/worker/route.ts

1import { verifySignature } from "@simpleq/sdk";
2 
3const SIMPLEQ = "https://api.simpleq.io";
4const headers = {
5  Authorization: `Bearer ${process.env.SIMPLEQ_KEY}`,
6  "Content-Type": "application/json",
7};
8 
9export async function POST(req: Request) {
10  const raw = await req.text();
11  // Verify HMAC-SHA256 over the raw body before trusting anything.
12  verifySignature(raw, req.headers.get("x-simpleq-signature"), process.env.SIMPLEQ_SECRET!);
13 
14  const { id, payload } = JSON.parse(raw);
15 
16  // Ack mode: return 200 immediately, do the slow work out of band.
17  process.nextTick(() => runJob(id, payload));
18  return new Response("ok", { status: 200 });
19}
20 
21async function runJob(id: string, payload: any) {
22  // Handler-side idempotency: never redo finished work after a redelivery.
23  if (await resultExists(id)) {
24    await fetch(`${SIMPLEQ}/v1/jobs/${id}/ack`, { method: "POST", headers });
25    return;
26  }
27 
28  try {
29    const res = await callUpstream(payload); // may run for minutes
30 
31    if (res.status === 429 || res.status === 503 || res.status === 529) {
32      // Backpressure: defer, don't fail. No attempt is burned.
33      const retryAfter = Number(res.headers.get("retry-after") ?? 30);
34      await fetch(`${SIMPLEQ}/v1/jobs/${id}/defer`, {
35        method: "POST", headers, body: JSON.stringify({ retryAfter }),
36      });
37      return;
38    }
39 
40    await saveResult(id, await res.json());
41    await fetch(`${SIMPLEQ}/v1/jobs/${id}/ack`, { method: "POST", headers });
42  } catch (err) {
43    const retryable = isTransient(err); // 5xx / network → true; 400 / policy → false
44    await fetch(`${SIMPLEQ}/v1/jobs/${id}/nack`, {
45      method: "POST", headers, body: JSON.stringify({ retryable }),
46    });
47  }
48}

The resultExists check is what makes a redelivered job cheap instead of a duplicate generation. Persist the result under the job id the moment the work completes, and the second delivery short-circuits to a clean /ack.

Standard mode vs ack mode: choosing

Ack mode is more capable, but it's also more moving parts — a second HTTP call, an ackTimeout to size, idempotency to get right. Don't reach for it when standard mode would do. The deciding question is simple: can the work reliably finish inside the 15-second webhook timeout?

Workload	Mode	Why
Fast DB write, cache warm, fan-out enqueue	Standard	Completes well under 15s; the synchronous 200/non-2xx contract is enough
Small synchronous API call	Standard	Predictably fast; no need for a second call
LLM generation (Claude, OpenAI)	Ack	Often exceeds 15s; report outcome when the completion lands
Video transcode / render	Ack	Runs for minutes; must report out of band
Slow partner API / document pipeline	Ack	Upstream latency is unpredictable and can exceed the timeout

A practical rule: start in standard mode. If you see jobs failing on timeout — not on logic errors, but on the 15-second ceiling — that's the signal to move that queue to ack mode. Both modes share the same retry, backoff, rate-limit, and dead-letter machinery underneath; only the delivery contract changes.

Setting up an ack-mode queue

Create the queue from the anthropic template (which sets ack mode plus a 600-second ackTimeout for you) or specify the ack-mode fields explicitly. Then publish jobs to it exactly as you would any other queue:

create-and-publish.sh

bash

1# Create an ack-mode queue from the anthropic template (600s ackTimeout).
2curl -X POST https://api.simpleq.io/v1/queues \
3  -H "Authorization: Bearer sq_live_..." \
4  -H "Content-Type: application/json" \
5  -d '{
6    "name": "long-generations",
7    "template": "anthropic",
8    "webhookUrl": "https://your-app.com/api/worker",
9    "ackTimeoutAction": "retry"
10  }'
11 
12# Publish a job. idempotencyKey dedupes the publish; the work runs out of band.
13curl -X POST https://api.simpleq.io/v1/queues/long-generations/jobs \
14  -H "Authorization: Bearer sq_live_..." \
15  -H "Content-Type: application/json" \
16  -d '{
17    "payload": {
18      "model": "claude-sonnet-4-6",
19      "messages": [{ "role": "user", "content": "Write a detailed report on..." }]
20    },
21    "idempotencyKey": "report_user-123_abc"
22  }'

The official TypeScript SDK — @simpleq/sdk on npm — wraps all of this, including signature verification and the ack/nack/defer calls. The API is HTTP-first underneath, so any language that can make a request and POST back an outcome works the same way. You can always inspect a job's state with GET /v1/jobs/:id while you're wiring things up.

Summary

The 15-second webhook timeout is a feature, not a bug — it keeps fast work honest. Ack mode is the escape hatch for the work that genuinely can't fit: confirm receipt with a fast 200, run the slow work out of band, then report the real outcome with /ack, /nack (with the retryable flag), or /defer (with retryAfter). Size ackTimeout to your worst case, pick ackTimeoutAction deliberately, and make handlers idempotent so a redelivered job is cheap rather than duplicated.

If you'd rather not build the receipt-then-report plumbing, the dead-worker detection, and the redelivery loop yourself, SimpleQ gives you ack mode out of the box — durable acceptance, three-signal acks, ackTimeout-based crash recovery, idempotent publishes, and a dead-letter queue with replay. See the ack-mode processing use case for an end-to-end example.

Frequently asked questions

Use ack mode. Your endpoint returns 200 fast to confirm it received the job, then keeps working in the background. When the work finishes — which can be minutes later — you POST the real outcome to /v1/jobs/:id/ack, /nack, or /defer. The delivery no longer has to complete inside the synchronous request, so the ~15-second HTTP ceiling stops mattering.

Try SimpleQ