Quick Answer
Webhooks fail when your endpoint is down, slow, or returns a non-200 response. The sending platform retries — so your handler must be idempotent (safe to run twice), respond 200 immediately, and process asynchronously. Every production webhook handler needs: signature verification, idempotency checks, async queue processing, and a dead letter queue for exhausted retries.
Why webhook retries happen
When NexusProMail delivers a webhook event, it expects a 200 response within a short timeout. Anything else — 500, timeout, 4xx — is treated as failed and retried with exponential backoff. Common causes: deployment downtime, slow database queries in the handler path, unhandled exceptions returning 500, or timeout from synchronous processing.
The fundamental rule: respond first, process second
The most common webhook mistake is doing slow work before returning 200. This causes timeouts which cause retries which cause duplicate processing.
- Verify signature — reject immediately if invalid (401)
- Return 200 immediately
- Push event to a queue
- Process from the queue asynchronously
Python (Flask + Redis)
import hashlib, hmac, os, json
from flask import Flask, request, jsonify
import redis
app = Flask(__name__)
r = redis.Redis.from_url(os.environ["REDIS_URL"])
SECRET = os.environ["WEBHOOK_SECRET"]
def verify_sig(body: bytes, sig: str) -> bool:
expected = hmac.new(SECRET.encode(), body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, sig)
@app.route("/webhooks/email", methods=["POST"])
def webhook():
sig = request.headers.get("X-NexusProMail-Signature", "")
if not verify_sig(request.data, sig):
return jsonify({"error": "Invalid signature"}), 401
r.lpush("webhook_queue", json.dumps(request.get_json()))
return "", 200 # Respond BEFORE processing
def process_event(event: dict):
event_id = event.get("id")
if r.sismember("processed_events", event_id):
return # Already processed — skip
r.sadd("processed_events", event_id)
r.expire("processed_events", 172800) # 48h TTL
if event["type"] == "bounced.hard":
suppress_address(event["data"]["email"])
elif event["type"] == "complained":
suppress_address(event["data"]["email"], reason="complaint")
elif event["type"] == "unsubscribed":
update_subscription(event["data"]["email"], False)Node.js (Express + Bull)
const Queue = require("bull")
const webhookQueue = new Queue("webhooks", process.env.REDIS_URL)
app.post("/webhooks/email", express.raw({ type: "application/json" }), (req, res) => {
const sig = req.headers["x-nexuspromail-signature"]
if (!verifySignature(req.body, sig)) return res.status(401).end()
res.sendStatus(200) // Acknowledge immediately
webhookQueue.add(JSON.parse(req.body), {
attempts: 5,
backoff: { type: "exponential", delay: 1000 }
})
})
webhookQueue.process(async (job) => {
const event = job.data
const already = await redis.get("event:" + event.id)
if (already) return
await processEvent(event)
await redis.setex("event:" + event.id, 172800, "1")
})Dead letter queue for exhausted retries
Events that exhaust all retry attempts must not be silently dropped. Capture them for manual review or replay, and alert for critical event types (bounced.hard, complained).
webhookQueue.on("failed", async (job, err) => {
if (job.attemptsMade >= job.opts.attempts) {
await deadLetterQueue.add({ event: job.data, error: err.message })
if (["bounced.hard","complained"].includes(job.data.type)) {
await alertOncall("Critical webhook failed: " + job.data.type)
}
}
})Checklist
- Signature verification rejects invalid requests with 401
- Handler returns 200 before any processing
- Events pushed to queue on receipt
- Worker checks event ID for idempotency
- Processed IDs stored with 48h TTL
- Dead letter queue captures exhausted retries
- Critical event failures trigger alerts
For the full webhook event reference, see the email webhooks guide. For API integration context, see the API integration guide.