Why Background Jobs Matter for Shopify Apps
Shopify apps frequently need to do work outside the request-response cycle. Webhook handlers must return 200 within five seconds, but the actual processing, syncing inventory to a warehouse, generating reports, sending notification emails, takes much longer. Without a queue, you either block the response (risking webhook failures and automatic retries from Shopify) or fire-and-forget with no guarantee the work completes.
BullMQ paired with Redis solves this cleanly. You push a job onto a queue with its payload, return 200 immediately, and a separate worker process picks up the job and executes it. If the worker crashes mid-processing, the job stays in Redis and gets retried automatically. If you deploy a new version, pending jobs survive because they live in the database, not in application memory.
BullMQ Architecture
BullMQ is built on three components: the Queue (where jobs are pushed), the Worker (where jobs are processed), and Redis (the persistent broker connecting them). Jobs survive app restarts because they're stored in Redis, not process memory. Multiple workers can consume from the same queue for horizontal scaling, and each worker runs in its own Node.js process or container.
This separation means your web server never blocks on heavy computation. It enqueues work and moves on. The worker picks up jobs at its own pace, respecting concurrency limits you define.
import { Queue, Worker } from 'bullmq';
import Redis from 'ioredis';
const connection = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT || '6379'),
maxRetriesPerRequest: null,
});
const orderSyncQueue = new Queue('order-sync', { connection });
const worker = new Worker('order-sync', async (job) => {
const { orderId, shopDomain } = job.data;
await syncOrderToWarehouse(orderId, shopDomain);
await updateOrderMetafield(orderId, { syncedAt: new Date().toISOString() });
return { success: true, orderId };
}, {
connection,
concurrency: 5,
});
worker.on('completed', (job, result) => {
console.log(`Job ${job.id} completed:`, result);
});
worker.on('failed', (job, err) => {
console.error(`Job ${job.id} failed:`, err.message);
});
The concurrency: 5 setting means this worker processes up to five jobs simultaneously. The maxRetriesPerRequest: null on the Redis connection is required by BullMQ, it disables the default retry limit that ioredis sets, allowing BullMQ to handle reconnection logic itself.
Cron-Like Scheduling
BullMQ supports repeatable jobs with cron expressions, giving you true scheduled task behavior without a separate cron daemon or external scheduler. A job defined with a cron pattern fires on schedule, and BullMQ ensures only one instance runs even if you scale to multiple worker replicas.
await orderSyncQueue.add('nightly-inventory-sync',
{ type: 'full-sync' },
{
repeat: {
pattern: '0 2 * * *',
tz: 'UTC',
},
jobId: 'nightly-inventory-sync',
}
);
await orderSyncQueue.add('hourly-price-check',
{ type: 'price-check' },
{
repeat: {
pattern: '0 * * * *',
},
jobId: 'hourly-price-check',
}
);
Always set a jobId for repeatable jobs. Without it, BullMQ creates a new repeatable job entry every time your app restarts and re-registers the schedule. With a jobId, it recognizes the existing schedule and skips duplicate registration.
Common cron patterns for Shopify app automation:
| Pattern | Schedule |
|---|---|
0 */6 * * * |
Every 6 hours |
0 2 * * * |
Daily at 2 AM |
0 9 * * 1 |
Weekly on Monday at 9 AM |
0 0 1 * * |
Monthly on the 1st |
*/5 * * * * |
Every 5 minutes |
Retry Strategies and Error Handling
Shopify API calls fail in production, rate limits (HTTP 429), transient 500 errors, and token expiry all happen regularly. BullMQ's retry configuration lets you handle these gracefully with exponential backoff, and you can add custom logic for specific error types like rate limiting.
const worker = new Worker('order-sync', processJob, {
connection,
concurrency: 5,
limiter: {
max: 10,
duration: 1000,
},
});
async function processJob(job) {
try {
const result = await callShopifyAPI(job.data);
return result;
} catch (error) {
if (error.status === 429) {
const retryAfter = parseInt(error.headers?.['retry-after'] || '2');
throw new DelayedError(`Rate limited, retry in ${retryAfter}s`, retryAfter * 1000);
}
throw error;
}
}
The limiter setting caps the worker at 10 jobs per second, preventing your app from exceeding Shopify's API rate limits. For errors that need custom retry timing, like a 429 with a specific Retry-After header, you can throw a DelayedError to schedule the next attempt at the right interval.
Set queue-level defaults for retry behavior across all jobs:
const orderSyncQueue = new Queue('order-sync', {
connection,
defaultJobOptions: {
attempts: 5,
backoff: {
type: 'exponential',
delay: 2000,
},
removeOnComplete: { count: 1000 },
removeOnFail: { count: 5000 },
},
});
With exponential backoff starting at 2 seconds, the retry intervals are: 2s, 4s, 8s, 16s, 32s. After five failed attempts, the job moves to a "failed" state where you can inspect it, reprocess it manually, or route it to a dead letter queue.
Dead Letter Queues
When jobs exhaust all retry attempts, they need to go somewhere for investigation rather than silently disappearing. A dead letter queue captures failed jobs with their original payload and error context, giving your team a clear path to diagnose and reprocess failures.
worker.on('failed', async (job, err) => {
if (job.attemptsMade >= job.opts.attempts) {
await deadLetterQueue.add('failed-order-sync', {
originalJob: job.data,
error: err.message,
failedAt: new Date().toISOString(),
attempts: job.attemptsMade,
});
await sendAlertToSlack({
channel: '#shopify-alerts',
text: `Order sync failed after ${job.attemptsMade} attempts: ${err.message}`,
orderId: job.data.orderId,
});
}
});
The dead letter queue is itself a BullMQ queue, you can build a separate worker to periodically attempt reprocessing, or an admin interface that lets your team inspect and retry individual failures. The Slack notification ensures nobody has to poll a dashboard to learn about critical failures.
Webhook Integration Pattern
The most common BullMQ integration point in a Shopify app is the webhook handler. Here's how to use it in a Remix app, the handler authenticates the webhook, enqueues the job with the relevant payload, and returns 200 immediately.
import { orderSyncQueue } from '~/queue.server';
import { authenticate } from '~/shopify.server';
export const action = async ({ request }) => {
const { payload, shop } = await authenticate.webhook(request);
await orderSyncQueue.add(`order-${payload.id}`, {
orderId: payload.admin_graphql_api_id,
shopDomain: shop,
orderData: {
totalPrice: payload.total_price,
lineItems: payload.line_items.length,
customer: payload.customer?.email,
},
}, {
priority: payload.total_price > 500 ? 1 : 5,
});
return new Response(null, { status: 200 });
};
Return 200 immediately after enqueueing. Shopify expects webhook responses within 5 seconds. If your handler takes longer, Shopify retries the webhook, potentially creating duplicate jobs. The queue guarantees processing happens reliably in the background.
Redis Configuration for Production
For local development, a Docker Compose setup gets you running quickly with persistence enabled:
version: '3.8'
services:
redis:
image: redis:7-alpine
ports:
- '6379:6379'
volumes:
- redis-data:/data
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy noeviction
volumes:
redis-data:
Three Redis settings matter for job queues. appendonly yes enables AOF persistence, so jobs survive Redis restarts. maxmemory-policy noeviction is critical, it tells Redis to reject new writes rather than evicting existing job data when memory is full. You want to fail loudly, not silently lose jobs. And the maxmemory limit prevents Redis from consuming all available system memory.
Use managed Redis in production. Upstash offers serverless Redis with pay-per-request pricing that pairs well with the database-free app architecture. Redis Cloud provides dedicated instances with automatic failover. Both handle persistence, backups, and scaling so you don't have to.
Monitoring with Bull Board
Visibility into queue health is non-negotiable for production apps. Bull Board provides a real-time dashboard showing active, waiting, completed, and failed jobs per queue, mount it behind authentication in your app.
import { createBullBoard } from '@bull-board/api';
import { BullMQAdapter } from '@bull-board/api/bullMQAdapter';
import { ExpressAdapter } from '@bull-board/express';
const serverAdapter = new ExpressAdapter();
serverAdapter.setBasePath('/admin/queues');
createBullBoard({
queues: [
new BullMQAdapter(orderSyncQueue),
new BullMQAdapter(inventoryQueue),
new BullMQAdapter(deadLetterQueue),
],
serverAdapter,
});
app.use('/admin/queues', serverAdapter.getRouter());
Key metrics to monitor: active jobs (current processing load), waiting count (backlog size), failed count (error rate), completed rate (throughput), and average processing time (performance baseline). A sudden spike in waiting count usually means your workers can't keep up, scale horizontally or increase concurrency.
Common Shopify Job Patterns
Most Shopify apps need a handful of recurring patterns. Here are the most common, along with whether each works best as a scheduled cron job or an event-driven webhook job:
- Bulk metafield updates, Merchant uploads a CSV, app queues one job per product. Event-driven.
- Inventory sync, Hourly reconciliation with a third-party warehouse API. Cron job.
- Order fulfillment polling, Check logistics provider API for tracking number updates. Cron job.
- Subscription billing, Monthly billing cycle via the Shopify Billing API. Cron job.
- Report generation, Weekly sales digest exported to merchant email. Cron job.
- Cache warming, Precompute storefront data after product or collection changes. Event-driven (webhook on product/collection update).
Event-driven jobs respond to something that already happened, a webhook fires and you process it. Cron jobs run on a schedule regardless of events, they poll, aggregate, or maintain. Knowing which pattern applies shapes your queue design, concurrency settings, and retry strategy.
Queue Isolation and Concurrency
A slow inventory sync should never block a time-sensitive billing job. Separate queues with dedicated workers ensure that one bottleneck doesn't cascade across your entire background processing pipeline. Each queue gets its own concurrency limit tuned to the workload's characteristics.
const inventoryQueue = new Queue('inventory-sync', { connection });
const billingQueue = new Queue('billing', { connection });
const notificationQueue = new Queue('notifications', { connection });
const inventoryWorker = new Worker('inventory-sync', processInventory, {
connection, concurrency: 3,
});
const billingWorker = new Worker('billing', processBilling, {
connection, concurrency: 1,
});
const notificationWorker = new Worker('notifications', sendNotification, {
connection, concurrency: 10,
});
Billing gets concurrency: 1 because charging a merchant twice is worse than processing slowly, sequential execution eliminates race conditions. Notifications get concurrency: 10 because they're fast, idempotent, and a backlog degrades the merchant experience. Inventory sync sits in the middle at concurrency: 3, balancing throughput against warehouse API rate limits.
Edge Cases for Production
Several edge cases will bite you if you don't plan for them:
- Job deduplication, Shopify may send the same webhook multiple times. Use the webhook's
X-Shopify-Webhook-Idheader as the job ID to prevent duplicate processing. - Graceful shutdown, Call
worker.close()before process exit to let in-progress jobs finish rather than being abruptly killed and retried. - Redis connection loss, BullMQ auto-reconnects, but jobs that were mid-processing when the connection dropped may fail. Your retry configuration handles this automatically.
- Memory management, Use
removeOnCompleteandremoveOnFailwith count limits to prevent Redis from accumulating unbounded job history. - Time zones for cron jobs, Always specify the
tzoption on repeatable jobs. Without it, schedules default to the server's local time, which may differ between environments. - Rate limiting workers, Use the
limiteroption to cap job throughput per second, keeping your workers within Shopify's API rate limits across all concurrent jobs.
Key Takeaways
- BullMQ + Redis gives Shopify apps a persistent, crash-resistant job queue that decouples receiving work from processing it.
- Webhook handlers should enqueue and return
200immediately, never block on processing. - Repeatable jobs with cron expressions replace external cron daemons; always set a
jobIdto prevent duplicate schedules. - Exponential backoff with a dead letter queue ensures transient failures are retried and permanent failures are investigated.
- Separate queues for different workloads prevent one bottleneck from cascading across your entire system.
- Use managed Redis with
noevictionpolicy in production, never silently lose job data. - Monitor active, waiting, and failed counts in real-time with Bull Board to catch processing bottlenecks early.
If your Shopify app needs reliable background processing, cron jobs, retry logic, or a Redis-backed queue architecture, I can help you design and build it. Let's talk about what your infrastructure actually needs.