Idempotency
Ensuring safe concurrent workflow execution
Idempotency
Distributed systems are messy. Jobs get delivered twice, API calls retry on timeout, users double-click buttons. Stevora uses two complementary mechanisms to handle this safely: idempotency keys for workflow creation and optimistic locking for step execution.
Idempotency Keys
When creating a workflow run, you can pass an optional idempotencyKey. If a run with the same key already exists in the workspace, Stevora returns the existing run instead of creating a duplicate.
POST /api/v1/workflow-runs
{
"definitionId": "def_abc123",
"input": { "orderId": "order_789" },
"idempotencyKey": "process-order-789"
}The key is scoped to your workspace. Two different workspaces can use the same key without conflict.
How It Works
The check happens in createWorkflowRun before the run is created:
// Idempotency check
if (input.idempotencyKey) {
const existing = await prisma.workflowRun.findUnique({
where: {
workspaceId_idempotencyKey: {
workspaceId,
idempotencyKey: input.idempotencyKey,
},
},
});
if (existing) {
return { run: existing, created: false };
}
}The response includes a created flag so the caller knows whether the run is new or an existing duplicate:
{
"run": { "id": "run_xyz", "status": "PENDING", "..." : "..." },
"created": false
}When to Use Idempotency Keys
Use them whenever the same logical trigger could fire more than once:
- Webhook handlers -- incoming webhooks may be delivered multiple times.
- API retries -- client-side retry logic after a timeout.
- Queue consumers -- at-least-once delivery means duplicate messages.
- User actions -- double-clicks, page refreshes, or repeated form submissions.
A good idempotency key is deterministic and derived from the triggering event:
// Derive from the event that triggered the workflow
const idempotencyKey = `process-order-${order.id}`;
// Or combine multiple identifiers
const idempotencyKey = `send-welcome-${user.id}-${campaign.id}`;Schema Validation
The idempotency key is validated and limited to 255 characters:
const createWorkflowRunSchema = z.object({
definitionId: z.string().min(1),
input: z.record(z.unknown()).optional().default({}),
idempotencyKey: z.string().max(255).optional(),
});Optimistic Locking
Even after a workflow run is created, the engine must protect against concurrent modifications. If two workers pick up the same job (due to queue redelivery or a race condition), they must not both execute the same step.
Stevora solves this with optimistic locking via a version column. Every workflow run has an integer version that increments on each state change.
How It Works
When the engine transitions a workflow to RUNNING, it uses updateMany with a version guard:
const updated = await prisma.workflowRun.updateMany({
where: { id: run.id, version: run.version },
data: {
status: 'RUNNING',
currentStepName: stepName,
startedAt: run.startedAt ?? new Date(),
version: { increment: 1 },
},
});
if (updated.count === 0) {
throw new ConcurrencyError(`Workflow run ${run.id} was modified concurrently`);
}The WHERE clause includes version: run.version. If another worker already incremented the version, the update affects zero rows, and the engine throws a ConcurrencyError. The losing worker backs off and the job is redelivered.
This same pattern is applied at every critical state transition:
- Workflow to RUNNING -- when starting a step.
- Step completion -- when saving state updates and output.
- Wait transition -- when a step enters the WAITING state.
- Retry scheduling -- when setting up the next retry attempt.
The ConcurrencyError
ConcurrencyError is a specialized error class that signals a version conflict:
class ConcurrencyError extends AppError {
constructor(message: string = 'Concurrent modification detected') {
super('CONCURRENCY_ERROR', message, 409);
this.name = 'ConcurrencyError';
}
}It returns HTTP 409 (Conflict) and carries the error code CONCURRENCY_ERROR. When the execution engine catches this error, it knows the operation is safe to retry because no state was modified.
Version Increment Flow
Here is a concrete example of a step completing successfully. Notice the version is checked against run.version + 1 because it was already incremented once in the transitionToRunning call:
const updated = await tx.workflowRun.updateMany({
where: { id: run.id, version: run.version + 1 },
data: {
state: stateToSave as Prisma.JsonObject,
currentStepName: stepDef.name,
version: { increment: 1 },
},
});
if (updated.count === 0) {
throw new ConcurrencyError(
`Workflow run ${run.id} was modified concurrently during step completion`
);
}A typical successful step execution increments the version twice: once for the RUNNING transition and once for the completion.
Step-Level Idempotency
The engine also guards against re-executing completed steps. Before dispatching a step handler, it checks the step run status:
// Idempotency: skip if already completed
if (stepRun.status === 'COMPLETED' || stepRun.status === 'SKIPPED') {
log.info({ stepName, status: stepRun.status }, 'step already done, advancing');
await advanceToNextStep(run, steps, stepName);
return;
}If a completed step is re-delivered (e.g., due to a queue duplicate), the engine skips execution and advances to the next step. No work is duplicated.
Summary
| Mechanism | Protects Against | Scope |
|---|---|---|
| Idempotency key | Duplicate workflow creation | Per workspace, set at creation |
| Optimistic locking (version) | Concurrent step execution | Per workflow run, every state change |
| Step status check | Re-executing completed steps | Per step run, before dispatch |