Stevora

Idempotency

Ensuring safe concurrent workflow execution

Idempotency

Distributed systems are messy. Jobs get delivered twice, API calls retry on timeout, users double-click buttons. Stevora uses two complementary mechanisms to handle this safely: idempotency keys for workflow creation and optimistic locking for step execution.

Idempotency Keys

When creating a workflow run, you can pass an optional idempotencyKey. If a run with the same key already exists in the workspace, Stevora returns the existing run instead of creating a duplicate.

POST /api/v1/workflow-runs
{
  "definitionId": "def_abc123",
  "input": { "orderId": "order_789" },
  "idempotencyKey": "process-order-789"
}

The key is scoped to your workspace. Two different workspaces can use the same key without conflict.

How It Works

The check happens in createWorkflowRun before the run is created:

// Idempotency check
if (input.idempotencyKey) {
  const existing = await prisma.workflowRun.findUnique({
    where: {
      workspaceId_idempotencyKey: {
        workspaceId,
        idempotencyKey: input.idempotencyKey,
      },
    },
  });
  if (existing) {
    return { run: existing, created: false };
  }
}

The response includes a created flag so the caller knows whether the run is new or an existing duplicate:

{
  "run": { "id": "run_xyz", "status": "PENDING", "..." : "..." },
  "created": false
}

When to Use Idempotency Keys

Use them whenever the same logical trigger could fire more than once:

  • Webhook handlers -- incoming webhooks may be delivered multiple times.
  • API retries -- client-side retry logic after a timeout.
  • Queue consumers -- at-least-once delivery means duplicate messages.
  • User actions -- double-clicks, page refreshes, or repeated form submissions.

A good idempotency key is deterministic and derived from the triggering event:

// Derive from the event that triggered the workflow
const idempotencyKey = `process-order-${order.id}`;

// Or combine multiple identifiers
const idempotencyKey = `send-welcome-${user.id}-${campaign.id}`;

Schema Validation

The idempotency key is validated and limited to 255 characters:

const createWorkflowRunSchema = z.object({
  definitionId: z.string().min(1),
  input: z.record(z.unknown()).optional().default({}),
  idempotencyKey: z.string().max(255).optional(),
});

Optimistic Locking

Even after a workflow run is created, the engine must protect against concurrent modifications. If two workers pick up the same job (due to queue redelivery or a race condition), they must not both execute the same step.

Stevora solves this with optimistic locking via a version column. Every workflow run has an integer version that increments on each state change.

How It Works

When the engine transitions a workflow to RUNNING, it uses updateMany with a version guard:

const updated = await prisma.workflowRun.updateMany({
  where: { id: run.id, version: run.version },
  data: {
    status: 'RUNNING',
    currentStepName: stepName,
    startedAt: run.startedAt ?? new Date(),
    version: { increment: 1 },
  },
});

if (updated.count === 0) {
  throw new ConcurrencyError(`Workflow run ${run.id} was modified concurrently`);
}

The WHERE clause includes version: run.version. If another worker already incremented the version, the update affects zero rows, and the engine throws a ConcurrencyError. The losing worker backs off and the job is redelivered.

This same pattern is applied at every critical state transition:

  • Workflow to RUNNING -- when starting a step.
  • Step completion -- when saving state updates and output.
  • Wait transition -- when a step enters the WAITING state.
  • Retry scheduling -- when setting up the next retry attempt.

The ConcurrencyError

ConcurrencyError is a specialized error class that signals a version conflict:

class ConcurrencyError extends AppError {
  constructor(message: string = 'Concurrent modification detected') {
    super('CONCURRENCY_ERROR', message, 409);
    this.name = 'ConcurrencyError';
  }
}

It returns HTTP 409 (Conflict) and carries the error code CONCURRENCY_ERROR. When the execution engine catches this error, it knows the operation is safe to retry because no state was modified.

Version Increment Flow

Here is a concrete example of a step completing successfully. Notice the version is checked against run.version + 1 because it was already incremented once in the transitionToRunning call:

const updated = await tx.workflowRun.updateMany({
  where: { id: run.id, version: run.version + 1 },
  data: {
    state: stateToSave as Prisma.JsonObject,
    currentStepName: stepDef.name,
    version: { increment: 1 },
  },
});

if (updated.count === 0) {
  throw new ConcurrencyError(
    `Workflow run ${run.id} was modified concurrently during step completion`
  );
}

A typical successful step execution increments the version twice: once for the RUNNING transition and once for the completion.

Step-Level Idempotency

The engine also guards against re-executing completed steps. Before dispatching a step handler, it checks the step run status:

// Idempotency: skip if already completed
if (stepRun.status === 'COMPLETED' || stepRun.status === 'SKIPPED') {
  log.info({ stepName, status: stepRun.status }, 'step already done, advancing');
  await advanceToNextStep(run, steps, stepName);
  return;
}

If a completed step is re-delivered (e.g., due to a queue duplicate), the engine skips execution and advances to the next step. No work is duplicated.

Summary

MechanismProtects AgainstScope
Idempotency keyDuplicate workflow creationPer workspace, set at creation
Optimistic locking (version)Concurrent step executionPer workflow run, every state change
Step status checkRe-executing completed stepsPer step run, before dispatch