LLM Step

The LLM step calls a language model with full support for tool use loops, guardrail checks, model fallbacks, structured outputs, and template variables. It is the core building block for AI agent workflows in Stevora.

Schema

Field	Type	Required	Default	Description
`type`	`"llm"`	Yes		Step type discriminator
`name`	`string` (1-100 chars)	Yes		Unique step name within the workflow
`model`	`string`	Yes		Primary model identifier (e.g., `gpt-4o`, `claude-sonnet-4-20250514`)
`fallbackModels`	`string[]`	No		Models to try if the primary model fails
`systemPrompt`	`string`	No		System message prepended to the conversation
`messages`	`LlmMessage[]`	Yes		Conversation messages (min 1)
`temperature`	`number`	No		Sampling temperature (0 -- 2)
`maxTokens`	`number`	No		Maximum tokens in the response
`responseFormat`	`"json" \| "text"`	No		Expected response format
`outputSchema`	`Record<string, unknown>`	No		JSON Schema the LLM output must conform to
`tools`	`LlmToolDef[]`	No		Tools the model can call
`guardrails`	`GuardrailConfig`	No		Post-completion validation checks
`maxToolRounds`	`number`	No	`10`	Maximum tool call round-trips (1 -- 20)
`retry`	`RetryPolicy`	No		Retry policy for failed executions

LlmMessage

Field	Type	Required	Description
`role`	`"system" \| "user" \| "assistant"`	Yes	Message role
`content`	`string`	Yes	Message content

LlmToolDef

Field	Type	Required	Description
`name`	`string`	Yes	Tool function name
`description`	`string`	Yes	Description shown to the model
`parameters`	`Record<string, unknown>`	Yes	JSON Schema for tool parameters

GuardrailConfig

Field	Type	Required	Default	Description
`postChecks`	`GuardrailCheck[]`	No	`[]`	Checks to run after each LLM response
`onFailure`	`"block" \| "warn" \| "retry_with_feedback"`	No	`"warn"`	Action when a check fails

GuardrailCheck

Field	Type	Required	Description
`type`	`"content_safety" \| "schema_validation" \| "confidence_threshold"`	Yes	Check type
`config`	`Record<string, unknown>`	No	Check-specific config

Configuration Example

{
  "type": "llm",
  "name": "generate-outreach-email",
  "model": "gpt-4o",
  "fallbackModels": ["claude-sonnet-4-20250514"],
  "systemPrompt": "You are a sales development representative writing personalized outreach emails.",
  "messages": [
    {
      "role": "user",
      "content": "Write a cold outreach email to {{state.enrichedLead.name}} at {{state.enrichedLead.company}}. Their role is {{state.enrichedLead.title}}. Our product helps with {{input.valueProposition}}."
    }
  ],
  "temperature": 0.7,
  "maxTokens": 1000,
  "responseFormat": "json",
  "outputSchema": {
    "type": "object",
    "properties": {
      "subject": { "type": "string" },
      "body": { "type": "string" }
    },
    "required": ["subject", "body"]
  },
  "guardrails": {
    "postChecks": [
      { "type": "content_safety" },
      { "type": "schema_validation" }
    ],
    "onFailure": "retry_with_feedback"
  },
  "maxToolRounds": 5,
  "retry": {
    "maxAttempts": 3,
    "backoffMs": 2000,
    "backoffMultiplier": 2
  }
}

Template Variables

Message content and the system prompt support {{variable}} template syntax. Variables are resolved from the workflow context at execution time.

Prefix	Source	Example
`state.`	`workflowState`	`{{state.enrichedLead.name}}`
`input.`	`workflowInput`	`{{input.companyName}}`

The template engine uses dot-notation traversal. If a path resolves to undefined or null, it is replaced with an empty string. Unrecognized prefixes are left as-is in the output.

// Template:
"Summarize this for {{state.recipientName}}: {{state.rawContent}}"

// With workflowState = { recipientName: "Alice", rawContent: "..." }
// Resolves to:
"Summarize this for Alice: ..."

Tool Use Loop

When tools are defined, the LLM step supports multi-round tool calling:

The step sends the messages and tool definitions to the model.
If the model responds with tool calls, Stevora executes them and appends the results as new messages.
The updated conversation is sent back to the model for another round.
This loop continues until the model responds without tool calls, or maxToolRounds is reached.

If the model exceeds maxToolRounds without producing a final response, the step fails with error code MAX_TOOL_ROUNDS.

{
  "type": "llm",
  "name": "research-company",
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": "Research {{state.companyName}} and summarize their recent funding, team size, and tech stack."
    }
  ],
  "tools": [
    {
      "name": "search_web",
      "description": "Search the web for information",
      "parameters": {
        "type": "object",
        "properties": {
          "query": { "type": "string" }
        },
        "required": ["query"]
      }
    },
    {
      "name": "fetch_page",
      "description": "Fetch the content of a web page",
      "parameters": {
        "type": "object",
        "properties": {
          "url": { "type": "string" }
        },
        "required": ["url"]
      }
    }
  ],
  "maxToolRounds": 8,
  "responseFormat": "json"
}

Guardrails

Guardrails validate the LLM output before the step is marked complete. Three check types are available:

Check Type	Description
`content_safety`	Checks the response for harmful or unsafe content
`schema_validation`	Validates the response against the `outputSchema`
`confidence_threshold`	Checks if the model's response meets a confidence threshold

Failure modes

The onFailure field controls what happens when a guardrail check fails:

warn (default) -- Log the failure but allow the step to complete. The output is used as-is.
block -- Fail the step immediately with error code GUARDRAIL_BLOCKED.
retry_with_feedback -- Send the guardrail failure details back to the model as a user message and ask it to fix the response. Up to 2 guardrail retries are attempted before the step fails with GUARDRAIL_RETRIES_EXHAUSTED.

Model Fallbacks

When fallbackModels is set, the step tries models in order:

Try the primary model.
If it fails (API error, timeout, etc.), try the first fallback.
Continue through fallbackModels until one succeeds.
If all models fail, the step fails with error code LLM_ALL_FAILED.

Each attempt -- successful or failed -- is recorded as an LLM call for cost tracking.

{
  "model": "gpt-4o",
  "fallbackModels": ["claude-sonnet-4-20250514", "gpt-4o-mini"]
}

Output Format

The completed step output includes the model response plus LLM metadata:

{
  "subject": "Quick question about your infrastructure",
  "body": "Hi Alice, I noticed...",
  "_llm": {
    "model": "gpt-4o",
    "inputTokens": 342,
    "outputTokens": 187
  }
}

When responseFormat is "json", the engine parses the response content as JSON (handling markdown code fences that models sometimes add). The parsed fields become top-level keys in the output and are also written to stateUpdates so subsequent steps can access them from workflowState.

When responseFormat is "text" (or unset), the output is { content: "..." }.

On this page