LLM Step
Orchestrate LLM calls with tools, guardrails, and fallbacks
The LLM step calls a language model with full support for tool use loops, guardrail checks, model fallbacks, structured outputs, and template variables. It is the core building block for AI agent workflows in Stevora.
Schema
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
type | "llm" | Yes | Step type discriminator | |
name | string (1-100 chars) | Yes | Unique step name within the workflow | |
model | string | Yes | Primary model identifier (e.g., gpt-4o, claude-sonnet-4-20250514) | |
fallbackModels | string[] | No | Models to try if the primary model fails | |
systemPrompt | string | No | System message prepended to the conversation | |
messages | LlmMessage[] | Yes | Conversation messages (min 1) | |
temperature | number | No | Sampling temperature (0 -- 2) | |
maxTokens | number | No | Maximum tokens in the response | |
responseFormat | "json" | "text" | No | Expected response format | |
outputSchema | Record<string, unknown> | No | JSON Schema the LLM output must conform to | |
tools | LlmToolDef[] | No | Tools the model can call | |
guardrails | GuardrailConfig | No | Post-completion validation checks | |
maxToolRounds | number | No | 10 | Maximum tool call round-trips (1 -- 20) |
retry | RetryPolicy | No | Retry policy for failed executions |
LlmMessage
| Field | Type | Required | Description |
|---|---|---|---|
role | "system" | "user" | "assistant" | Yes | Message role |
content | string | Yes | Message content |
LlmToolDef
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Tool function name |
description | string | Yes | Description shown to the model |
parameters | Record<string, unknown> | Yes | JSON Schema for tool parameters |
GuardrailConfig
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
postChecks | GuardrailCheck[] | No | [] | Checks to run after each LLM response |
onFailure | "block" | "warn" | "retry_with_feedback" | No | "warn" | Action when a check fails |
GuardrailCheck
| Field | Type | Required | Description |
|---|---|---|---|
type | "content_safety" | "schema_validation" | "confidence_threshold" | Yes | Check type |
config | Record<string, unknown> | No | Check-specific config |
Configuration Example
{
"type": "llm",
"name": "generate-outreach-email",
"model": "gpt-4o",
"fallbackModels": ["claude-sonnet-4-20250514"],
"systemPrompt": "You are a sales development representative writing personalized outreach emails.",
"messages": [
{
"role": "user",
"content": "Write a cold outreach email to {{state.enrichedLead.name}} at {{state.enrichedLead.company}}. Their role is {{state.enrichedLead.title}}. Our product helps with {{input.valueProposition}}."
}
],
"temperature": 0.7,
"maxTokens": 1000,
"responseFormat": "json",
"outputSchema": {
"type": "object",
"properties": {
"subject": { "type": "string" },
"body": { "type": "string" }
},
"required": ["subject", "body"]
},
"guardrails": {
"postChecks": [
{ "type": "content_safety" },
{ "type": "schema_validation" }
],
"onFailure": "retry_with_feedback"
},
"maxToolRounds": 5,
"retry": {
"maxAttempts": 3,
"backoffMs": 2000,
"backoffMultiplier": 2
}
}Template Variables
Message content and the system prompt support {{variable}} template syntax. Variables are resolved from the workflow context at execution time.
| Prefix | Source | Example |
|---|---|---|
state. | workflowState | {{state.enrichedLead.name}} |
input. | workflowInput | {{input.companyName}} |
The template engine uses dot-notation traversal. If a path resolves to undefined or null, it is replaced with an empty string. Unrecognized prefixes are left as-is in the output.
// Template:
"Summarize this for {{state.recipientName}}: {{state.rawContent}}"
// With workflowState = { recipientName: "Alice", rawContent: "..." }
// Resolves to:
"Summarize this for Alice: ..."Tool Use Loop
When tools are defined, the LLM step supports multi-round tool calling:
- The step sends the messages and tool definitions to the model.
- If the model responds with tool calls, Stevora executes them and appends the results as new messages.
- The updated conversation is sent back to the model for another round.
- This loop continues until the model responds without tool calls, or
maxToolRoundsis reached.
If the model exceeds maxToolRounds without producing a final response, the step fails with error code MAX_TOOL_ROUNDS.
{
"type": "llm",
"name": "research-company",
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "Research {{state.companyName}} and summarize their recent funding, team size, and tech stack."
}
],
"tools": [
{
"name": "search_web",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": { "type": "string" }
},
"required": ["query"]
}
},
{
"name": "fetch_page",
"description": "Fetch the content of a web page",
"parameters": {
"type": "object",
"properties": {
"url": { "type": "string" }
},
"required": ["url"]
}
}
],
"maxToolRounds": 8,
"responseFormat": "json"
}Guardrails
Guardrails validate the LLM output before the step is marked complete. Three check types are available:
| Check Type | Description |
|---|---|
content_safety | Checks the response for harmful or unsafe content |
schema_validation | Validates the response against the outputSchema |
confidence_threshold | Checks if the model's response meets a confidence threshold |
Failure modes
The onFailure field controls what happens when a guardrail check fails:
warn(default) -- Log the failure but allow the step to complete. The output is used as-is.block-- Fail the step immediately with error codeGUARDRAIL_BLOCKED.retry_with_feedback-- Send the guardrail failure details back to the model as a user message and ask it to fix the response. Up to 2 guardrail retries are attempted before the step fails withGUARDRAIL_RETRIES_EXHAUSTED.
Model Fallbacks
When fallbackModels is set, the step tries models in order:
- Try the primary
model. - If it fails (API error, timeout, etc.), try the first fallback.
- Continue through
fallbackModelsuntil one succeeds. - If all models fail, the step fails with error code
LLM_ALL_FAILED.
Each attempt -- successful or failed -- is recorded as an LLM call for cost tracking.
{
"model": "gpt-4o",
"fallbackModels": ["claude-sonnet-4-20250514", "gpt-4o-mini"]
}Output Format
The completed step output includes the model response plus LLM metadata:
{
"subject": "Quick question about your infrastructure",
"body": "Hi Alice, I noticed...",
"_llm": {
"model": "gpt-4o",
"inputTokens": 342,
"outputTokens": 187
}
}When responseFormat is "json", the engine parses the response content as JSON (handling markdown code fences that models sometimes add). The parsed fields become top-level keys in the output and are also written to stateUpdates so subsequent steps can access them from workflowState.
When responseFormat is "text" (or unset), the output is { content: "..." }.