AI Agent Safety Dashboard

⚡ Intervention — Alert

Alert Context

Signal

Agent

ARN

Session

Tokens

Duration

⚡ Session Intervention — agent

Session Context

Agent

ARN

Session

Status

Tokens

Duration

⚙️ Safety Thresholds

Configure alert thresholds for cost and evaluation signals. Changes apply on next sync. Observability thresholds are controlled by CloudWatch alarm configurations.

💰 Cost Thresholds

ℹ️ How it works

Each agent gets an AWS Budget that tracks monthly spend. The Default Budget Amount sets the dollar limit — changing it updates all existing budgets immediately. Warning and Critical % control when the dashboard shows alert badges based on how much of the budget is used.

Default Budget Amount ($)Monthly budget for all agents (updates AWS Budgets on save)

Budget Warning %Triggers warning when budget usage exceeds this

Budget Critical %Triggers critical when budget usage exceeds this

🧪 Evaluation Thresholds

Configure when the evaluation alarm fires for each agent. One CloudWatch alarm is created per agent that monitors bad responses across 7 built-in evaluators.

ℹ️ How it works

Each agent has one CloudWatch alarm that monitors bad responses across 7 built-in evaluators. The alarm sums bad scores from 5 of them (Harmfulness, Correctness, Goal Success, Tool Selection, Tool Parameters) — CloudWatch checks every 15 minutes by aggregating bad counts over that window. The alarm fires when the total bad count reaches the lowest non-zero threshold you set below. For example: Harmfulness = 1 and Correctness = 3 means the alarm fires at ≥ 1 total bad. Set a value to 0 to exclude that evaluator from the alarm. The remaining 2 evaluators (Helpfulness and Faithfulness) are tracked and visible in the dashboard evaluator scores but don't contribute to the CloudWatch alarm. Saving updates all agent alarms immediately.

Per-Evaluator Bad Count Thresholds

Maximum bad responses allowed before the alarm fires. Set to 0 to exclude from alarm. Minimum active value: 1.

HarmfulnessMax harmful responses. Most sensitive — even 1 harmful response is typically critical. Min: 1

CorrectnessMax incorrect or partially correct responses. Counts both "Incorrect" and "Partially Correct" labels. Min: 1

Goal Success RateMax goal failures (agent failed to achieve the user's goal). Min: 1

HelpfulnessMax unhelpful responses. Dashboard display only — does not affect CloudWatch alarm

FaithfulnessMax unfaithful responses (hallucinations). Dashboard display only — does not affect CloudWatch alarm

Tool Selection AccuracyMax times agent picked the wrong tool for the task. Min: 1

Tool Parameter AccuracyMax times agent passed wrong parameters to a tool. Min: 1

Dashboard Display Thresholds

Controls severity badges in dashboard tables. Uses bad percentage (bad ÷ total × 100), not raw counts. These do not affect CloudWatch alarms.

Dashboard Warning %Show ⚠️ warning badge when bad % exceeds this value (default: 20%)

Dashboard Critical %Show 🔴 critical badge when bad % exceeds this value (default: 50%)

📡 Observability Thresholds

ℹ️ How it works

Each agent gets 4 static threshold alarms (latency, errors, tokens, invocations). When a metric exceeds its threshold in Datapoints to Alarm out of Evaluation Periods 5-minute windows, the alarm fires. Saving updates all agent alarms immediately.

Per-Metric Thresholds

Latency (ms)Alarm when average response time exceeds this (default: 10000ms = 10s)

Error CountAlarm when errors per period exceed this (default: 5)

Token UsageAlarm when tokens per period exceed this (default: 100000)

Invocation CountAlarm when invocations per period exceed this (default: 200)

Evaluation Window

Evaluation PeriodsNumber of 5-min periods to evaluate (default: 3)

Datapoints to AlarmHow many periods must breach before alarm fires (default: 2). Must be ≤ Evaluation Periods

AI Agent Safety Dashboard

⚡ Intervention — Alert

Alert Context

⚡ Session Intervention — agent

Session Context

💰 Cost Thresholds

🧪 Evaluation Thresholds

📡 Observability Thresholds

Confirm Intervention

⚠️ Stop All Agents

🔄 Confirm Rollback