Policies & Guardrails
Aberon enforces policies on AI agent behavior in real-time. Policies are configured in the dashboard — no agent code changes needed.
Policy Types
1. PII Detection
Automatically detect and mask personally identifiable information in traces.
How it works:
- Powered by Microsoft Presidio NLP engine
- Scans agent inputs/outputs before storage
- Detected PII is replaced with type labels:
[PERSON],[US_SSN],[IBAN_CODE] - Raw PII is NEVER stored
Supported entity types:
PERSON, PHONE_NUMBER, EMAIL_ADDRESS, CREDIT_CARD, US_SSN, IBAN_CODE, IP_ADDRESS, LOCATION, DATE_TIME, NRP, MEDICAL_LICENSE
Example:
Agent output (original):
"Employee John Smith (SSN: 123-45-6789) will receive payment to account DE89370400440532013000."
Stored in Aberon:
"Employee [PERSON] (SSN: [US_SSN]) will receive payment to account [IBAN_CODE]."
Audit log records: pii_detected: ["PERSON", "US_SSN", "IBAN_CODE"], fields_masked: 3
Dashboard: Policies → Create → PII Detection
- Select entity types to detect
- Apply to specific agent or all agents
- Action: redact (mask and store) or block (reject the trace)
2. Tool Restriction
Block or require approval for specific tool calls.
Use case: Your support agent has access to search_kb, create_ticket, send_email, execute_sql. You want to block the last two.
Dashboard: Policies → Create → Tool Restriction
- List blocked tools:
send_email,execute_sql,delete_user - Action: block (instant) or require_approval (human decides)
- Apply to specific agent or all agents
What happens when an agent calls a blocked tool:
14:23:01 search_kb("refund policy") ✅ Allowed
14:23:03 create_ticket(customer=...) ✅ Allowed
14:23:05 send_email(to="client@...") ❌ BLOCKED Dashboard shows a guardrail block notification with:
- Which agent attempted it
- Which tool was called
- Which policy blocked it
- Full trace link
Every block is recorded in the audit trail.
3. Cost Limit
Pause agent execution when cost exceeds a threshold.
Use case: Data analysis agent processes large datasets via GPT-4. Normally costs $0.50 per run. Sometimes enters a loop and costs $500.
Dashboard: Policies → Create → Cost Limit
- Set max cost: $50.00 per run
- Action: require_approval
- Timeout: 600 seconds (10 minutes for human to decide)
What happens:
Step 1: Parse dataset $2.30 ✅ Step 2: Summarize $18.40 ✅ Step 3: Cross-reference $31.20 ✅ Step 4: Generate report ⏳ PAUSED — $51.90 exceeds $50 limit
Dashboard shows approval request:
- Current cost: $51.90
- Limit: $50.00
- [Approve] [Reject] [8:42 remaining]
If approved: Step 4 completes. Total: $64.00. Audit: "Approved by analyst@company.com"
If rejected: Agent stops. No additional cost.
If timeout: Agent stops. Policy default applies.
4. Approval Workflows (Human-in-the-Loop)
Any policy with action "require_approval" creates an approval request.
Approval lifecycle:
- Policy triggers → approval request created
- Agent pauses (SDK polls for decision)
- Human sees request in Dashboard → Approvals
- Human approves or rejects with optional reason
- Agent receives decision and continues or stops
- Everything recorded in audit trail
Dashboard: Approvals page shows:
- Pending approvals with countdown timer
- Approved/rejected history
- Who decided, when, why
SDK integration:
result = agent.check_guardrails(tool_name="send_email", trace_id=t.trace_id)
if result.requires_approval:
pending = PendingApproval(client._transport, result.requires_approval)
decision = pending.wait(timeout=120, poll_interval=3)
# Returns when human approves, or raises ApprovalDeniedError / ApprovalExpiredError Policy Scope
Policies can target:
- All agents — global policy (target_agent_id = None)
- Specific agent — only applies to one agent
- Agent + children — applies to agent and all its sub-agents (apply_to_children = True)
Policy Priority
When multiple policies apply to the same action, they are evaluated in priority order (lower number = higher priority). First policy that blocks or requires approval wins.
Audit Trail for Policies
Every policy action is recorded:
guardrail.passed— check passed, agent proceedsguardrail.blocked— action blocked by policyguardrail.approval_requested— human approval neededapproval.approved— human approved with reasonapproval.rejected— human rejected with reasonapproval.expired— no decision within timeout
All entries are part of the SHA-256 hash chain — tamper-evident. Learn more about the immutable audit trail.