Guardrails
For creating custom guardrails, see Creating Custom Guardrails. For the underlying safety primitives, see Safety Primitives.
Guardrails Architecture
AgentOS guardrails internals — input/output dispatcher, ALLOW/SANITIZE/BLOCK/FLAG verdicts, two-phase scanning, fail-open and fail-closed semantics
Human-in-the-Loop (HITL)
Five approval triggers, six handler factories (cli, slack, webhook, llmJudge, autoApprove, autoReject), the workflow human step, and the runtime HumanInteractionManager. Pause AgentOS agent runs at any lifecycle event for human review.
PII Redaction (and PHI scrubbing)
Four-tier PII detection and redaction guardrail for AgentOS — regex, NER, optional LLM judge, and redaction engine — covering 18 entity types including SSN, payment information, dates of birth, names, locations, and clinical terminology. Designed to slot into HIPAA-PHI scrubbing pipelines without claiming HIPAA compliance itself.
Creating Custom Guardrails
This guide walks you through everything you need to create, package, test, and deploy a custom guardrail for AgentOS. By the end you will understand the full lifecycle -- from implementing the IGuardrailService interface to publishing a self-contained extension pack.
Safety Primitives
Seven operational safety primitives that wrap every AgentOS LLM call: killswitch, cost guard, circuit breaker, provider health registry, stuck detection, action audit log. Prevent runaway loops, money fires, and zombie agents — independently or as one guard chain via wrapLLMCallback().