📄️Guardrails

For creating custom guardrails, see Creating Custom Guardrails. For the underlying safety primitives, see Safety Primitives.

📄️Guardrails Architecture

AgentOS guardrails internals — input/output dispatcher, ALLOW/SANITIZE/BLOCK/FLAG verdicts, two-phase scanning, fail-open and fail-closed semantics

Five approval triggers, six handler factories (cli, slack, webhook, llmJudge, autoApprove, autoReject), the workflow human step, and the runtime HumanInteractionManager. Pause AgentOS agent runs at any lifecycle event for human review.

📄️PII Redaction (and PHI scrubbing)

Four-tier PII detection and redaction guardrail for AgentOS — regex, NER, optional LLM judge, and redaction engine — covering 18 entity types including SSN, payment information, dates of birth, names, locations, and clinical terminology. Designed to slot into HIPAA-PHI scrubbing pipelines without claiming HIPAA compliance itself.

📄️Creating Custom Guardrails

This guide walks you through everything you need to create, package, test, and deploy a custom guardrail for AgentOS. By the end you will understand the full lifecycle -- from implementing the IGuardrailService interface to publishing a self-contained extension pack.

📄️Safety Primitives

Seven operational safety primitives that wrap every AgentOS LLM call: killswitch, cost guard, circuit breaker, provider health registry, stuck detection, action audit log. Prevent runaway loops, money fires, and zombie agents — independently or as one guard chain via wrapLLMCallback().

Guardrails & Safety

📄️Guardrails

📄️Guardrails Architecture

📄️Human-in-the-Loop (HITL)

📄️PII Redaction (and PHI scrubbing)

📄️Creating Custom Guardrails

📄️Safety Primitives