Guardrails and Security
Guardrails are controls that keep AI behavior within allowed boundaries.
Guardrail layers
- Input guardrails: block unsafe/malicious prompts.
- Processing guardrails: restrict tool/data access.
- Output guardrails: moderate and validate response.
Security basics
- Use least privilege for tools and APIs.
- Never expose secrets in prompts or logs.
- Mask PII and sensitive enterprise data.
- Audit all critical tool calls.