Blast Radius Policies
A BlastRadiusPolicy is a cluster-scoped guardrail that limits what chaos experiments can do. It evaluates in a 7-step chain before any experiment runs. If any step rejects the experiment, it's blocked (in Enforce mode) or logged (in Audit mode).
Why policies matter
Without guardrails, a misconfigured experiment could target production namespaces, kill too many pods at once, or run at 3am. Policies let platform teams define safe boundaries while giving developers freedom to experiment within them.
The 7-step evaluation chain
- Namespace scope - Is the target namespace in the allowed list?
- Label scope - Does the target match the label selector?
- Action type - Is the action type in the allowed list?
- Max targets - Does the experiment exceed the absolute target count?
- Max percentage - Does the experiment exceed the percentage limit?
- Time windows - Is the current time in an allowed window (or blocked window)?
- Audit mode - If
Audit, log and allow; ifEnforce, block
Basic policy
apiVersion: chaos.chaosplane.io/v1alpha1
kind: BlastRadiusPolicy
metadata:
name: production-guardrails
spec:
enforcement: Enforce
scope:
namespaces:
- production
- staging
targetLimits:
maxTargets: 2
maxPercentage: 20
protectedResources:
namespaces:
- kube-system
- chaosplane
labels:
chaosplane.io/protected: "true"
names:
- kind: Deployment
name: payment-service
namespace: production
actionLimits:
allowedActions:
- pod-kill
- network-delay
- pod-cpu-stress
maxDuration: 10m
timeWindows:
allowed:
- name: business-hours
schedule: "0 9 * * 1-5"
duration: 8h
timezone: UTC
blocked:
- name: peak-traffic
schedule: "0 18 * * 1-5"
duration: 2h
timezone: America/New_York
Enforcement modes
Enforce
Experiments that violate the policy are rejected. The experiment moves to Failed phase with a message explaining which policy step blocked it.
spec:
enforcement: Enforce
Audit
Violations are logged but experiments are allowed to proceed. Use this when rolling out a new policy to understand its impact before enforcing it.
spec:
enforcement: Audit
Scope
The scope field defines which experiments this policy applies to. A policy only evaluates experiments targeting resources within its scope.
spec:
scope:
namespaces:
- production
labelSelector:
matchLabels:
environment: production
If namespaces is empty, the policy applies to all namespaces. If labelSelector is empty, it applies to all labels.
Target limits
Limit how many resources a single experiment can affect:
spec:
targetLimits:
maxTargets: 3 # absolute maximum
maxPercentage: 25 # percentage of matching resources
Both limits are evaluated. The experiment is blocked if it would exceed either one.
Protected resources
Resources that can never be targeted, regardless of the experiment spec:
spec:
protectedResources:
# Entire namespaces
namespaces:
- kube-system
- monitoring
# Resources with specific labels
labels:
chaosplane.io/protected: "true"
tier: database
# Specific named resources
names:
- kind: Pod
name: critical-singleton
namespace: production
- kind: Node
name: control-plane-1
Action limits
Restrict which action types are allowed and cap experiment duration:
spec:
actionLimits:
allowedActions:
- pod-kill
- network-delay
- pod-cpu-stress
maxDuration: 5m
If allowedActions is empty, all actions are permitted. maxDuration applies to the experiment's spec.duration.
Time windows
Control when experiments can run using cron expressions.
Allowed windows
Experiments can only run during these windows:
spec:
timeWindows:
allowed:
- name: business-hours-utc
schedule: "0 9 * * 1-5" # 9am Monday-Friday
duration: 8h
timezone: UTC
- name: weekend-testing
schedule: "0 10 * * 6" # 10am Saturday
duration: 4h
timezone: America/Los_Angeles
Blocked windows
Experiments are blocked during these windows (takes precedence over allowed):
spec:
timeWindows:
blocked:
- name: deployment-freeze
schedule: "0 17 * * 5" # 5pm Friday
duration: 64h # through Sunday night
timezone: UTC
- name: peak-hours
schedule: "0 12 * * 1-5" # noon weekdays
duration: 2h
timezone: America/New_York
The schedule field uses standard 5-field cron syntax: minute hour day-of-month month day-of-week.
Multiple policies
Multiple policies can apply to the same experiment. All policies are evaluated, and the experiment is blocked if any one of them rejects it.
Audit-first workflow
The recommended approach for new policies:
- Deploy in
Auditmode - Run experiments and observe which ones would be blocked
- Adjust the policy as needed
- Switch to
Enforcemode
# Check policy evaluation results in audit mode
chaosctl events -n production | grep BlastRadiusPolicy
Example: development vs production
# Permissive policy for development
---
apiVersion: chaos.chaosplane.io/v1alpha1
kind: BlastRadiusPolicy
metadata:
name: dev-policy
spec:
enforcement: Audit
scope:
namespaces: [development, staging]
targetLimits:
maxTargets: 10
maxPercentage: 50
---
# Strict policy for production
apiVersion: chaos.chaosplane.io/v1alpha1
kind: BlastRadiusPolicy
metadata:
name: prod-policy
spec:
enforcement: Enforce
scope:
namespaces: [production]
targetLimits:
maxTargets: 1
maxPercentage: 10
protectedResources:
namespaces: [kube-system]
labels:
tier: database
actionLimits:
allowedActions: [pod-kill, network-delay]
maxDuration: 5m
timeWindows:
allowed:
- name: chaos-hours
schedule: "0 10 * * 2-4"
duration: 4h
timezone: UTC