Your First Experiment
This tutorial walks you through creating and running a pod-kill experiment against a sample application. You'll see the full lifecycle: steady-state check, chaos injection, recovery, and verification.
Deploy a sample app
First, deploy something to target. We'll use a simple nginx deployment:
kubectl create namespace demo
kubectl create deployment nginx --image=nginx:latest --replicas=3 -n demo
kubectl wait --for=condition=available deployment/nginx -n demo --timeout=60s
Verify it's running:
kubectl get pods -n demo
# NAME READY STATUS RESTARTS AGE
# nginx-7d9f8b6c4-abc12 1/1 Running 0 30s
# nginx-7d9f8b6c4-def34 1/1 Running 0 30s
# nginx-7d9f8b6c4-ghi56 1/1 Running 0 30s
Create your first experiment
Save this as first-experiment.yaml:
apiVersion: chaos.chaosplane.io/v1alpha1
kind: ChaosExperiment
metadata:
name: nginx-pod-kill
namespace: demo
spec:
target:
kind: Pod
namespace: demo
labelSelector:
matchLabels:
app: nginx
action:
type: pod-kill
parameters:
gracePeriodSeconds: "0"
duration: 30s
rollback:
enabled: false
steadyState:
before:
- name: pods-available
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
after:
- name: pods-recovered
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
recoveryTimeout: 2m
Apply it:
kubectl apply -f first-experiment.yaml
Watch the experiment run
# Watch the experiment status
kubectl get chaosexperiment nginx-pod-kill -n demo -w
# NAME PHASE TARGET ACTION AGE
# nginx-pod-kill Pending Pod pod-kill 1s
# nginx-pod-kill SteadyStateChecking Pod pod-kill 2s
# nginx-pod-kill Running Pod pod-kill 4s
# nginx-pod-kill Completing Pod pod-kill 34s
# nginx-pod-kill Recovering Pod pod-kill 35s
# nginx-pod-kill Completed Pod pod-kill 2m5s
Or use chaosctl:
chaosctl get experiments -n demo
chaosctl describe experiment nginx-pod-kill -n demo
What just happened
- SteadyStateChecking - ChaosPlane verified 3 nginx pods were ready before injecting chaos
- Running - One or more pods were deleted (the Kubernetes scheduler immediately started replacements)
- Completing - The experiment duration elapsed
- Recovering - ChaosPlane waited for the steady-state
afterprobe to pass (3 pods ready again) - Completed - Everything recovered successfully
Check the results
chaosctl describe experiment nginx-pod-kill -n demo
You'll see the affected resources, timing, and probe results.
Try with abort conditions
This version automatically aborts if too many pods go down at once:
apiVersion: chaos.chaosplane.io/v1alpha1
kind: ChaosExperiment
metadata:
name: nginx-pod-kill-safe
namespace: demo
spec:
target:
kind: Pod
namespace: demo
labelSelector:
matchLabels:
app: nginx
action:
type: pod-kill
parameters:
gracePeriodSeconds: "0"
duration: 60s
rollback:
enabled: false
abortConditions:
- name: too-many-pods-down
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 1
action: abort
steadyState:
before:
- name: pods-available
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
after:
- name: pods-recovered
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
recoveryTimeout: 2m
Clean up
kubectl delete chaosexperiment nginx-pod-kill -n demo
kubectl delete chaosexperiment nginx-pod-kill-safe -n demo
kubectl delete namespace demo
Next steps
- Learn the core concepts
- Pod chaos guide for all 8 pod actions
- Workflows guide to chain experiments together