Skip to main content

Your First Experiment

This tutorial walks you through creating and running a pod-kill experiment against a sample application. You'll see the full lifecycle: steady-state check, chaos injection, recovery, and verification.

Deploy a sample app

First, deploy something to target. We'll use a simple nginx deployment:

kubectl create namespace demo
kubectl create deployment nginx --image=nginx:latest --replicas=3 -n demo
kubectl wait --for=condition=available deployment/nginx -n demo --timeout=60s

Verify it's running:

kubectl get pods -n demo
# NAME READY STATUS RESTARTS AGE
# nginx-7d9f8b6c4-abc12 1/1 Running 0 30s
# nginx-7d9f8b6c4-def34 1/1 Running 0 30s
# nginx-7d9f8b6c4-ghi56 1/1 Running 0 30s

Create your first experiment

Save this as first-experiment.yaml:

apiVersion: chaos.chaosplane.io/v1alpha1
kind: ChaosExperiment
metadata:
name: nginx-pod-kill
namespace: demo
spec:
target:
kind: Pod
namespace: demo
labelSelector:
matchLabels:
app: nginx
action:
type: pod-kill
parameters:
gracePeriodSeconds: "0"
duration: 30s
rollback:
enabled: false
steadyState:
before:
- name: pods-available
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
after:
- name: pods-recovered
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
recoveryTimeout: 2m

Apply it:

kubectl apply -f first-experiment.yaml

Watch the experiment run

# Watch the experiment status
kubectl get chaosexperiment nginx-pod-kill -n demo -w

# NAME PHASE TARGET ACTION AGE
# nginx-pod-kill Pending Pod pod-kill 1s
# nginx-pod-kill SteadyStateChecking Pod pod-kill 2s
# nginx-pod-kill Running Pod pod-kill 4s
# nginx-pod-kill Completing Pod pod-kill 34s
# nginx-pod-kill Recovering Pod pod-kill 35s
# nginx-pod-kill Completed Pod pod-kill 2m5s

Or use chaosctl:

chaosctl get experiments -n demo
chaosctl describe experiment nginx-pod-kill -n demo

What just happened

  1. SteadyStateChecking - ChaosPlane verified 3 nginx pods were ready before injecting chaos
  2. Running - One or more pods were deleted (the Kubernetes scheduler immediately started replacements)
  3. Completing - The experiment duration elapsed
  4. Recovering - ChaosPlane waited for the steady-state after probe to pass (3 pods ready again)
  5. Completed - Everything recovered successfully

Check the results

chaosctl describe experiment nginx-pod-kill -n demo

You'll see the affected resources, timing, and probe results.

Try with abort conditions

This version automatically aborts if too many pods go down at once:

apiVersion: chaos.chaosplane.io/v1alpha1
kind: ChaosExperiment
metadata:
name: nginx-pod-kill-safe
namespace: demo
spec:
target:
kind: Pod
namespace: demo
labelSelector:
matchLabels:
app: nginx
action:
type: pod-kill
parameters:
gracePeriodSeconds: "0"
duration: 60s
rollback:
enabled: false
abortConditions:
- name: too-many-pods-down
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 1
action: abort
steadyState:
before:
- name: pods-available
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
after:
- name: pods-recovered
type: k8s
k8s:
resource: pods
namespace: demo
labelSelector: app=nginx
condition:
minReady: 3
recoveryTimeout: 2m

Clean up

kubectl delete chaosexperiment nginx-pod-kill -n demo
kubectl delete chaosexperiment nginx-pod-kill-safe -n demo
kubectl delete namespace demo

Next steps