Chaos Engineering

Increase confidence in your systems

Accept that every system will fail.

Trying to build the perfect solution is almost impossible, so accept that things might go wrong and engineer for them. At times it is easy to foresee what the issue might be, for example an unexpected surge in customer requests, and what the solution could be, implement autoscaling. But how confident are you that your solution will work (will autoscaling auto scale fast enough?)?

Gain confidence by introducing controlled chaos

This is where chaos engineering comes in. The goal of chaos engineering is to understand how your system will behave under certain conditions in order to improve it. This will increase your confidence in the system. The result should either make you feel amazing at the job you’ve done in architecting your system or having to improve on what you already have.

Chaos Engineering Workshop

We provide you with the expertise to learn the principles of chaos engineering. You will learn how to run controlled experiments in order to gain more confidence in your system and make it more resilient. You will also learn the importance of having visibility of the system, and how this can be improved, to be able to debug issues better.

The program

This two day workshop consists of theory and practical assignments.

Workshop Day 1

  • Introduction to Chaos Engineering
    • Principles
    • Tools
    • Patterns
  • Deep dive into failures
    • What is failure?
    • What are patterns to mitigate failures?
  • Game time!
    • Define the experiment
    • Red team: attack the system
    • What are the weak spots of the system?
    • How can we introduce controlled failures?
  • Blue team: defend the system
    • Do we have enough visibility?
    • Can we find the root cause ?
    • Can we prevent the failure in the future?

Workshop Day 2
  • Recap of Day 1
    • How difficult was it to cause the issue?
    • How difficult was it to fix the issue?
    • Patterns
  • Game time!
    • Swap teams this time
  • Continuously verifying the system
    • Can we automate chaos experiments?
    • Can we integrate it with the current way of working?
  • Improving the system
    • What areas can be improved?
    • Which tooling can help?
    • What are useful patterns?
  • Culture
    • How to avoid blame?
    • How to do proper post-mortems?

So, do you want to become more confident in building and running your systems? Book a workshop today!