The Seahorse writes code for a living so, in the same way that I am obliged to follow sports ever so casually (for instance the Chicago Cubs have hired a new, adorable player) I follow some software/dev/tech blogs so as to be able to hold my own in conversation–at least enough to return the ball back over the net (see? sports metaphor!) So a big h/t to Jenn Webb at O’Reilly who clued me in to the notion of using Chaos Monkey on people.
Kind of Like a Fire Drill
Software isn’t one thing, it’s a lot of inter-connected little ecosystems that all communicate with each other. Chaos Monkey randomly selects one of these systems and terminates it. This happens during the normal work-week, not at 3 a.m., so presumably you have the staff available to troubleshoot and correct the problem. It’s a way to force weaknesses to the surface so that you can plan to mitigate them when you are fully-resourced and at your best.
People are Interconnected Too
So what happens if you send somebody (anybody?) from one of your teams away on short notice? Would the team figure out how to fill the gaps? Could you test this tomorrow? What would happen if you picked someone at 9:47 a.m., whispered in her ear, “Take the rest of the day off without pay. Do not respond to any messages from anyone at work. See you tomorrow morning.” What would happen if you whispered the same thing in a second person’s ear at 10:32 a.m.?
This Sounds Terrifying
I know, right? And yet doesn’t that point to the need to actually try it?