Chaos Minions: Harassing the World’s Largest Supercomputer

12:00 pm - 12:20 pm

Objective: Guidance
Audience Level: Beginner/Intermediate
Session Type: Presentation

As HPC systems increase in size and complexity, there is a growing need for resilience validation.Chaos Minions is a framework in which fault injections and recovery are combined into an automated solution.The framework runs harassers on targeted components and provides randomly generated harassment to the system. A workload can run parallel to the framework to identify bottlenecks in resiliency.


, HPC Systems Engineer, Intel