Chaos Minions: Harassing the Worlds Largest Supercomputer
10/06/2017
12:00 pm - 12:20 pm
OCCC W315A
Objective: Guidance
Audience Level: Beginner/Intermediate
Session Type: Presentation
As HPC systems increase in size and complexity, there is a growing need for resilience validation.Chaos Minions is a framework in which fault injections and recovery are combined into an automated solution.The framework runs harassers on targeted components and provides randomly generated harassment to the system. A workload can run parallel to the framework to identify bottlenecks in resiliency.
Speaker(s)
, HPC Systems Engineer, Intel
|
|