SC12 Home > SC12 Schedule > SC12 Presentation - Classifying Soft Error Vulnerabilities in Extreme-Scale Scientific Applications Using a Binary Instrumentation Tool

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Classifying Soft Error Vulnerabilities in Extreme-Scale Scientific Applications Using a Binary Instrumentation Tool

SESSION: Resilience

EVENT TYPE: Papers

TIME: 1:30PM - 2:00PM

SESSION CHAIR: Bronis R. de Supinski

AUTHOR(S):Dong Li, Jeffrey Vetter, Weikuan Yu

ROOM:255-EF

ABSTRACT:
Extreme-scale scientific applications are at a significant risk of being hit by soft errors on future supercomputers. To better understand soft error vulnerabilities in scientific applications, we have built an empirical fault injection and consequence analysis tool - BIFIT - to evaluate how soft errors impact applications. BIFIT is designed with capability to inject faults at specific targets: execution point and data structure. We apply BIFIT to three scientific applications and investigate their vulnerability to soft errors. We classify each application's individual data structures in terms of their vulnerabilities, and generalize these classifications. Our study reveals that these scientific applications have a wide range of sensitivities to both the time and the location of a soft error. Yet, we are able to identify relationships between vulnerabilities and classes of data structures. These classifications can be used to apply appropriate resiliency solutions to each data structure within an application.

Chair/Author Details:

Bronis R. de Supinski (Chair) - Lawrence Livermore National Laboratory

Dong Li - Oak Ridge National Laboratory

Jeffrey Vetter - Oak Ridge National Laboratory

Weikuan Yu - Auburn University

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Classifying Soft Error Vulnerabilities in Extreme-Scale Scientific Applications Using a Binary Instrumentation Tool

SESSION: Resilience

EVENT TYPE:

TIME: 1:30PM - 2:00PM

SESSION CHAIR: Bronis R. de Supinski

AUTHOR(S):Dong Li, Jeffrey Vetter, Weikuan Yu

ROOM:255-EF

ABSTRACT:
Extreme-scale scientific applications are at a significant risk of being hit by soft errors on future supercomputers. To better understand soft error vulnerabilities in scientific applications, we have built an empirical fault injection and consequence analysis tool - BIFIT - to evaluate how soft errors impact applications. BIFIT is designed with capability to inject faults at specific targets: execution point and data structure. We apply BIFIT to three scientific applications and investigate their vulnerability to soft errors. We classify each application's individual data structures in terms of their vulnerabilities, and generalize these classifications. Our study reveals that these scientific applications have a wide range of sensitivities to both the time and the location of a soft error. Yet, we are able to identify relationships between vulnerabilities and classes of data structures. These classifications can be used to apply appropriate resiliency solutions to each data structure within an application.

Chair/Author Details:

Bronis R. de Supinski (Chair) - Lawrence Livermore National Laboratory

Dong Li - Oak Ridge National Laboratory

Jeffrey Vetter - Oak Ridge National Laboratory

Weikuan Yu - Auburn University

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar