SC12 Home > SC12 Schedule > SC12 Presentation - Evaluating the Error Resilience of GPGPU Applications

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Evaluating the Error Resilience of GPGPU Applications

SESSION: Research Poster Reception

EVENT TYPE: Posters and Electronic Posters

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Bo Fang, Jiesheng Wei, Karthik Pattabiraman, Matei Ripeanu

ROOM:East Entrance

ABSTRACT:
GPUs have been originally designed for error-resilient workload. Today, GPUs are used in error-sensitive applications, e.g. General Purpose GPU (GPGPU) applications. The goal of this project is to investigate the error resilience of GPGPU applications and understand their reliability characteristics. To this end, we employ fault injection on real GPU hardware. We find that, compared to CPUs, GPU platforms lead to a higher rate of silent data corruption a major concern since these errors are not flagged at runtime and often remain latent. We also find that out-of-bound memory accesses are the most critical reason of crashes on GPGPU applications.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Bo Fang - University of British Columbia

Jiesheng Wei - University of British Columbia

Karthik Pattabiraman - University of British Columbia

Matei Ripeanu - University of British Columbia

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Evaluating the Error Resilience of GPGPU Applications

SESSION: Research Poster Reception

EVENT TYPE:

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Bo Fang, Jiesheng Wei, Karthik Pattabiraman, Matei Ripeanu

ROOM:East Entrance

ABSTRACT:
GPUs have been originally designed for error-resilient workload. Today, GPUs are used in error-sensitive applications, e.g. General Purpose GPU (GPGPU) applications. The goal of this project is to investigate the error resilience of GPGPU applications and understand their reliability characteristics. To this end, we employ fault injection on real GPU hardware. We find that, compared to CPUs, GPU platforms lead to a higher rate of silent data corruption a major concern since these errors are not flagged at runtime and often remain latent. We also find that out-of-bound memory accesses are the most critical reason of crashes on GPGPU applications.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Bo Fang - University of British Columbia

Jiesheng Wei - University of British Columbia

Karthik Pattabiraman - University of British Columbia

Matei Ripeanu - University of British Columbia

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar