SC12 Home > SC12 Schedule > SC12 Presentation - Combining In-Situ and In-Transit Processing to Enable Extreme-Scale Scientific Analysis

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Combining In-Situ and In-Transit Processing to Enable Extreme-Scale Scientific Analysis

SESSION: Optimizing I/O For Analytics

EVENT TYPE: Papers

TIME: 11:00AM - 11:30AM

SESSION CHAIR: Dean Hildebrand

AUTHOR(S):Janine C. Bennett, Hasan Abbasi, Peer-Timo Bremer, Ray W. Grout, Attila Gyulassy, Tong Jin, Scott Klasky, Hemanth Kolla, Manish Parashar, Valerio Pascucci, Philippe Pébay, David Thompson, Hongfeng Yu, Fan Zhang, Jacqueline Chen

ROOM:355-D

ABSTRACT:
With the onset of extreme-scale computing, scientists are increasingly unable to save sufficient raw simulation data to persistent storage. Consequently, the community is shifting away from a post-process centric data analysis pipeline to a combination of analysis performed in-situ (on primary compute resources) and in-transit (on secondary resources using asynchronous data transfers). In this paper we summarize algorithmic developments for three common analysis techniques: topological analysis, descriptive statistics, and visualization. We describe a resource scheduling system that supports various analysis workflows, and discuss our use of the DataSpaces and ADIOS frameworks to transfer data between in-situ and in-transit computations. We demonstrate the efficiency of our lightweight, flexible framework on the Jaguar XK6, analyzing data generated by S3D, a massively parallel turbulent combustion code. Our framework allows scientists dealing with the data deluge at extreme-scale to perform analyses at increased temporal resolutions, mitigate I/O costs, and significantly improve time to insight.

Chair/Author Details:

Dean Hildebrand (Chair) - IBM Almaden Research Center

Janine C. Bennett - Sandia National Laboratories

Hasan Abbasi - Oak Ridge National Laboratory

Peer-Timo Bremer - Lawrence Livermore National Laboratory

Ray W. Grout - National Renewable Energy Laboratory

Attila Gyulassy - University of Utah

Tong Jin - Rutgers University

Scott Klasky - Oak Ridge National Laboratory

Hemanth Kolla - Sandia National Laboratories

Manish Parashar - Rutgers University

Valerio Pascucci - University of Utah

Philippe Pébay - Kitware, Inc.

David Thompson - Sandia National Laboratories

Hongfeng Yu - Sandia National Laboratories

Fan Zhang - Rutgers University

Jacqueline Chen - Sandia National Laboratories

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Combining In-Situ and In-Transit Processing to Enable Extreme-Scale Scientific Analysis

SESSION: Optimizing I/O For Analytics

EVENT TYPE:

TIME: 11:00AM - 11:30AM

SESSION CHAIR: Dean Hildebrand

AUTHOR(S):Janine C. Bennett, Hasan Abbasi, Peer-Timo Bremer, Ray W. Grout, Attila Gyulassy, Tong Jin, Scott Klasky, Hemanth Kolla, Manish Parashar, Valerio Pascucci, Philippe Pébay, David Thompson, Hongfeng Yu, Fan Zhang, Jacqueline Chen

ROOM:355-D

ABSTRACT:
With the onset of extreme-scale computing, scientists are increasingly unable to save sufficient raw simulation data to persistent storage. Consequently, the community is shifting away from a post-process centric data analysis pipeline to a combination of analysis performed in-situ (on primary compute resources) and in-transit (on secondary resources using asynchronous data transfers). In this paper we summarize algorithmic developments for three common analysis techniques: topological analysis, descriptive statistics, and visualization. We describe a resource scheduling system that supports various analysis workflows, and discuss our use of the DataSpaces and ADIOS frameworks to transfer data between in-situ and in-transit computations. We demonstrate the efficiency of our lightweight, flexible framework on the Jaguar XK6, analyzing data generated by S3D, a massively parallel turbulent combustion code. Our framework allows scientists dealing with the data deluge at extreme-scale to perform analyses at increased temporal resolutions, mitigate I/O costs, and significantly improve time to insight.

Chair/Author Details:

Dean Hildebrand (Chair) - IBM Almaden Research Center

Janine C. Bennett - Sandia National Laboratories

Hasan Abbasi - Oak Ridge National Laboratory

Peer-Timo Bremer - Lawrence Livermore National Laboratory

Ray W. Grout - National Renewable Energy Laboratory

Attila Gyulassy - University of Utah

Tong Jin - Rutgers University

Scott Klasky - Oak Ridge National Laboratory

Hemanth Kolla - Sandia National Laboratories

Manish Parashar - Rutgers University

Valerio Pascucci - University of Utah

Philippe Pébay - Kitware, Inc.

David Thompson - Sandia National Laboratories

Hongfeng Yu - Sandia National Laboratories

Fan Zhang - Rutgers University

Jacqueline Chen - Sandia National Laboratories

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar