SCHEDULE: NOV 10-16, 2012
When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.
Design and Analysis of Data Management in Scalable Parallel Scripting
SESSION: Big Data
EVENT TYPE: Papers
TIME: 1:30PM - 2:00PM
SESSION CHAIR: Dennis Gannon
AUTHOR(S):Zhao Zhang, Daniel S. Katz, Justin M. Wozniak, Allan Espinosa, Ian Foster
ROOM:255-EF
ABSTRACT:
We seek to enable efficient large-scale parallel execution of applications in which
a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (Many-Task-Computing) applications suffer poor
performance and utilization on large parallel computers due to the volume of filesystem I/O and a lack of appropriate
optimizations in the shared filesystem. Thus, we design and implement a scalable MTC data
management system that uses aggregated compute node local storage for more
efficient data movement strategies. We
co-design the data management system with the data-aware scheduler to enable
dataflow pattern identification and automatic optimization. The framework reduces the
time-to-solution of parallel stages of an astronomy data analysis application, Montage, by 83.2% on 512
cores, decreases time-to-solution of a seismology application, CyberShake, by 7.9%
on 2,048 cores, and delivers BLAST performance better than mpiBLAST at various
scales up to 32,768 cores, while preserving the flexibility of the original BLAST
application.
Chair/Author Details:
Dennis Gannon (Chair) - Microsoft Corporation
Zhao Zhang - University of Chicago
Daniel S. Katz - University of Chicago
Justin M. Wozniak - Argonne National Laboratory
Allan Espinosa - University of Chicago
Ian Foster - University of Chicago
Click here to download .ics calendar file
Click here to download .vcs calendar file
Click here to add event to your Google Calendar
Design and Analysis of Data Management in Scalable Parallel Scripting
SESSION: Big Data
EVENT TYPE:
TIME: 1:30PM - 2:00PM
SESSION CHAIR: Dennis Gannon
AUTHOR(S):Zhao Zhang, Daniel S. Katz, Justin M. Wozniak, Allan Espinosa, Ian Foster
ROOM:255-EF
ABSTRACT:
We seek to enable efficient large-scale parallel execution of applications in which
a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (Many-Task-Computing) applications suffer poor
performance and utilization on large parallel computers due to the volume of filesystem I/O and a lack of appropriate
optimizations in the shared filesystem. Thus, we design and implement a scalable MTC data
management system that uses aggregated compute node local storage for more
efficient data movement strategies. We
co-design the data management system with the data-aware scheduler to enable
dataflow pattern identification and automatic optimization. The framework reduces the
time-to-solution of parallel stages of an astronomy data analysis application, Montage, by 83.2% on 512
cores, decreases time-to-solution of a seismology application, CyberShake, by 7.9%
on 2,048 cores, and delivers BLAST performance better than mpiBLAST at various
scales up to 32,768 cores, while preserving the flexibility of the original BLAST
application.
Chair/Author Details:
Dennis Gannon (Chair) - Microsoft Corporation
Zhao Zhang - University of Chicago
Daniel S. Katz - University of Chicago
Justin M. Wozniak - Argonne National Laboratory
Allan Espinosa - University of Chicago
Ian Foster - University of Chicago
Click here to download .ics calendar file