BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121114T230000Z DTEND:20121114T233000Z LOCATION:355-D DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Parallelization and locality optimization of affine loop nests has been successfully addressed for shared-memory machines. However, many large-scale simulation applications must be executed in a distributed environment, and use irregular/sparse computations where the control-flow and array-access patterns are data-dependent.=0A=0AIn this paper, we propose an approach for effective parallel execution of a class of irregular loop computations in a distributed memory environment, using a combination of static and run-time analysis. We discuss algorithms that analyze sequential code to generate an inspector and an executor. The inspector captures the data-dependent behavior of the computation in parallel and without requiring=0Areplication of any of the data structures used in the original computation. The executor performs the computation in parallel. The effectiveness of the framework is demonstrated on several benchmarks and a climate modeling application. SUMMARY:Code Generation for Parallel Execution of a Class of Irregular Loops on Distributed Memory Systems PRIORITY:3 END:VEVENT END:VCALENDAR BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121114T230000Z DTEND:20121114T233000Z LOCATION:355-D DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Parallelization and locality optimization of affine loop nests has been successfully addressed for shared-memory machines. However, many large-scale simulation applications must be executed in a distributed environment, and use irregular/sparse computations where the control-flow and array-access patterns are data-dependent.=0A=0AIn this paper, we propose an approach for effective parallel execution of a class of irregular loop computations in a distributed memory environment, using a combination of static and run-time analysis. We discuss algorithms that analyze sequential code to generate an inspector and an executor. The inspector captures the data-dependent behavior of the computation in parallel and without requiring=0Areplication of any of the data structures used in the original computation. The executor performs the computation in parallel. The effectiveness of the framework is demonstrated on several benchmarks and a climate modeling application. SUMMARY:Code Generation for Parallel Execution of a Class of Irregular Loops on Distributed Memory Systems PRIORITY:3 END:VEVENT END:VCALENDAR