BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121114T180000Z DTEND:20121114T183000Z LOCATION:355-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Although present X-ray scattering techniques can provide tremendous information=0Aon the nano-structural properties of materials that are valuable in the design=0Aand fabrication of energy-relevant nano-devices, a primary challenge remains in=0Athe analyses of such data. In this paper we describe a high-performance,=0Aflexible, and scalable Grazing Incidence Small Angle X-ray Scattering simulation=0Aalgorithm and codes that we have developed on multi-core/CPU and many-core/GPU clusters. We discuss in detail our implementation, optimization and performance on these platforms. Our results show speedups of ~125x on a Fermi-GPU and ~20x on a Cray-XE6 24-core node, compared to a sequential CPU code, with near linear scaling on multi-node clusters. To our knowledge, this is the=0Afirst GISAXS simulation code that is flexible to compute scattered light=0Aintensities in all spatial directions allowing full reconstruction of GISAXS=0Apatterns for any complex structures and with high-resolutions while reducing=0Asimulation times from months to minutes. SUMMARY:Massively Parallel X-Ray Scattering Simulations PRIORITY:3 END:VEVENT END:VCALENDAR BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121114T180000Z DTEND:20121114T183000Z LOCATION:355-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Although present X-ray scattering techniques can provide tremendous information=0Aon the nano-structural properties of materials that are valuable in the design=0Aand fabrication of energy-relevant nano-devices, a primary challenge remains in=0Athe analyses of such data. In this paper we describe a high-performance,=0Aflexible, and scalable Grazing Incidence Small Angle X-ray Scattering simulation=0Aalgorithm and codes that we have developed on multi-core/CPU and many-core/GPU clusters. We discuss in detail our implementation, optimization and performance on these platforms. Our results show speedups of ~125x on a Fermi-GPU and ~20x on a Cray-XE6 24-core node, compared to a sequential CPU code, with near linear scaling on multi-node clusters. To our knowledge, this is the=0Afirst GISAXS simulation code that is flexible to compute scattered light=0Aintensities in all spatial directions allowing full reconstruction of GISAXS=0Apatterns for any complex structures and with high-resolutions while reducing=0Asimulation times from months to minutes. SUMMARY:Massively Parallel X-Ray Scattering Simulations PRIORITY:3 END:VEVENT END:VCALENDAR