BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121114T001500Z DTEND:20121114T020000Z LOCATION:East Entrance DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: CUDA reached its max performance with CUDA 4.0. Since its release, NVIDIA has started the re-design of the CUDA framework driven by the search for a framework whose compiler back-end is unified with OpenCL. However, our poster indicates that the new direction comes at a high performance cost. We use the MD code FENZI as our benchmark for our performance analysis. We consider two versions of FENZI: a first version that was implemented for CUDA 4.0 and an optimized version on which we performed additional code optimizations by strictly following NVIDIAs guidelines. For the first version we observed that CUDA 4.0 always outperforms CUDA 4.1, 4.2, and 5.0. We repeated the performance comparison for the optimized FENZI and the four CUDA variants. CUDA 5.0 provides the best performance; still its performance across GPUs and molecular systems is less than the performance of FENZI without optimizations for CUDA 4.0. SUMMARY:On the Cost of a General GPU Framework - The Strange Case of CUDA 4.0 vs. CUDA 5.0 PRIORITY:3 END:VEVENT END:VCALENDAR