BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121113T213000Z DTEND:20121113T220000Z LOCATION:355-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: The work presented here is driven by two observations. First, heterogeneous architectures that integrate the CPU and the GPU on the same chip are emerging, and hold much promise for supporting power-efficient and scalable high performance computing. Second, MapReduce has emerged as a suitable framework for simplified parallel application development for many classes of applications, including data mining and machine learning applications that benefit=0Afrom accelerators.=0A=0AThis paper focuses on the challenge of scaling a MapReduce application using the CPU and GPU together in an integrated architecture. We use different methods for dividing the work, which=0Aare map-dividing scheme, which divides map tasks on both devices, and the=0Apipelining scheme, which pipelines the map and the reduce stages on different devices. We develop dynamic work distribution schemes for both the approaches.=0ATo achieve high performance, we use a runtime tuning method to adjust task block sizes. SUMMARY:Accelerating MapReduce on a Coupled CPU-GPU Architecture PRIORITY:3 END:VEVENT END:VCALENDAR BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121113T213000Z DTEND:20121113T220000Z LOCATION:355-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: The work presented here is driven by two observations. First, heterogeneous architectures that integrate the CPU and the GPU on the same chip are emerging, and hold much promise for supporting power-efficient and scalable high performance computing. Second, MapReduce has emerged as a suitable framework for simplified parallel application development for many classes of applications, including data mining and machine learning applications that benefit=0Afrom accelerators.=0A=0AThis paper focuses on the challenge of scaling a MapReduce application using the CPU and GPU together in an integrated architecture. We use different methods for dividing the work, which=0Aare map-dividing scheme, which divides map tasks on both devices, and the=0Apipelining scheme, which pipelines the map and the reduce stages on different devices. We develop dynamic work distribution schemes for both the approaches.=0ATo achieve high performance, we use a runtime tuning method to adjust task block sizes. SUMMARY:Accelerating MapReduce on a Coupled CPU-GPU Architecture PRIORITY:3 END:VEVENT END:VCALENDAR