BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121113T180000Z DTEND:20121113T183000Z LOCATION:255-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: In this paper, we describe the challenges involved in=0Adesigning a family of highly-efficient Breadth-First Search (BFS)=0Aalgorithms and in optimizing these algorithms on the latest two=0Agenerations of Blue Gene machines, Blue Gene/P and Blue Gene/Q.=0AWith our recent winning Graph 500 submissions in November=0A2010, June 2011, and November 2011, we have achieved unprecedented=0Ascalability results in both space and size. On Blue Gene/P,=0Awe have been able to parallelize the largest BFS search presented=0Ain the literature, running a scale 38 problem with 238 vertices and=0A242 edges on 131,072 processing cores. Using only four racks of an=0Aexperimental configuration of Blue Gene/Q, we have achieved the=0Afastest processing rate reported to date on a BFS search, 254 billion=0Aedges per second on 65,536 processing cores. This paper describes=0Athe algorithmic design and the main classes of optimizations that=0Awe have used to achieve these results. SUMMARY:Breaking the Speed and Scalability Barriers for Graph Exploration on Distributed-Memory Machines PRIORITY:3 END:VEVENT END:VCALENDAR BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20121113T180000Z DTEND:20121113T183000Z LOCATION:255-EF DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: In this paper, we describe the challenges involved in=0Adesigning a family of highly-efficient Breadth-First Search (BFS)=0Aalgorithms and in optimizing these algorithms on the latest two=0Agenerations of Blue Gene machines, Blue Gene/P and Blue Gene/Q.=0AWith our recent winning Graph 500 submissions in November=0A2010, June 2011, and November 2011, we have achieved unprecedented=0Ascalability results in both space and size. On Blue Gene/P,=0Awe have been able to parallelize the largest BFS search presented=0Ain the literature, running a scale 38 problem with 238 vertices and=0A242 edges on 131,072 processing cores. Using only four racks of an=0Aexperimental configuration of Blue Gene/Q, we have achieved the=0Afastest processing rate reported to date on a BFS search, 254 billion=0Aedges per second on 65,536 processing cores. This paper describes=0Athe algorithmic design and the main classes of optimizations that=0Awe have used to achieve these results. SUMMARY:Breaking the Speed and Scalability Barriers for Graph Exploration on Distributed-Memory Machines PRIORITY:3 END:VEVENT END:VCALENDAR