SC12 Home > SC12 Schedule > SC12 Presentation - Hadoop's Adolescence: A Comparative Workload Analysis from Three Research Clusters

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Hadoop's Adolescence: A Comparative Workload Analysis from Three Research Clusters

SESSION: Research Poster Reception

EVENT TYPE: Posters and Electronic Posters

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Kai Ren, Garth Gibson, YongChul Kwon, Magdalena Balazinska, Bill Howe

ROOM:East Entrance

ABSTRACT:
We analyze Hadoop workloads from three different research clusters from an application-level perspective, with two goals: (1) explore new issues in application patterns and user behavior and (2) understand key performance challenges related to IO. Our analysis suggests that Hadoop usage is still in its adolescence. We see underuse of Hadoop features, extensions, and tools as well as significant opportunities for optimization. We see significant diversity in application styles, including some ``interactive'' workloads, motivating new tools in the ecosystem. We find that some conventional approaches to improving performance are not especially effective and suggest some alternatives. Overall, we find significant opportunity for simplifying the use and optimization of Hadoop.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Kai Ren - Carnegie Mellon University

Garth Gibson - Carnegie Mellon University

YongChul Kwon - University of Washington

Magdalena Balazinska - University of Washington

Bill Howe - University of Washington

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Hadoop's Adolescence: A Comparative Workload Analysis from Three Research Clusters

SESSION: Research Poster Reception

EVENT TYPE:

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Kai Ren, Garth Gibson, YongChul Kwon, Magdalena Balazinska, Bill Howe

ROOM:East Entrance

ABSTRACT:
We analyze Hadoop workloads from three different research clusters from an application-level perspective, with two goals: (1) explore new issues in application patterns and user behavior and (2) understand key performance challenges related to IO. Our analysis suggests that Hadoop usage is still in its adolescence. We see underuse of Hadoop features, extensions, and tools as well as significant opportunities for optimization. We see significant diversity in application styles, including some ``interactive'' workloads, motivating new tools in the ecosystem. We find that some conventional approaches to improving performance are not especially effective and suggest some alternatives. Overall, we find significant opportunity for simplifying the use and optimization of Hadoop.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Kai Ren - Carnegie Mellon University

Garth Gibson - Carnegie Mellon University

YongChul Kwon - University of Washington

Magdalena Balazinska - University of Washington

Bill Howe - University of Washington

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar