SC12 Home > SC12 Schedule > SC12 Presentation - Usage Behavior of a Large-Scale Scientific Archive

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Usage Behavior of a Large-Scale Scientific Archive

SESSION: Big Data

EVENT TYPE: Papers

TIME: 2:00PM - 2:30PM

SESSION CHAIR: Dennis Gannon

AUTHOR(S):Ian F. Adams, Brian A. Madden, Joel C. Frank, Mark W. Storer, Ethan L. Miller, Gene Harano

ROOM:255-EF

ABSTRACT:
Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of file-level activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Our examination of system usage showed that, while a subset of users were responsible for most of the activity, this activity was widely distributed at the file level. We also show that the physical grouping of files and directories on media can improve archival storage system performance. Based on our observations, we provide suggestions and guidance for both future scientific archival system designs as well as improved tracing of archival activity.

Chair/Author Details:

Dennis Gannon (Chair) - Microsoft Corporation

Ian F. Adams - University of California, Santa Cruz

Brian A. Madden - University of California, Santa Cruz

Joel C. Frank - University of California, Santa Cruz

Mark W. Storer - NetApp

Ethan L. Miller - University of California, Santa Cruz

Gene Harano - National Center for Atmospheric Research

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Usage Behavior of a Large-Scale Scientific Archive

SESSION: Big Data

EVENT TYPE:

TIME: 2:00PM - 2:30PM

SESSION CHAIR: Dennis Gannon

AUTHOR(S):Ian F. Adams, Brian A. Madden, Joel C. Frank, Mark W. Storer, Ethan L. Miller, Gene Harano

ROOM:255-EF

ABSTRACT:
Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of file-level activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Our examination of system usage showed that, while a subset of users were responsible for most of the activity, this activity was widely distributed at the file level. We also show that the physical grouping of files and directories on media can improve archival storage system performance. Based on our observations, we provide suggestions and guidance for both future scientific archival system designs as well as improved tracing of archival activity.

Chair/Author Details:

Dennis Gannon (Chair) - Microsoft Corporation

Ian F. Adams - University of California, Santa Cruz

Brian A. Madden - University of California, Santa Cruz

Joel C. Frank - University of California, Santa Cruz

Mark W. Storer - NetApp

Ethan L. Miller - University of California, Santa Cruz

Gene Harano - National Center for Atmospheric Research

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar