SC12 Home > SC12 Schedule > SC12 Presentation - Digitization and Search: A Non-Traditional Use of HPC

SCHEDULE: NOV 10-16, 2012

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Digitization and Search: A Non-Traditional Use of HPC

SESSION: Research Poster Reception

EVENT TYPE: Posters and Electronic Posters

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Liana Diesendruck, Luigi Marini, Rob Kooper, Mayank Kejriwal, Kenton McHenry

ROOM:East Entrance

ABSTRACT:
We describe our efforts to provide a form of automated search of handwritten content for digitized document archives. To carry out the search we use a computer vision technique called word spotting. A form of content based image retrieval, it avoids the still difficult task of directly recognizing text by allowing a user to search using a query image containing handwritten text and ranking a database of images in terms of those that contain more similar looking content. In order to make this search capability available on an archive three computationally expensive pre-processing steps are required. We augment this automated portion of the process with a passive crowd sourcing element that mines queries from the systems users in order to then improve the results of future queries. We benchmark the proposed framework on 1930s Census data, a collection of roughly 3.6 million forms and 7 billion individual units of information.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Liana Diesendruck - University of Illinois at Urbana-Champaign

Luigi Marini - University of Illinois at Urbana-Champaign

Rob Kooper - University of Illinois at Urbana-Champaign

Mayank Kejriwal - University of Illinois at Urbana-Champaign

Kenton McHenry - University of Illinois at Urbana-Champaign

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Digitization and Search: A Non-Traditional Use of HPC

SESSION: Research Poster Reception

EVENT TYPE:

TIME: 5:15PM - 7:00PM

SESSION CHAIR: Torsten Hoefler

AUTHOR(S):Liana Diesendruck, Luigi Marini, Rob Kooper, Mayank Kejriwal, Kenton McHenry

ROOM:East Entrance

ABSTRACT:
We describe our efforts to provide a form of automated search of handwritten content for digitized document archives. To carry out the search we use a computer vision technique called word spotting. A form of content based image retrieval, it avoids the still difficult task of directly recognizing text by allowing a user to search using a query image containing handwritten text and ranking a database of images in terms of those that contain more similar looking content. In order to make this search capability available on an archive three computationally expensive pre-processing steps are required. We augment this automated portion of the process with a passive crowd sourcing element that mines queries from the systems users in order to then improve the results of future queries. We benchmark the proposed framework on 1930s Census data, a collection of roughly 3.6 million forms and 7 billion individual units of information.

Chair/Author Details:

Torsten Hoefler (Chair) - ETH Zurich

Liana Diesendruck - University of Illinois at Urbana-Champaign

Luigi Marini - University of Illinois at Urbana-Champaign

Rob Kooper - University of Illinois at Urbana-Champaign

Mayank Kejriwal - University of Illinois at Urbana-Champaign

Kenton McHenry - University of Illinois at Urbana-Champaign

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar