[Lowerbounds, Upperbounds]

Algorithms are everywhere.

Luis von Ahn

12:00 PM, 3305 Newell-Simon Hall

Thesis Oral

Title: Human Computation

Abstract:

Tasks like image recognition are trivial for humans, but continue to challenge even the most sophisticated computer programs. This thesis introduces a paradigm for utilizing human processing power to solve problems that computers cannot yet solve. Traditional approaches to solving such problems focus on improving software. I advocate a novel approach: constructively channel human brainpower using computer games. For example, the ESP Game, introduced in this thesis, is an enjoyable online game — many people play over 40 hours a week — and when people play, they help label images on the Web with descriptive keywords. These keywords can be used to significantly improve the accuracy of image search. People play the game not because they want to help, but because they enjoy it.

I introduce three other examples of games with a purpose: Peekaboom, which helps determine the location of objects in images, Phetch, which collects paragraph descriptions of arbitrary images to help accessibility of the Web, and Verbosity, which collects common-sense knowledge. I also show that, in principle, every problem that could be solved by a computer, today or in the future, could be solved using enjoyable computer games.

In addition, I introduce CAPTCHAs, automated tests that humans can pass but computer programs cannot. CAPTCHAs take advantage of human processing power in order to differentiate humans from computers, an ability that has important applications in practice.

The results of this thesis are currently in use by hundreds of Web sites and companies around the world, and some of the games presented here have been played by over 100,000 people. Practical applications of this work include improvements in problems such as: image search, adult-content filtering, spam, common-sense reasoning, computer vision, accessibility, and security in general.

Thesis Committee:
Manuel Blum, Chair
Takeo Kanade
Michael Reiter
Josh Benaloh, Microsoft Research
Jitendra Malik, University of California, Berkeley

Daniel Blandford

3:00 PM, 3305 Newell-Simon Hall

Thesis Oral

Title: Compact Data Structures with Fast Queries

Abstract:

Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the cache. However, a data structure is only useful if it allows the application to perform fast queries (and updates) to the data.

This thesis describes compact representations of several types of data structures including variable-length arrays and dictionaries, separable graphs, ordered sets, text indices, and meshes. All of the representations support fast queries; most support fast updates as well. Several structures come with strong theoretical results, and all of the structures come with experimental results showing good compression results. (This represents an improvement over many compressed structures, which can be too complex to implement or suffer from high constant factors.) The compressed data structures are usually close to as fast as their uncompressed counterparts, and sometimes are faster due to caching effects.

These data structures are united by a common theme: the use of difference coding to represent data by its difference from other, previously known, data. For example, a compact graph structure represents the neighbors of a vertex by the difference between the neighbor label and the original vertex label. For many structures this is combined with a relabeling scheme which ensures that most of the differences encoded are small. The variable-bit-length arrays and dictionaries represent a general framework for creating compressed queryable data structures. This represents an improvement for many structures, which would otherwise need to be built ad-hoc.

Thesis Committee:
Guy Blelloch, Chair
Christos Faloutsos
Danny Sleator
Ian Munro, Waterloo