Summer 2013 magazine

Page 23

Even the building blocks of life—

codes so immense and complicated they once looked impenetrable—are becoming increasingly easy to catalog.

Jacob Thomas

Channeling the flood

So the obvious question: How is this all possible? Though there’s a slew of complex math and computer science involved, the short answer is information—lots and lots and lots of information. In 2010, former Google CEO Eric Schmidt told attendees at the Techonomy conference in Lake Tahoe that humanity had generated five exabytes of data from the dawn of time to 2003 (each exabyte equals one quintillian bytes), whereas now we create that much information every two days. Some have quibbled with Schmidt’s numbers, but there’s no denying that we’re living in the time of the flood. “We have the ability to collect more data than ever,” says Assistant Professor of Computer Science John MacCormick, whose recent book, Nine Algorithms That Changed the Future, explains how big data’s complex tools work. “But we can also analyze the data now. You need both for this to be a meaningful trend.” Without that ability to analyze, you get something like Jorge Luis Borges’ “The Library of Babel,” a fictional ever-growing collection of books holding all of the information in the universe and ultimately rendered useless because there’s simply too much to explore. Everything becomes nothing. All the information all the time becomes information overload. Enter big data, the means to turn that overload into a working load, the means to channel the flood, water the digital jungle and make it

flower into something useful. “I remember when I was in graduate school in the 1970s working as a research assistant in Harvard’s Joint Center for Urban Studies, and I was running a regression on 100,000 observations in a study of urban loan applications,” says Associate Professor of International Business & Management Stephen Erfle, who introduces students to data-driven decision making in his Managerial Economics course. “It took Harvard’s mainframe computer system half of the weekend to run. Now, you could do that in minutes.” Even the building blocks of life—codes so immense and complicated they once looked impenetrable—are becoming increasingly easy to catalog. “When I started at Dickinson, that was the year when the Drosophila Melanogaster [fruit fly] genome sequence was published,” says Kirsten Guss, John R. & Inge Paul Stafford Chair in Bioinformatics. “Now the genomes of 12 different drosophila species have been published, as well as honey bees and flower beetles. That this information is so accessible—literally a click away—that’s amazing to me.” Pattern power

The secret is pattern recognition, a relatively simple task most children learn to perform by age 5 but computers have taken to new heights thanks to advances in processing power. Driving much of big data, pattern-recognition algorithms fuel everything from IBM’s Jeopardy!-winning

21


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.