CIO December 15 2006 Issue by Sreekanth Sastry

Data Mining foreign government, says a counter-intelligence official involved with the project who requested anonymity. The parameters for these searches are developed by counterintelligence officers, based on their experience of what suspicious activity looks like. As the technology improves, the DoD hopes to rely on artificial intelligence to decide which patterns warrant attention. However, even systems that have more limited scope, such as the DoD’s security clearance system, are sending out mixed signals. “Right now, it’s information overload,” says the counter-intelligence official. “With the rules we have now, we would have a ton of false positives.” His goal is to refine the system and eventually show that the concept works. This, he hopes, will encourage people to share more data. He doesn’t anticipate getting usable results for three or four years. The factors that will determine the project’s future are the same as with any IT project: how well the technology performs, the problems the DoD uses the system to solve and what it does with the results it gets.

Projects Get the Axe

f anti-terrorism data mining is going to improve, the business rules aren’t the only aspect that need change. After all, a system is nothing without good data. Sometimes law enforcement has a detailed profile of a terrorist suspect. But in other cases all they have is a name. “Names alone are not a helpful way to match people,” says Jeff Jonas, data mining’s acknowledged superstar, who made his name protecting Las Vegas casinos from cheats. Jonas, for example, shares his name with at least 30 other Americans. After 9/11, the government began replacing the Computer Assisted Passenger Pre-Screening system (Capps) — which only tracked passenger data collected from airlines (names, credit card numbers, addresses) — with Capps II, which would add information culled from data brokers. Capps II first gained notoriety in 2003, when reports surfaced that Northwest Airlines and JetBlue gave passenger records to the Transportation Security Administration so it could test the new system. Critics asked about privacy safeguards and in response to the outcry Congress withheld funds for Capps II until the GAO completed a study on how exactly the TSA intended to protect privacy. In August 2004, the TSA pulled the plug on its Rs 450 crore-plus investment in Capps II in favor of a new system called Secure Flight. Secure Flight and its predecessor share many characteristics, most notably combining passenger records with data purchased from commercial databases. According to a recent government audit, DHS and the Department of Justice spent more than

Rs 112.5 crore in 2005 buying data for fighting crime and preventing terrorism. In September 2005, the Secure Flight Working Group, a collection of data mining and privacy experts who the TSA asked to review the project, filed a confidential report that was highly critical of the system. Within a week, the report was on the Internet. It read: “First and foremost, TSA has not articulated what the specific goals of Secure Flight are.” Bruce Schneier, a security expert who was a member of the working group, sees Capps II and Secure Flight as examples of how the lack of proper scope has damaged anti-terror IT efforts. Even if you managed to design a data mining system that could comb through phone records or credit card transaction and spot terrorists with a 99 percent success rate, it still wouldn’t be good enough, argues Schneier. For example, if 300 million Americans make 10 phone calls, purchases or other quantifiable events per day, that would produce 1 trillion pieces of data a year for the government to mine. Even 99 percent accuracy would produce a billion false positives a year. That’s why Schneier wasn’t surprised when he read a January article in The New York Times reporting that hundreds of FBI agents were looking into thousands of data mining–generated leads every month, almost all of which turned out to be dead ends. “[Data mining] is a lousy way to fight terrorism,” he says By contrast, says Schneier, data mining has worked to prevent credit card fraud because con artists act in predictable ways and operators of credit card data mining systems have drawn a clear ROI line for an acceptable level of false negatives and positives, and

Even a data mining system with 99 percent accuracy could potentially produce a billion false positives a year.

Vol/2 | I SSUE/03

Govern Main.indd 55

12/14/2006 5:22:26 PM