Sunday, May 14, 2006

Some Operational Pitfalls Of Relying On Data Mining

One of the founders of Wired Magazine--techie extraordinaire--Simson L. Garfinkel of Harvard looks at the latest revelation coming from the broader extra-legal NSA warrantless eavesdropping program, the data mining of the call records of nearly all Americans in the name of keeping us "safe" from "terror."

Garfinkel shows how the effort can be a wild goose chase of an extremely wasteful kind.

Some cogent points:

(T)he real danger of this kind of unrestricted data mining isn't "false positives" -- that is, associations that don't really exist -- but meaningless positives. Investigating all of these positives takes time and money. And if the investigations are not done correctly, innocent lives can be ruined in the process. A principle of American jurisprudence is that it is better for a guilty person to go free than for an innocent person to be imprisoned. How will our society react to a system that requires many people to be investigated because only one of them might be a terrorist?...

Ultimately, (this) may be the greatest dilemma for those involved in collecting and mining data: What information does one not need to collect, and when is it safe to throw away a piece of data? It's human nature to hold on to information as long as possible -- once you eliminate it, you can't always get it back. And even if you could keep everything forever, would you want to? The cost of storing and protecting data is high. The more information you have, the more difficult it is to search and cross-reference it, which means you must spend more on computer systems. And perhaps the most insidious side effect of all: After spending all the money and effort to collect, keep and protect data, it is hard not to develop that nagging feeling that you really should be putting it to use.

