On (Mis)Trusting Google Desktop
Highly usable software, such as Google Desktop, can seem revolutionary, but the web-meets-desktop search capabilities are seductively porous and raise huge privacy concerns.
identifying data. It is identified by the IP address and the cookie only."
The problem is that search data itself might be sensitive even if it is completely disassociated from the person who submitted it. For example, Google Suggest, a feature of Google that is included in the Google toolbar attempts to associate words together to assist in searches by analyzing data about the overall popularity of various searches (ref: here). By searching to see if two pieces of information are associated on the Internet, you may actually disclose the fact that they are. I call this the search disclosure principle. This is a close cousin to the observer effect in that the act of searching information contributes to the information available. The mere fact that two pieces of information are being associated by people can be a potential privacy violation, national security risk, or corporate exposure even if the query is completely disassociated (or "sanitized") from the person who submitted it.
Imagine the CIA itself searching for the name "Valerie Plame" with "CIA" to see if an operative is exposed (on a blog, etc.). That search may end up associating those pieces of information together, provide a first breadcrumb to follow, and contribute to blowing that person's cover. This is of particular concern in high-sensitivity scenarios like government, medical trials, confidential corporate information, etc. where one doesn't need to know who connected the dots to turn the public, competitors, the press or enemies on to a possible linkage.
The privacy issues listed above assume that everything works as intended. A greater concern may be the aggregation point of sensitive data created on Google's servers. Consumers have seen many breaches at data warehouses over the last few years (CardSystems, TJX, etc.) that one wonders how many financially-driven attackers will soon become incentivized to turn their sights on Google. One could only imagine that the combined Google/DoubleClick data pool would contain enough "big brother" data to have made George Orwell salivate. Perhaps most interesting is that much of the data housed by Google (such as search history, etc.) isn't covered under many disclosure laws (such as California Senate Bill 1386). This means that, depending on the breach, Google may be under no obligation to inform the public. Privacy International, a UK consumer protection group, came down particularly hard on them in their privacy assessment putting them 23rd out of 23 companies studied. They went
Hugh Thompson



