Zeynep Tufekci writes: In light of the revelations of a massive data collection and snooping effort by the NSA, one response has been to suggest that privacy advocates are overreacting, and that, as a friend put it, “the scale of abuses reported is minimal/nonexistent” so this is not that big of a deal.
That the abuses of this massive data trove that we know of are very few is true — but that should not be a comfort as this hides a huge, uncomfortable problem. We don’t know what we don’t know, and not just in some abstract, philosophical sense that there will always be “unknown unknowns,” but very specifically in that what we know of the NSA’s data management practices strongly suggests that the NSA itself doesn’t really know how the data it is collecting is being used.
In a nutshell, here’s what we’ve learned, or has been highlighted, as a result of Edward Snowden’s leaks: Almost all major software companies as well as telecommunications giants have created mechanisms by which the NSA has access to traffic and user information that goes through that company. We have also learned that NSA has been deliberately weakening internet security so that it can eavesdrop easier on it all. We learned that NSA also taps into internet’s physical backbone and listens in to the traffic directly.
In short, the NSA is collecting a massive amount of data from multiple, varied sources. Each of these data surveillance methods produces massive amounts of complex, incongruous data in nonstop fashion. Just managing data storage at this scale is a humongous challenge, let alone categorizing and sorting it all, and then retrieving it on demand.
To manage this data beast, the NSA seems to have relied on highly-competent “sysadmins”—in effect super users. The powerful wizards. What is increasingly clear that it did not do, however, is find a way to provide an effective oversight of these sysadmins, the custodians of it all. [Continue reading…]