Friday, June 21, 2013

Your own personal Watergate

Mr. Pierce points out that the President is being disingenuous. Obama does not do disingenuous well, and the whole "I used to disagree but then I saw what's really going on thing" is so ... naive I just don't know what to think about the man.

There are some very basic points that need to be watched here that most naive and/or well-intentioned defenders of the NSA ignore that are nicely highlighted in both Obama's quote and the Guardian pull:

1.  "Listening" or "reading" is usually analysis done by human being; "storage and metadata" are analyses done by machines, do not require human beings.

2.  Keyword summarization of content is metadata, and not content.  Knowing that there's an 87.4% chance the word "XYZ" occurs fifteen times in a communication doesn't tell you what someone is saying, in the same way that more traditional metadata like "calls between subject and his mom one Sunday afternoon a month that last for twenty minutes" tells you nothing about the content of those calls.  

So I put it to you, readers: does anyone know if (a) FISA disallows the distinction in (1), and (b) does FISA view keyword summarization as content or metadata?

Because if FISA disallows the distinction in (1) then there'd be no need for (2).  But I know, from personal experience, that a lot of time, money and talent has been put into technologies that can do (2) by the national security complex, and so I have to suppose (1).  It could be, and bear with me now, that NSA and CIA have wasted a ton of money on technologies they can't use!!  It wouldn't be the first time a large secretive government entity did something stupid.  Its hard to know whether that would be a good thing or not.

3.  Its very important to the NSA to know who's a US person and who isn't.  (A US person is a citizen or lawful resident, e.g. people on visas.)  This is an issue because the size of the firehose is so big - we know from previous whistleblowers that one of those hoses is the AT&T backbone down on Folsom in SF - and so you need to know who to exclude.  This is probably why the NSA needs Verizon's data, and also the PRISM data, because it uses that stuff to identify which signals belong to US persons and which don't.  The NSA already gathers communications from "US-based machines", and it can easily connect the dots across those communications, but packets don't have country codes attached to them, and so they need the data from e.g. Verizon and PRISM to attach a country code to those graphs.

(Note that contrary to some views, while there may be a strong legal distinction between the rules on tapping telephonic communications and the rules on email and web traffic, there is no operational difference in the handling.  Everything is transferred in packets, which come with a little tag that tells the various target devices whether they can read them or not.  I'm sitting at work listening to music on Spotify, listening to a conference call, typing a blog post, and all of it happens on my company's internal network and the same wires carry everything; there are not three kinds of data here, only one, with three different readers.  You can say there's a distinction legally, but from the standpoint of the data management people, that just tells me which folder I dump the packets into.)

There are some obvious conclusions you can draw from all of this about what the NSA does, especially if you consider that (a) phone calls take up considerably less space, data-wise, than web traffic and (b) even small companies (<$5 billion in market cap) now routinely compress, store and analyze 30+ Petabytes worth of data.  

No comments: