So the world is desperately excited by a programme called "PRISM", and we learn that - shockingly - the NSA reads people's emails. Can that possibly be true? Hard to believe, I realise, but stay with me.
The National Security Agency has obtained direct access to the systems of Google, Facebook, Apple and other US internet giants, according to a top secret document obtained by the Guardian
we have not joined any program that would give the U.S. government—or any other government—direct access to our servers. Indeed, the U.S. government does not have direct access or a “back door” to the information stored in our data centers.
Early Warning, who is usually sensible, says Google is lying. But I tend to trust Google, certainly more than I'd trust the Graun or WaPo to understand tech. EW's belief that Google is lying appears to stem from the US Govt confirming the existence of PRISM: but its an awfully long way from "existence" to "details of the story are correct". And indeed the US have said explicitly that details are wrong.
I can't tell where the truth lies, but I suspect that the Graun has indulged in what Wiki would call "Original Research", which is to say connecting the dots a bit further than the sources permit. This is the key slide, and the key words are "Collection directly from the servers of...". Weeell, its only a powerpoint slide, hardly a careful analysis. It looks like the real meaning of "directly from the servers of" is actually "we put in requests, following the law, and they comply with that law by providing data". Which is a very different thing to direct access. The former is known and boring (even if you don't like it); the latter would be new. The Graun knows about the distinction and is definitely claiming the latter (they have to be, otherwise there is no story): Companies are legally obliged to comply with requests for users' communications under US law, but the Prism program allows the intelligence services direct access to the companies' servers.
Another thing that suggests strongly to me that this is only an analysis-of-received-data type operation is the price tag: $20M/y. That doesn't sound like the kind of money to fund searching through all of even just Google's vast hoards of data, let alone all the rest.
If you wanted a conspiracy theory, the one I'd offer would be that this is to deflect attention from the "Verizon revelation" about the phone records. You get people wildly excited about direct access, based on some ambiguous slides. That all turns out to be nonsense, and so people then start waving all the rest away.
[Update: According to Business Insider the WaPo has modified and weakened its story somewhat. It does indeed say "updated", though not in what way. I did like BI's "Many have questioned other aspects of the revelations, such as the amateurish appearance of the slides (though they are believable to those with government experience)".]
[UUpdate: there is a US govt factsheet. Some of it is potentially weaselly Under Section 702 of FISA, the United States Government does not... - yeah, but what about things *not* done under section 702? However, it does make some direct positive statements PRISM is not an undisclosed collection or data mining program. It is an internal government computer system used to facilitate the government’s statutorily authorized collection of foreign intelligence information from electronic communication service providers... So it looks more and more to me as though either the US govt, and Google, are lying to us directly; or (far more likely) the Graun and WaPo are wrong.]
[UUUpdate: the Graun sez Technology giants struggle to maintain credibility over NSA Prism surveillance. The substance is the same: Graun makes claims, the companies say they're wrong, and the Graun has no evidence. The institution that is leaking credibility is the Graun, not the companies.
And: just when you thought they couldn't lose the plot any more, we have them calling this the biggest intelligence leak in the NSA's history. That's twaddle. So far, this is nothing: they have no substance.]
[UUUUpdate: at last, the dog that didn't bark in the night speaks, though softly. Bruce Schneier, who I'd have hoped would be on top of this, has some stuff to say. He praises whistleblowers in general; I agree. But he only talks about PRISM in an afterword, and its pretty clear that he doesn't know what is going on either. He praises Edward Snowden but I think that is premature - some of the stuff the Graun has him saying makes him sound rather tin-foil-hat to me.]
[Late update: the Graun has now admitted that the original story as wrong, although to their discredit only by implication. They were no honest enough to publish an upfront correction - or, in other words, they are simply dishonest.
Kevin Drum points out that the Graun was mislead by the words "direct access" in the original powerpoint -and makes the obvious point (that I've though of, but not written down): why didn't Snowden tell the Graun this? Its hard to think of a reason that rebounds to his credit. the most obvious are (a) he's clueless, or (b) he knew that with that error corrected, the powerpoint was dull. Its not possible that it was an oversight, since the Graun talked to him *after* the story was public, and this was a major point.
More: The Graun (or is it just Glenn Greenwald?) is claiming total accuracy and no backpedalling. Read his point (4). How odd.]
Much later: even though the "direct access" claim has been thoroughly refuted, the Graun is still peddling this crap on Friday 12 July 2013. Have they no shame?
Everybody needs a good conspiracy theory once in a while and it works when it plays to the fears of the audience. (Evil snooping government. )
I remember the "CIA owns the Uni Caf and is spying on students" meme that did the rounds back in the late 60s early 70s when I was a student. It was a much-loved leftist conspiracy theory.
I can certainly accept the possibility that PRISM is all above board, and I am not one of those people who really cares anyway. I doubt the NSA would be interested in my trivial communications.
On the otherhand if PRISM was all above board, then why does one slide show Skype being integrated one month after Microsoft purchased it? If PRISM just involves legal requests then why hadn't Skype been integrated earlier? Why did it apparently require Microsoft to purchase it? Or are the NSA so incompetent that they didn't realize Skype existed until Microsoft bought it?
As for how it could work, the NSA could have their own people working in sections of these companies so it wouldn't require a massive coverup, just a few people at the top of the companies having signed equivalents of official secrets acts, in which case they would be obliged to deny involvement.
It sounds all conspiracy-theory-like but so too would extraordinary rendition 10 years ago.
Google, facebook, microsoft touch so much data it's a security service's dream to have access to it. There is nothing the NSA can do to obtain so much useful (and useless I guess) data themselves. I admit it IS more than possible that they would have just ignored such an opportunity just because of US laws. But on the otherhand we've seen "them" find loopholes to avoid laws in the past. Sending prisoners to other countries to torture them for example, or redefining combatants as insurgents to avoid geneva convention and the like. It doesn't seem wild to me that they had found some loophole by which they believe they could argue tapping this data is legal.
Anyway as I said it doesn't really concern me, it's only slightly interesting because it's a mystery. If I knew the truth either way it would become instantly boring.
I was thinking it sounded cheap too but the $20m could just be for the access rather than the analysis.
$2 million each for 10 sets of optical taps at each provider's core backbone(s). Sounds reasonable.
[Not close. For the taps, yes. But for the crunching of the results? No -W]
It is of course technically feasible to reconstruct raw data streams to visible application data via packet contents (e.g., port 80/443 packets are for web servers, etc). In fact this technique has been commercially available for well more than a decade.
The US Government doesn't have to have direct access to the servers of Google, Facebook, et al., to get most of this information. They can position themselves as the man-in-the-middle and get the IP packets while they are in transit. I don't know that they are actually doing this, but they are certainly capable of it.
[But that adds in the burden of them decrypting it all, which is distinctly non-trivial -W]
Personally, I don't think they are reading everybody's e-mail (as opposed to the metadata, which program has been ongoing for nearly a decade). But if they have decided that you are a person of interest, it would not be too difficult for them to read yours. Given that the program is based on a broadly-written law which has never been successfully challenged (there is a catch-22 here: you have to be able to prove that you are a target of this surveillance to have standing to challenge the law, but the program is conducted in such a way that it is nearly impossible for anyone to prove that he is a target), it is almost certainly legal.
Here's a bit of background how these types of programs are supposed to work
And now that there are ample hints about what kinds of data are being collected for each call, how long before some highly whacker starts testing out varying patterns of calls (playing with iterations of different to, from, length, whatever else), that get interpreted into strings of bits, and finds one that overflows a buffer in some interesting way?
More to the point -- a good government that with the best of motives accumulates a growing collection of material that would be a treasure to a later bad government -- lacks foresight.
"[But that adds in the burden of them decrypting it all, which is distinctly non-trivial -W]"
Possibly true, now that encryption of SMTP traffic and use of HTTPS has become common. Five years ago, sniffing traffic would give you a wealth of unencrypted data ...
From what I've been able to gather, hearing various official and unofficial statements, is that they are performing traffic analysis (packet metadata is not encrypted) and if that analysis leads to something interesting, that can then lead to a subpoena which is then used to get to unencrypted content in [say] Google's database[s] with Google's cooperation.
"...they are performing traffic analysis (packet metadata is not encrypted) and if that analysis leads to something interesting, that can then lead to a subpoena which is then used to get to unencrypted content in [say] Google’s database[s] with Google’s cooperation."
#9 ...which seems to me to be nothing particularly novel, though not necessarily desirable. Plus M. Geist seems to think its legal:
...though, again, not necessarily desirable.
DARPAnet - think there is no backdoor?
NSA is exceedingly good at decryption. That's its bread-n-butter reason for existence.
bigcitylib - agreed on all counts.
eli - the most commonly used software which implements the internet in all of its layered and multi-protocoled glory is open source and the protocols are published standards. Now there might be a back door through which one can enter a google datacenter and deploy sniffing hardware (and indeed ISPs can be forced to give this kind of access with proper search warrants and supoenas and the like, the current kerfuffle largely stems around the belief by some that google, facebook et al have given NSA permanent access of this kind, and to all information flowing through the datacenters, which the companies are denying).
david - modern encyrption algorithms used by those following best practices are provably hard (in the computational complexity sense) to decrypt. The only exploitable weakness in SHA256 that's been found, for instance, reduces the effective key length from 256 bits to 254. A lot of sharp minds outside NSA and other government organs have tackled decryption with very little success. It is possible that NSA (for instance) has made a huge breakthrough in theoretical mathematics skewering the theoretical basis for strong encryption, but I doubt it). Weaker encryption (what was DES? 64-bit keys? I don't even remember any more) is breakable with less computational effort but is still expensive. I think William's point isn't that it's impossible to decrypt a small number of messages of extreme interest that are sloppily encrypted, the point is you can't routinely decrypt even a moderate volume of traffic. There's a reason why you can be forced to 'fess up your keys, etc.
I'm curious as to why you would trust google. I fundamentally wouldn't believe anything they say about this, given their own history of stealing data and lying about it.
[Apologies for the late approval of this (2013/06/18, since you ask). As to why: I don't recognise your "their own history of stealing data and lying about it". Citation needed -W]
dhogaza --- I'll just say that I know more about what NSA is capable of and actually does than I'll attempt to describe here. It is enough to say that I stand by my prior comment. NSA would, of course, prefer for you to doubt it.
Not a new story, does nobody remember Stellar Wind and room 641a.
Bush II bypassed the FISA to collect all calls, emails and texts after 9/11 to monitor foreign agents, but people involved in the program revealed that they kept monitoring journalists... To exclude them as people who often contacted foreign sources!
Only the politically naive could think that the program actually stopped when it was exposed in 2008.
No modern government finds it acceptable for its citizens, or subjects, to be able to communicate secretly with other citizens or foreign sources. The content of any communication can be secret, but the fact that the communication occurred must be information that the government can access.
This was one of the problems the Blackberry phone and messaging system encountered. india especially objected to the strong encryption on the Blackberry messaging system that prevented government from knowing the source, destination and time of any messages being sent.
Pattern recognition programs that spot the type of communications made by terrorists, activist, journalists and sexual deviants only need the source, destination and time of any call, email, message or clicked link to detect a 'threat' to the state. Reading the content is not required to detect suspicious activity. Anybody that thinks that their emails and intenet use plus phone and messaging activity is being read is paranoid and ignoring the logistical problems of analysis that much data. But it is not paranoid to assume that both the governments and commercial organisations are collecting the source, destination and time of all online and phone activity.
modern encyrption algorithms used by those following best practices are provably hard (in the computational complexity sense) to decrypt.
This is one reason (but not the only one--signal to noise issues are another) why I don't think the NSA is reading everybody's e-mail. It's not practical to decrypt every packet by brute force. But as has been pointed out above, those IP packets have to have metadata giving their origin and destination, and that metadata cannot be encrypted. Any man in the middle can read that metadata. In fact, that's what the internet was designed to do: find a way to get a packet to its destination by any intermediate path available, even if most of the potential intermediate paths are out of commission. And if the authorities have decided that they need to take a closer look at you, they can issue subpoenas to peek at the unencrypted data, or as a last resort they can brute force decrypt a key--that would take enough time that they can't do it for everybody, but not so much that they couldn't do it for a handful of cases they deem to be of high importance.
"But I tend to trust Google, certainly more than I’d trust the Graun or WaPo to understand tech". This is no longer about who knows more about technology, but simply about whether obfuscating the truth about PRISM is better than coming clean is better damage control. The person who exposed this now known, and he did work for the NSA, and he has infinitely more knowledge about what the NSA does as opposed to somebody who writes a blog.
[No, he doesn't have infinitely more knowledge. Indeed, it looks like he has very little knowledge. If he knows a lot, why is it that all he's released is one rather ambiguous powerpoint presentation? -W]
"dhogaza — I’ll just say that I know more about what NSA is capable of and actually does than I’ll attempt to describe here. It is enough to say that I stand by my prior comment. NSA would, of course, prefer for you to doubt it."
No, it is not enough to stand by your comment without backing it up. If, for instance, you believe the NSA has cracked the prime factorization problem provide some evidence.
"... he did work for the NSA, and he has infinitely more knowledge about what the NSA does as opposed to somebody who writes a blog."
NSA is big on compartmentalization ...
"Pattern recognition programs that spot the type of communications made by terrorists, activist, journalists and sexual deviants only need the source, destination and time of any call, email, message or clicked link to detect a ‘threat’ to the state. Reading the content is not required to detect suspicious activity."
This is traffic analysis, which I mentioned above, and as you say, it can be surprisingly effective.
This is traffic analysis, which I mentioned above, and as you say, it can be surprisingly effective.
As is pointed out in this post (h/t Brad DeLong).
Summary: A bombshell story published in the Washington Post this week alleged that the NSA had enlisted nine tech giants, including Microsoft, Google, Facebook, and Apple, in a massive program of online spying. Now the story is unraveling, and the Post has quietly changed key details. What went wrong?
I tell ya, if "a jury of his peers" still meant what it used to (it doesn't) it would be a gas of a trial, having people who actually understand what _will_ be possible to do with the data -- and understand it.
Imagine seeing a dozen competent knowledgeable network-aware programmers sitting in the jury box listening to lawyers try to describe what's going on and what will be possible.
Despite you little reference at the end being the onion type - what if? What if in 1990 or so someone decided it's really hard to digitize millions of phone calls, and it would be so much easier of people typed everything up for the taking? Email must be the intelligence analyst's wet dream, no more steaming up envelopes, deciphering bad handwriting, gluing the envelope back shut ...
Probably too far fetched for a government agency to be that far seeing.
The Terrorism Database by Hendrik Hertzberg in the 2013 May 20 issue of The New Yorker, page 35: "… the more than three-million names on the government’s Terrorist Identities Datamart Environment list. ... This fall, the National Security Agency, the largest and most opaque component of the counterterrorism behemoth, will cut the ribbon on a billion-dollar facility called the Utah Data Center, near Bluffdale. The center, reportedly, will gobble up a galaxy of intercepted telecommunications. According to a rough estimate by Digital Fourth, an advocacy group based in Massachusetts, each of the Utah Data Center's two hundred (at most) professionals will be responsible for reviewing five hundred billion terabytes of information each year,..."
5 Basic Unknowns about the NSA 'Black Hole'
It's still unclear if the National Security Agency has been collecting all American's phone and other records, and for how long
By Justin Elliott , Theodoric Meyer and ProPublica
I have an idea that at least some of this is done exactly as Google has done with its webbots: crawl the web and download pages. This must certainly bring in a significant fraction of what is evaluated. And if legal for Google, and everyone else, is surely legal for the NSA. It is the parts where they get to see the stuff beyond that that is of concern.
Of course, Facebook people willingly gave (and give) it all up to anyone who cares to view their pages if viewable.
If the Government is collecting these records AND using them, you would expect to see drug seizures, Mafia dons going to prison, angry people with guns and on-line fantasies being arrested, tax evaders imprisoned. That is, if the data being collected was being used for the better running of society.
If the Government were looking for associations of interest, you'd also find Senators and Congressmen and Mayors under investigation for their repeated contacts with powerful but disreputable persons. But they are not.
So, what are the data-miners doing? Preparing files and lists, but not acting on them. Why? Knowledge, as J. Edgar Hoover knew, gives power to those who choose to use it.
If the streets, Wall Street included, and government officers were being cleaned up, I'll bet the American people would accept a lot of personal intrusion in their lives. If the Bad Guys had to go to writing letters again, they would be emasculated - just as Osama bin Laden was once he could not use modern media. Which would also be a good thing in most Americans' view. The problem with Prism etc. is that the power of social overview is NOT used for the good of the community. Ever.
And how effective is it, anyway? Nazi Germany had the power of the State behind it and could not detect an assassination plot against Hitler. Romania and East Germany successfully spied on their citizens and in the process shut down social development. North Korea controls everything, but only those at the top of controls live reasonable lives. The ex-Soviet Union watched but could not stop the dissent they observed, despite, again, the coersive power of the State. And despite the power and the will, none of these examples lead to a better, more efficient, more effective, more stable, more happy society.
If the data mining has any value, we need to see results. If it is political in nature, we are in for a terrible time if history has any relevance.
Doug Proctor --- Amen.
I am curious to know what "direct access to servers" means to everybody arguing about this.
[To me, it means the same as "direct access to the servers at work": that I can browse the entire directory tree freely from root downwards, and can read anything (of course I can't actually do that at work, not having the required permissions). That's assuming that Google's servers actually look like one directory tree; perhaps they don't -W]
"The Terrorism Database by Hendrik Hertzberg in the 2013 May 20 issue of The New Yorker, page 35: “… the more than three-million names on the government’s Terrorist Identities Datamart Environment list. …"
So less than 1% of the population of the US ... and this supports the "NSA monitors everyone" meme how, exactly?
I'm a bit depressed, I thought you'd come back with personal knowledge of classified information to support your case.
"I am curious to know what “direct access to servers” means to everybody arguing about this."
In reality, in response to a subpoena, it would probably mean the datacenter operator (google, facebook, etc) crafting scripts to filter logs or to probe databases for specific information regarding specific targets.
But you know that, why ask silly Qs? You know that tapping into the interhose is pretty much an exercise in traffic analyss, though I have to admit the Utah NSA datacenter makes one wonder (I suspect it basically boils down to storing traffic, which government statements essentially back up - store stuff, analyze deeply if you have enough for a subpoena. I don't like it, but it's not as horrific as some are stating).
"And how effective is it, anyway? Nazi Germany had the power of the State behind it and could not detect an assassination plot against Hitler."
Strictly speaking, not true, they detected several such plots and in the case of the one that went off, were pretty close to it. Part of the power of the state was behind it, in actuality ...
David Benson, you disappoint me, I was really hoping for proof that the NSA has cracked the prime factorization problem.
Not that they have ears. They've listened to international calls for decades, ears are not a surprise.
(I've been in NSA, CIA, DSD, talked the GCHQ folks.)
My question was NOT what was actually happening, but what it meant to people arguing about it. Note the difference.
How stories change. In a recent article  the Guardian reads:
The Guardian understands that the NSA approached those companies and asked them to enable a "dropbox" system whereby legally requested data could be copied from their own server out to an NSA-owned system. That has allowed the companies to deny that there is "direct or indirect" NSA access, to deny that there is a "back door" to their systems, and that they only comply with "legal" requests – while not explaining the scope of that access.
They have silently dropped the claim of direct access. The focus of their article is how their reporting has prompted Google, Microsoft and others to call on the US government to allow more disclosure on how much data collection is done.
This is priceless.
Aside from the gold-plated hypocrisy of Google, Microsoft, et al pretending to care about privacy, we have people who post every detail of their lives on Facebook up in arms because the gummint can use data to figure out who their friends are.
Yes, I've painted with a slightly broad brush, but still...
[I won't defend micro$oft, obviously, but what makes you call Google hypocritical? What thing have they done one way but talked another? -W]
dhogaza --- Already in the 1980s a mid-level manager of NSA mathematicians told Peter Freyd that "the had solved 'the big one'". Peter thought the man was referring to having solved P NP. Based on some recent work regarding iterative solutions to NP problems, I'll now opine the opposite for many decrypting problems. The technique appears to be infeasibly slow, requiring massive amounts of parallel supercomputer time. Guess who has that.
> the Utah NSA datacenter
Should be running about the time Utah secedes.
"If you wanted a conspiracy theory, the one I’d offer would be that this is to deflect attention from the “Verizon revelation” about the phone records"
Yes I agree! Let me join your "Denier Community". Please. I believe in Artic Sea Ice melting! Shrinking polar bears population. Rising temperatures!! And Tornadoooos!
Forget about html: P not equal NP.
I'll add that this may well be so in the limit, but that problem instances may well have polynomial time solutions, O(n^k) where k is quite large and depends upon the instance.
I might be all wet, but proofs in either direction are not forthcoming. The algorithm requires both more time and storage than I can easily set up; still thinking about an experimental lashup but it'll be discouragingly slow with a single hard disk.
...the Second Law of Thermodynamics...
[Sigh. You need to give up the OT spam. Last warning -W]