Now it appears that the NSA wants to mine date from sites like Myspace and Facebook. Frankly, I'm not really bothered by this. The more useless information they gather, the more impossible it is for such data mining to find anything useful. The more storage space and processing power they use up searching through vast databases of information like the fact that Mandy in Sioux City, Iowa likes Disturbed and is allergic to peaches, the less likely their surveillance is to do anything useful.
- Log in to post comments
More like this
This is not what I wanted to write about for my first post of 2014, but unfortunately it's necessary—so necessary, in fact, that I felt the obligation to crosspost it to my not-so-super-secret other blog in order to get this information out to as wide a readership as possible.
I've always had a bit…
It's rare that I blog off topic - there's so much cool science in the world that I don't have much time for anything else. But my departure from Facebook has co-incided with something of a global trend, so I thought I may as well explore what people thought.
In case you've been wrapped in…
When I first became a regular user of Linux, several years ago, I tried out different text editors and quickly discovered that emacs was my best choice. By coincidence, about that time I ran into an old emacs manual written by Richard Stallman in the dollar section of a used booksore. In that…
A new paper in Bioinformatics describes an efficient compression algorithm that allows an individual's complete genome sequence to be compressed down to a vanishingly small amount of data - just 4 megabytes (MB).
The paper takes a similar approach to the process I described in a post back in June…
But, Ed, maybe this is the idea? Maybe the NSA is betting on others that think like you to not worry about such a meaningless site in hopes that they'll gain support when they start searching more meaningful areas? Personally, I think they should be stopped as soon as possible before they spread like the viruses they are!
My understanding of data mining in general is that the more raw data you have the more effective it becomes. Marketing companies are already adept at this and can predict buying patterns for totally unrelated things. The example I saw in a presentation from HOPE 2004 by a privacy expert was that VW knows that if you buy crunchy peanut butter you're much more likely to buy a VW Bug cars than if you buy creamy. Storage space and processing power for the NSA are non-issues. They can have all they want and both of those are relatively cheap. There's quite possibly a lot of information that they can gather from Myspace and Facebook, matching favorite bands with ethnicities or voting tendencies or sexual orientation or I don't know what else. Just as with cryptography, the more data you have, the more you can decipher from it and the faster your deciphering goes.
One point: While you are right that such data mining is singularily useless to pick people who're Up To Something out from all the background chatter, it is extremely well suited to finding out whether a specific person - such as the candidate who's going to run against you in your next election - has something in his past that you can use to smear him.
In short; the only possible use for such data banks is to harress people that those in control of the data banks already know and dislike.
Not to mention the false positive issue. This is just another way to be flagged as 'suspicious', a flage which you won't even know about, or be able to contest.
I tend not to believe in vast conspiracies, but I remember well the revelations about the LAPD's anti-terror squad of the 1980s, which kept files on Jesse Jackson, Jackson Browne, Stevie Wonder, the mayor, the Olympic Organizing Committee and a host of other non-threats, while employing as an undercover agent a man who grew up to become a member of the SLA (Patty Hearst's kidnapers). Consequently, JS's
doesn't seem that far off the mark.
Other revelations in the LAPD scandal included several officers keeping files at their homes. Since USB flash drives now allow one to transport huge quantities of data without even making a bulge under a sportscoat, the possibility of non-official uses of the data is almost as frightening as the prospect of official use.
I do believe that we should have very serious concerns. What exactly is the government going to do with this information? Does the fact that I comment to blog sites such as this place me in some sort of risk category? Will that trigger a cascade of searches for other web-sites that I comment on or download from? Will that permit the construction of an index on me?
Will this brand my family in some way and lead to the denial of a security clearance for my daughter in the future?
OK. Before everyone tells me to sit down and put my tinfoil hat back on, let me explain that I attended college in the late 60's. My parents pleaded with me not to sign any petitions or join any political organizations. They feared that this might prevent me from getting a security clearance in the future.
You must remember that both of my parents lived through the McCarthy era. They had seen that people who innocently signed petitions during the 1930's could get hauled up before the House Un-American Activities Committee in the 1950s. My father worked in the maritime industry, first as an engineering officer for a shipping company and then on shore-side maintenance. Based upon his knowledge about counter-smuggling efforts, he had little doubt that he had been "checked out."
I am a civilian volunteer for Federal organization that has been absorbed into the Department of Homeland Security. Guess who needs a security clearance just to continue in the volunteer maritime first-responder role that I have enjoyed for 10 years. Fingerprints, FBI check, the whole nine yards.
We know from media reports that the current administration admits that it runs "psy-ops" directed towards foreign governments or networks that some time affect the domestic population.
Does anyone believe that Adm. Poindexter's Total Information Awareness (http://www.fas.org/irp/agency/dod/poindexter.html) is really gone? Hasn't it just gone to a new part of the security bureaucracy?
Communication via the internet permits disparate individuals to form groups and plan action: anything from planning a Renaissance Festival to a terrorist strike.
It also permits the planning of political actions, meetings, mobilizations, voter registration drives, get out the vote drives, etc.
The key to building any grassroots political movement is knowing who has interests similar to your own, where do they live, how committed are they to the cause? Planning of legitimate political activities is facilitated by the internet.
If you make everyone afraid, as my parents were afraid, to admit publicly that they sympathizes with any political cause, you can shut the whole organizing process down. You shut the internet down as a useful means of organizing. You make it something to be shunned for fear of exposing individuals to exactly the type of "indexing" that I discussed in the first paragraph.
Take for example the recent focus on "eco-terrorism." Do Sierra Club members come into contact with persons who might engage in such direct actions as "tree spiking" to deter illegal logging? Maybe the Department of the Interior should use the TIA approach with conservation and environmental groups just to be sure.
Vegans get placed under surveillance? Why? Maybe they know someone who is a threat to the meat packing industry.
Worse, I just don't know if DARPA can really pull all this data mining off or they are just telling us that they can so that we will all "be careful." Either way, the repression is the same.
PS: After 12 years, I will be resigning from the organization that I described above. I don't need this crap.
Of course that will leave me some extra free time. Maybe I'll start to drink liberally or learn how to dance like a kossack. Some many blogs; so little time.
OK, I'll go put my tinfoil hat on now and enjoy a glass of Kool-Aid. Oh look, there's another story about a missing white girl..........and look, they have music. I haven't heard this song in years. I didn't think that we needed to hear it any more. Guess I was wrong. The fourth verse is my favorite and expresses better than I can the thrust of this posting.
Cue Music:
Buffalo Springfield
For What It's Worth
Stephen Stills, 1966
There's something happening here
What it is ain't exactly clear
There's a man with a gun over there
Telling me I got to beware
I think it's time we stop, children, what's that sound
Everybody look what's going down
There's battle lines being drawn
Nobody's right if everybody's wrong
Young people speaking their minds
Getting so much resistance from behind
I think it's time we stop, hey, what's that sound
Everybody look what's going down
What a field-day for the heat
A thousand people in the street
Singing songs and carrying signs
Mostly say, hooray for our side
It's time we stop, hey, what's that sound
Everybody look what's going down
(This is my favorite lyric)
Paranoia strikes deep
Into your life it will creep
It starts when you're always afraid
You step out of line, the man come and take you away
We better stop, hey, what's that sound
Everybody look what's going down
Stop, hey, what's that sound
Everybody look what's going down
Stop, now, what's that sound
Everybody look what's going down
Stop, children, what's that sound
Everybody look what's going down
In the last year I saw a field-of-use-category list of all U.S. supercomputers -- meteorology, petroleum exploration, etc. Half of them were something like, "unknown". It took me a few minutes to figure that out. My bet is the NSA. Half. They have no problems with storage space or processing power, at all. Also, they're experimenting with technology. They don't have a s.o.p. datamining setup, but are trying to develop one.
While I'm barely conversant in data mining, I once worked in IT. My understanding is the NSA is, indirectly, attempting to use DM specifically to find and/or ID individuals. But the idea that this massive project, eventually costing billions, is to dig up political dirt doesn't make sense. There are far better, cheaper, more efficient ways to do that which don't involve a host of NSA professionals, software vendors, and outside IT and academic consultants.
The potential for political abuse is certainly there and eventually will be used. But most discussions on the topic I've seen shows few understand the nature of NSA's DM goals, or even what DM is. The NSA wants every fucking piece of data they can get, updated all the time, all constantly being re-evaluated by AI. They're not after "specific people", per se, but in identifying potential associates of terrorists, social networks and their activity, here and abroad, and getting to the terrorists by watching these connections, "watching" through DM.
That is, the NSA is trying to create a DM system that will bring in oceans-full of data 24/7 and connecting all the dots, over and over and over again every day. These aren't traditional databases. NSA even wants them to be predictive, and they'll need robust, cutting-edge AI to do this and to simply flag interesting drops in this constantly changing ocean. It's conceivable, with enough data sources, they could watch people move about in near real-time this way. Unfortunately, every last poor bastard of us will be in the mix also, forever. Even after we're dead.
One example, which I've seen little mentioned, is they not only got those domestic phone calling records, they installed some fiber optics diverters so we must presume they are getting real-time calling records, without any mediation by those phone companies.
Everything we say and do that can be/is digitized, yes, like blog comments, will be grist for DM if they can physically get it. Nothing is stopping them from getting a lot of this right now. If they can plug directly into the phone lines, they can plug into the network.
They screwed up with Poindexter and Total Information Awareness. I doubt the momentum changed an iota in the NSA, they just went black with everything.
In the end the NSA anti-terrorist data-mining project will be the Digital Domesday Book of the 21st century.
I certainly don't mean this in a way that should be construed to support these kinds of illegal/unconstitutional surveillance, and I wouldn't want to see this thought appropriated as a way to spread more fear, but even before I knew any specifics of the NSA's counterterrorism programs, I've wondered about this.
If I were part of an organized and well funded worldwide terrorist organization, and I had some operations upcoming that required communication, here's what I would do. I would "game the system". Computers have the capability to transmit tremendous amounts of data-that's why spam works. I could put so much incorrect, non-factual and out of date information into the system that the information that was accurate would A.) lose credibility quickly as so much other information turned out to be false and B.) get lost in the noise. Sure, the NSA's coders could keep developing software that would recognize disinformation, but my coders could keep updating the system also.
It just seems to me that for the US to become more and more reliant on data analysis systems is to allow for a potential enemy to completely cloud your vision. I think it's a problem worth thinking about.
mikey
As far as I can tell, the current administration positively Revels in cloudy thinking.