Lions, Tigers, and Crowds

By jwilbanks on March 26, 2009.

I gave a talk at eTech two weeks ago. It was a busy time - I was in the middle of my wedding, which was in Brazil, and I actually had to leave Brazil and fly to San Jose to give the talk, have a couple of meetings, and fly right back so that I could rejoin the wedding festivities. We were announcing a collaboration with Microsoft (which has garnered its own attention and criticism, and deserves its own blog posting here, which it will get) as well.

I'm also trying out some new themes for talks. I gave over 70 talks last year and although I loved the three core talks I gave much of the year, when you talk that much you want to say something new. It was also one of my first talks to a truly general web 2.0-ish crowd as opposed to my normal audiences of scientists and policy wonks. I felt a pressure to be both accessible and challenging.

Well, one of the neat things about the Web is that you find out how successful you are in your goals. Cameron Neylon is a blogger I follow pretty closely - he's smart *and* funny, a rare combination - and he's posted a deep and thoughtful meditation on what was to me a minor chord progression in my talk. But apparently it resonated and sparked some conversation out in the luminiferous aether.

On Slide 17 I state: "there is no crowd." Cameron has a nice post on what crowds mean, two different kinds of crowdsourcing, and the Polymath project that Michael Nielsen's written about.

I wasn't talking about any of that stuff. I was simply trying to point out to an audience used to a potential market of billions that, comparatively speaking, there just aren't a lot of scientists, and that has to be taken into account in our strategies for creating open science.

The low total numbers of scientists, and the high barriers to becoming a scientist, represent a design challenge for open science. I tend to think the urge to share is distributed across people at different levels. Certainly, some of us want to share more than others. Maybe it's Gaussian, maybe it's not. But if x% of people like to share generally, then it's likely that some y% of scientists like to share, and it's probably within the same order of magnitude.

The difference is that .005% of all web users gets us Wikipedia. .005% of geneticists gets us a table at T.G.I. Friday's. My point was that the math breaks down for crowds and science.

I don't make this point to discourage us from open science. I make it because I believe it creates a series of what Dennett would call "forced moves" for our strategies to achieve open science. And we have some guidance from how we arrived at today's open web and programming and cultural worlds.

First, we have to encode enough knowledge into abstract, reusable forms that science is easier. This gets sniffed at all the time - "science is supposed to be hard" bellyaching, and "i don't want normal people to understand genetics, because it's complicated" and so forth.

In the immortal words of Socrates, f*ck that.

Programming used to be really hard. In the 1970s, it *was* really hard. In the 80s it was easier. In the 90s it was easy enough that I could do it, and enough people could do it together that we got free and open source software. Today, my aunt Fran can do things like add Facebook apps that cover up all sorts of complexity and allow her to do things that used to be the exclusive domain of programmers.

It is coming to the sciences. If you don't believe me, I would encourage you to attend an iGEM conference and have your mind blown by teenagers programming bacteria, or to do an eBay search for gene sequencers, or to order yourself a copy of scratch-built gene sequence from Mr Gene...and that's just biology.

This change has got to happen in science. First, it makes the scientists themselves more powerful (as OOP made programmers more powerful). Then it lets us bring in the smart people who haven't been through the guild training sessions. Then it's going to let in crowds. This change requires an object-oriented approach to knowledge, which is why I think we need ontologies, and which is why I am a convert to the Semantic Web.

The second design constraint is access. If we only have a few people, we have to make damn sure that each of them is as powerful as possible. That's part of point one. But remember that we're talking about knowledge here, and there's a lot of it out there already. Unfortunately it's not object oriented.

Hell, it doesn't even have hyperlinks.

But we have to convert it to the new formats that let us empower scientists, who will in turn empower the emergence of crowds. That means we have to have access to it, and the rights to make the transformative changes to formats. It's unlikely we have got the systems correct right now. Remember there were dozens upon dozens of hypertext systems developed before the right one evolved at the right time. We need to let a lot of experiments happen and let evolutionary selection work its magic. We can't count on one company, one school, one publisher to get it right.

Access and knowledge encoding. They're part of the same continuum, part of what lets us move forward culturally *and* technologically. They turn the lack of a crowd from a problem into a solution, because they let us increase the power of the crowd we have, and over time, dissolve the participation barriers that keep the crowd artificially small.

I repudiate the idea that science is special and hard because of science. I think it's special and hard because we have failed to imagine the world in which, 25 years from now, science is part of our lives just like the web is part of our lives. When I was a kid, the idea that we'd use computers in literally everything was fiction. But a group of people made it happen, by creating a world where the tools of programmers diffused out to a wider world, and by encoding principles of access into networks and documents.

For me, open science is about enabling precisely the same transformation. We have a model. We just have to have the will and the wits to make it work again.

More like this

Modularity and scalability

Michael Nielsen gets it right, again. This is what I'm on about when I talk about ontologies and object-orientation of knowledge. In science, the code is the knowledge. Unlike computer programming, the code is locked up PDF and XML formats, and behind firewalls and copyrights (at least in code you…

In Which We Continue To Push The Sisyphean Rock Up The Hill

My last posts on why I don't like the open source metaphor for science have generated a lot of good comments, here and in my email, twitter, and in person. They've forced me to think about what exactly it is about the meme that makes me so uncomfortable, and raised some good objections and points…

Open Source Science? Or Distributed Science?

I was asked in an interview recently about "open source science" and it got me thinking about the ways that, in the "open" communities of practice, we frequently over-simplify the realities of how software like GNU/Linux actually came to be. Open Source refers to a software worldview. It's about…

PNAS: Dr. Mathochist, Software Engineer

I've decided to do a new round of profiles in the Project for Non-Academic Science (acronym deliberately chosen to coincide with a journal), as a way of getting a little more information out there to students studying in STEM fields who will likely end up with jobs off the "standard" academic…

Your point about hyperlinks is important. I have been involved in the conversation about your Microsoft collaboration and the lack of hyperlinks was quite apparent there. A commenter at my site used an example from the human diseases ontology - I wanted to refer to a particular disease but could not find a web endpoint, only an entire ontology to download in some format I know nothing about. One of the things Microsoft could help us with would be to put all the ontologies used by the plugin online so anyone can reference a node.

See my post asking why a word processor ontology plugin should not use hyperlinks to embed semantics into a document so that everyone can play.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

World Opera, Collaborative Science, and Getting On The One

March 3, 2011

(blows off the dust since the last entry) (Life trumped blogging; my first child was born in March) Just before I went into the parent tunnel, which is awesome by the by, I attended a seminar conducted by Niels Windfeld Lund, General Manager of the World Opera. Not my usual event. But music's…

Documents and Data...

September 10, 2010

Last month I was on Dr. Kiki's Science Hour. Besides being a lot of fun (despite my technical problems, which were part of my recent move to GNU/Linux and away from Mac!), I also discovered that at least one person I went to high school with is a fan of Dr. Kiki, because he told everyone about the…

Marking and Tagging the Public Domain

August 11, 2010

I am cribbing significant amounts of this post from a Creative Commons blogpost about tagging the public domain. Attribution is to Diane Peters for the stuff I've incorporated :-) The big news is that, 18 months since we launched CC0 1.0, our public domain waiver that allows rights holders to place…

rdf:about="Shakespeare"

July 11, 2010

Dorothea has written a typically good post challenging the role of RDF in the linked data web, and in particular, its necessity as a common data format. I was struck by how many of her analyses were spot on, though my conclusions are different from hers. But she nails it when she says: First, HTML…

Of Pepsi and ScienceBlogs...

July 7, 2010

I've gotten a few emails about the Pepsi-ScienceBlogs tempest. It's clearly taken a toll on ScienceBlogs' credibility. Some of my SciBlings have resigned in protest, and others are taking shots on the topic. Sponsorship is part of scientific publishing, even in the peer reviewed world. Remember how…