Cloud Computing

i-015c8e4f7b4e3d783c9b91d0e36e087a-cloud-creatures.jpg

In general, I try to keep the content of this blog away from my work. I don't do
that because it would get me in trouble, but rather because I spend enough time on work, and blogging is my hobby. But sometimes there's an overlap.

One thing that's come up in a lot of conversations and a lot of emails it the idea of cloud computing. A lot of people are interested in it, but they're not really sure of what it is, or what it means.

So what do we mean when we talk about "cloud computing"? What's the cloud? How's it different from good old-fashioned client/server computing?

i-b623afcf223b6de274989340a08932a1-data-center-t01.jpg

The idea of cloud computing is that there's a world of computers sitting in data centers, scattered around the world. The programs that you run, the data that you store, are somewhere out there - but you don't know where, and more importantly, you don't care where.

A simple example of this idea: I started writing this blog on Blogger. Blogger is a piece of
software run by Google on (probably) thousands of computers in Google's data centers. I don't
know where the server running the old blog is; I don't know where the data for it is stored. Blogger is "in the cloud".

That kind of thing is the basic point of cloud computing. Cloud computing is built
around the idea of resources: to run some program, to perform some task,
you need some set of resources. Resources are things like processing time, network
bandwidth, disk storage, or memory. As a user of the cloud, you don't
need to know or care where the resources are. You just know what you need,
and you buy that quantity of resources from whoever can provide it to you most
conveniently.

Cloud-based software is similar to client-server computing in many ways.
Both are based on the idea that you don't really run programs on your own computer. Your
computer provides a window into an application, but it doesn't run the application itself.
Instead of running the program on your computer, all you do on your own computer is
run some kind of user interface. The real program is running somewhere else,
on a computer called a server. You use the server because for some reason, the resources
necessary to run the program aren't available on your local computer - it's cheaper, faster,
or more convenient to run the program somewhere else, where the necessary resources
are easy to obtain.

The big difference is in what you know: in traditional client-server systems, you had a
specific computer that was your server, and that's where your stuff was running. The computer may
not have been sitting on your desk in front of you, but you knew where it was. For example, when I
was in college, one of the first big computers I used was a Vax 11/780, named nicknamed "Gold". Gold lived in the Rutgers university computing lab in Hill Center. I used Gold pretty much daily for at least a year before I actually got to see it. The data center had at least 30 other computers - several DEC 20s, a couple of Pyramids, an S/390, and a bunch of Suns. But
of those machines, I specifically used Gold. Every program that I wrote, I wrote specifically
to run on Gold, and that's the only place that I could run it.

In the cloud, you don't have a specific server that you use. You have computing resources -
that is, someone is renting you a certain about of computation on some collection of computers
somewhere. You don't know where they are; you don't know what kind of computers they are. You
could have 2 massive machines with 32 processors each, and 64 gigabytes of memory; or they could
be 64 dinky little single-processor machines with 2 gigabytes of memory. The computers where you
run your program could have great big disks of their own; or they could be diskless machines
accessing storage on dedicated storage servers. To you, as a user of the cloud, that doesn't
matter. You've got the resources you pay for, and where they are doesn't matter, so long
as you get what you need.

The cloud metaphor is actually a good one. A cloud is a huge collection of tiny droplets of
water. Some of those droplets will fall on my yard, providing the trees and bushes with water.
Some will fall onto land where it will run off into the reservoir which my drinking water comes
from. Clouds grow from evaporated water, which comes from all over the place. When it
comes to clouds, what I care about is that enough water falls on my yard to keep
the plants alive, and that enough water winds up in my reservoir so that I have enough
to drink. I don't care which cloud drops water on my yard. I don't care
where on earth that water came from. To me, it's all just water - every droplet is
pretty much exactly the same, and I can't tell the difference. So long as I get enough,
I'm happy.

You can think of the various data centers around the world, where companies have swarms of computers as clouds. Google, Amazon, Microsoft, IBM, Yahoo, and others all have thousands of machines connected to networks, running all sorts of software. Each of those centers is a cloud, and each processor, each disk drive, is a droplet of water in that cloud. In the cloud
world, when you write a program, you don't know what computer it's going to run on. You don't know where the disks that store the data are. And you don't need to care. You just need to know
how many droplets you need.

Categories

More like this

Genome Biology recently published a review, "The Case for Cloud Computing in Genome Informatics." What is cloud computing? Well: This is a general term for computation-as-a-service. There are various different types of cloud computing, but the one that is closest to the way that computational…
I was born in 1984. My earliest memory of a computer is thumbing through a plastic box of black, square 5.25-inch floppy disks, trying to decide whether I wanted to play The Oregon Trail, Mavis Beacon Teaches Typing, or Word Munchers on the family Compaq 386. Since most of the ScienceBloggers…
Technorati Tags: scale, computation, information Since people know I work for Google, I get lots of mail from folks with odd questions, or with complaints about some Google policy, or questions about the way that Google does some particular thing. Obviously, I can't answer questions about Google…
This came up in a question in the post where I started to talk about π-calculus, but I thought it was an interesting enough topic to promote it up to a top-level post. If you listen to anyone talking about computers or software, there are three worlds you'll constantly hear: parallel, concurrent,…

So cloud computing = Client/Server + Load Balancer. Somehow, that makes it seem less impressive. Like the Thin Client rage of the 1990's, this buzzword will soon pass.

I've always thought the cool stuff behind cloud computing was actually distributed computing. Tools like bigtable, mapreduce, dynamo, erlang, etc that are the actual enabling technologies behind cloud computing.

It's kind of annoying to me that cloud computing has become this buzzword, and every company is saying "we should do something in the cloud", even if they aren't really using any of the technologies that make that possible.

Also, I'm a little skeptical of services like EC2, S3, and to some extent app engine. With EC2, can you insure your machines are running on the same rack or even the same datacenter? What about the location of S3 respective to EC2 nodes? Have you ever tried running a mapreduce over a big table in a different datacenter? Bad idea.

It seems like irrespective to *where* the resources are stored, you need to know what the underlying topology and throughput and latency between nodes is.

As far as app engine goes, what's the point of being able to scale up if you can't do sophisticated computations with a mapreduce or equivalent batch processing? It seems like it reduces you to scaling up CRUD websites that can't offer much in the way of computationally expensive aggregation of data. Certainly, you couldn't write something like google or facebook on top of app engine...

I think the idea of infrastructure as a service is interesting, but the offerings I see out there right now sure look iffy.

Problem is, for a fair amount of data you do need to care where it resides. Some data may be legal in one jurisdiction but illegal somewhere else. Or data may enjoy very heavy legal protection in one area but have almost none in another. Would Google be happy storing its financial records and internal development documents on another company's cloud system, stored somewhere in the world without having any actual control over what happens to the data or who has access to it?

I'd say a minimal requirement for cloud computing to be used for anything weightier than blogs and email would be good, trustworthy encryption of all data happening at the client level (implying, for instance, that the client-level systems need to be open source). The cloud provider (and by implication, nobody else outside the organization) should not be able to find out what data is actually being stored with it.

This opens up a new era of weather-related metaphors.

Hurricane computing - sounds very powerful, but rather scary. Probably not.

Cold Snap computing - brisk, invigorating, may catch you unprepared. Be sure to keep the tomatoes under cover.

Contrail computing - a virtual cloud!

By Bayesian Bouff… (not verified) on 21 May 2009 #permalink

I'm no computing expert, but wasn't there some buzz a while back about MS and/or Mac going to "cloud" type operating systems? So we would buy a basic dumb box and never own the operating system, it would run from some server in some cloud. I never really became a fan of that idea.

Hmm, so with this method, computational power becomes a commodity. Makes me think that this could lead to market-based computation, similar to how energy markets work right now, where you can buy energy from the grid as needed and sell it when you have excess. I'm sure there would be hurdles in terms of common architecture as well as security, but it's neat to think about at least...

This is all well and good, but what will we do when an artificial intelligence emerges form the cloud and decides to exterminate the human race?

By Valhar2000 (not verified) on 21 May 2009 #permalink

Wow! Thanks a lot, that's the first clear and down-to-machine explanation of cloud computing that I've seen.

When there was a post in a Google Research Blog about the need for enabling communication among the 'clouds', I tried searching about what they really meant, but didn't end up with anything solid. Now I get the idea - they just meant we need a way for sharing information between the Cloud operators' servers and a clear protocol for that, isn't it? Doesn't seem as radical an idea as that article claims, once I understand it.

When you say:

You have computing resources - that is, someone is renting you a certain about of computation

I suppose you meant "amount" :-)

Janne raises an extremely good point. In addition to the example she gives many companies, including the one I work for, have legal responsibilities (eg HIPAA) to guarantee the privacy of the data stored. With cloud computing that's impossible.

Even worse - we can't use cloud technologies such as blogger or twitter or google apps for our business communications because those communications frequently include private information.

By NoAstronomer (not verified) on 29 May 2009 #permalink

Re #11:

First: no one ever said that cloud computing was good for everything. It's a useful new model of programming and resource allocation - but it's never going to replace private computers for all applications.

Second: of course you can encrypt everything. In fact, I believe that most of the cloud computing systems out there encrypt all data that gets written into storage. So for many applications, you can provide an acceptable quantity of security. But since it's generally unavoidable for most applications to require some information to be decrypted on one of the cloud servers, it's completely unacceptable for some applications to run on a server that's owned by some other organization.

Finally: "cloud" doesn't just mean "Google's data centers", or "data centers run by companies like Google". Cloud computing is really just about widely scattered computing resources - there's absolutely nothing stopping companies from setting up their own data centers, or providing data services that include stronger security guarantees. For example, I used to work for IBM. IBM provides a lot of services that we would now call
cloud. Many of those services are used by hospitals, stock brokers, banks, and other organizations that have absolute security requirements. The key is that IBM is considered a trustworthy company, and they're willing to provide a contract guaranteeing a particular degree of security for the data. In other words, security is another resource that can be purchased, in a variety of different ways, if you need it.

Hey how about a private cloud. Taking those client server applications and making them hardware independent, hosting them yourself via the web and authenticating through your active directory. It is a huge opportunity to use those legacy pcs and provide anytime anywhere secure access to your network applications via the web. Lastly, end the hassle of managing your VPN.

Re #5;

Have you seen Bespin? The most obvious example of a cloud metaphor being taken to heart.

Re #13;

I think that is a part of the idea behind the new Google Wave (currently exists only as a preview video, closed developer trial and scattered early reviews on a few sites). The software is Open Source, so anyone could set up and run a Wave, which then operates as a rather nifty integrated cloud-style setup. It's mostly geared towards integrating the current Google Cloud apps into a single setup, so GMail, Google Documents, etc.

If only there was someone who knew more about it to give a fuller explanation...

By Paul Schofield (not verified) on 13 Jun 2009 #permalink

Pardon My confusion!
Do you mean we get to step back to the days where we go to a dumb/smart terminal and login to some mainframe somewhere and get our data somewhere and save our precious data somewhere so we can go hunting our precious data somewhere when we need it only to find out our precious data is not there because the company that stored it has gone out of business and I have paid all this money to cloud compute?
Gee I think I'll keep all my data safely backed up and stored in a safe on my site!

I do not agree with this statement "... client-server computing in many ways. Both are based on the idea that you don't really run programs on your own computer. Your computer provides a window into an application"

This describes a Terminal Server type environment not client/server. Client/Server environments have software that runs on the client as well as the server. Processing can occur on the client or the server.

Hello just saw this pic and its awesome, would love to use it for a design. Would that be ok with you?

The idea of cloud computing is that there's a world of computers sitting in data centers, scattered around the world.

By cheap computers (not verified) on 10 Aug 2009 #permalink

this sucks i have a teacher named Mr.Carmody and he makes us make a 20 slide power point on it as far as I'm concerned we can live without it

then what is the difference between grid computing and cloud computing

By Anonymous (not verified) on 04 Jul 2010 #permalink