Now on ScienceBlogs: The Festival Recognizes Our First "Featured Fan"!

ScienceBlogs Book Club: Inside the Outbreaks

Good Math, Bad Math

Finding the fun in good math; Shredding bad math and squashing the crackpots who espouse it.

Search

Profile

markcc.jpg
Mark Chu-Carroll (aka MarkCC) is a PhD Computer Scientist, who works for Google as a Software Engineer. My professional interests center on programming languages and tools, and how to improve the languages and tools that are used for building complex software systems.

Donors Choose

Other Information

Add this blog to my Technorati Favorites!

Recent Posts

Recent Comments

Categories

Blogroll

Old Topic Indices

Great Online Books

« You can't write that number; in fact, you can't write most numbers. | Main | Finally: Finger Trees! »

Cloud Computing

Category: ComputationCoolnessProgramming
Posted on: May 20, 2009 1:50 PM, by Mark C. Chu-Carroll

cloud-creatures.jpg

In general, I try to keep the content of this blog away from my work. I don't do that because it would get me in trouble, but rather because I spend enough time on work, and blogging is my hobby. But sometimes there's an overlap.

One thing that's come up in a lot of conversations and a lot of emails it the idea of cloud computing. A lot of people are interested in it, but they're not really sure of what it is, or what it means.

So what do we mean when we talk about "cloud computing"? What's the cloud? How's it different from good old-fashioned client/server computing?

data-center-t01.jpg

The idea of cloud computing is that there's a world of computers sitting in data centers, scattered around the world. The programs that you run, the data that you store, are somewhere out there - but you don't know where, and more importantly, you don't care where.

A simple example of this idea: I started writing this blog on Blogger. Blogger is a piece of software run by Google on (probably) thousands of computers in Google's data centers. I don't know where the server running the old blog is; I don't know where the data for it is stored. Blogger is "in the cloud".

That kind of thing is the basic point of cloud computing. Cloud computing is built around the idea of resources: to run some program, to perform some task, you need some set of resources. Resources are things like processing time, network bandwidth, disk storage, or memory. As a user of the cloud, you don't need to know or care where the resources are. You just know what you need, and you buy that quantity of resources from whoever can provide it to you most conveniently.

Cloud-based software is similar to client-server computing in many ways. Both are based on the idea that you don't really run programs on your own computer. Your computer provides a window into an application, but it doesn't run the application itself. Instead of running the program on your computer, all you do on your own computer is run some kind of user interface. The real program is running somewhere else, on a computer called a server. You use the server because for some reason, the resources necessary to run the program aren't available on your local computer - it's cheaper, faster, or more convenient to run the program somewhere else, where the necessary resources are easy to obtain.

The big difference is in what you know: in traditional client-server systems, you had a specific computer that was your server, and that's where your stuff was running. The computer may not have been sitting on your desk in front of you, but you knew where it was. For example, when I was in college, one of the first big computers I used was a Vax 11/780, named nicknamed "Gold". Gold lived in the Rutgers university computing lab in Hill Center. I used Gold pretty much daily for at least a year before I actually got to see it. The data center had at least 30 other computers - several DEC 20s, a couple of Pyramids, an S/390, and a bunch of Suns. But of those machines, I specifically used Gold. Every program that I wrote, I wrote specifically to run on Gold, and that's the only place that I could run it.

In the cloud, you don't have a specific server that you use. You have computing resources - that is, someone is renting you a certain about of computation on some collection of computers somewhere. You don't know where they are; you don't know what kind of computers they are. You could have 2 massive machines with 32 processors each, and 64 gigabytes of memory; or they could be 64 dinky little single-processor machines with 2 gigabytes of memory. The computers where you run your program could have great big disks of their own; or they could be diskless machines accessing storage on dedicated storage servers. To you, as a user of the cloud, that doesn't matter. You've got the resources you pay for, and where they are doesn't matter, so long as you get what you need.

The cloud metaphor is actually a good one. A cloud is a huge collection of tiny droplets of water. Some of those droplets will fall on my yard, providing the trees and bushes with water. Some will fall onto land where it will run off into the reservoir which my drinking water comes from. Clouds grow from evaporated water, which comes from all over the place. When it comes to clouds, what I care about is that enough water falls on my yard to keep the plants alive, and that enough water winds up in my reservoir so that I have enough to drink. I don't care which cloud drops water on my yard. I don't care where on earth that water came from. To me, it's all just water - every droplet is pretty much exactly the same, and I can't tell the difference. So long as I get enough, I'm happy.

You can think of the various data centers around the world, where companies have swarms of computers as clouds. Google, Amazon, Microsoft, IBM, Yahoo, and others all have thousands of machines connected to networks, running all sorts of software. Each of those centers is a cloud, and each processor, each disk drive, is a droplet of water in that cloud. In the cloud world, when you write a program, you don't know what computer it's going to run on. You don't know where the disks that store the data are. And you don't need to care. You just need to know how many droplets you need.

Share on Facebook
Share on StumbleUpon
Share on Facebook
Find more posts in: TechnologyInformation Science

Comments

1

So cloud computing = Client/Server + Load Balancer. Somehow, that makes it seem less impressive. Like the Thin Client rage of the 1990's, this buzzword will soon pass.

Posted by: Nate | May 20, 2009 5:42 PM

2

I've always thought the cool stuff behind cloud computing was actually distributed computing. Tools like bigtable, mapreduce, dynamo, erlang, etc that are the actual enabling technologies behind cloud computing.

It's kind of annoying to me that cloud computing has become this buzzword, and every company is saying "we should do something in the cloud", even if they aren't really using any of the technologies that make that possible.

Also, I'm a little skeptical of services like EC2, S3, and to some extent app engine. With EC2, can you insure your machines are running on the same rack or even the same datacenter? What about the location of S3 respective to EC2 nodes? Have you ever tried running a mapreduce over a big table in a different datacenter? Bad idea.

It seems like irrespective to *where* the resources are stored, you need to know what the underlying topology and throughput and latency between nodes is.

As far as app engine goes, what's the point of being able to scale up if you can't do sophisticated computations with a mapreduce or equivalent batch processing? It seems like it reduces you to scaling up CRUD websites that can't offer much in the way of computationally expensive aggregation of data. Certainly, you couldn't write something like google or facebook on top of app engine...

I think the idea of infrastructure as a service is interesting, but the offerings I see out there right now sure look iffy.

Posted by: Brendan | May 20, 2009 6:16 PM

3

Problem is, for a fair amount of data you do need to care where it resides. Some data may be legal in one jurisdiction but illegal somewhere else. Or data may enjoy very heavy legal protection in one area but have almost none in another. Would Google be happy storing its financial records and internal development documents on another company's cloud system, stored somewhere in the world without having any actual control over what happens to the data or who has access to it?

I'd say a minimal requirement for cloud computing to be used for anything weightier than blogs and email would be good, trustworthy encryption of all data happening at the client level (implying, for instance, that the client-level systems need to be open source). The cloud provider (and by implication, nobody else outside the organization) should not be able to find out what data is actually being stored with it.

Posted by: Janne | May 20, 2009 10:26 PM

4

Wow....you gave quite a lot of information about cloud computing. While going through other articles, I also came across one such article which talked about the challenges being addressed by cloud computing.

Would like to share it with u ...here is the link-

http://www.webguild.org/2009/05/is-cloud-computing-the-answer.php

Posted by: kristine | May 21, 2009 8:23 AM

5

This opens up a new era of weather-related metaphors.


Hurricane computing - sounds very powerful, but rather scary. Probably not.


Cold Snap computing - brisk, invigorating, may catch you unprepared. Be sure to keep the tomatoes under cover.


Contrail computing - a virtual cloud!

Posted by: Bayesian Bouffant, FCD | May 21, 2009 1:00 PM

6

I'm no computing expert, but wasn't there some buzz a while back about MS and/or Mac going to "cloud" type operating systems? So we would buy a basic dumb box and never own the operating system, it would run from some server in some cloud. I never really became a fan of that idea.

Posted by: Martym | May 21, 2009 5:56 PM

7

Hmm, so with this method, computational power becomes a commodity. Makes me think that this could lead to market-based computation, similar to how energy markets work right now, where you can buy energy from the grid as needed and sell it when you have excess. I'm sure there would be hurdles in terms of common architecture as well as security, but it's neat to think about at least...

Posted by: Chris | May 21, 2009 8:19 PM

8

This is all well and good, but what will we do when an artificial intelligence emerges form the cloud and decides to exterminate the human race?

Posted by: Valhar2000 | May 22, 2009 5:27 AM

9

Wow! Thanks a lot, that's the first clear and down-to-machine explanation of cloud computing that I've seen.

When there was a post in a Google Research Blog about the need for enabling communication among the 'clouds', I tried searching about what they really meant, but didn't end up with anything solid. Now I get the idea - they just meant we need a way for sharing information between the Cloud operators' servers and a clear protocol for that, isn't it? Doesn't seem as radical an idea as that article claims, once I understand it.

Posted by: Sundara Raman | May 22, 2009 1:34 PM

10

When you say:

You have computing resources - that is, someone is renting you a certain about of computation

I suppose you meant "amount" :-)

Posted by: Oscar | May 23, 2009 2:46 PM

11

Janne raises an extremely good point. In addition to the example she gives many companies, including the one I work for, have legal responsibilities (eg HIPAA) to guarantee the privacy of the data stored. With cloud computing that's impossible.

Even worse - we can't use cloud technologies such as blogger or twitter or google apps for our business communications because those communications frequently include private information.

Posted by: NoAstronomer | May 29, 2009 4:18 PM

12

Re #11:

First: no one ever said that cloud computing was good for everything. It's a useful new model of programming and resource allocation - but it's never going to replace private computers for all applications.

Second: of course you can encrypt everything. In fact, I believe that most of the cloud computing systems out there encrypt all data that gets written into storage. So for many applications, you can provide an acceptable quantity of security. But since it's generally unavoidable for most applications to require some information to be decrypted on one of the cloud servers, it's completely unacceptable for some applications to run on a server that's owned by some other organization.

Finally: "cloud" doesn't just mean "Google's data centers", or "data centers run by companies like Google". Cloud computing is really just about widely scattered computing resources - there's absolutely nothing stopping companies from setting up their own data centers, or providing data services that include stronger security guarantees. For example, I used to work for IBM. IBM provides a lot of services that we would now call
cloud. Many of those services are used by hospitals, stock brokers, banks, and other organizations that have absolute security requirements. The key is that IBM is considered a trustworthy company, and they're willing to provide a contract guaranteeing a particular degree of security for the data. In other words, security is another resource that can be purchased, in a variety of different ways, if you need it.

Posted by: Mark C. Chu-Carroll Author Profile Page | May 29, 2009 4:36 PM

13

Hey how about a private cloud. Taking those client server applications and making them hardware independent, hosting them yourself via the web and authenticating through your active directory. It is a huge opportunity to use those legacy pcs and provide anytime anywhere secure access to your network applications via the web. Lastly, end the hassle of managing your VPN.

Posted by: Greg Kaufman | May 29, 2009 5:43 PM

14

Yes, I did not expect such a topic. But seems to be interesting. Please read comments. Yours

Posted by: nasza-gwara | June 12, 2009 11:10 AM

15

Re #5;

Have you seen Bespin? The most obvious example of a cloud metaphor being taken to heart.

Re #13;

I think that is a part of the idea behind the new Google Wave (currently exists only as a preview video, closed developer trial and scattered early reviews on a few sites). The software is Open Source, so anyone could set up and run a Wave, which then operates as a rather nifty integrated cloud-style setup. It's mostly geared towards integrating the current Google Cloud apps into a single setup, so GMail, Google Documents, etc.

If only there was someone who knew more about it to give a fuller explanation...

Posted by: Paul Schofield | June 13, 2009 8:46 AM

16

Cells Are Like Robust Computational Systems, Scientists Report

"ScienceDaily (June 17, 2009) — Gene regulatory networks in cell nuclei are similar to cloud computing networks, such as Google or Yahoo!, researchers report today in the online journal Molecular Systems Biology."

http://www.sciencedaily.com/releases/2009/06/090616103205.htm

Posted by: Jonathan Vos Post | June 19, 2009 1:39 AM

17

The cell regulatory systems got there first.

Posted by: Eric | June 23, 2009 11:41 AM

18

Hi, I agree with the above entry. Greetings thread author. Waiting for the next entries.

Posted by: czarna lista allegro | June 26, 2009 10:32 AM

19

Pardon My confusion!
Do you mean we get to step back to the days where we go to a dumb/smart terminal and login to some mainframe somewhere and get our data somewhere and save our precious data somewhere so we can go hunting our precious data somewhere when we need it only to find out our precious data is not there because the company that stored it has gone out of business and I have paid all this money to cloud compute?
Gee I think I'll keep all my data safely backed up and stored in a safe on my site!


Posted by: DK Corky | July 2, 2009 1:56 PM

20

I do not agree with this statement "... client-server computing in many ways. Both are based on the idea that you don't really run programs on your own computer. Your computer provides a window into an application"

This describes a Terminal Server type environment not client/server. Client/Server environments have software that runs on the client as well as the server. Processing can occur on the client or the server.

Posted by: Tom | July 9, 2009 12:04 PM

21

Hello just saw this pic and its awesome, would love to use it for a design. Would that be ok with you?

Posted by: Tich | August 4, 2009 5:14 PM

22

The idea of cloud computing is that there's a world of computers sitting in data centers, scattered around the world.

Posted by: cheap computers | August 10, 2009 9:01 AM

23

Interesting! Please type this for another topic. Regards to the author

Posted by: hosting bez limitu | September 2, 2009 11:46 AM

24

this sucks i have a teacher named Mr.Carmody and he makes us make a 20 slide power point on it as far as I'm concerned we can live without it

Posted by: Bj | June 11, 2010 1:39 PM

25

then what is the difference between grid computing and cloud computing

Posted by: Anonymous | July 4, 2010 1:43 PM

ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter

© 2006-2011 ScienceBlogs LLC. ScienceBlogs is a registered trademark of ScienceBlogs LLC. All rights reserved.