Counting cycles

By wcrawford on June 14, 2009.

I picked up a little buzz about Google software engineers planning to rework the guts of some major open-source software to make it run faster. Since it wasn't software I use, I didn't read enough to remember what software, but it brought up memories...

Walking to school in the snow, 3 miles, uphill, both ways

No, this isn't going to be one of those posts. I only wish we'd had the kind of raw processing power in my early years (decades?) as a systems analyst/programmer that we take for granted now. Most people today spend more time on what needs to be done, and that's as it should be.

This is just a little harmless nostalgia, none of it longing for those days.

(If you want my take as of three years ago as to how I think I'd deal with being young again, here's your post.)

Early on, cycles really didn't count

As I've noted elsewhere, my first systems analysis and programming involved an IBM 188 Collator. (Hmm. 20% of all Google results for the search [IBM '188 collator'] are my handiwork. That may be depressing.] In some ways, the 188 was a marvelous machine, particularly in 1961 when it was introduced: IBM's first punch card equipment using solid-state circuitry and core.

That's right, core memory--visible devices, just a wee bit larger than today's RAM bits. I honestly don't remember how much core the 188 had--maybe 64 bytes, but that's vague memory. I do remember how you programmed it: with a double-wide board full of holes, into some of which you put jumpers to make circuit connections. Hard-wired programming...

For the circulation system, it wasn't a question of using too many computing cycles. You got 650 cycles per minute--that is, one cycle for each card feed. Your program did whatever logical comparisons between two cards (one from each reader) as it could, given the limited core and your ingenuity, then either fed both cards into a common bin or one or both cards into other bins.

Sounds primitive. Was primitive. Worked.

(More technologically interesting, in some ways, was IBM's last card sorter--by far the fastest, and using vacuum feed rather than pushers to move the cards and an optical sensor rather than brush contact, so that a card would last for thousands of sorts without wearing out. Without the speed and gentleness of the IBM 84 [2000 cards per minute, which is fast for a mechanical device processing little pieces of stiff paper], the circ system would never have kept up with Doe Library's volume of business.)

A bit later, every cycle counted

Comparing computing power of, say, the IBM 360/65 that I did early programming on (indirectly, sending decks of cards over from Berkeley to UCSF) and the Intel Core 2 Duo notebook I'm writing this on is a chump's game. Looking at some sources, I see "1.25 million calculations a second" for the '65, which had one megabyte of RAM (rather a lot in those days). How does that compare with two CPUs, each with 1.6 billion processing cycles per second, and 4 gigabytes of RAM? You got me; I'm not sure there is a real answer to that question.

The thing is, doing library processing on a machine with that kind of power required a lot of optimization. The ideal language for the work I wanted to do was clearly PL/I, for its combination of logic and string processing--but the head of the systems office properly wouldn't let me use PL/I because the early compilers just didn't produce tight code. Instead, I used assembler (BAL)...

When PL/I (Optimizer) came along (and we'd moved up to a somewhat faster S/360), I could start using the high-level language--but not without paying attention. I remember a classic example: Cases where I needed to do translates to normalize characters for sorting purposes. The classy way to do that would be to include two strings of characters in the TRANSLATE statement, the source and the object. But, after trying that and seeing the results, I moved to using two 256-character strings (not variables), containing the source and object sets.

Why? Because it made a difference of at least 10:1 in the overall running time of the program--changing it from something we couldn't use to something we could. And once you understood some assembler and learned to read PL/I's pseudo-assembler output, you could see why:

If you were translating using variables, then the compiler would generate code that built two 256-character strings each time the translate was performed, then do the translate--a big, unwieldy loop of code.

If you were translating using fixed strings, then the compiler generated one assembler statement. One. I think the difference for the translation steps was at least two orders of magnitude, maybe even worse.

That's just one example. There were many others. In the '70s and early '80s, I'd probably spend as much time optimizing code as writing it in the first place, maybe more--and after the first two programs, my first code was already fairly optimal.

Don't take me back...

With more abstract tools and less need to worry about cycles, I could have (potentially, at least) accomplished a lot more. So could we all. I think it's great that a modern PC (Mac, Unix or Vista) can devote perhaps 90% of its cycles to system overhead--and still have plenty left for actual computation.

Still, sometimes things really do run slower than you'd like--and there are still lots of programmers who understand code efficiency. (I'd bet Google has hundreds of them!) They may be counting cycles at a more abstract level, but they're still coming a little closer to the machine side of the man:machine boundary to get the job done.

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Universities Can Agree On All Hate Speech Except Antisemitism

More by this author

The last post (and a little oops)

September 18, 2009

I posted what was to be the last post on this blog yesterday. This morning, in clearing out archives (of stuff that originally appeared on the original site), I accidentally cleared out the most recent 25 posts instead of post 796-821. (Don't ask.) I'll restore the other 23, maybe, at least to one…

Great customer service redux

July 17, 2009

Some of you may remember back in May 2008 when I discussed the unexpectedly good customer service provided by Mill Creek Entertainment, the company busily mining public domain (and otherwise minimal-license) flicks and TV flicks to create really inexpensive bundles of movies on DVD. (That's not all…

What did you learn at ALA?

July 17, 2009

More of you attended this year's ALA Annual Conference than ever before. If the programs I attended (and I attended more programs than usual) and the crowds I saw in exhibits are any indication, you were active at the conference, not just there for the sunshine. (Yes, "active" includes schmoozing;…

Not dead yet, not really back yet

July 15, 2009

I don't usually go nine days between posts, but... You can blame ALA Annual 2009 in Chicago for most of that. I (still) travel without technology, so no blogging from ALA--and also no keeping up with blogs, FriendFeed, etc. (but email once or twice in the hidden Internet room in the exhibits).…

Restored copyright? Querulous comments on early Hitchcock

July 6, 2009

A couple of days ago, on Walt, Even Randomer, I posted a set of desultory reviews of the fourth and final DVD of Alfred Hitchcock: The Legend Begins. Sidebar: One eccentric feature of this blog used to be the "treadmill movie reviews," brief reviews of movies from Mill Creek Entertainment's…