One thing that continually amazes me is the amount of email I get from
readers of this blog asking for career advice. I usually try to just politely
decline; I don’t think I’m particularly qualified to give personal
advice to people that I don’t know personally.
But one thing that I have done before is shared a bit about my own
experience, and what I’ve learned about the different career paths that you
can follow as a computer science researher. About six months after I started
this blog, I wrote a post about working in academia versus working in
industry. I’ve been meaning to update it, because I’ve learned
a bit more in the last few years. When I wrote the first version, I
was a research staff member at IBM’s T. J. Watson research center. Since
then, I left IBM, and I’ve been an engineer at Google for 2 1/2 years.
Having spent a couple of years as a real full-time developer has
been a seriously educational (and humbling) experience. If you’d like
to look at the original to see how my thinking has changed, you can find it
here.
At least as a computer scientist, there are basically three kinds of work
you can do that take advantage of a strong academic background like a PhD. You
can go into academia and do research; you can go into industry and do
research; or you can go into industry and do development. If you do
the last, you’ll likely be doing what’s sometimes called advanced
development, which is building a system where you’ve got a specific
goal, where you need to produce something real – but it’s out on the edge of
what people really know how to do. You’re not really doing research, but
you’re not doing run-of-the-mill programming either: you’re doing full-scale
development of systems that require exploration and experimentation.
I’m going to talk about what the differences are between
academic research, industrial research, and advanced development in
terms of the basic tradeoffs. As I see it, there really five fundamental
areas where the three career paths differ:
- Freedom: In academia, you’ve got a lot of freedom to do
what you want, to set your agenda. In industrial research, you’ve
still got a lot of freedom, but you’re much more constrained: you
actually need to answer to the company for what you do. And in AD,
you’re even more constrained: you’re expected to produce a particular
product. You generally have a decent amount of freedom to choose
a product to work on, but once you’ve done that, you’re pretty much
tied down. - Funding: In academia, you frequently need to devote huge amounts
of your time to getting funding for your work. In industrial research,
there’s still a serious amount of work involved in getting and
maintaining your funding, but it’s not the same order of magnitude
as in academia. And in AD, you don’t really need to worry about funding
at all. - Time and Scale: Academic projects frequently have to be limited
in scale – you’ve got finite resources, but you can plan out
a research agenda years in advance; in industrial
work (whether research or AD), you’ve got access to resources that
an academic can only dream of, but you need to produce results
now – forget about planning what you’ll be doing five years
from now. - Results: What you produce in the end is very different
depending on which path you’re on. In academic research, you’ve got
three real goals: get money, publish papers, and graduate students.
In industry, you’re expected to produce something of value to
the company – whether that’s a product, patents, reputation, depends
on your circumstances – but you need to convince the company that
you’re worth what they’re paying to have you. And in AD, you’re
creating a product. You can publish papers along the way, and that’s
great, but if you don’t have a valuable product at the end, no number
of papers is going to convince anyone that your project wasn’t a failure. - Impact: what kind of affect your work will have on
the world/people/computers/software if it’s successful.
Freedom
To many people, this is the fundamental tradeoff between industry and
academia. The short version is that academics have a lot more freedom
than industry folks, but it comes at a serious price.
When you’re a professor, you’ve got a huge amount of freedom. In an
important sense, you don’t really have a boss. You set your agenda, and you
pursue it however you want. You can decide what to work on. You can decide
what your goals are, and you can decide when to change them. You’re in
charge.
In industry, you don’t have nearly so much freedom. You’re constrained by
the needs of your company. Even in the most free-wheeling industrial
environment, you can’t just pick what you want to do; you’re expected to do
things that are at least potentially beneficial to the company. (And
that potential had better actually be a pretty high probability!)
The biggest strike here against industry is politics. (Not that there
aren’t politics in academia…) In an industrial setting, you’re stuck living
with the outcome of political struggles in which you aren’t even a
participant. In my time in industry, I’ve seen very good projects be cancelled
in favor of garbage as the result of random turf wars between executives. Your
work can be outstanding – but because of the outcome of some pointless
political struggle, you could have to completely change directions on
virtually no notice.
Funding
This is the biggest problem with academia: as a professor, you need to
find a way to raise money to provide the resources you need to do your work.
That’s a huge problem: most of the academic folks I know spend at least half
of their time writing grant proposals, grant reports, work summaries,
attending status meetings, and so on – doing all of the things that are
necessary to keep their work and their students funded. (And that means that
they’re not nearly as free as the general statement above about being able to
do what they want would imply: academics can do what they want provided they
can get someone to pay for it; but getting someone to pay for work is very
hard; and getting someone to pay for something very different from what you’ve
done before can be close to impossible.)
In industry, your funding generally comes from product development groups
within your company. As an industrial researcher, you are indirectly working
for the product groups. This tends to mean that you spend much less time going
around and begging for money; it also means that you have a lot fewer choices
about who to send an application to. If the product group for your research
area isn’t interested in what you’re doing, you’re going to have to find a new
project.
In development, you’ve got some of the same problems as academic research
- but in practice, it tends to be a lot less burdensome. Before you can start
a project, you need to get someone to agree to fund the project. But once you
get going, you don’t worry about budgets – it’s someone else’s problem. You
just get to focus on the work. (The someone elses problem is the key.
Obviously, money is still an issue: someone needs to pay for the development.
But in general, for product development, the money is allocated ahead of time
- so you don’t worry about it; some administrative person is responsible for
keeping it flowing.)
It’s a direct tradeoff with freedom: the more freedom you have, the more
you’re stuck working to get resources; the more constrained you are, the more
secure your funding situation is. Speaking as someone with development, this
tradeoff can cut both ways: sometimes, the fact that you don’t need to worry
about funding is enough freedom in itself to make up for the limitations on
what you can do; at other times, it’s frustrating enough to make you want to
bang your head against a wall.
Time and Scale
This is the most direct tradeoff of any of these.
In academia, you get to spend a lot of time working on something. Every
academic researcher I know has at least the next five years of their
work planned out – and usually considerably more than that. Academics get to
really create an ambitious, long-term agenda, and follow it. In contrast, in
industrial work, you rarely get to plan more than a year or two (if you’re
lucky) in advance.
On the other hand, industrial researchers tend to work on a scale that’s
almost unimaginable to academics. In my field (software engineering),
academics talk about what they call large systems, which are
typically a couple of thousand lines of code. (I can’t tell you how many
papers I’ve reviewed that talk about tools that work on “real-world large
scale systems”, but turn out to max out around 10,000 lines of code.) In
contrast, one of my first projects at IBM involved doing static analysis of
templates in a C++ compiler. The code base that I ran my initial
tests on was 1.5 million lines of code. At Google, I’ve got a
configuration file that specifies a set of source files to be spliced
together, and that configuration file is longer than the the entire code base
used by most academic research projects.
My current project is building a system which processes terabytes of data
every day. I don’t even know how many machines it’s currently running on – but
it’s in the thousands. And around here, that’s routine.
If I were to get back into static program analysis, I could easily get
tens of millions of lines of code to test on – and I could use hundreds or
even thousands of machines to speed up the analysis if I wanted to! No
academic gets to do anything on that scale!
On the other hand: I never expected to wind up doing logs analysis. It’s
a huge change from the stuff I’ve done before. It’s still within the
general scope of things that I like to do, but it’s probably not an area that
I would have gravitated towards if I were free to choose anything I wanted.
Results
Results are the primary product of your work, and they’re hugely different depending
on your career path.
In academia, you produce two things: publications and students. And the students
mostly matter because they help you produce publications. Publications are pretty
much the thing that matters in academia, so producing papers is where
you focus your effort.
Industrial research produces three things: papers, patents, and products. The
last two count for a whole lot more than the first. Patents are a big deal in industry.
Partly, that’s because they bring in a lot of money; and partly because
they can save the company a lot of money. The way that patents end
up working in industry is sort of like the mutually assured destruction strategy of nuclear weapons in the real world. You want to have enough patents (bombs) to make sure that you can utterly obliterate your competition (opponents), so that they know that they
can’t obliterate you without killing themselves.
Products in industry research really means prototypes. In general, industry researchers don’t produce full-fledged products. What they do is create a new idea,
and build an implementation that demonstrates the idea. If it proves out, a product
development group will adopt the idea and implement it as a part of a product.
Obviously, being an engineer working on a product, what you produce is
the product.
Impact
That leads to a final tradeoff, which makes a huge difference to me:
impact. What I mean by that is what kind of affect your work has
on other people.
The primary output of most academic work is publications. When academic
work is highly successful, it has an ideological impact – the ideas influence
other people. It’s a sort of indirect impact, but it shouldn’t be
under-appreciated. Most of the really great things to come out of research
were built on ideas that came from academic research. It can take a long time,
and it can be very indirect – but eventually, if it’s really good work, it can
have an impact. (But to be honest, most of the time, it doesn’t. Most academic
work produces papers that no more than a handful of people read, and which
never influences anything.)
Industry also produces ideas and papers, but they’re not the primary form
of impact. Most industrial research work produces two things: patents and
prototypes, which (if they’re successful), wind up influencing the company
and/or its products. Like academia, most of it dies an unmourned death: very
few research prototypes actually wind up making much difference. But when they
do, it tends to be more tangible than what happens in academia. In industry,
when your work gets picked up, it gets picked up right away, and
you’ll probably know the people doing it. Academic research tends to take
longer, and be much more indirect. As an academic, you probably won’t even
know when someone is building on what you did until after they’re
done.
Industrial development is very different: you tend to have
direct, immediate, tangible impact. There’s a directness to it
which is very different from anything else. In general, in the short
term, you get to see an immediate impact from your work, which is
extremely rewarding. But it’s likely to be short-lived: rarely does a
development project end up turning into an influential long-lived
product. But development tends to have a higher
success rate than research, and when it’s successful, it’s wonderful -
you get to see the product of your work helping other people.
In my 11 years at IBM, my research never produced anything that really got
used. Selling something new to an IBM product development group is incredibly
hard – the way the company is put together, it’s really hard to produce a new
product, and even for existing products, the development teams are usually so
overworked that they just don’t have time to look at incorporating some new
research prototype into their system. In my time there, there were two
times that I managed to get a development group to pick up my work; and both
times, the product never got released. (And both times, the reason that it
didn’t get released was completely political.)
In contrast, when I came to Google, my first project was a query language
for component dependency graphs in our build system. Within one day of when I
checked the first version of it into our code repository, I had people using
it. Within a week, I had over a hundred people actively using my in-progress
code. Now, I’d be surprised if there’s a single engineer at Google who’s never
used it.
Of course, the down-side of that is that my code got replaced. After
people used it for a few months, we realized that the syntax really just
wasn’t right for the way it was going to be used. Since it was a query
language, I’d tried to do something SQL-like, so that it would be familiar to
people; it turned out that people wrote much more complicated queries than we
anticipated, and it was really hard to write complicated queries over a
depedency graph using a SQL-like syntax. Since by then, my time was committed
to other things, someone else did a rewrite to change the syntax, and that’s
the version that people use. That’s pretty typical of development in my
experience: I got to do something really cool and exciting and useful,
and I got to put it into the hands of people who needed it?
Now, in my current project, I’ve got a couple of hundred internal
customers. People who actively use the product of my work. People who
are affected by what I do: who have access to the information that they
need to do their jobs, because of the tools that I produced.
My work actually matters to people. The importance of that can’t be overstated. I’m a person whose research
was always focused on how to build tools that help other people program. Now
I’m building tools that my coworkers really use to get their work
done. I never managed to do that in industrial research; and I never would
have been able to have such a direct impact on other people from academia.
So what do I do, and why?
These days, I’m a software engineer doing AD at Google. Doing AD at Google
isn’t something special or unusual the way it was when I was at IBM; the
overwhelming majority of engineers at Google are doing what I call advanced
development. It’s the nature of the company: most of what we do is on the edge
of what current technology can do. We’re working with quantities of data that
are almost incomprehensible, and it’s our job to make them
comprehensible. So almost everything we do winds up being on the edge.
As you can probably guess from the description above, I’m pretty happy
in advanced development. I won’t pretend that it’s without its frustrations.
There are definitely times when I miss the freedom that I had as a researcher.
But on balance, I’m a lot happier being an engineer at Google than I ever
was being a researcher at IBM – and my guess is that I’m happier than I would
be in academia.
Freedom is great, and I don’t mean to downplay it. I would like a bit more
freedom than I have. But my experience has been that when I’ve had more
freedom, I was less happy. That’s not because I like to have my work
dictated to me, or to be tied down to something specific and well-defined. I
know people who like development for those reasons, but they’re not my
reasons. For me, it comes down to the way that the tradeoffs between freedom
and impact work out for the kind of work that I want to do. My area is tools:
I want to build tools that help other people write code. For a person who
wants to build tools, the inability to get those tools into the hands of
people who would benefit from them is incredibly frustrating. The reality of
things is that you’re always stuck with tradeoffs – and in this case, the
tradeoff is pretty clear. A typical academic can’t devote the time to
producing tools that are solid enough to be used for real-world development -
doing that takes a huge amount of time and effort, and that time and effort
won’t generate any more publications or grants than a prototype.
For my area, the difference between academic research and industrial
research and development was demonstrated to me by an experience at a
conference back in (I think) 2001. I had papers at two workshops – one was an
industry-dominated workshop on software configuration management; the other
was an academia-dominated workshop on aspect-oriented software development. I
spent the morning at the AOSD workshop, and the afternoon at the SCM. During
the morning, I heard about 8 different academics describe their tools for
“large-scale software development”, but the largest system that
anyone had used as a test-case for their system was about 1,200 lines
of code. That afternoon, at the SCM conference, I saw someone present their
results from analyzing a “moderate sized” development project – the software
for the Paris Metro, which included 4 million lines of code developed by 4,000
engineers.
I want to be in the second group – writing the tools that get used by
4,000 people to build a better system. And the way to do that is to be
an engineer at a company like Google. If I’d stayed at IBM, I would definitely have
stayed in research: IBM isn’t (in my opinion) a good place to be an engineer.