Yesterday's bad graphic post spurred me to finally get around to doing the "Why Does Excel Suck So Much?" post I've been meaning to do for a while. I gripe about Excel a lot, as we're more or less forced to use it for data analysis in the intro labs (students who have taken the intro engineering course supposedly are taught how to work with Excel, and it's kind of difficult to buy a computer without it these days, so it eliminates the "I couldn't do anything with the data" excuse for not doing lab reports). This is a constant source of irritation, as the default settings are carefully chosen so as to make it difficult for students to do a good job of data presentation.
Let's say, for example, that we're doing a lab that asks students to measure two things that are proportional to one another-- the magnetic field along the axis of a coil for various currents, for example. The graph I get from students is almost always something like this:
These students have done a terrible job of taking the data, right? No, they did the data collection right, but were tripped up by Excel.
I've constructed my fake data so as to make the effect as bad as possible, but this is the sort of thing that really happens with a lot of graphs. What we have here simulates data that was collected in a perfectly responsible manner, by taking half the points while increasing the current, and the second half filling in the gaps between the first set while decreasing the current. This is a fairly standard experimental technique to guard against things changing over time.
The garbage graph your see results from students using the "Chart Wizard" and choosing the first thing that looks reasonable, a "Line" plot. This is helpfully indicated by a little graphic showing points connected by lines, and it draws students like rotting meat draws flies.
The problem is, a "Line" plot as defined by Excel is a type of plot that is utterly unsuited to scientific data: it plots a series of points in the order in which they occur, spaced equally along the horizontal axis. The horizontal axis labels are just that: labels. This may superficially resemble a graph of magnetic field vs. current, but in fact it's a graph of magnetic field vs. row number, with the points helpfully labelled by the current corresponding to that row number.
What you want if you're attempting to do science with Excel is a "Scatter" plot, a little farther down the list. If you choose "Scatter," the resulting graph looks like this:
This is at least the right kind of plot, but it's awful in almost every other respect. There's the godawful grey background, the inexplicable pastel color scheme, and the axes running right through the middle of the plot, rendering them completely illegible.
There is no way that I know of to change any of these defaults, and fixing them takes an inordinate amount of work. You have to right-click to get "Format Chart Area" to remove the grey, and that's the easiest of the lot. Changing the axes requires you to format each individual axis, change the maximum and minimum values (or at least tell it not to auto-scale the axes, in case anything else changes), and set the crossing point of the other axis. Changing the data points to colors that don't resemble one of those eye tests they give you in grade school requires you to change both the "foreground" and "background" colors of each series, and "foreground" and "background" are not defined the way you might think.
After a good deal of pointing and clicking, you can end up with something halfway presentable:
Of course, when you realize that you're going to need to do all this every time you make a graph, the several hundred dollar price tag for a real scientific graphing package doesn't seem so bad.
Of course, it's rare to have a lab in which you just plot points with no further analysis. Usually, we really care about the slope of the line, or something like that, so let's look at how Excel does there. It's actually quite simple to superficially do the right thing, by right-clicking a point and adding a "Trendline." You can even get it to display some information (I've trimmed the graph down to a single series, for clarity):
Of course, what's shown on that graph is the sum total of the information you can get about the trendline: an equation, and an R2 value. Which is fine, if you're doing biology or something, but in physics, we care about the numerical values of things, and the uncertainties in those numerical values.
Having the slope of the line is nice, and all, but if you're going to do anything with it, you need the uncertainty in the slope. Maybe there's some way to convert that R2 value into an uncertainty, but I don't know what it is, and you can be sure that our students have absolutely no idea. Which sort of blows the whole idea of uncertainty analysis out of the water, unless you want to make them do least-squares regressions by hand.
There is another tool that can be brought to bear here-- if you have the right version of Excel, and have installed it from the disks, you can get the "Data Analysis ToolPak," which will do a linear regression relatively painlessly, and give you the uncertainty in the slope and intercept. This brings uncertainty propagation back into play, pedagogically, though it forces some contortions in order to get the data into the right form to show a linear relationship. That's not entirely a bad thing, but it does make for some additional headaches.
(And, of course, the Data Analysis ToolPak was entirely too useful, and thus has been removed from the new version of Excel. Probably because it didn't fit nicely in their godawful "ribbon" concept for the interface.)
Now, you might be saying "Well, of course Excel isn't appropriate for scientific data analysis. It's not really for scientists, though." Which is true, but here's the thing: the things I've complained about here aren't good for anything. The color schemes and axis settings lead to illegible plots no matter what sort of data you're working with. And I'm completely at a loss as to the purpose of the "Line" plot, or making it difficult to find uncertainties in fitted quantitites.
Another thing you might be saying is "But all of these pictures are from the old Excel-- the newer version is much more swizzy!" I used the older version for my demo pictures above, because that's what's on the computers in the teaching labs. But just for fun, here's what you get from the new version:
I'll admit, there are a few improvements. The dreadful grey background is gone, and the colors, while still weirdly muted, at least have a reasonable contrast with the background. And at least one of the axes is moved out of the middle of the goddamn plot, making it halfway readable. The data points have become cartoonishly large, but that's a relatively minor problem.
If you're perceptive, though, you'll notice something missing. Something... kind of important. That's right, it defaults to no axis labels at all. Fan-tastic.
Happily, the lack of the analysis tools probably renders this all a moot point, starting next year when ITS upgrades the computers in the teaching labs. Unless somebody in Redmond picks up a clue and provides a simple tool to do fits with uncertainties, I'm scrapping the whole stupid program, and forcing the engineers to learn SigmaPlot. Which has its own stupid defaults, but at least generates something approximately like proper data presentation on the first attempt.
- Log in to post comments
The thing that astounds me about this, is these defaults didn't happen by accident. *Somebody at Microsoft thinks that they are a good idea*! And is in a position to force their likes on everyone else!
We need to find this person. And have a discussion with them.
Possibly involving baseball bats.
At the *very least*, they need a clear and obvious way to reset the damnable defaults to something that looks less half-assed.
We need to find this person. And have a discussion with them.
Possibly involving baseball bats.
We also need access to a time machine, so that we can have this discussion back in 1996 or so (as Chad points out, these defaults have propagated through multiple versions of Excel). At least we would know to look for him in Redmond that year. Today, he might be anywhere, as there is no guarantee that he still works for Microsoft.
Is OpenOffice Calc any better?
And, when you add to these complaints the fact that for years Excel would report an R-sq value for regression through the origin (which makes no sense), and had difficulties with multiple regression results, means that it really is not well suited for much statistical work past the simplest problems.
The decision to make Excel a tool that could be used on a huge number of problems seems
to have had a serious impact on its success rate in several areas.
I don't know if the problems have been addressed in the newest version because I don't use it.
Hey... Microcrap Excel can be used to great effect by a 5-year old Chinese child with pigtails. I saw it on TV so it must be true. If you have a 172 IQ and a donated Intel 'Hard-Core' V8 Eight-Core Platform, so can you. Your problem is that you are not importing the results into an 8 GB PowerPoint presentation.
(Admittedly she was trying to send grandma pictures of her father doing the maid. You know how tricky, er, sophisticated those icon shortcuts can be.)
Matlab!
"Matlab!"
Or Octave, or Sage, or .... - just so it could be affordable for students.
The graphing functionality in EXCEL is at best mediocre. The legume enumerators, who never draw anything but pie charts, tell me that it has many wonderful features that I never use. I would however state that the ability to write Quick Basic code inside EXCEL is its single salvation. I find that 0.9 of the times when I need to write code to crunch something, I can get by in EXCEL and avoid fighting the problems of stand alone code - like a useless GUI that has replaced command line, at least on most WINDOWS compilers/languages.
Back when I was trying to do technical stuff on WINDOWS - a mandate of the information security apparat - I had a regular annual subscription to SIGMAPLOT. Now on Ubuntu I find three or four near equivalents at no cost other than a bit of my time and bandwidth.
Of course, when you realize that you're going to need to do all this every time you make a graph, the several hundred dollar price tag for a real scientific graphing package doesn't seem so bad.
I suspect a macro might facilitate applying that bunch of changes more efficiently.
Still, the lack of support for error bars does suck giant hairy moose nads.
"Still, the lack of support for error bars does suck giant hairy moose nads."
Never, I repeat, never, did I ever expect to see the words "error bars" and "suck giant hairy moose nads" in the same paragraph: certainly not the same sentence.
It is a good thing I had swallowed my coffee BEFORE I read abb3w's post.
You certainly can change the default chart type using all of the colors and formatting that you prefer, even in Excel 2000.
Here's how:
1) Make a beautiful chart the way you like it.
2) Right click on it, then click the "Custom Types" tab.
3) Click the "User-defined" radio button.
4) Click the "Add.." button. Give your chart type a name and description. Click "OK."
5) With your new chart-type name highlighted under the "Chart type:" column, click on the "Set as default chart" button. (This step is optional, of course.)
And if you really want to show mercy to your students, you could make a perfect chart, save it, and give it to them. Then give them these instructions so that they can open it and save all of the formatting you have done as their default chart type.
We use LoggerPro almost exclusively in our labs now. The software is fairly inexpensive, is pretty easy for the Excel people to figure out, lets us set the default graph appearance, and gives uncertainties on the fits. Overall, it is a much better way to go than Excel.
Also, warn your students about the fact that EXCEL's precedence rules are not the one learned in math classes, e.g., in EXCEL, -13^2 equals 169 not -169.
Man, I'm only 35, but back when I was last doing lab reports, we used graph paper, a pencil and a ruler... Kids nowadays!
There is no way that I know of to change any of these defaults, and fixing them takes an inordinate amount of work.
In old-style Excel, it is actually pretty easy.
A good explanation is here.
In the new Excel you can save your chart as a template and use that to generate any future charts. This includes changing the color palate, the size of the data points, the appearance of the axis labels and even where the x axis crosses the y axis.
Gnuplot continues to be free. Also free: matplotlib, which makes gorgeous plots if one is willing to mess around with python code a little.
The problem is, Excel is not designed for science. It's designed for accounting and business. In fact, I keep having to remind one of my technicians not to graph line data that way, but rather to do scatterplots and set Excel to connect the dots. It drives me crazy when I'm putting together a talk and trying to make one of this technician's graphs into something presentable.
Excel's graphing has always sucked. Moreover, it's counterintuitive how to turn rows and columns into a graph. On the Mac, Cricket Graph was far superior. Unfortunately, i don't think it exists anymore.
Orac:
Excel's graphing is actually pretty good, better than much of what is available out there. If you spend some time learning the tool it can do most of what you want.
Moreover, it's counterintuitive how to turn rows and columns into a graph.
"counterintuitive" is a fairly meaningless statement. It only applies to how you approach the product, not how everyone approaches it and the latter is how most people use the term. What is "counterintuitive" for one person is "intuitive" for another.
One more item - as a reference this site
http://www.daheiser.info/excel/frontpage.html
discusses many of the ins-and-outs of Excel's features. The organization of the site could be better, but there is some good information to be found.
Vastly. The default "line chart" options always present you with the choice, up front, of using one series as the independent axis. What's more, linear regression is trivially easy; the results are presented in a matrix that itself is easily extracted on a per-series basis.
I use it quite a bit for reduction of Monte Carlo results to separate trends (voltage, temperature) from uncontrollably random (process) influences to get statistical predictions. It's not quite as good as a full-up Octave script but is good enough, and frankly I just haven't taken the time to do it in Octave.
Which reminds: yes, Octave is sweet. Then again, the student license for Matlab is so cheap that a physics student (esp. one who will be continuing in the physical sciences) really should just buy one.
I've generated some very nice publication ready graphs with Excel. The problem is most students don't know how to work past the default settings and end up with the crap you first posted.
Prizm generates a nice graph as an Excel alternative though.
"the student license for Matlab is so cheap that a physics student (esp. one who will be continuing in the physical sciences) really should just buy one."
I wasn't aware of that - sweet.
Our school has a similar license with Minitab: it's on all the school machines, and students can burn a copy for their own use from any of them. It expires in the summer, but they can get a new copy when the school renews.
well, since no one else has mentioned it, i must take strong exception to part of this post:
"Of course, what's shown on that graph is the sum total of the information you can get about the trendline: an equation, and an R2 value. Which is fine, if you're doing biology or something, but in physics...."
as a biologist, i have made nearly identical complaints concerning excel as all those listed here, and explored the same solutions, for the same reasons. i teach bio at an undergrad institution, and have to struggle thru making excel do real science properly for the same reasons. so physicists, keep the sniping under control; we're in this together.
I agree that Excelâs graphs are horrible, but I use it a lot for basic data entry and manipulation, which it does quite well. So I end up using three different programs for handling numbers: One for data entry, one for statistics, and one for graphics. (I blogged about this here: http://neurodojo.blogspot.com/2008/08/tools-of-trade.html).
Which is fine, if you're doing biology or something
What RBurke said, which is considerably more polite than my initial response.
As for how doth Excel suck, as others have pointed out it's reasonably powerful, just user-unfriendly -- you clearly haven't invested much time into learning it or you'd have known about default graph styles and so on.
Plus, if you want to see some serious user-unfriendliness, try R (http://cran.fhcrc.org/). It's got real scientifical physicist-level cred, even.
You might enjoy this thread on Metafilter about why excel + science = suck (and why people still use it so much)
As for suggestions, two or three lines of R will give you beautiful results. Once you get a reasonable template, you can even just give them the two or three lines of code and have them substitute in their values for their own - easy enough.
Oh and best of all, it doesn't come with a "several hundred dollar price tag".
I gripe about Excel a lot, as we're more or less forced to use it for data analysis in the intro labs (students who have taken the intro engineering course supposedly are taught how to work with Excel, and it's kind of difficult to buy a computer without it these days, so it eliminates the "I couldn't do anything with the data" excuse for not doing lab reports)
This bugs me. Why are students being taught to use inadequate tools in the first place? I understand that most people have some version of Windows on their computer and that usually implies that they have some version of the Microsoft "productivity" tools. But the problem is that folks get taught to use these things in school (either high school or college), but that's all their taught to use and for most folks, that's all they are aware of.
This has real world consequences: people graduate and get jobs in offices and start sending around Excel spreadsheets (and 99 times out of 100 the spreadsheet contains things that shouldn't belong in a spreadsheet (for instance, step by step procedures)).
Another example: when I was in college I would see students write up math proofs in Microsoft Word (now to be fair, our math department didn't teach using any computer software for writing proofs). I would go absolutely insane trying to write even a simple proof in Word. Sure, I'm sure it's easy in the sense that you can possibly find everything you need in a GUI menu somewhere, but I've found that the extra effort I invested in learning LaTeX on my own has paid off in spades over the long run.
Full Disclosure: I think Microsoft is generally awful and try my best to avoid coming into contact with any of their products.
I let my freshmen use Excel, because it's a big class for non-majors and the campus computer labs have Excel. I think they also have Mathematica, but a lot of my students already know Excel, very few know Mathematica, and there's no real time to teach it. Besides, if I'm going to graph data, I'd rather use something like Igor anyway.
For any other class, I insist that they use something, ANYTHING other than Excel.
I think that Physics students would be in a minority of Excel users. I would say that the majority would be business users and besides the ugliness of the default settings, the default settings are probably fine.
I second the recommendation of Prism. It also has Kaplan-Meier survival statistics. For a biological scientist who also does clinical trials, it's the poor man's alternative to SPSS.
A Random Excel complaint that's always irritated me: Why is there no subscript/superscript toolbar button? It's there for Word. It's there for Powerpoint. Why not Excel? Surely you would like to quickly format column titles such as sigma-squared. Why must you by default highlight the letters and dive into the format menus?
Yes, this can be solved with a macro, but it's always bugged me that within the universal MSOffice Suite they could put the code in PowerPoint, Word, and Publisher, but not in Excel.
For comparison's sake, SigmaPlot does have subscript/superscript buttons that allow you to format text on a graph on the fly, but the buttons are inactive on a Data sheet, where again you might want to format column titles. At least Excel is consistent in its user unfriendliness...
The first thing that students in the physical sciences should learn about Excel is to never, ever, choose a line plot. The second is how to change default graph values, and the third should be to never, ever use trendline other than as a rough fist pass at data. The LINEST function works well to provide a great deal of statistical information from a linear regression.
Supermongo!
Excel 2007 does have an Analysis ToolPak which can be added in Excel options under add-ins. Select it and hit Go. It looks quite a bit different so there will be some relearning, but it should have most of the same stuff. This along with jake's advice on how to set up defaults just might keep you in Excel for a while. Your welcome...?
Dave@13 - I hate that. And why does Excel not allow you to use something other than parentheses? I know it color codes them but it would still be easier to see if I could put alternate with brackets. That said, I use Excel a lot at work as well as Minitab.
As for how doth Excel suck, as others have pointed out it's reasonably powerful, just user-unfriendly -- you clearly haven't invested much time into learning it or you'd have known about default graph styles and so on.
And pray tell me, how am I supposed to RTFM when Microsoft (who are hardly the only or even worst offenders in this regard) does not provide TFM? Yes, the capability is there, buried in some obscure submenu. How does Joe Average User know to look for it there, as opposed to the hundreds of other submenus in Excel? Oh, and don't tell me to go looking under "Help". IME Microsoft's "help" is not very helpful (again, they are hardly the only or even worst offenders) if you don't already have a pretty good idea of what Microsoft's programming team decided to call that feature.
If Microsoft had gone to the effort to make the program user friendly, or even had a microgram of aesthetic sense, maybe I would think it worth the effort. But if you think I will go to extraordinary lengths to figure out something I don't use very much anyway, let me introduce you to our administrative assistant, Shirley U. Jest.
I'm with Orac; Excel's graphing has sucked for decades, at least since 1991 when I started using it. Perhaps business students understand their terminology and maybe that's Microsoft's target market, but there's no reason the graphing part of Excel should have continued to suck so bad as it has become used as a general numerical analysis tool. They spent how much effort on the completely baffling Office 2007 ribbon and gratuitously wrecking (again) their user interface experience, but they still put zero effort into making sensible scientific graphics, the kind that Gnuplot was making when I was a wee undergraduate.
But maybe the problem is that while the spreadsheet interface (or an active visual grid) is a pretty nice tool for entering and visualizing textual numerics (i.e. columns of numbers), the underlying software beyond the interface is still aimed at bookkeepers. Nobody would complain that (say) Quicken doesn't properly graph electromagnetic data because it isn't sold on that basis. Yet Excel has plenty of statistical functions which implies Microsoft knows damn well it is being used for data analysis beyond the financial sector. While Excel is an easy choice, it's not the tool for the job; Microsoft has had years to fix or extend the scientific plotting capabilities of Excel and haven't. They mangled the UI of their entire Office suite so I don't buy any argument that they don't want to confuse their users by changing the way Excel makes graphs. If it's not a priority for them to fix (and they have to know how awful it is) then we need to find a different tool.
I've been pushing R Project (http://www.r-project.org/) and Octave (http://www.gnu.org/software/octave/) for mathematical work; I'm still looking for a Mathcad replacement (I'm still evaluating Maxima and Scilab.) Both R and Octave are free and cross-platform (Win/Mac/Linux) so there's no excuse for not using them. And while I don't wish (La)TeX on the casual user, if you spend a lot of time writing equations or proofs, you need something more precise (if not more shiny) than Word's equation editor.
LyX does a reasonable job of putting a humane face on TeX (recall that TeX and Metafont were developed specifically for typesetting mathematics.) The biggest problem with TeX is that again, we have been trained that word processors are useful. And they may well be for casual use. Either formatting is not very important (or to be done by someone else) in which case a good text editor is the proper tool, or formatting is extremely important in which case you need a typesetting program such as TeX. The word processor is the bastard stepchild of the text editor and the typesetter, doing neither job very well but covering the easy 80% of business writing - three page memos from HR.
TeX and its variants are difficult to learn, or at least take concerted effort, and they take a rethinking of the nature of the writing you do to really understand why they work they way they do. The payoff is precision of communication - there are some ideas that cannot be expressed well through Word, Excel, or Powerpoint.
Science is much more than just taking data, it's also about showing relationships and communicating ideas. The popular office tools are simply not adequate for the specialized communications of the practicing scientist or engineer. Sorry to get all Marshal McLuhan here and don't take this as a he-man anti-Microsoft rant. As ubiquitous as it is, Microsoft Office is not the right tool for the job; better tools exist, they require effort to find and learn but they produce far better results.
Use R
Its Free
Runs on windows, Mac and Linux
http://cran.r-project.org/
Can't answer for Microsoft. OpenOffice.org has subscript/superscript available for the text formatting toolbar, although it's not shown by default. Trivial to add in the config menu.
What Bob #37 and Kevin #38 said - use R. It's free, it's powerful, it can replace SPSS, and it makes beautiful graphs. It's cross-platform, too, and all the algorithms have been vetted by statisticians.
Just LOOK at these beauties! sure, you can do it in EXCEL if you try hard enough, but the point is that it shouldn't be hard - it should be straightforward.
OK, so there's a learning curve, but anybody who studies physics should be able to get to grips with it. I studied computer science at the University of Bonn and took my introductory maths courses with mathematicians and physicists. Almost all of them used LaTeX / gnuplot or similar free tools. So it is absolutely mindboggling to me to see physics students use a Microsoft product out of their own free will.
(Also, I would recommend LaTeX for all documents that are longer than 4 pages.)
I second Igor. It has a silly name, but it has become a very good plotting program with decent stats and curve fitting. It was originally for the Mac (it had an icon of a box that opened like a jack-in-the-box when you double clicked). Now it's available for Windows as well.
On the other hand, I use Excel's admittedly difficult plotting to do color-coded contour plots. Of course I haven't really done one since I figured out the original one. I just load in the new data right over the old set.
I want my copy of Origin! We've been soliciting the department to see if anyone else wants a couple of licenses, since if we buy 10 or more the cost per license drops substantially. Haven't heard anything back yet.
Sometimes I've dealt with this problem by recording a macro of the changes I make (on standard graph formats that I'll use a lot) and then assigning that to a key. I can pump out lots of identical-looking graphs that way, which has been useful at work.
Wow, I came it quite late to this discussion. Clearly excel has pushed people's buttons. Let me just add that for students (in particular non-physics majors) - I encourage them to use LoggerPro. The site licenses is essentially free (like $200). The license allows students to install it on their home computer. I think this is the best balance between cheap-easy-good.
For me, I like matplotlib or stuff like that (although, if I need something quick I also use LoggerPro).
I am very fond of R as a statistics system, and the output of its graphic libraries is damn nice if you put in the hours of effort to figure out how to make them do anything at all, but I would never, ever inflict the input language on undergraduates. It's just barely this side of the "will not use this ever" line myself.
(Don't even mention MATLAB. When I am in charge of hell there will be a special section for its designers. There is no way they could have gotten it that wrong by accident. No, they knew all of the normal affordances of programming languages and they deliberately violated every single one of them.)
One word (four syllables): Gnumeric.
One letter: R.
I usually use SigmaPlot, but I don't suppose undergraduates can afford such overpriced software. There is more affordable software that can do a good job, however.
But the basic reason Excel sucks is because it was designed for shoe salesmen, by shoe salesmen.
There is a nice little freebie similar to Sigmaplot called KyPlot available at http://www.pricelesswarehome.org/WoundedMoon/win32/kyplot.html.
I think it was originally developed as a commercial project then became orphanware. Like a lot of orphanware, it's not well documented by I found that my experience with Sigmaplot helped a lot. On the plus side, it makes good scientific graphs -- far better than Excel. I use to visualize data, also.
You can also get a free Matlab clone called Scilab that does a lot of the same graphing that Matlab does. The statistical R language has pretty good graphing capability, but the learning curve is steep.
One letter: R.
One word: Touche.
And gnu-plot for additional graphing options. Endter data with cat, recode with tr and sed. We don't need no stinking Excel.
I tend to find Microsoft programs as a whole to be very frustrating. The programs are not nearly as intuitive as the programmers think they are, and the help facility seems to be going from poor to almost completely useless. Recently I spent an inordinate amount of time trying to turn the text in a text box through 90 degrees. In the end a colleague discovered it in a manual put out by a different company.
I've also tried to combine a line graph with a bar chart (to demonstrate the normal approximation to the binomial distribution) without success. Today I was trying to figure out how to do matrix multiplication in Excel and got the message 'You cannot change part of a matrix'. No matter what I did, I kept getting this message (and could not find any help on the problem) which in the end was 'fixed' by a [CTRL][ALT][DEL].
In my case, I am teaching in an adult ed. centre and do not have permission to add or update any software so I'm afraid I'm stuck with making do with MS.
Or have a canned perl or gawk script to translate. IMHO they're a bit cleaner than trying to do it all in tr and sed -- at least for anything nontrivial.
de gustibus and all that. "There is no One True Way," or, as the poet taught us: "There are nine and sixty ways of composing tribal lays, and every single one of them is right!"
Back when I took Junior Lab, they gave us some Matlab scripts for fitting curves and such and told us to figure the rest out. You learned quickly. (Or you gave up and transferred to management school.) Later, when I had to do much the same kind of work in a place which happened not to have a site license, I downloaded R and got up to speed in about a day. I don't think the getting-up-to-speed part was substantially harder in either language, but hey, I'm a sample of one. If we'd been given a from-the-can R script in the first place, I probably could have skipped all that wasted time in between trying to deal with a software package which managed to be the suboptimal solution for every problem I attacked.
Nowadays, I use SciPy and Matplotlib for most of the data-lifting. It's an easy combination to start using, if you already know a bit of Python.
I noticed you can't plot a *function* in excel. Several times I've searched the web trying to find information on how to do this.
I think there is simply no support for it in excel.
Matlab (again!). You'll be using it again and again. I used it in too many subjects to mention, so spending the money on a legal (student) version got distributed over about a decade.
Just for what it's worth, people who do accounting hate Excel, too. It does all financial calculations assuming that an amount of money is a real number. Money is an integer, so every calculation causes small errors that can cascade into hilarious results.
Don't let this crisis go to waste!
The amount of ignorance out there with respect to graphing and charting is appalling (I was somewhat guilty myself before I discovered Tufte). Make this into a "teachable moment" - talk to your students about what makes for a good graph and why, explain "chartjunk," the uselessness of pie charts, and why not to use those godawful 3-D charts.
Do your students a real service: Select and support a free open source program that does what you need. Finding a program that is "relatively inexpensive" is not a good solution; students are inundated with tuition, fees, textbook costs, etc. - what is just a few extra bucks to you might be a huge cost to a student (especially when it's an unnecessary cost). Do YOUR homework and select good free software for the task - and then SUPPORT that software with tutorials, templates, examples, bug fixes, etc.
uqbar (and others here) are spot on in rejecting Excel in favour of something else which is (1) more powerful, (2) easier to use, and (3) free. And, as he (she?) says, you should be leading the charge. To become facile with tools such as R (and, yes, Latex) should be part of any serious engineering curriculum.
Whenever I have to teach students anything quantitative. the first thing to be thrown out is Excel. Followed by the rest of MS Office for good measure.
There are any number of graphing alternatives on all platforms; all are way better than Excel for scientific purposes (and for most conceivable other uses).
I find it simply unconscionable to teach students Excel instead of something truly useful, like R or Octave or Mathematica. It's an expensive waste of time and resources.
When I was a student in the mid-'80s, I was part of a team at our university's computing centre that evaluated installing APL-2 on our then IBM mainframe. The alternative was to issue TRS100 laptops with Lotus 1-2-3 to the students. We opted for APL. None of those lucky enough to learn and use APL-2, while it was available, would ever be at a loss when it came to stats/maths computing - they learnt a way of thinking, of handling conceptual tools, not recipes for filling cells and selecting templates. And this for students with often a very scant scientific or computing background: it made them free, it kept them curious and creative. When Excel was introduced, performances dropped accordingly.
The classic essay by Edward Tufte at www.edwardtufte.com on Microsoft PowerPoint is still pertinent to the phimotic aesthetics and lobotomized cognitive concepts of all MS Office products to date.
Whenever I have to teach students anything quantitative. the first thing to be thrown out is Excel. Followed by the rest of MS Office for good measure.
There are any number of graphing alternatives on all platforms; all are way better than Excel for scientific purposes (and for most conceivable other uses).
I find it simply unconscionable to teach students Excel instead of something truly useful, like R or Octave or Mathematica. It's an expensive waste of time and resources.
When I was a student in the mid-'80s, I was part of a team at our university's computing centre that evaluated installing APL-2 on our then IBM mainframe. The alternative was to issue TRS100 laptops with Lotus 1-2-3 to the students. We opted for APL. None of those lucky enough to learn and use APL-2, while it was available, would ever be at a loss when it came to stats/maths computing - they learnt a way of thinking, of handling conceptual tools, not recipes for filling cells and selecting templates. And this for students with often a very scant scientific or computing background: it made them free, it kept them curious and creative. When Excel was introduced, performances dropped accordingly.
The classic essay by Edward Tufte at www.edwardtufte.com on Microsoft PowerPoint is still pertinent to the phimotic aesthetics and lobotomized cognitive concepts of all MS Office products to date.
Weirdly, Microsoft had a pretty brilliant idea built into its incarnation which is that it was a software company when most technology companies of the day were hardware companies. But since then the organization seems to have misunderstood the purpose of all software which is the manipulation and storage of information. The fact that there's a save option in most Microsoft applications is the most glaring example of this total failure to understand their purpose. Save!? It's a paradigm of software design that we all got really used to but also one that is fundamentally wrong. When should your data ever not be saved? Why is it up to you, as the user, to decide when data should be saved? I know Microsoft has some autosave features but they're implemented badly and often confusing to users. It should be much simpler. Simpler as in you never have to think about it. Even their word processor is horribly trapped in ancient paradigms like paper and fonts as opposed to structures and outlines (like the lovely Scrivener for Mac--see http://tr.im/hFZ1). Unlike Apple that has a mission statement as clear as day--make beautiful things--Microsoft long ago lost their sense of purpose. They're really just grasping at straws (like cloud computing) because they long ago forgot the simple purpose of what software is supposed to accomplish.
Another issue I've seen is that Excel doesn't layer stacked line series correctly...
http://www.consultantninja.com/2007/08/excel-stacked-bar-vs-stacked-lin…
The fact that there's a save option in most Microsoft applications is the most glaring example of this total failure to understand their purpose. Save!? It's a paradigm of software design that we all got really used to but also one that is fundamentally wrong. When should your data ever not be saved?
First, this is not something you can blame on Microsoft. Unix, which dates from the 1970s, has the same paradigm in its interactive text editors.
Second, you may decide that on second thought, the previous version made more sense, and so you want to be able to revert to that version. The emacs text editor on Unix does this automatically. Most other programs overwrite the old version when you save the revised version, so it can be hard to go back--but you can choose not to save the changes, which at least allows you to revert to the previous version. It's not as elegant as what emacs does, but it works until you have saved the modified file.
@#54 re: Plot Functions.
Yes, you can. See
http://groups.google.com/group/microsoft.public.excel.charting/browse_t…
and go get ChtFrmla.zip
...mrt
It's probably inadequate for sophisticated users, but Vernier Software's Graphical Analysis is cheap and effective for simple plotting and data fits. I used it for years in my HS physics courses. The interface is clean and understandable, and it will import and export to Excel using CSV format. If you need even moderately sophisticated stats analysis -- anything more than regressions and curve fits -- then skip GA. For what Chad is talking about here, it's a no-brainer.
Meanwhile, I second LoggerPro. My chem colleague used it and loved it. It's more expensive than GA, which is about $15 - $20 the last I checked.
Lyza (www.lyzasoft.com) is free for students and profs, is better behaved for your charting/regression issues, is scalable on the desktop to hundreds of millions of rows, provides easy peer-to-peer sharing, and handles just about any data transformation/integration task. Give it a try.
Hey man,
To get the coefficient of indetermination the formula is 1-R^2. And that is the amount of variance in your dependent variable that is unaccounted for by variation in your independent variable.
As a computer-savvy guy with physics and biology training who has desperately avoided using Excel as much as possible for years, I must say I am very pleased to see that others here have already vented spleen about most of the problems with Microsoft's decrepit old dog. I will add that people who claim Excel is good for data entry clearly have not tried it for even moderately large data sets or they have failed to notice the inconsistencies that begin to creep in as the software starts to "auto-format" random entries as dates and copy-paste errors wreak havoc with column and row alignment.
Several people have commented that Excel was designed for "bookkeeping," "finance" or "business users". Perhaps if the people on Wall Street had better tools for analyzing and understanding data we would not be in quite such a big financial mess as we currently find ourselves.
2andahater:To get the coefficient of indetermination the formula is 1-R^2. And that is the amount of variance in your dependent variable that is unaccounted for by variation in your independent variable.
A physicist nearly always wants the uncertainty on the SLOPE of the fitted line. That has nothing obvious to do with R^2 or 1- R^2.
R^2 in physics is nearly always 0.99. That's why we don't use it.
"Perhaps if the people on Wall Street had better tools for analyzing and understanding data we would not be in quite such a big financial mess as we currently find ourselves."
Nah, if it was only about Excel... but too many people rely on clunky spreadsheets when doing business. This said, Excel is a great tool, but not for everything. Scientific plots are not one of those things Excel excels at. If you really want to do charts in Excel, give these guys a try: think-cell.com
There's a great tool, which is unfortunately Mac-only, called OmniGraphSketcher. It's designed as a simple graph production app that takes ranges of data pasted into it and produces, by default, clean simple line graphs.
http://www.omnigroup.com/applications/omnigraphsketcher/
ARGHHHH!! I used to work in a research institute and got to use ROOT, which is fantastic as it's based on C++ so it lets you do anything you want when doing analysis. Admittedly it takes time to learn but once you have. Now I work in a 'corporate' research environment and have to use EXCEL, help. It's just pants. It's very easy to make a quick plot of your data. However, it's a plot that gives you know information of use, I'm not even sure Microsoft can spell uncertainty!
The crux of your students' issues is that they're using an accounting program for scientific analysis. And by the way, real accountants don't even use Excel for any real work.
It's like trying to drive a nail with a hammer and then writing an article "Why this free screwdriver sucks for driving nails."
Correction to my post above:
It's like trying to drive a nail with a screwdriver and then writing an article "Why this free screwdriver sucks for driving nails."
This is a very necessary post. Not just so we can all collectively vent, but also to warn others and to assure everyone who thought this was fixable (or their fault) that it isn't. "Don't bother" is a helpful instruction. And it was done with humour and solid examples, too. Gotta love it. Jeff (#73 above) misses the point that numerous students use Excel because it's there and they make the very reasonable assumption that charts will look like charts and that data will be displayed in ways that maximize understanding. It didn't have to be this way. It's not a question of doing "scientific analysis." It's a question of simply asking a very sophisticated program used in wide variety of settings to get the simplest thing right. This screwdriver isn't being used as a hammer! It's being used as a screwdriver and it's failing. And it wasn't free.
Thank you!! It's so good to hear people speak out about Excel.
Excel is fine if you're an office receptionist or something and basically need an electronic version of lined notebook paper, and it's great for enabling the technologically challenged to summarize and maintain simple data electronically. But it's not a tool for anyone with serious data to analyze.
A year on, there's still no good solution.
I'm retired & try to use Antiques like SigmaPlot 2001, Mathcad 7 and StatView 5 (which is free because it's no longer ditributed). Graphs are always a pain because some function is missing in the package, or the thing crashes under M$ V*sta (the biggest scanadal is the absence of retro-compatibility with their OS; you don't pay 500 euros for a package that will be dead in 3 years).
I'll try OOo again, but still havn't found how to write code (the object oriented aspects of VBA are too intrusive for an occasional user).
One aspect to this problem is that outside of the scientific community, nobody really uses Excel as a Spreadsheet program. In most offices it's used as a lame database, with people simply populating the cells with words and notes. I've seen this countless times, and when you ask why they're not using a proper database program, you simply get a blank look.
I teach ICT to beginners, and trying to show my learners how to produce charts in Excel, especially Excel 2007 is just a nightmare. So all I can do is agree that Excel sucks. Sucks like a grandmother on an egg. Sucks like a nuclear-powered Dyson. Sucks like a hole in the International Space Station. Sucks like..... well, you get the idea!
why excel sucks is because its abigger harder to use pain in the ass calculator
I don't particularly like MS as a company but Excel is a sweet tool. The problems described in the article are just typical problems ecountered by anyone using any new software. You don't need to format a chart every time you make it. One needs to be really naive to believe that. As a math or phisics student or teacher you only use maybe 2-3 types of charts with few colors. You just keep a copy having few templates and use them all the time (it is called copy-and-paste). About the R and the like it takes minutes to create your own to your preferences. No big deal.
Matlab, mathematica, octave etc are more or less black boxes which to a scientist brain are like sugar water (coke) to a obese diabetic. Excel is a very plain tool. It allows you to get from point A to point B your way. But of course you need to use your brain in the process. Nowadays using your brain has become a no-no not only in industry (there it has been a standard for decades) but even in schools where you just have to regurgitate your guru's mantras. Visit my site and see what one can do in excel. By the way excel is the most used statistical software in the world and in many engineering companies it is still the most used software by engineers (mechanical, structural, aerospace, etc). These people have all the packages available but smart people hate black boxes.
excelunusual dot com
Give me a break.
Excel has major weaknesses and annoyances, but it's incredibly useful even for extremely detailed scientific data analysis. It is very easy to use for the purpose of getting from complex raw data to a format that can be used by SAS, etc., unless you live in an imaginary fairy world where your datalogger spits out perfectly formed SAS-able data, in which case you're doing something so simple that you really should be able to get by with an abacus.
All the defaults complained about above can in fact be changed. The oh-so-onerous formatting changes in graphs can in fact be completed in seconds. And although I agree publication quality graphs are better made in SigmaPlot or whatever, for a lot of graphs -- including anything an undergrad is going to put together -- Excel can do just fine. Furthermore, all of those other graphing programs are more unstable than Excel. Excel does freeze and crash, but nowhere near as often as SigmaPlot, Origin, or the good old Axum back in the day. And they all have their own, bizarre, totally unique and proprietary navigational and editing schemas, whereas the schema in Excel is as close to a universal one as there is.
VBA is slow and limited, but it enables a degree of modeling and automation that provides otherwise clueless and frightened students an easy first step into that world.
So the idea that it's evil to teach students Excel is just nuts. It's insane to teach most students anything else for the truly trivial data analysis and graphing tasks they need to do. And if you push the boundaries just a bit, those students will go out into the world with Excel skills that are impressively rare, which helps with getting and keeping jobs.
To the arrogant physicist who suggested biologists aren't concerned with real statistical parameters: get a life. You don't have a clue what you're talking about. We have to deal with the real, ugly, complex world, so I guarantee you that biologists are, on average, far more savvy about statistics than you are. Excel's stats abilities are indeed extremely limited... but providing more info RE regressions is not among those limitations. Use the bloody Data Analysis package.
I process disgustingly large and complicated datasets on a daily basis. I model extensively in C++. I am a highly quantitative mathematical biologist who curses at Excel at least once per day, usually quite loudly and profanely. But I've also tried to live without it, and frankly I am forced to admit that it makes my life and job much easier.
Stick to teaching physics, professor.
This is the single most illinformed and ridiculous blog entry I have ever read on Excel.
By your own words, you prove that you have no idea how to use Excel as a tool.
On the other hand, I have used it virtually everyday since it was first released 25 years ago.
For data analysis and presentation it towers over all other platforms.
I'll agree that the defaults are more situated toward business uses, but if you do not know how to change these then you have no business penning an article on Excel.
Damn, man, take two seconds and do a Google search on advanced Excel charting.
And by the way, any uncertainty data you can think of regarding regression analyses can easily be displayed on an Excel chart.
As a engineering student, I have grown to loath the way excel handles and presents data. I much prefer MatLab over excel. My only disdain for MatLab is the way in which it retrieves data from excel. I would much rather eat nails than be forced to work with excel.
One thing not mentioned above: It is hard to make a series of plots look the same e.g. if you change the font size or the format of the axis labels the axis move around.
Wow
Ever hear of the concept that there is a right tool for every job?
I'm a scientist, and I use excel all the time. Why? It's easy 90%+ of all graphs I need to make are for quick - 'what do you think about THIS' discussions. Does my collaborator need a perfect graph? No, does he need to see *something* so we can talk. Yes!
That being said - trying to do anything beyond really simple mathematics in excel is a waste of time.
If I need a basic r^2 for a teleconference in an hour, go go excel! if I need anything else, no.
Finally, I think you can actually make very sharp looking graphs and charts in excel, you just need to know how. I would say that it is no harder to learn to do it 'right' in excel that it is to learn any of the other arcane packages out there.
I'll often do my analysis in matlab, set the output of the system so that it creates a nice easy-to-convert output that I can shove straight into one of three or four excel templates to turn into pretty pictures.
Finally - everyone and their brother has access to excel (or openoffice) - so that means if I make a SIMPLE chart in excel, I can pass it along, and people can actually use it.
You can probably create nice plots after a lot of fine tuning, but try to do a series of plots which should look all the same. You are lost. The plot area changes e.g. when you plot data in different ranges in two plots. In one plot y-data is 1-10 range in the other 100-1000, the axis moves as "1000" needs more space. The axis should be at fixed position. There are many more examples when this happens.
I wish excel would die a very quick death. It is a time waster and who cares or looks at graphs. The written word is far better.
I have so many skills and everyone wants excel and powerpoint. Why, Why, Why??? Please explain. Drives me crazy as everyone uses it in a different way it is temperamental and can cause huge stuff ups. Hate it and spreadsheets as well.
Accountants have taken over the world. I want my skills on stage now for a change.
Was using the excellent, very above expectations excel 2003 until this year when my old Pc died R.I.P in heaven, avoided all this time the crappy excel 2007 all this years. And now i´m forced to endure this horrendous cousin of the ultra crappy internet explorer that is called Excel 2010. I looks like they upgraded the program for people to just hate Microsoft ...I can´t understand why they like to destroy their top programs... Why they made all this buttons?!? Just to look beautiful..well programs are no whores you know, they don´t fucking need to be beautiful...they just need to be practical MFs!!
I think what would be worthwhile is a way to save those settings down. They could offer setting for science, business, and other communities. Or, we could build them ourselves and provide them either for a nominal fee or free.
Excel 2007 issues...
1) The macro record sucks; it doesn't record all of the steps now. It used to be a powerful tool to quickly churn out functional VBA code. Now, it skips steps, making it fairly worthless and time-consuming filling in the gaps for it.
2) The help file is about as useful as a tick on a dog's a**. The "offline" help file only covers the most rudimentary stuff. If you use "online" help, all it does it take you to Bing, MS' search engine. It's as if MS is saying "why should we bother writing up a real help file, since folks just google up answers anyways... I KNOW! Let's force them to Bing up answers, that way the amount of confused people trying to find answers will artificially inflate the Bing usage statistics!" Gee...wonderful. Excel help files were amazingly helpful in 95/97. They started sucking in 2000. Now, they practically non-existent, leaving folks roaming through pages of online forums reading about people bantering on problems instead of getting to the point of resolution.
3) Reorg or buttons to the ribbon makes sense when you get used to it, but some of the most powerful buttons, like Print Preview, are non-existent; you have to manually add it to your "quick" bar. Sloppy.
4) Custom Toolbars and Custom Buttons are now a nightmare. They used to be very simple... right-click a menu, tell it to create a new toolbar or button, play around with the picture editor to make an icon, assign a macro, you're done. You could create an .xla file to store your toolbars and macros in as an add-in. Now, you want to make a custom button? Well, you better get comfortable coding XML, b/c you have to manually code some into certain files to get your buttons added to the interface. Hope you're a programmer.
They basically add a new coat of paint to a product, and in doing so broke a lot of things. Then, they had the audacity to charge people money for a regression by making it seem like "progress". It's lame, and it falls right in line with MS's "rest on your laurels and just repaint the fence" Vista days.