A Rant About Straight Lines

I'm teaching a lot of calculus this term, and we just spent the last class period or two talking about straight lines. That makes sense. Calculus is especially concerned with measuring the slopes of functions, and straight lines are just about the simplest functions there are.

Now, the textbook we're using this term, like pretty much all textbooks, defines a linear function as one that can be written in the form


$latex y=mx+b$

I hate that!

It's not that it's wrong. It is perfectly true that straight lines, and only straight lines, can be expressed with equations of that form. But defining a straight line in terms of the form of its equation is like defining an even number as one whose final digit is 0, 2, 4, 6 or 8. It's perfectly correct, but it doesn't really get at what an even number is. Do we really want to suggest that cultures that use a different system for writing down numbers cannot understand the concept of an even number?

But it gets worse. The problem is not simply that y=mx+b fails to make clear what is special or interesting about linear functions. It is that even if you are going to define linear functions in terms of their equations, that is hardly the most natural one to use.

So let's philosophize a bit. If I am thinking of a particular straight line, and I want you to be thinking about the same line, what information do I need to give you? The answer is that I need to give you two things: one point on the line, and the line's slope. One point and the slope uniquely determine the line. Of course, I could also give you two points on the line, but that is because you can use the two points to determine the slope.

The key observation is that any point will do. There is no privileged point that plays a special role in defining the line. And when you appreciate that, you come to see the full horror of the y=mx+b form. It's that b at the end. It represents the y-intercept of the line, which is to say the point where the line hits the y-axis. But why should we care about that point? Why should the y-intercept be considered the be-all end-all of linear functions? It's madness! Heck, I could just move the axes, and suddenly the line has a different y-intercept. But I haven't changed the line itself, now have I?

So how should we think about the equation of a straight line? Since the thing that makes a straight line special is that it has a constant slope, I would say the general equation is this:


$latex \textrm{slope}=\textrm{constant}$

Now that's an equation I can get behind!

Let's flesh it out a little bit more. I said before that you need to be given a point and a slope to determine a line. So let's call the slope m and the given point


$latex (x_0,y_0)$.

You probably remember that the slope of the line between two points is defined as rise over run. So, if (x,y) is any other point on the line then we have


$latex m=\textrm{slope}=\dfrac{\textrm{rise}}{\textrm{run}}=\dfrac{y-y_0}{x-x_0}$

And this gives us the equation

$latex m=\dfrac{y-y_0}{x-x_0}$.

We could stop there, but fractions are annoying. So let's cross-multiply to get

$latex y-y_0=m(x-x_0)$

And we're done.

Textbooks call this the point-slope form of the equation of a line. If the given point happens to be the y-intercept, so that it's coordinates are (0,b), then our point-slope form quickly reduces to y=mx+b.

So, let's wrap this up. The y=mx+b form certainly has its uses. For example, if you want to cook up an algebraic proof of the fact that two lines are parallel if and only if they have the same slope, (an exercise we did in class!), then this is the most convenient form to use. But please, let us stop the nonsense that this is the most appropriate way to define the notion of a linear function.

Of course, in this post I have been assuming that our goal is to describe a line in a two-dimensional plane. We could also ask how to describe a line that's slashing through three-space. But that's a subject for a different post...

More like this

Jason - you are, as usual, absolutely correct. However, you are either missing the point , or else deliberately obfuscating in order to make the point seem otherwise - or, at least, more important than it actually is. y = mx + b is a neat and convenient way for students to learn that straight lines have constant slopes. Any half intelligent student will realise (if only sub-consciously) that its reference to the point where the line cuts the y axis is arbitrary, and crops up purely for the sake of ease of expression. In other words, if he/she has not actually worked it through that y - y0 = m (x - xo), then it won't matter because they've got the idea anyway. Are you possibly being a little pedantic, or even obscurantist on this point?

Why should the y-intercept be considered the be-all end-all of linear functions? It’s madness!

No it's not! It's just a convenient reference, and one which you are perfectly free to reject if you please. What suggestion of "be-all end-all " (let alone madness!)
does this introduce?

By the way, back over in good old England we usually write y = mx + c, the reason being that c refers to a constant, not a bonstant (Oops - having said that, who's being pedantic now?)

By the way, am much interested in your crazy American Creationist movement, which leaves me and most of my Limey friends over here simply open-mouthed in amazement ~(but please don't send any more of them over here, or I may have to start my own sojourn amongst the creationists) - love your blog, have followed it a long time. Might offer you a bit more in future. You and PZ, and Jerry C etc., just you mind to keep up the good work! - and we'll agree to keep letting Dawkins loose on your side of the pond.

Just as "b" references a privileged point, so does "m" reference a privileged set of axis. In a change of coordinates, "m" transforms in a non-obvious way. The essence of a line is that it is drawn through two points. The equation (y-y1)/(x-x1) = (y1-y2)/(x1-x2) captures the geometric meaning.

Why on Earth do you use 'm' in America? Wouldn't it be more natural to use 'a'? Wouldn't that also generalize nicer to second-degree polynomials (y=ax²+bx+c)?

By Peter Lund (not verified) on 05 Sep 2013 #permalink

Peter - (I assume you're also from UK)

- hang on a minute!
When did we stop using "m" ? For quadratics, sure - a,b,c etc. But for calculus step 1? Was always "m" for me.

So straight lines must have no ends?

By Rosie Redfield (not verified) on 05 Sep 2013 #permalink

Pardon my intervention in this exciting topic, but I became enlightened in calculus 15 years ago when my kids had it in high school. I personally had calculus and diff E "up to here" in college, but studied for the test. I "knew" that in the real world of engineering, most of it would collapse to formulas that worked.
When my kids asked for help, I was finally forced to consider, what is the point of all of this?
It finally hit me, that calculus is a nifty mathematical way to describe the world as it CHANGES. Maybe that is old news to most, but consider this. Much of math describes things as they ARE - formulas for spheres, relationships between sides of things (trig), trains approaching from different directions, etc. Calculus describes the where, when and how of things in transition. That is pretty cool. I think if kids today could be taught to see the "cool" side of it, they might embrace it more easily as a field of study.
Of course, the engineers will always cut to the formulas, but others might see the world, and its constant state of change, differently.

Like Phil, I don't see how y=mx+b "privileges" the y axis. Before it does that, you have to privilege x=0, and who says anyone does that? You should take your own advice: "The key observation is that any point will do." Any x will do.

Now, I do indeed like your idea of focusing students on the key features of a straight line rather than the mathematical form. I.e., teach that a straight line is one fro which slope = constant, and you need to know two bits of data to fully determine it (slope and one point). But I disagree that the point-slope form is more natural or superior teaching aid; like Phil, I think its going to be more obscure and less natural to the students. I expect that if you use that form instead, you are going to have to explain more, not less, to get across the fundamental concepts you are trying to convey.

It all depends on when you come in -:) the textbook Jason is referring to certainly uses y=mx+b, but has a previous precalculus section on the derivation of straight lines that starts with constant slope, then builds the two point form from that. At the end of the day students need a formula, and this one is the easiest to remember!

By Stephen Lucas (not verified) on 05 Sep 2013 #permalink

My main issue with this form is that people keep using it to write code in the software I work on. That software naturally has cases where the line is vertical, and then the functions, which work by computing the slope of the line connecting the two points passed as arguments, all break. Whenever I come across one of those slope calculations, I recast the function with the line expressed parametrically,
x = x0 + K1 s
y = y0 + K2 s

By Winter Toad (not verified) on 05 Sep 2013 #permalink

Winter Toad, you're rapidly sneaking up on the full vector formation:

X=X0+X's (vectors in bold, X constant)

To me, the most important thing about linear functions is superposition. But then I'm an electrical engineer and use the Linear Systems Theorem a lot.

By D. C. Sessions (not verified) on 05 Sep 2013 #permalink

Correction: that should have been "X0 and X' constant."

By D. C. Sessions (not verified) on 05 Sep 2013 #permalink

Stephen Lucas --

What section are you talking about? I'm talking about section 0.5, which as far as I can tell is the first place in the book that linear functions are discussed. And the first thing that is said is that a linear function is one that has the form y=mx+b.

Phil B --

y = mx + b is a neat and convenient way for students to learn that straight lines have constant slopes. Any half intelligent student will realise (if only sub-consciously) that its reference to the point where the line cuts the y axis is arbitrary, and crops up purely for the sake of ease of expression.

While I do think my students are generally quite a bit better than “half intelligent,” I also think you are misjudging what they find easy and what they find hard. Most of the time they come to class with the mindset that math is just a matter of memorizing formulas and then applying them with machine-like precision when called upon to do so. They do not naturally think about the big picture, or ask themselves about “what is really gong on.”

I am a visual person and the geometrical motivations always worked for me like the point slope intuition that you mentioned which can be also used to digress slightly into Euclidean vs non-Euclidean line definitions...and depending on the type of calculus class even manifolds.

Of course the y = mx + b form can be used to express linear relationships without plotting on a coordinate system.

In a Gallup study( 2008-2008 143 countries “Is religion important in your daily life?” ), had this linear form:

religiosity = -0.1129x + 0.983 where x = educational level.

I have to side with Jason on this one. The "y=mx+b" equation becomes so automatic for students that they often forget where it comes from, or they forget about "point-slope" form, and this creates problems.

When I ask my calculus students, at the beginning of the term, what two pieces of information one needs to determine a line, the first answer I hear is almost invariably: slope and y-intercept. This is exactly what Jason is talking about.

This creates real problems. Suppose you want to find the tangent line to a curve at a point. You've used your new calculus skills to find the slope of the tangent line. Now, to write down the equation for the tangent line, you're going to need a point on the line. Of course, if you understand that any point will do, then this is easy. If you think you need to know what the y-intercept is.....ugh.

Given that, as far as I am aware, the only 2 people to have posted so far who teach mathematics at a high level (i.e. Jason and Jeff) are not agreeing with me, I think I am going to have to concede with respect to what their students find more or less difficult – Eric’s support notwithstanding. Far be it from me to assume a knowledge over their long experience, in attempting to tell them how to do what must be a difficult job.

That said, I am still a little curious:-

-While I do think my students are generally quite a bit better than “half intelligent,”

Jason – My comment certainly was not meant to impugn the intelligence of your students. I think a careful reading of what I said will reveal the underlying assumption (for the purpose of highlighting my subsequent point), is that your students would indeed be expected to be persons of considerable intelligence –no doubt much higher than average.

-the mindset that math is just a matter of memorizing formulas and then applying them with machine-like precision when called upon to do so. They do not naturally think about the big picture, or ask themselves about “what is really gong on.”

This bit, however, puzzles me. Surely such a mindset is one out of which you would wish to lead your students as soon as possible – perhaps by halfway through 1st year? As a teacher myself (in my case of med students) getting them to think about what they are learning – even to challenge it – is paramount. If this applies to a subject which largely revolves around learning huge factual shopping lists relating to human biological sciences – anatomy, physiology etc., then how much more so in the rarefied atmosphere of mathematics, where abstraction of thought is of the essence?

There's another important issue here:

"Linear functions" have two parts -- the "linear" and the "function." When you teach them, you do them differently depending on whether you're emphasizing the "linear" part or the "function" part. They make a pretty good intro to functions, too, where one probably would want to emphasize the f(x)=mx+b form. The vertical lines in software aren't functions (at least y as a function of x), and that's one of the first things one learns about what makes something a "function." But when you're learning about lines, you need to know all the different ways to describe a line, and what each would be used for.

There is always going to be some question about whether the motivation is more geometric or more algebraic/analytical -- think of all the ways to describe a given ellipse in two dimensions.

By Another Matt (not verified) on 09 Sep 2013 #permalink

I'm having a very hard time understanding this:

Heck, I could just move the axes, and suddenly the line has a different y-intercept. But I haven’t changed the line itself, now have I?

But since the equation for any line will always be relative to a coordinate system, you have "changed the line itself" when you switch coordinate systems. Your line's x_0 y_0 points will change with it, and if anything besides translation of the axes involved (rotation, e.g.), you'll have changed its slope as well, with respect to the xy-plane.

Someone help me out?

By Another Matt (not verified) on 09 Sep 2013 #permalink

@19: If you translate the axes rigidly, (no stretching nor rotation), then the slope remans the same although the intercept changes (and of course an arbitrary point (x0,y0) on the line would change as well). The modification to the point-slope form would be more symmetric, since both x0 and y0 would change in similar ways, whereas the change in b would be a little more complicated.

One other annoyance is that in linear algebra and beyond, these things are called "affine" functions, not "linear" unless b = 0. Maybe "linear" should refer to straight line graphs and the more restrictive notion of function/transformation should have another name (like "homogeneous" is used for linear equations with constant term 0), but affine and linear are the standard terms for transformations.

By Ned Rosen (not verified) on 10 Sep 2013 #permalink

I agree that the point-slope form captures the geometry of non-vertical lines better than point-slope, but if you want to be fully general, then ax+ by = c or even a(x-x0) + b(y-y0) = 0 apply to vertical lies as well.

If we're only considering lines which are (graphs of) functions, then one of the functional forms y = f(x) = mx + b = m(x-x0) + y0 probably should be emphasized along with point-slope, since point-slope is not in the form y = f(x).

By Ned Rosen (not verified) on 10 Sep 2013 #permalink

As a college math professor, I'm going to agree with the original post. I also have plenty of calculus students who can't manage the equation of a line because at no point (or at least no point they were required to pay attention to) were they required to understand where that equation comes from or what it's telling them. I'm not convinced many of the really understand, at a fundamental level, the point slope form either, but of the two, it's certainly the formula that comes the closest to how students should think about lines for the purpose of calculus. So it's the form I talk about almost exclusively.

All that said, there's still a problem here. Either of the "point-slope" or "slope-intercept" forms could be considered equations that describe lines in the plane, but neither of those formulas corresponds to a "linear function". A linear function F is one that satisfies the property F(ax+y)=aF(x)+F(y), where a is a scalar and x and y are usually some kind of vectors. The formula f(x)=mx+b (or your favorite other variant in functional form) should probably be called an "affine function" or something like that.

Yes, let's talk about affine transformations...

Nice! As a colleague I disagree with you (keep in mind though that my pupils are 14, 15 years old).

"you come to see the full horror of the y=mx+b form. It’s that b at the end."
At 14, 15 years kids are far more open to a systematical inductive approach. So I start with lines going through the origin O, all with a different slope. They have to draw four or five of them in one figure. Then they immediately see what the slope, ie the parameter a, actually does. Of course thus we are researching even simpler functions: those of the form y = mx (I prefer to use a iso of m, but whatever).
The next step is to make clear that linear functions with the same slope are parallel. For this I use the translation vector (0, b) - of course they already have met some simple vector math. Again they have to draw four or five of them, each with the same slope.
Then the final step to the general form y = mx + b is a very easy one. The most important thing: they never forget.

"But why should we care about that point?"
Didactics, my dear JR, it's all about didactics. The equations you give, especially the one for the slope, make my kids gaze if I present it at such an early stage.
At a somewhat higher level, where pupils are capable of more abstract thinking and have more calculating skills (you don't want to know how bad mine are at cross-multiplying; almost all of them are simply too young for it) I'd prefer your approach too.

"the nonsense that this is the most appropriate way...."
Here is where you go wrong. The most appropriate way to define something in math and physics is by far not alway the best one from a didactical point of view. As hopefully is clear the age of the pupils (Piaget!) plays a decisive role.

"Like Phil, I don’t see how y=mx+b “privileges” the y axis."
Oh, I do. See my post above: I derive the equation by inductively researching all points of the y-axis. Only after my kids have mastered the form y = ax + b I show them how it relates to every single point of the xy-plane.
That's the essence of the inductive method: I begin with origin O, then include all points of the y-axis and finally all points of the xy-plane.
Btw from here it's very easy to expand to the xyz-space as well. JR's method after all also includes a privilege - namely for all straight lines in one specific plane ......

"Most of the time they come to class with the mindset that math is just a matter of memorizing formulas and then applying them with machine-like precision when called upon to do so."
Wow, then the American educational system rather sucks. Dutch books try very hard to avoid this; instead they try to build up gradually the big picture (we use Dutch books in Suriname).

"the first answer I hear is almost invariably: slope and y-intercept."
The first answer I always get is two points, because that's where I end the subject when presented for the first time. It can take some time for my pupils to recall though.
As a teacher physics (I teach two subject) I can think of another reason to privilege the y-interception. In classical mechanics, more specifically velocity-time diagram, this interception represents the initial velocity.

"They make a pretty good intro to functions"
Again too abstract for my kids. My introduction is the Surinamese equivalent of states and their capitals (too distinghuish functions from non-functions I also ask a pupil to give me the names of the mother plus some aunts and their daughters).