Good Math, Bad Math

Basics: Limits

One of the fundamental branches of modern math – differential and integral calculus – is based on the concept of limits. In some ways, limits are a very intuitive concept – but the formalism of limits can be extremely confusing to many people.

Limits are basically a tool that allows us to get a handle on certain kinds
of equations or series that involve some kind of infinity, or some kind of value that is almost defined. The informal idea is very simple; the formalism is also pretty simple, but it’s often obscured by so much jargon that it’s hard to relate it to the intuition.

The use of limits for finding almost defined values sounds tricky, but
it’s really pretty simple, and it makes for a very good illustration.

Think of the simple function: f(x)=(x-1)/(sqrt(x)-1). Just looking at it, you should be able to quickly see that its value at x=1 is undefined – f(1)=(1-1)/(1-1)=0/0.

But let’s look at what happens as we get close to x=1.

f(2)=2.414. f(1.5)=2.22. f(1.2)=2.09. f(1.1)=2.05. f(1.01)=2.005… As we look at values of numbers greater than 1, but ever closer and closer to x=1, we can see that as x gets closer to 1, f(x) gets closer and closer to 2.

The same thing happens from the other direction. f(0.5)=1.71. f(0.9)=1.95. f(0.99)=1.995….

At exactly x=1, the value of the function is undefined – it’s a division by zero. But from either side of 1 – greater or lesser – the closer we get to x=1, the closer f(x) gets to 2. So we say that the limit of f(x) as x approaches 1 = 2 – more traditionally written: “limx→1f(x)=2″.

A simple example of managing infinity with limits is the equation f(x) = (1/x)+4. As x gets larger, f(x) obviously gets closer and closer to 4. It never actually reaches four – but it gets closer and closer. For any number ε, no matter how small, you can find some value x so that f(x)<4+ε, and after that x, f(x) will always be less than 4+ε. Epsilon can be 10-800 – and there’s someplace where after some value x, f(x) is always less than 4+10-800.

Which brings us at last to the formal definition of a limit. We’ll start with
the case where we get infinitely close to a real value. Given a real-valued function f(x), defined in an open interval around a value p (but not necessarily at P). Then limx→pf(x) (the limit of f(x) as x approaches p) = L if and only if for all ε>0, there exists some value δ>0 such that for all x where 0<|x-p|<δ, |f(x)-L|<ε.

That’s really just restating what we did with the example. It’s just a formal way of saying that as x gets closer and closer to p, f(x) gets closer and closer to L. The ε and δ are names for a pair of decreasing values – as x gets closer to p, both ε and δ get closer and closer to 0.

For dealing with infinity, as in our second example, the formal definition is:
Given a real-valued function f, limx→∞f(x)=L if and only if for all ε>0, there exists some real number n such that for all x>n |f(x)-L|<ε.

This is exactly the same kind of trick that we used in our example of a limit as x approaches infinity – no matter how small a value you pick for epsilon, there is some point on the curve after which f(x) will never be farther than ε away from L.


  1. #1 Blake Stacey
    February 14, 2007

    Just looking at it, you should be able to quickly see that its value at x=1 is defined – f(1)=(1-1)/(1-1)=0/0.

    “Defined” is equivalent to “undefined” for small values of definition.

  2. #2 Blake Stacey
    February 14, 2007

    Keith Devlin had a good essay on limits, titled “Letter to a calculus student“.

    The expression to the right of the equal sign in [the definition of the derivative] represents the result of a process. Not an actual process that you can carry out step-by-step, but an idealized, abstract process, one that exists only in the mind. It’s the process of computing the ratio

    [f(x +h) - f(x)] / h

    for increasingly smaller nonzero values of h and then identifying the unique number that those quotient values approach, in the sense that the difference between those quotients and that number can be made as small as you please by taking values of h sufficiently small. (Part of the mathematical theory of the derivative is to decide when there is such a number, and to show that if it exists it is unique.) The reason you can’t actually carry out this procedure is that it is infinite: it asks you to imagine taking smaller and smaller values of h ad infinitum.

    The subtlety that appears to have eluded Bishop Berkeley is that, although we initially think of h as denoting smaller and smaller numbers, the “lim” term in [the definition formula] asks us to take a leap (and it’s a massive one) to imagine not just calculating quotients infinitely many times, but regarding that entire process as a single entity. It’s actually a breathtaking leap.

  3. #3 Doug
    February 14, 2007

    I was under the impression that limit theory was developed to explain why Newton [fluxions] and Leibniz [calculus] were able to essentially divide by zero.

    “… (If nothing else, it took some of the most brilliant mathematicians 150 years to arrive at it) …”

    I am unsure who deserves credit:
    “… The rule is believed to be the work of Johann Bernoulli since l’Hôpital, a nobleman, paid Bernoulli a retainer of 300 Francs per year to keep him updated on developments in calculus and to solve problems he had. Among these problems was that of limits of indeterminate forms. When l’Hôpital published his book, he gave due credit to Bernoulli and, not wishing to take credit for any of the mathematics in the book, he published the work anonymously. Bernoulli, who was known for being extremely jealous, claimed to be the author of the entire work, and until recently, it was believed to be so. Nevertheless, the rule was named for l’Hôpital, who never claimed to have invented it in the first place …”

  4. #4 bza
    February 14, 2007

    The wikipedia entry on l’Hopital is actually too generous. The retainer he paid Bernoulli was not only for being kept up to date on current developments. It was a condition of the payments that Bernoulli send l’Hopital copies of his own discoveries and not publish anything on his own. L’Hopital then frequently reported these discoveries to luminaries such as Huygens without mentioning Bernoulli.

    There’s a short account of all this in section IV of this review of Bernoulli’s collected works (JSTOR; university access required).

  5. #6 Mikael Johansson
    February 14, 2007

    Oooooooooh, THAT kind of limits.

    And here I was hoping for direct and inverse limits. :P

  6. #7 Billy
    February 14, 2007


    You can get more manageable (and more reliable) links for JSTOR articles by going to “Article Information,” then copying the “Stable URL” link.

    Here’s a nicer link for the article you cited:

    (When you print an article, the Stable URL appears on the cover page; so that’s another way to find it.)

    And thanks for the link!

  7. #8 Jonathan Vos Post
    February 25, 2007

    Jonathan Vos Post

    The problem of pain, screaming for solution
    for dissolution from the reign of fire
    dendron : burning bush : the axon wire
    carries no reprieve from execution

    Failing to solve the pain equation
    searching for signs where blood might balance
    cries might cancel — what good are these talents
    when the unknown’s launched the last invasion?

    Formulas raped. Irrational fact
    barbecues brain on vector spearhead, turns
    the white-hot axis where the body burns
    real roots of pain which one cannot extract

    Calculating with a final breath
    the limit as life approaches death…

    26 Sep 1983

  8. #9 Misha Livshits
    February 2, 2008

    Ho-hum, here we go again… Limits, limits, limits… Away with them! We don’t need them to differentiate!
    True, when we plug in x=1 into (x^2-1)/(x-1) we get 0/0 that is undefined (because for any number c 0*c=0). But we can divide the numerator by the denominator first, as polynomials, to get (x^2-1)/(x-1)=(x+1)(x-1)/(x-1)=x+1, and plug in x=1. This works fine for polynomials when we want to differentiate them, because p(x)-p(a)=0 for x=a, and therefore p(x)-p(a) is divisible by x-a, so we can divide p(x)-p(a) by x-a and then plug in x=a into the result to get the the derivative, p’(x). So we can look at differentiation as division in a certain class of functions (say, polynomials). Take class of continuous functions instead of polynomials, and you get the classical differentiation. So we can look at classical differential calculus as just algebra of continuous functions.

    What are the advantages of looking at differentiation like this? First, the rules of differentiation become almost obvious. Second, you don’t have to torture the students with limits to teach them how to differentiate and how to use differentiation. Third, you start with simple examples that are already interesting and allow to solve some nontrivial problems, and generalize gradually.

    But you’d say: “Aha, you are cheating, you say ‘away with limits’ and then you mention continuity!” True, but continuity is a simpler notion than limits (we don’t have to worry about what the limit is and whether it exists), and we can even get away without the general notion of continuity for most calculus applications, because most of the functions that we deal with are better than continuous, they are usually Lipschitz or Holder, and their derivatives are too, but I’m getting a bit ahead of myself, let’s go back to polynomials and see why the tangent clings to the graph.

    To do that, we just rewrite the expression p(a+h) as a polynomial in h (with coefficients depending on a, but we take a fixed). So, p(a+h)=p(a)+p’(a)h+q(a,h)h^2, where q(a,h) is a polynomial with 2 variables, a and h. This polynomial is bounded if a and h are, and we arrive at the estimate: |p(a+h)-p(a)-p’(a)h|&le Kh^2. It can be rewritten in a more symmetric form, as |p(x)-p(a)-p’(a)(x-a)| &le K(x-a)^2. This estimate tells us that the vertical distance between the graph of the polynomial, y=p(x) and its tangent line at the point (a,p(a)), y=p(a)+p’(a)(x-a) is less than K times the square of the horizontal distance to the point of tangency, that is (a,p(a)). This pretty much captures our idea of tangency. For example, the x axis is the tangent of the parabola y=x^2 at (0,0).

    Now we can take our estimate, that holds for polynomials, and turn it into a definition of the derivative, we call g uniformly Lipschitz differentiable (ULD) on the interval [A,B] if the inequality |g(x)-g(a)-g’(a)(x-a)| &le K(x-a)^2 holds for A &le x, a &le B. By dividing our inequality by |x-a| we get |(g(x)-g(a))/(x-a)-g’(a)| &le K|x-a|. Switching x and a gives us |(g(a)-g(x))/(a-x)-g’(x)|&le K|x-a|, and taking into account that (g(a)-g(x))/(a-x)=(g(x)-g(a))/(x-a), we conclude that |g’(x)-g’(a)|&le 2K|x-a|, in other words, g’(x), the derivative of a ULD function is Lipschitz. So we arrive at the differentiation theory in the realm of Lipschitz functions, that already include all the analytic functions.

    From the estimate that defines ULD it is easy to derive the increasing function theorem that says that a ULD function with non-negative derivative must be non-decreasing. This theorem takes the central role in the whole theory of uniform Lipschitz calculus. See the slides for my talk for details, as well as my calculus writeup from a subpage of my home page where you can find some details about my project and more references. I especially recommend the articles by Mark Bridger, by Harold Edwards and by Hermann Karcher, and his German Lecture Notes. There is also a forthcoming book by Qun Lin who has simplified calculus along the similar lines.

    Some people may object that this ULD theory is too restrictive. Well, we can use the Holder estimates instead of Lipschitz, the basic inequality becomes |g(x)-g(a)-g’(a)(x-a)| &le K|x-a||x-a|^&gamma, and if we want to capture all the continuously differentiable functions with a given modulus of continuity m, the estimate becomes |g(x)-g(a)-g’(a)(x-a)| &le K|x-a|m(|x-a|). All the proofs stay virtually the same. This calculus via unform estimates is much simpler than the classical one and puts the true understanding of the subject within the reach of much wider audience, since the subtle theorems involving compactness are not needed. It also emphasizes the aspects of the subject that are most important for computations. The other attractive feature is that most of the textbook calculus problems can be solved within the new approach, all the formulas stay the same, the only thing that changes are the mathematical mantras that accompany the calculations. Generalizing to many variables is also straight-forward.

    So we CAN do calculus without limits, epsilons and deltas can be learned towards the end of the course, or in a separate course of introductory analysis, and they become more understandable after the uniform approach is digested. So it looks like we have a fantastic opportunity to truly simplify calculus and make it more accessible to the public. What are we waiting for? Maybe a stamp of approval from “pure” mathematicians?