Chapter 4
Differential Calculus and Its Uses





4.5 The Chain Rule

4.5.3 Differentiating a Function-of-a-Function:
         The Chain Rule

Now that we have a formula for differentiating the square root function, we need to extend that to a formula for the derivative of `sqrt(u)`, where `u` is itself a function of `x`. Actually, square root has little to do with this step — this is a problem that we will encounter over and over with many different functions. If we write `y=sqrt(u)`, then our problem takes this form: `y` is a function of `u`, and `u` is a function of `x`. Therefore, `y` is a function of `x` also. How do we find `dytext[/]dx` when we know `dytext[/]du` and `dutext[/]dx`?

Let's relate this question to something we have done before. In Chapter 2 we saw that, for any constant `k`,

d d x e k x = k e k x .

If we set `u=kx` and `y=e^u`, then this calculation takes the form

d y d x = e k x k
  = e u d u d x
  = d y d u d u d x .

We show next that the formula

d y d x = d y d u d u d x

holds for any function, not just the exponential function, and for any dependence of `u` on `x`. In words, this says that the rate of change of `y` as a function of `x` is the rate of change of `y` as a function of `u` times the rate of change of `u` as a function of `x`.

Suppose we fix a number `x` at which we want to know `dytext[/]dx`, and we compute an approximating difference quotient for a small increment `Delta x`. We write simply `u` for the value of `u` at `x` and `u+Delta u` for the value at `x+Delta x`. That is, `Delta u` is the corresponding increment in the intermediate variable. Similarly, we write `y` for the value of the outer variable at `x` and `y+Delta y` for the value at `x+Delta x`, so `Delta y` is the corresponding increment in the outer variable. Then `dytext[/]dx` is approximated by `Delta ytext[/]Delta x`, and simple algebra tells us that

Δ y Δ x = Δ y Δ u Δ u Δ x .

We may not know yet what `du` is — or why it appears to cancel in the derivative formula — but `Delta u` is an ordinary numerical quantity, so, whenever it is not zero, it is subject to the algebraic cancellation law.

The two factors `Delta ytext[/]Delta u` and `Delta utext[/]Delta x` in the last equation approximate, respectively, the rate of change of `y` with respect to `u` and the rate of change of `u` with respect to `x`. Furthermore, the approximations to instantaneous rates of change all get better as the increment in `x` shrinks to zero, so when we take limiting values of all three quotients, we find

d y d x = d y d u d u d x ,

as predicted.

Simple as it is, this equation is perhaps the most important formula of differential calculus, because so many other formulas and calculations depend on it. Important results have names — the name of this one is the Chain Rule. It is called that because it tells us how to differentiate chains of functions, i.e., how to find `dytext[/]dx` when `y` is a function of `u` and `u` is a function of `x`.


The Chain Rule   If `y` is a function of `u` and `u` is a function of `x`, then

d y d x = d y d u d u d x .

In functional notation, if `u=gtext[(]xtext[)]` and `y=ftext[(]utext[)]=ftext[(]gtext[(]xtext[))]`, then

d d x f ( g ( x ) ) = f ( g ( x ) ) g ( x ) .

Example 2

Calculate the derivative of `sqrt(p^2+x^2)`, where `p` is a constant.

Solution   As we did with the Product Rule, we solve the problem in both functional and variable notation, this time to make the same point: You don't have to do this. But the two calculations proceed slightly differently in terms of what you have to think about and when.

Our function `sqrt(p^2+x^2)` is a composite of the square root function, say, `ftext[(]utext[)]= sqrt(u)`, and a polynomial function, say, `gtext[(]x text[)]=p^2+x^2`. The derivatives of these functions are, respectively, `f' text[(]utext[)]= 1 text[/] (2 sqrt(u))` and `g' text[(]x text[)]=2x`. The Chain Rule tells us to evaluate `f'` at `gtext[(]x text[)]` and multiply the result by `g' text[(]x text[)]`:

d d x f ( g ( x ) ) = f ( g ( x ) ) g ( x ) = 1 2 p 2 + x 2 2 x = x p 2 + x 2 .

If we set `u=p^2+x^2` and `y=sqrt(u)=sqrt(p^2+x^2)`, then `du text[/] dx=2x`, and `dy text[/] du=1 text[/] (2 sqrt(u))`, so

d y d x = d y d u d u d x = 1 2 u 2 x = x p 2 + x 2 .

There is not a great deal of difference between these two ways to solve the problem except in terms of when you have to think about the fact that `f'` must be evaluated at `u=gtext[(]x text[)]`. The Chain Rule appears to be simpler in variable notation, but the simpler notation disguises the fact that the answer has to be in terms of the sole independent variable, `x.` Thus, the calculation is not finished until you replace the “intermediate” variable `u` by its equivalent as a function of `x.`

Checkpoint 1Checkpoint 1

Go to Back One Page Go Forward One Page

 Contents for Chapter 4