Section 4-5-3

Chapter 4
Differential Calculus and Its Uses

4.5 The Chain Rule

4.5.3 Differentiating a Function-of-a-Function:
The Chain Rule

Now that we have a formula for differentiating the square root function, we need to extend that to a formula for the derivative of `sqrt(u)`, where `u` is itself a function of `x`. Actually, square root has little to do with this step — this is a problem that we will encounter over and over with many different functions. If we write `y=sqrt(u)`, then our problem takes this form: `y` is a function of `u`, and `u` is a function of `x`. Therefore, `y` is a function of `x` also. How do we find `dytext[/]dx` when we know `dytext[/]du` and `dutext[/]dx`?

Let's relate this question to something we have done before. In Chapter 2 we saw that, for any constant `k`,

\frac{d}{d x} e^{k x} = k e^{k x} .

If we set `u=kx` and `y=e^u`, then this calculation takes the form

$\frac{d y}{d x}$	${= e}^{k x} \cdot k$
	$= e^{u} \frac{d u}{d x}$
	$= \frac{d y}{d u} \frac{d u}{d x}$ .

We show next that the formula

\frac{d y}{d x} = \frac{d y}{d u} \frac{d u}{d x}

holds for any function, not just the exponential function, and for any dependence of `u` on `x`. In words, this says that the rate of change of `y` as a function of `x` is the rate of change of `y` as a function of `u` times the rate of change of `u` as a function of `x`.

Suppose we fix a number `x` at which we want to know `dytext[/]dx`, and we compute an approximating difference quotient for a small increment `Delta x`. We write simply `u` for the value of `u` at `x` and `u+Delta u` for the value at `x+Delta x`. That is, `Delta u` is the corresponding increment in the intermediate variable. Similarly, we write `y` for the value of the outer variable at `x` and `y+Delta y` for the value at `x+Delta x`, so `Delta y` is the corresponding increment in the outer variable. Then `dytext[/]dx` is approximated by `Delta ytext[/]Delta x`, and simple algebra tells us that

\frac{Δ y}{Δ x} = \frac{Δ y}{Δ u} \frac{Δ u}{Δ x} .

We may not know yet what `du` is — or why it appears to cancel in the derivative formula — but `Delta u` is an ordinary numerical quantity, so, whenever it is not zero, it is subject to the algebraic cancellation law.

The two factors `Delta ytext[/]Delta u` and `Delta utext[/]Delta x` in the last equation approximate, respectively, the rate of change of `y` with respect to `u` and the rate of change of `u` with respect to `x`. Furthermore, the approximations to instantaneous rates of change all get better as the increment in `x` shrinks to zero, so when we take limiting values of all three quotients, we find

\frac{d y}{d x} = \frac{d y}{d u} \frac{d u}{d x}

as predicted.

Simple as it is, this equation is perhaps the most important formula of differential calculus, because so many other formulas and calculations depend on it. Important results have names — the name of this one is the Chain Rule. It is called that because it tells us how to differentiate chains of functions, i.e., how to find `dytext[/]dx` when `y` is a function of `u` and `u` is a function of `x`.

The Chain Rule If `y` is a function of `u` and `u` is a function of `x`, then

\frac{d y}{d x} = \frac{d y}{d u} \frac{d u}{d x} .

In functional notation, if `u=gtext[(]xtext[)]` and `y=ftext[(]utext[)]=ftext[(]gtext[(]xtext[))]`, then

\frac{d}{d x} f (g (x)) = f^{'} (g (x)) g^{'} (x) .

Example 2

Calculate the derivative of `sqrt(p^2+x^2)`, where `p` is a constant.

Solution As we did with the Product Rule, we solve the problem in both functional and variable notation, this time to make the same point: You don't have to do this. But the two calculations proceed slightly differently in terms of what you have to think about and when.

Our function `sqrt(p^2+x^2)` is a composite of the square root function, say, `ftext[(]utext[)]= sqrt(u)`, and a polynomial function, say, `gtext[(]x text[)]=p^2+x^2`. The derivatives of these functions are, respectively, `f' text[(]utext[)]= 1 text[/] (2 sqrt(u))` and `g' text[(]x text[)]=2x`. The Chain Rule tells us to evaluate `f'` at `gtext[(]x text[)]` and multiply the result by `g' text[(]x text[)]`:

d d ⁢ x f ( g ( x ) ) = f ′ ( g ( x ) ) g ′ ( x ) = 1 2 ⁢ p 2 + x 2 ⁢ 2 x = x p 2 ⁢ + x 2 .

If we set `u=p^2+x^2` and `y=sqrt(u)=sqrt(p^2+x^2)`, then `du text[/] dx=2x`, and `dy text[/] du=1 text[/] (2 sqrt(u))`, so

\frac{d y}{d x} = \frac{d y}{d u} \frac{d u}{d x} = \frac{1}{2 \sqrt{u}} 2 x = \frac{x}{\sqrt{p^{2} + x^{2}}} .

There is not a great deal of difference between these two ways to solve the problem except in terms of when you have to think about the fact that `f'` must be evaluated at `u=gtext[(]x text[)]`. The Chain Rule appears to be simpler in variable notation, but the simpler notation disguises the fact that the answer has to be in terms of the sole independent variable, `x.` Thus, the calculation is not finished until you replace the “intermediate” variable `u` by its equivalent as a function of `x.`

Checkpoint 1

Contents for Chapter 4

Chapter 4 Differential Calculus and Its Uses

4.5 The Chain Rule

Chapter 4
Differential Calculus and Its Uses