2.1 The Second Derivative Test

Back

I shall work only with functions

f : R2 _ ! R

x

y

_

  f

_

x

y

_

[eg. f

_

x

y

_

= z = xy − x5/5 − y3/3 + 4 ]

This has as its graph a surface always, see figure 2.1

Figure 2.1: Graph of a function from R2 to R

9

10 CHAPTER 2. OPTIMISATION

The first derivative is

_

@f

@x

,

@f

@y

_

a 1 × 2 matrix

which is, at a point

_

a

b

_

just a pair of numbers.

[eg. f

_

x

y

_

= z = xy − x5/5 − y3/3 + 4

_

@f

@x

,

@f

@y

_

=

_

y − x4, x − y2_

_

@f

@x

,

@f

@y

_

2

3

= [3 − 16, 2 − 9] = [−13,−7]

This matrix should be thought of as a linear map from R2 to R:

[−13,−7]

_

x

y

_

= −13x − 7y

It is the linear part of an affine map from R2 to R

z = [−13,−7]

_

x − 2

y − 3

_

+

_

(2)(3) − 25

5 − 33

3 + 4

_

"

(f

_

2

3)

_

= −5.4)

This is just the two dimensional version of y = mx+c and has graph a plane

which is tangent to f

_

x

y

_

= xy−x5/5−y3/3+4 at the point

_

x

y

_

=

_

2

3

_

.

So this generalises the familiar case of y = mx+c being tangent to y = f(x)

at a point and m being the derivative at that point, as in figure 2.2.

To find a critical point of this function,that is a maximum, minimum or

saddle point, we want the tangent plane to be horizontal hence:

2.1. THE SECOND DERIVATIVE TEST 11

Figure 2.2: Graph of a function from R to R

Definition 2.1. If f : R2 ! R is differentiable and

_

a

b

_

is a critical point

of f then _

@f

@x

,

@f

@y

_

a

b

= [0, 0].

Remark 2.1.1. I deal with maps f : Rn ! R when n = 2, but generalising

to larger n is quite trivial. We would have that f is differentiable at a 2 Rn

if and only if there is a unique affine (linear plus a shift) map from Rn to R

tangent to f at a. The linear part of this then has a (row) matrix representing

it _

@f

@x1

,

@f

@x2

, · · · ,

@f

@xn

_

x=a

Remark 2.1.2. We would like to carry the old second derivative test through

from one dimension to two (at least) to distinguish between maxima, minima

and saddle points. This remark will make more sense if you play with

the DEMO’s program on the Graphing Calculator and plot the five cases

mentioned in the introduction.

I sure hope you can draw the surfaces x2+y2 and x2−y2, because if not you

are DEAD MEAT.

12 CHAPTER 2. OPTIMISATION

Definition 2.2. A quadratic form on R2 is a function f : R2 ! R which

is a sum of terms xpyq where p, q 2 N (the natural numbers: 0,1,2,3, . . .) and

p + q _ 2 and at least one term has p + q = 2.

Definition 2.3. [alternative 1:] A quadratic form on R2 is a function

f : R2 ! R

which can be written

f

_

x

y

_

= ax2 + bxy + cy2 + dx + ey + g

for some numbers a,b,c,d,e,g and not all of a,b,c are zero.

Definition 2.4. [alternative 2:] A quadratic form on R2 is a function

f : R2 ! R

which can be written

f

_

x

y

_

= [x − _, y − _]

_

a11 a12

a21 a22

_ _

x − _

y − _

_

+ c

for real numbers _, _, c, aij 1 _ i, j _ 2 and with a12 = a21

Remark 2.1.3. You might want to check that all these 3 definitions are

equivalent. Notice that this is just a polynomial function of degree two in

two variables.

Definition 2.5. If f : R2 ! R is twice differentiable at

_

x

y

_

=

_

a

b

_

the second derivative is the matrix in the quadratic form

[x − a, y − b]

"

@2f

@x2

@2f

@x@y

@2f

@y@x

@2f

@y2

# _

x − a

y − b

_

Remark 2.1.4. When the first derivative is zero, it is the “best fitting

quadratic” to f at

_

a

b

_

, although we need to add in a constant to lift

it up so that it is “more than tangent” to the surface which is the graph of

f at

_

a

b

_

. You met this in first semester in Taylor’s theorem for functions

of two variables.

2.1. THE SECOND DERIVATIVE TEST 13

Theorem 2.1. If the determinant of the second derivative is positive at

_

a

b

_

f_or a continuously differentiable function f having first derivative zero at

a

b

_

, then in a neighbourhood of

_

a

b

_

, f has either a maximum or a mini

_mum, whereas if the determinant is negative then in a neighbourhood of

a

b

_

, f is a saddle point. If the determinant is zero, the test is uninformative.

“Proof” by arm-waving: We have that if the first derivative is zero, the

second derivative at

_

a

b

_

of f is the approximating quadratic form from

Taylor’s theorem, so we can work with this (second order) approximation to

f in order to decide what shape (approximately) the graph of f has.

The quadratic approximation is just a symmetric matrix, and all the information

about the shape of the surface is contained in it. Because it is symmetric,

it can be diagonalised by an orthogonal matrix, (ie we can rotate the surface

until the quadratic form matrix is just

_

a 0

0 b

_

We can now rescale the new x and y axes by dividing the x by |a| and the y

by |b|. This won’t change the shape of the surface in any essential way.

This means all quadratic forms are, up to shifting, rotating and stretching

_

1 0

0 1

_

or

_

−1 0

0 −1

_

or

_

1 0

0 −1

_

or

_

−1 0

0 1

_

.

ie. x2 + y2 − x2 − y2 x2 − y2 − x2 + y2 since

[x, y]

_

1 0

0 1

_ _

x

y

_

= x2 + y2

et cetera. We do not have to actually do the diagonalisation. We simply note

that the determinant in the first two cases is positive and the determinant

is not changed by rotations, nor is the sign of the determinant changed by

scalings. _

Proposition 2.1.1. If f : R2 ! R is a function which has

Df

_

a

b

_

=

_

@f

@x

,

@f

@y

_

a

b

14 CHAPTER 2. OPTIMISATION

zero and if

D2f

_

a

b

_

=

"

@2f

@x2

@2f

@xy

@2f

@y@x

@2f

@y2

#

a

b

is continuous on a neighbourhood of

_

a

b

_

and if det D2f

_

a

b

_

> 0

then if

@2f

@x2

____

a

b

> 0

f has a local minimum at

_

a

b

_

and if

@2f

@x2

____

a

b

< 0

f has a local maximum at

_

a

b

_

.

“Proof” The trace of a matrix is the sum of the diagonal terms and this is

also unchanged by rotations, and the sign of it is unchanged by scalings. So

again we reduce to the four possible basic quadratic forms

_

1 0

0 1

_

x2 + y2

,

_

−1 0

0 −1

_

−x2 − y2

,

_

1 0

0 −1

_

x2 − y2

,

_

−1 0

0 1

_

−x2 + y2

and the trace distinguishes betweeen the first two, being positive at a minimum

and negative at a maximum. Since the two diagonal terms have the

same sign we need only look at the sign of the first. _

Example 2.1.1. Find and classify all critical points of

f

_

x

y

_

= xy − x4 − y2 + 2.

2.1. THE SECOND DERIVATIVE TEST 15

Solution _

@f

@x

,

@f

@y

_

=

_

y − 4x3, x − 2y

_

at a critical point, this is the zero matrix [0, 0]

so y = 4x3 and y = 1

2x

so 1

2x = 4x3 ) x = 0 or x2 = 1

8

so x = 0 or x = p1

8

or x = p−1

8

when y = 0, y = 1

2

p

8

y = −1

2

p

8

and there are three critical points

_

0

0

_

p1

8

1

2

p

8

!

p−1

8

−1

2

p

8

!

.

D2f =

 

@2f

@x2

@2f

@x@y

@2f

@y@x

@2f

@y2

!

=

_

−12x2 1

1 −2

_

D2f

_

0

0

_

=

_

0 1

1 −2

_

and det = −1 so

_

0

0

_

is a saddle point.

D2f

 

p1

8

1

2

p

8

!

=

_ −12

8 1

1 −2

_

and det = 3 − 1 = 2 so the point is either a

maximum or a minimum.

D2f

 

p−1

8

−1

2

p

8

!

is the same. Since the trace is −31

2 both are maxima. _

Remark 2.1.5. Only a wild optimist would believe I have got this all correct

without making a slip somewhere. So I recommend strongly that you try

checking it on a computer with Mathematica, or by using a graphics calculator

(or the software on the Mac).

16 CHAPTER 2. OPTIMISATION