Taylor Theorem

Introduction

Taylor theorem is widely used for the approximation of a k-times differentiable function around a given point by a polynomial of degree k, called the kth-order Taylor polynomial. In addition, it is also useful for proving some of the convex function properties.

In this blog post, I would like to discuss and prove the univariate Taylor theorem followed by touching some of the basic expressions for the commonly used quadratic multivariate Taylor theorem.

Prerequisites

The following theorem and lemmas are prerequisite knowledge for proving the univariate Taylor theorem.

Rolle’s Theorem

Rolle’s theorem is a special case of mean value theorem.

If a real-valued function f is continuous on a proper closed interval [a,b], differentiable on the open interval (a,b), and f(a)=f(b), then there exists at least one c in the open interval (a,b) such that f(c)=0.

Lemma 1

Let f:RR be 2 times differentiable on the open interval with f and f continuous on the closed interval I between a and x0 (a function of class C2 on I). If f(a)=0, and x0 is a point such that f(x0)=f(a) and x0a, then there exists t(0,1) such that f(a+t(x0a))=0.

Proof

Because f(x0)=f(a) and x0a, according to Rolle’s Theorem, there exists t1(0,1) such that f(a+t1(x0a))=0. Now, because f(a)=f(a+t1(x0a)) and a+t1(x0a)a, again, according to Rolle’s Theorem, there exists t2(0,1) such that f(a+t2(a+t1(x0a)a))=0.

f(a+t2(a+t1(x0a)a))=f(a+t2t1(x0a))=f(a+t(x0a))=0

where t=t1t2.

This concludes the proof.

Lemma 2

Let f of class Ck on the closed interval I between a and x0. If f(a)=f(a)==f(k1)(a)=0, and x0 is a point such that f(x0)=f(a) and x0a, then there exists t(0,1) such that f(k)(a+t(x0a))=0.

Proof

Because f(x0)=f(a), x0a, and f(a)=0, according to Lemma 1, there exists t1(0,1) such that f(a+t1(x0a))=0. Because f(a+t1(x0a))=f(a), a+t1(x0a)a, and f(a)=0, according to Lemma 1, there exists t2(0,1) such that f(3)(a+t2(a+t1(x0a)a))=0. We rearrange the terms, so there must exists t2(0,1) such that f(3)(a+t2(x0a))=0, where t2=t1t2.

Iterating this process, there exists tk1(0,1) such that f(k)(a+tk1(x0a))=0.

This concludes the proof.

Univariate Taylor Theorem

General Univariate Taylor Theorem

Let k1 be an integer and let the function f:RR be k times differentiable at the point aR. Then there exists a function hk:RR such that

f(x)=Pk(x)+Rk(x)

where Pk(x) is the k-th order Taylor polynomial,

Pk(x)=f(a)+f(a)(xa)+f(a)2!(xa)2++f(k)(a)k!(xa)k

and Rk(x) is the remainder term.

The remainder term could have many different forms. The commonly seen form is the Peano form of the remainder, where the formula of the remainder is not explicitly defined.

Rk(x)=hk(x)(xa)k,limxahk(x)=0

There are explicit formula for the remainder, with some additional assumptions. Among the mean-value forms of the remainder, the Lagrange form of the remainder is widely used.

Let f:RR be k+1 times differentiable on the open interval with f(k) continuous on the closed interval between a and x, then the remainder term could be expressed as

Rk(x)=f(k+1)(ξL)(k+1)!(xa)k+1=f(k+1)(a+t(xa))(k+1)!(xa)k+1

where ξL is some real value between a and x, and 0t1.

It is easy to show that Taylor theorem with the Lagrange form remainder implies Taylor theorem with the Peano form remainder.

f(x)=Pk(x)+Rk(x)=Pk1(x)+Rk1(x)

We will use Peano form for Rk(x) and Lagrange form for Rk1(x), we then have

f(k)(a)k!(xa)k+hk(x)(xa)k=f(k)(a+t(xa))k!(xa)kf(k)(a)k!+hk(x)=f(k)(a+t(xa))k!

hk(x)=f(k)(a+t(xa))f(k)(a)k

Apparently,

limxahk(x)=0

This concludes the proof that Taylor theorem with the Lagrange form remainder implies Taylor theorem with the Peano form remainder.

Let’s check Taylor’s theorem quickly. When k=1, using the Peano form of the remainder, we have

f(x)=P1(x)+R1(x)=f(a)+f(a)(xa)+h1(x)(xa)

We have the expression for h(x)

h1(x)=f(x)f(a)f(a)(xa)xa=f(x)f(a)xaf(a)

Because of the definition of derivative,

f(a)=limxaf(x)f(a)xa

we have

limxah1(x)=limxa(f(x)f(a)xaf(a))=limxa(f(x)f(a)xa)f(a)=f(a)f(a)=0

Let’s try to prove Taylor’s theorem more formally using the Lagrange form now. (To make the proof somewhat easier, it is not using the more general Peano form remainder.)

Proof

We define new functions g1 and g2.

g1(x)=f(x)(f(a)+f(a)(xa)++f(k1)(a)(k1)!(xa)k1)

g2(x)=g1(x)(xax0a)kg1(x0)

where a and x0 are some arbitrary real values and ax0.

The derivatives of g1(x) are as follows.

g1(1)(x)=f(1)(x)(f(1)(a)+f(2)(a)(xa)+f(3)(a)2!(xa)2++f(k1)(a)(k2)!(xa)k2)g1(2)(x)=f(2)(x)(f(2)(a)+f(3)(a)(xa)++f(k1)(a)(k3)!(xa)k3)g1(i)(x)=f(i)(x)(f(i)(a)+f(i+1)(a)(xa)++f(k1)(a)(ki1)!(xa)ki1)g1(k1)(x)=f(k1)(x)f(k1)(a)g1(k)(x)=f(k)(x)

It is easy to verify that g1(i)(a)=0 for i[1,k1].

The derivatives of g2(x) are as follows.

g2(1)(x)=g1(1)(x)k(xa)k1(x0a)kg1(x0)g2(2)(x)=g1(2)(x)k(k1)(xa)k2(x0a)kg1(x0)g2(i)(x)=g1(i)(x)k!(ki)!(xa)ki(x0a)kg1(x0)g2(k1)(x)=g1(k1)(x)k!xa(x0a)kg1(x0)g2(k)(x)=g1(k)(x)k!1(x0a)kg1(x0)

Given g1(i)(a)=0 for i[1,k1], it is also easy to verify that g2(i)(a)=0 for i[1,k1].

Further more,

g1(a)=f(a)f(a)=0

g2(a)=g1(a)=0

g2(x0)=g1(x0)g1(x0)=0

So we have g2(x0)=g2(a) and ax0.

So we have g2(x0)=g2(a) and ax0, and g2(a)=g2(a)==g2(k1)(a)=0. According to Lemma 2, there exists t(0,1) such that g2(k)(a+t(x0a))=0.

Therefore,

g2(k)(a+t(x0a))=g1(k)(a+t(x0a))k!1(x0a)kg1(x0)=f(k)(a+t(x0a))k!1(x0a)k(f(x0)(f(a)+f(a)(x0a)++f(k1)(a)(k1)!(x0a)k1))=0

f(x0)=f(a)+f(a)(x0a)++f(k1)(a)(k1)!(x0a)k1+f(k)(a+t(x0a))k!(x0a)k

Note that we have assumed a and x0 to be arbitrary as long as ax0. So we basically have,

f(x)=f(a)+f(a)(xa)++f(k1)(a)(k1)!(xa)k1+f(k)(a+t(xa))k!(xa)k

where xa.

However, it is easy to see that when x=a, the above equation still holds. So ultimately, we have

f(x)=f(a)+f(a)(xa)++f(k1)(a)(k1)!(xa)k1+f(k)(a+t(xa))k!(xa)k

The above verifies Taylor theorem with the Lagrange form remainder.

This concludes the proof.

Multivariate Taylor Theorem

The multivariate Taylor theorem is a little bit complicated. We will only focus on the quadratic form here without discussing the proofs.

Multi-Index Notation

There are some notations in order to express the multivariate Taylor theorem conveniently.

Suppose αNn, α={α1,α2,,αn}, and xRn, x={x1,x2,,xn}, we have the following notations.

|α|=α1+α2++αn

α!=α1!α2!αn!

(nα)=n!α!=n!α1!α2!αn!

xα=x1α1x2α2xnαn

Suppose αNn and a constant natural number k, how many different α are there such that |α|=k?

Let’s look at an example. Assume n=3 and k=2, the α such that |α|=k are

α=(2,0,0)α=(0,2,0)α=(0,0,2)α=(1,1,0)α=(1,0,1)α=(0,1,1)

Therefore, we have 6 different α for n=3 and k=2.

The formula for the general case is actually not hard to derive. This combination problem is actually equivalent to the combination problem that we have k+n identical balls, and n different boxes, how many unique ways to put these balls into the boxes such that each box has at least one ball. To solve it, we would need to put all the k+n balls as a sequence, and insert n1 barriers between the balls, such that each two adjacent balls could have at most one barrier. The number of combinations for this problem is, obviously, (k+n1n1).

Let’s verify if the formula we derived is valid for the example we have seen above.

(k+n1n1)=(2+3131)=(42)=6

k-th Order Partial Derivatives

Suppose a function f:RnR is k times differentiable, the k-th order partial derivatives of f are

Dαf=|α|fx1α1x2α2xnαn

for all α such that |α|=k.

Suppose a function f:RnR is k times differentiable at point xRn, the k-th order partial derivatives of f evaluated at point x are

Dαf(x)=|α|fx1α1x2α2xnαn(x)

for all α such that |α|=k.

Let’s look at an example. Assume n=3 and k=2,

f(x)=5x13+7x22x34

From the example in the Multi-Index Notation section, we know that all the α are

α=(2,0,0)α=(0,2,0)α=(0,0,2)α=(1,1,0)α=(1,0,1)α=(0,1,1)

So the 2-th order partial derivatives of f evaluated at point x are

D(2,0,0)f(x)=2fx12x20x30=30x1+7x22x34D(0,2,0)f(x)=2fx10x22x30=5x13+14x34D(0,0,2)f(x)=2fx10x20x32=5x13+84x22x32D(1,1,0)f(x)=2fx11x21x30=15x12+14x2x34D(1,0,1)f(x)=2fx11x20x31=15x12+28x22x33D(0,1,1)f(x)=2fx10x21x31=5x13+56x2x33

Quadratic Multivariate Taylor Theorem

Let f:RnR be a 2-times differentiable function at the point aRn. Using the Lagrange form of the remainder, there exists t(0,1) such that

f(x)=f(a)+f(a)(xa)+12(xa)H(a+t(xa))(xa)

where f(a) is a row vector and xa is a column vector, H is the Hessian matrix and H(a+t(xa)) is the Hessian matrix evaluated at a+t(xa). Concretely,

f(a)=[fx1(a),fx2(a),,fxn(a)]

and

xa=[x1a1,x2a2,,xnan]

General Multivariate Taylor Theorem

Let f:RnR be a k-times differentiable function at the point aRn, we have

f(x)=Pk(x)+Rk(x)

where the k-th order Taylor polynomial

Pk(x)=j=0k1j!|α|=j(|α|α)Dαf(a)(xa)α=j=0k|α|=j1j!(jα)Dαf(a)(xa)α=j=0k|α|=jDαf(a)α!(xa)α=0|α|kDαf(a)α!(xa)α

and the Peano form of the remainder

Rk(x)=|α|=khα(x)(xa)α,limxahα(x)=0

where hα:RnR. So instead of having one single h in the univariate case, here we have multiple different h, each for one α whose |α|=k.

For example, assume n=3 and k=2, for the 2-th order Taylor polynomial,

P2(x)=0|α|2Dαf(a)α!(xa)α=(|α|=0Dαf(a)α!(xa)α)+(|α|=1Dαf(a)α!(xa)α)+(|α|=2Dαf(a)α!(xa)α)=f(a)+D(1,0,0)f(a)(1,0,0)!(xa)(1,0,0)+D(0,1,0)f(a)(0,1,0)!(xa)(0,1,0)+D(0,0,1)f(a)(0,0,1)!(xa)(0,0,1)+D(2,0,0)f(a)(2,0,0)!(xa)(2,0,0)+D(2,0,0)f(a)(2,0,0)!(xa)(2,0,0)+D(0,0,2)f(a)(0,0,2)!(xa)(0,0,2)+D(1,1,0)f(a)(1,1,0)!(xa)(1,1,0)+D(1,0,1)f(a)(1,0,1)!(xa)(1,0,1)+D(0,1,1)f(a)(0,1,1)!(xa)(0,1,1)

Each of the term could be expanded using the multi-index notation and the k-th order partial derivatives.

For the Peano form of the remainder,

R2(x)=|α|=2hα(x)(xa)α=h(2,0,0)(x)(xa)(2,0,0)+h(0,2,0)(x)(xa)(0,2,0)+h(0,0,2)(x)(xa)(0,0,2)+h(1,1,0)(x)(xa)(1,1,0)+h(1,0,1)(x)(xa)(1,0,1)+h(0,1,1)(x)(xa)(0,1,1)

References

Author

Lei Mao

Posted on

06-23-2021

Updated on

06-23-2021

Licensed under


Comments