5. Convergence of finite element approximations

In this section we develop tools to prove convergence of finite element approximations to the exact solutions of PDEs.

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

5.1. Weak derivatives

Consider a triangulation T with recursively refined triangulations Th and corresponding finite element spaces Vh. Given stable finite element variational problems, we have a sequence of solutions uh as h0, satisfying the h-independent bound

uhH1(Ω)C.

What are these solutions converging to? We need to find a Hilbert space that contains all Vh as h0, that extends the H1 norm to the h0 limit of finite element functions.

Our first task is to define a derivative that works for all finite element functions, without reference to a mesh. This requires some preliminary definitions, starting by considering some very smooth functions that vanish on the boundaries together with their derivatives (so that we can integrate by parts as much as we like).

Definition 5.1 (Compact support on (Omega))

A function u has compact support on Ω if there exists ϵ>0 such that u(x)=0 when minyΩ|xy|<ϵ.

Definition 5.2 ((C^infty_0(Omega)))

We denote by C0(Ω) the subset of C(Ω) corresponding to functions that have compact support on Ω.

Next we will define a space containing the generalised derivative.

Definition 5.3 ((L^1_{loc}))

For triangles Kint(Ω), we define

uL1(K)=K|u|dx,

and

L1K={u:uL1(K)<}.

Then

L1loc={f:fL1(K)Kint(Ω)}.

Finally we are in a position to introduce the generalisation of the derivative itself.

Definition 5.4 (Weak derivative)

The weak derivative DαwfL1loc(Ω) of a function fL1loc(Ω) is defined by

ΩϕDαwfdx=(1)|α|ΩDαϕfdx,ϕC0(Ω).

Not that we do not see any boundary terms since ϕ vanishes at the boundary along with all derivatives.

Now we check that the derivative agrees with our finite element derivative definition.

Lemma 5.5

Let V be a C0 finite element space. Then, for uV, the finite element derivative of u is equal to the weak derivative of u.

Proof

Taking any ϕC0(Ω), we have

Ωϕxi|FEudx=KKϕuxidx,=K(Kϕxiudx+KϕniudS),=KKϕxiudx=Ωϕxiudx,

as required.

Exercise 5.6

Let V be a C1 finite element space. For uV, show that the finite second derivatives of u is equal to the weak second derivative of u.

Exercise 5.7

Let V be a discontinuous finite element space. For uV, show that the weak derivative does not coincide with the finite element derivative in general (find a counter-example).

Lemma 5.8

For uC|α|(Ω), the usual “strong” derivative Dα of u is equal to the weak derivative Dαw of u.

Exercise 5.9

Prove this lemma.

Due to these equivalences, we do not need to distinguish between strong, weak and finite element first derivatives for C0 finite element functions. All derivatives are assumed to be weak from now on.

5.2. Sobolev spaces

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

We are now in a position to define a space that contains all C0 finite element spaces. This means that we can consider the limit of finite element approximations as h0.

Definition 5.10 (The Sobolev space (H^1))

H1(Ω) is the function space defined by

H1(Ω)={uL1loc:uH1(Ω)<}.

Going further, the Sobolev space Hk is the space of all functions with finite Hk norm.

Definition 5.11 (The Sobolev space (H^k))

Hk(Ω) is the function space defined by

Hk(Ω)={uL1loc:uHk(Ω)<}

Since uHk(Ω)uHl(Ω) for k<l, we have HlHk for k<l.

If we are to consider limits of finite element functions in these Sobolev spaces, then it is important that they are closed, i.e. limits remain in the spaces.

Lemma 5.12 ((H^k) spaces are Hilbert spaces)

The space Hk(Ω) is closed.

Let {ui} be a Cauchy sequence in Hk. Then {Dαui} is a Cauchy sequence in L2(Ω) (which is closed), so vαL2(Ω) such that Dαuivα for |α|k. If wjw in L2(Ω), then for ϕC0(Ω),

Ω(wjw)ϕdxwjwL2(Ω)ϕL0.

We use this equation to get

Ωvαϕdx=limiΩϕDαuidx,=limi(1)|α|ΩuiDαϕdx,=(1)|α|ΩvDαϕdx,

i.e. vα is the weak derivative of u as required.

We quote the following much deeper results without proof.

Theorem 5.13 ((H=W))

Let Ω be any open set. Then Hk(Ω)C(Ω) is dense in Hk(Ω).

The interpretation is that for any function uHk(Ω), we can find a sequence of C functions ui converging to u. This is very useful as we can compute many things using C functions and take the limit.

Theorem 5.14 (Sobolev’s inequality)

Let Ω be an n-dimensional domain with Lipschitz boundary, let k be an integer with k>n/2. Then there exists a constant C such that

uL(Ω)=esssupxΩ|u(x)|CuHk(Ω).

Further, there is a C0 continuous function in the L(Ω) equivalence class of u.

Previously we saw this result for continuous functions. Here it is presented for Hk functions, with an extra statement about the existence of a C0 function in the equivalence class. The interpretation is that if uHk then there is a continuous function u0 such that the set of points where uu0 has zero area/volume.

Corollary 5.15 (Sobolev’s inequality for derivatives)

Let Ω be a n-dimensional domain with Lipschitz boundary, let k be an integer with km>n/2. Then there exists a constant C such that

uWm(Ω):=|α|mDαuL(Ω)CuHk(Ω).

Further, there is a Cm continuous function in the L(Ω) equivalence class of u.

Proof

Just apply Sobolev’s inequality to the m derivatives of u.

5.3. Variational formulations of PDEs

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

We can now consider linear variational problems defined on Hk spaces, by taking a bilinear form b(u,v) and linear form F(v), seeking uHk (for chosen Hk) such that

b(u,v)=F(v),vHk.

Since Hk is a Hilbert space, the Lax-Milgram theorem can be used to analyse, the existence of a unique solution to an Hk linear variational problem.

For example, the Helmholtz problem solvability is immediate.

Theorem 5.16 (Well-posedness for (modified) Helmholtz))

The Helmholtz variational problem on H1 satisfies the conditions of the Lax-Milgram theorem.

Proof

The proof for C0 finite element spaces extends immediately to H1.

Next, we develop the relationship between solutions of the Helmholtz variational problem and the strong-form Helmholtz equation,

u2u=f,un=0, on Ω.

The basic idea is to check that when you take a solution of the Helmholtz variational problem and integrate by parts (provided that this makes sense) then you reveal that the solution solves the strong form equation. Functions in Hk make boundary values hard to interpret since they are not guaranteed to have defined values on the boundary. We make the following definition.

Definition 5.17 (Trace of (H^1) functions)

Let uH1(Ω) and choose uiC(Ω) such that uiu. We define the trace u|Ω on Ω as the limit of the restriction of ui to Ω. This definition is unique from the uniqueness of limits.

We can extend our trace inequality for finite element functions directly to H1 functions.

Lemma 5.18 (Trace theorem for (H^1) functions)

Let uH1(Ω) for a polygonal domain Ω. Then the trace u|Ω satisfies

uL2(Ω)CuH1(Ω).

The interpretation of this result is that if uH1(Ω) then u|ΩL2(Ω).

Proof

Adapt the proof for C0 finite element functions, choosing uC(Ω), and pass to the limit in H1(Ω).

This tells us when the integration by parts formula makes sense.

Lemma 5.19

Let uH2(Ω), vH1(Ω). Then

Ω(2u)vdx=ΩuvdxΩunvdS.
Proof

First note that uH2(Ω)u(H1(Ω))d. Then

Then, take viC(Ω) and uiC(Ω) converging to v and u, respectively, and viuiC(Ω) converges to vu. These satisfy the equation; we obtain the result by passing to the limit.

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

Now we have everything we need to show that solutions of the strong form equation also solve the variational problem. It is just a matter of substituting into the formula and applying integration by parts.

Lemma 5.20

For fL2, let uH2(Ω) solve

u2u=f,un=0 on Ω,

in the L2 sense, i.e. u2ufL2=0. Then u solves the variational form of the Helmholtz equation.

Proof

uH2uH2<uH1<uH1. Multiplying by test function vH1, and using the previous proposition gives

Ωuv+uvdx=Ωfvdx,vH1(Ω),

as required.

Now we go the other way, showing that solutions of the variational problem also solve the strong form equation. To do this, we need to assume a bit more smoothness of the solution, that it is in H2 instead of just H1.

Theorem 5.21

Let fL2(Ω) and suppose that uH2(Ω) solves the variational Helmholtz equation on a polygonal domain Ω. Then u solves the strong form Helmholtz equation with zero Neumann boundary conditions.

Proof

Using integration by parts for uH2, vC0(Ω)H1, we have

Ω(u2uf)vdx=Ωuv+uvvfdx=0.

It is a standard result that C0(Ω) is dense in L2(Ω) (i.e., every L2 function can be approximated arbitrarily closely by a C0 function), and therefore we can choose a sequence of v converging to u2uf and we obtain u2ufL2(Ω)=0.

Now we focus on showing the boundary condition is satisfied. We have

0=Ωuv+uvfvdx=Ωuv+uv(u2u)vdx=ΩunvdS.

We can find arbitrary vL2(Ω), hence unL2(Ω)=0.

5.4. Galerkin approximations of linear variational problems

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

Going a bit more general again, assume that we have a well-posed linear variational problem on Hk, connected to a strong form PDE. Now we would like to approximate it. This is done in general using the Galerkin approximation.

Definition 5.22 (Galerkin approximation)

Consider a linear variational problem of the form:

find uHk such that

b(u,v)=F(v),vHk.

For a finite element space VhV=Hk(Ω), the Galerkin approximation of this Hk variational problem seeks to find uhVh such that

b(uh,v)=F(v),vVh.

We just restrict the trial function u and the test function v to the finite element space. C0 finite element spaces are subspaces of H1, C1 finite element spaces are subspaces of H2 and so on.

If b(u,v) is continuous and coercive on Hk, then it is also continuous and coercive on Vh by the subspace property. Hence, we know that the Galerkin approximation exists, is unique and is stable. This means that it will be possible to solve the matrix-vector equation.

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

Moving on, if we can solve the equation, we would like to know if it is useful. What is the size of the error uuh? For Galerkin approximations this question is addressed by Céa’s lemma.

Theorem 5.23 (Céa’s lemma.)

Let VhV, and let u solve a linear variational problem on V, whilst uh solves the equivalent Galerkin approximation on Vh. Then

uuhVMγminvVhuvV,

where M and γ are the continuity and coercivity constants of b(u,v), respectively.

Proof

We have

b(u,v)=F(v)vV,b(uh,v)=F(v)vVh.

Choosing vVhV means we can use it in both equations, and subtraction and linearity lead to the “Galerkin orthogonality” condition

b(uuh,v)=0,vVh.

Then, for all vVh,

γuuh2Vb(uuh,uuh),=b(uuh,uv)+b(uuh,vuh)=0,MuuhVuvV.

So,

γuuhVMuvV.

Minimising over all v completes the proof.

5.5. Interpolation error in Hk spaces

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

The interpretation of Céa’s lemma is that the error is proportional to the minimal error in approximating u in Vh. To do this, we can simply choose v=Ihu in Céa’s lemma, to get

uuhVMγminvVhuvVMγuIhuV.

Hence, Céa’s lemma reduces the problem of estimating the error in the numerical solution to estimating error in the interpolation of the exact solution. We have already examined this in the section on interpolation operators, but in the context of continuous functions. The problem is that we do not know that the solution u is continuous, only that it is in Hk for some k.

We now quickly revisit the results of the interpolation section to extend them to Hk spaces. The proofs are mostly identical, so we just give the updated result statements and state how to modify the proofs.

Firstly we recall the averaged Taylor polynomial. Since it involves only integrals of the derivatives, we can immediately use weak derivatives here.

Definition 5.24 (Averaged Taylor polynomial with weak derivatives)

Let ΩRn be a domain with diameter d, that is star-shaped with respect to a ball B with radius ϵ, contained within Ω. For fHk+1(Ω) the averaged Taylor polynomial Qk,BfPk is defined as

Qk,Bf(x)=1|B|BTkf(y,x)dy,

where Tkf is the Taylor polynomial of degree k of f,

Tkf(y,x)=|α|kDαf(y)(xy)αα!,

evaluated using weak derivatives.

This definition makes sense since the Taylor polynomial coefficients are in L1loc(Ω) and thus their integrals over B are defined.

The next step was to examine the error in the Taylor polynomial.

Theorem 5.25

Let ΩRn be a domain with diameter d, that is star-shaped with respect to a ball B with radius ϵ, contained within Ω. There exists a constant C(k,n) such that for 0|β|k+1 and all fHk+1(Ω),

Dβ(fQk,Bf)L2C|Ω|1/2|B|1/2dk+1|β|k+1fL2(Ω).
Proof

To show this, we assume that fC(Ω), in which case the result of Theorem 3.15 applies. Then we obtain the present result by approximating f by a sequence of C(Ω) functions and passing to the limit.

We then repeat the following corollary.

Corollary 5.26

Let K1 be a triangle with diameter 1. There exists a constant C(k,n) such that

fQk,BfHk(K1)C|k+1f|Hk+1(K1).
Proof

Same as Lemma 3.16.

The next step was the bound on the interpolation operator. Now we just have to replace Cl, with Wl as derivatives may not exist at every point.

Lemma 5.27

Let (K1,P,N) be a finite element such that K1 is a triangle with diameter 1, and such that the nodal variables in N involve only evaluations of functions or evaluations of derivatives of degree l, and NiWl(K1)<,

NiWl(K1)=supuWl(K1)>0|Ni(u)|uWl(K1).

Let uHk(K1) with k>l+n/2. Then

IK1uHk(K1)CuHk(K1).
Proof

Same as Lemma 3.22. replacing Cl, with Wl, and using the full version of the Sobolev inequality in Lemma 5.14.

The next steps then just follow through.

Lemma 5.28

Let (K1,P,N) be a finite element such that K1 has diameter 1, and such that the nodal variables in N involve only evaluations of functions or evaluations of derivatives of degree l, and P contain all polynomials of degree k and below, with k>l+n/2. Let uHk+1(K1). Then for ik, the local interpolation operator satisfies

|IK1uu|Hi(K1)C1|u|Hk+1(K1).
Proof

Same as Lemma 3.23.

Lemma 5.29

Let (K,P,N) be a finite element such that K has diameter d, and such that the nodal variables in N involve only evaluations of functions or evaluations of derivatives of degree l, and P contains all polynomials of degree k and below, with k>l+n/2. Let uHk+1(K). Then for ik, the local interpolation operator satisfies

|IKuu|Hi(K)CKdk+1i|u|Hk+1(K).

where CK is a constant that depends on the shape of K but not the diameter.

Proof

Repeat the scaling argument of Lemma 3.24.

Theorem 5.30

Let T be a triangulation with finite elements (Ki,Pi,Ni), such that the minimum aspect ratio r of the triangles Ki satisfies r>0, and such that the nodal variables in N involve only evaluations of functions or evaluations of derivatives of degree l, and P contains all polynomials of degree k and below, with k>l+n/2. Let uHk+1(Ω). Let h be the maximum over all of the triangle diameters, with 0h<1. Let V be the corresponding Cr finite element space. Then for ik and ir+1, the global interpolation operator satisfies

IhuuHi(Ω)Chk+1i|u|Hk+1(Ω).
Proof

Identical to Theorem 3.25.

5.6. Convergence of the finite element approximation to the Helmholtz problem

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

Now that we have the required interpolation operator results, we can return to applying Céa’s lemma to the convergence of the finite element approximation to the Helmholtz problem.

Corollary 5.31

The degree k Lagrange finite element approximation uh to the solution u of the variational Helmholtz problem satisfies

uhuH1(Ω)ChkuHk+1(Ω).
Proof

We combine Céa’s lemma with the previous estimate, since

minvVhuvH1(Ω)uIhuH1(Ω)ChkuHk+1(Ω),

having chosen i=1.

Exercise 5.32

Consider the variational problem of finding uH1([0,1]) such that

10vu+vudx=10vxdx+v(1)v(0),vH1([0,1]).

After dividing the interval [0,1] into N equispaced cells and forming a P1 C0 finite element space VN, the error uuhH1=0 for any N>0.

Explain why this is expected.

Exercise 5.33

Let ˚H1([0,1]) be the subspace of H1([0,1]) such that u(0)=0. Consider the variational problem of finding u˚H1([0,1]) with

10vudx=1/20vdx,v˚H([0,1]).

The interval [0,1] is divided into 2N+1 equispaced cells (where N is a positive integer). After forming a P2 C0 finite element space VN, the error uuhH1 only converges at a linear rate. Explain why this is expected.

Exercise 5.34
Let Ω be a convex polygonal 2D domain. Consider the

following two problems.

  1. Find uH2 such that

    2u+fL2(Ω)=0,uL2(Ω)=0,

    which we write in a shorthand as

    2u=f,u|Ω=0.
  2. Find u˚H1(Ω) such that

    Ωuvdx=Ωfvdx,v˚H1(Ω),

    where ˚H1(Ω) is the subspace of H1(Ω) consisting of functions whose trace vanishes on the boundary.

Under assumptions on u which you should state, show that a solution to problem (1.) is a solution to problem (2.).

Let h be the maximum triangle diameter of a triangulation Th of Ω, with Vh the corresponding linear Lagrange finite element space. Construct a finite element approximation to Problem (2.) above. Briefly give the main arguments as to why the H1(Ω) norm of the error converges to zero linearly in h as h0, giving your assumptions.

Céa’s lemma gives us error estimates in the norm of the space where the variational problem is defined, where the continuity and coercivity results hold. In the case of the Helmholtz problem, this is H1. We would also like estimates of the error in the L2 norm, and it will turn out that these will have a more rapid convergence rate as h0.

A video recording of the following material is available here.

Imperial students can also watch this video on Panopto

To do this we quote the following without proof.

Theorem 5.35 (Elliptic regularity)

Let w solve the equation

w2w=f,wn=0 on Ω,

on a convex (results also hold for other types of “nice” domains) domain Ω, with fL2. Then there exists constant C>0 such that

|w|H2(Ω)CfL2(Ω).

Similar results hold for general elliptic operators, such as Poisson’s equation with the types of boundary conditions discussed above. Elliptic regularity is great to have, because it says that the solution of the H1 variational problem is actually in H2, provided that fL2.

We now use this to obtain the following result, using the Aubin-Nitsche trick.

Theorem 5.36

The degree k Lagrange finite element approximation uh to the solution u of the variational Helmholtz problem satisfies

uhuL2(Ω)Chk+1uHk+1(Ω).
Proof

We use the Aubin-Nitsche duality argument. Let w be the solution of

w2w=uuh,

with the same Neumann boundary conditions as for u.

Since uuhH1(Ω)L2(Ω), we have wH2(Ω) by elliptic regularity.

Then we have (by multiplying by a test function an integrating by parts),

b(w,v)=(uuh,v)L2(Ω),vH1(Ω),

and so

uuh2L2(Ω)=(uuh,uuh)=b(w,uuh),=b(wIhw,uuh) (orthogonality) ,CuuhH1(Ω)wIhwH1(Ω),ChuuhH1(Ω)|w|H2(Ω)C1hk+1|u|Hk+1(Ω)uuhL2(Ω)

and dividing both sides by uuhL2(Ω) gives the result.

Thus we gain one order of convergence rate with h by using the L2 norm instead of the H1 norm.

5.7. Epilogue

This completes our analysis of the convergence of the Galerkin finite element approximation to the Helmholtz problem. Similar approaches can be applied to analysis of other elliptic PDEs, using the following programme.

  1. Find a variational formulation of the PDE with a bilinear form that is continuous and coercive (and hence well-posed by Lax-Milgram) on Hk for some k.

  2. Find a finite element space VhHk. For H1, this requires a C0 finite element space, and for H2, a C1 finite element space is required.

  3. The Galerkin approximation to the variational formulation is obtained by restricting the solution and test functions to Vh.

  4. Continuity and coercivity (and hence well-posedness) for the Galerkin approximation is assured since VhHk. This means that the Galerkin approximation is solvable and stable.

  5. The estimate of the error estimate in terms of h comes from Céa’s lemma plus the error estimate for the nodal interpolation operator.

This course only describes the beginning of the subject of finite element methods, for which research continues to grow in both theory and application. There are many methods and approaches that go beyond the basic Galerkin approach described above. These include

  • Discontinuous Galerkin methods, which use discontinuous finite element spaces with jump conditions between cells to compensate for not having the required continuity. These problems do not fit into the standard Galerkin framework and new techniques have been developed to derive and analyse them.

  • Mixed finite element methods, which consider systems of partial differential equations such as the Poisson equation in first-order form,

    up=0,u=f.

    The variational forms corresponding to these systems are not coercive, but they are well-posed anyway, and additional techniques have been developed.

  • Non-conforming methods, which work even though VhHk. For example, the Crouzeix-Raviart element uses linear functions that are only continuous at edge centres, so the functions are not in C0 and the functions do not have a weak derivative. However, using the finite element derivative in the weak form for H1 elliptic problems still gives a solvable system that converges at the optimal rate. Additional techniques have been developed to analyse this.

  • Interior penalty methods, which work even though VhHk. These methods are used to solve Hk elliptic problems using Hl finite element spaces with l<k, using jump conditions to obtain a stable discretisation. Additional techniques have been developed to analyse this.

  • Stabilised and multiscale methods for finite element approximation of PDEs whose solutions have a wide range of scales, for example they might have boundary layers, turbulent structures or other phenomena. Resolving this features is often too expensive, so the goal is to find robust methods that behave well when the solution is not well resolved. Additional techniques have been developed to analyse this.

  • Hybridisable methods that involve flux functions that are supported only on cell facets.

  • Currently there is a lot of activity around discontinuous Petrov-Galerkin methods, which select optimal test functions to maximise the stability of the discrete operator. This means that they can be applied to problems such as wave propagation which are otherwise very challenging to find stable methods for. Also, these methods come with a bespoke error estimator that can allow for adaptive meshing starting from very coarse meshes. Another new and active area is virtual element methods, where the basis functions are not explicitly defined everywhere (perhaps just on the boundary of cells). This facilitates the use of arbitrary polyhedra as cells, leading to very flexible mesh choices.

All of these methods are driven by the requirements of different physical applications.

Other rich areas of finite element research include

  • the development of bespoke, efficient iterative solver algorithms on parallel computers for finite element discretisations of PDEs. Here, knowledge of the analysis of the discretisation can lead to solvers that converge in a number of iterations that is independent of the mesh parameter h.

  • adaptive mesh algorithms that use analytical techniques to estimate or bound the numerical error after the numerical solution has been computed, in order to guide iterative mesh refinement in particular areas of the domain.