Journal of Modern Physics, 2018, 9, 1713-1723
http://www.scirp.org/journal/jmp
ISSN Online: 2153-120X
ISSN Print: 2153-1196
Two-Level Block Decompositions for Solving
Helmholtz Equation via Chebyshev Pseudo
Spectral Method
Hsin-Chu Chen
Department of Computer and Information Science, Clark Atlanta University, Atlanta, GA, USA
How to cite this paper: Chen, H.-C. (2018)
Two-Level Block Decompositions for Solv-
ing Helmholtz Equation via Chebyshev
Pseudo Spectral Method. Journal of Mod-
ern Physics, 9, 1713-1723.
https://doi.org/10.4236/jmp.2018.99107
Received: June 14, 2018
Accepted: July 29, 2018
Published: August 1, 2018
Copyright © 2018 by author and
Scientific Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International
License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/
Open Access
2
u
u
,
]
1,1
[
⊗ −
]
1,1
=
[
Ω = −
2
ω∇ +
Abstract
(
)
In this paper, we consider solving the Helmholtz equation
f x y
in the Cartesian domain
, subject to homogeneous Di-
richlet boundary condition, discretized with the Chebyshev pseudo-spectral
method. The main purpose of this paper is to present the formulation of a
two-level decomposition scheme for decoupling the linear system obtained
from the discretization into independent subsystems. This scheme takes ad-
vantage of the homogeneity property of the physical problem along one di-
rection to reduce a 2D problem to several 1D problems via a block diagonali-
zation approach and the reflexivity property along the second direction to
decompose each of the 1D problems to two independent subproblems using a
reflexive decomposition, effectively doubling the number of subproblems.
Based on the special structure of the coefficient matrix of the linear system
derived from the discretization and a reflexivity property of the second-order
Chebyshev differentiation matrix, we show that the decomposed submatrices
exhibits a similar property, enabling the system to be decomposed using ref-
lexive decompositions. Explicit forms of the decomposed submatrices are de-
rived. The decomposition not only yields more efficient algorithm but intro-
duces coarse-grain parallelism. Furthermore, it preserves all eigenvalues of
the original matrix.
Keywords
Helmholtz Equation, Chebyshev Pseudo-Spectral Method, Chebyshev
Differentiation Matrix, Coarse-Grain Parallelism, Reflexive Matrix
1. Introduction
In this paper, we consider the numerical solution to the Helmholtz equation
DOI: 10.4236/jmp.2018.99107 Aug. 1, 2018
1713
Journal of Modern Physics
H.-C. Chen
2
)
u
u
=
]
1,1
(
f x y
[
2
in the Cartesian domain
with homo-
Ω = −
ω∇ +
2∇ is the Laplacian operator, ω
geneous Dirichlet boundary condition, where
is a real parameter representing the wavenumber, u is the amplitude associated
(
with wave propagation, and
f x y is a forcing function. Without loss of
generality we assume u = 0 on the boundary, i.e.,
u
and
∂Ω (1)
2
ω∇ +
(
f x y
[
⊗ −
0 on
]
1,1
∈Ω
x y
,
) (
,
=
=
u
u
)
,
)
,
,
2
We present a highly parallel numerical algorithm for solving the Helmholtz
equation discretized by the Chebyshev pseudospectral scheme. Chebyshev pseudo-
spectral methods have long been used to numerically solve partial differential
equations [1] [2] [3]. This discretization yields a linear system with coefficient
matrix of a special block structure that allows for a block diagonalization which
decouples the original matrix, along with some proper permutation of the
decoupled matrix into several independent submatrices. By exploiting a reflexivity
property that is inherent in the Chebyshev differentiation matrix, it is possible to
further decompose each of the submatrices into two independent submatrices
when the boundary conditions at both ends of the decoupled 1D subproblems
are the same. This second level of decomposition effectively doubles the
number of independent submatrices, yielding a higher degree of course-grain
parallelism.
This paper is organized as follows. In Section 2, we present the general
structure of the linear system derived from the Chebyshev collocation method
discretized independently in x and y direction. In Section 3, we address the
two-level block decomposition. This decomposition was proposed by the author
in [4] for solving Poisson equations with identical grids in either direction. In
this paper, we reformulate the decomposition to allow for the number of grid
points in the x direction to be different from that in the y direction and derive
the explicit forms for the decomposed submatrices, making applications of this
approach more flexible. The computational procedure and a numerical example
to demonstrate its usefulness are presented in Sections 4 and 5, respectively.
2. Linear System from the Chebyshev Collocation Method
In this section, we briefly describe the Chebyshev collocation method for solving
the Helmholtz equation in two dimensions. We assume that the problem is
discretized with a tensor product grid (
ix
xD ,
and
)
indexed from 0 to
+ Chebyshev spectral differen-
1
xD are given as [1]
tiation matrix associated with the x direction. The entries of
[2] [3]
, where
jy are the Chebyshev points in x and y direction, respectively. Let
xN , be the (
)
1
+ ∗
j N
i N
≤ ≤
x y
,i
≤ ≤
, 0
,0x
N
N
)
(
y
x
x
j
2
(
D
x
)
00
=
(
D
x
)
jj
=
N
+
2
x
6
x
−
(
2 1
−
1
(
,
D
)
x N N
x
x
= −
+
1
,
2
N
2
x
6
,
j
=
1,2,
,
N
−
1
x
j
x
)2
j
DOI: 10.4236/jmp.2018.99107
1714
Journal of Modern Physics
H.-C. Chen
j
j
)
i
,
≠
j i
, ,
j
=
0,1,
x
N
,
(
D
x
)
ij
=
c
i
c
j
)
(
i
+−
1
(
x
x
−
i
where
c
i
i
=
2 for
=
1
0 or
otherwise
N
x
. The Chebyshev spectral differentiation matrix
yD associated with the y direction can be obtained in a similar may.
Now, let
xA be the stripped matrix of
2
xD by removing the first and last
rows and columns of
(
(
yA be the stripped matrix of
xD and
)
(
2
A
D
x
x
)
(
1,2,
1,2,
)
)
A
y
i
, ,
i
, ,
1.
D
N
N
1,
=
=
−
=
,
,
j
j
=
x
y
−
2
y
2
ij
ij
ij
ij
2
yD . We have
(2)
With the grid just described, the Chebyshev collocation method yields, after
removing the boundary nodes whose values are already known from the system,
the following linear system
Ku
=
f K I
,
= ⊗ + ⊗ +
I
A
y
A
x
q
p
2
ω
(
I
p
⊗
I
q
)
(3)
p N=
− ,
1x
q N=
− , and
1y
kI represents the identity matrix of dimen-
where
sion k.
Many solution schemes can be employed to solve Equation (3), including
Gaussian eliminations, Gauss-Seidel iterations, ADI iterations, and matrix dia-
gonalizations [5] [6] [7] [8]. In this paper, the block version of the matrix
diagonalization method will be employed to first decompose the matrix K into a
block diagonal matrix consisting of q diagonal blocks. Each diagonal block will
then be further decomposed into two smaller diagonal sub-blocks using reflexive
decompositions, yielding a total of 2q diagonal blocks. Note that q is the number
of internal grid points along the y direction. This two-step block decomposition
scheme allows for the linear system Equation (3) to be decoupled into 2q linear
subsystems which can then be solved in parallel with course-grain parallelism
using multiprocessors or networked computers.
3. Two-Level Block Decompositions
3.1. Level-1 Decomposition: Block Diagonalization
The diagonalization scheme has been used elsewhere [7] [8] [9]. In this section,
however, we address the decomposition of the matrix K into q diagonal blocks
using a different representation. To begin with, let
yQ be such a matrix that the
similarity transformation
diagonalizes
yA into
1
= Λ
y
y
Q A Q−
Q A Q−
yΛ :
1
y
y
y
y
y
A natural choice of
yA so that
of
Now split the coefficient matrix K in Equation (3) into three parts
3K :
yQ is to have its columns formed from the eigenvectors
yA .
2K , and
yΛ is a diagonal matrix that consists of the eigenvalues of
1K ,
K
1
= ⊗
I
p
A K
y
,
2
= ⊗
A
x
I K
q
,
3
=
2
ω
I
p
⊗
I
q
(
)
DOI: 10.4236/jmp.2018.99107
1715
Journal of Modern Physics
H.-C. Chen
Q I
= ⊗ . We have the transformed matrix
Q
p
y
K Q KQ K K
1
−=
=
+
1
+
(4)
K
3
2
and let
with
K Q K Q K Q K Q K Q K Q
=
=
=
1
−
1
−
1
−
,
,
1
2
1
2
3
, where
Q
1
−
= ⊗
I
p
Q
1
−
y
.
3
We now show that the transformed matrix K can be permuted to yield a
3K are diagonal
1K and
block diagonal matrix. First, we observe that both
2K is a diagonal block matrix since
matrices and
1
−
I
(
K Q K Q I
1
−
y
⊗
=
I
I
p
p
p
1
= ⊗
1
p
=
⊗
Q
(
K Q K Q I
=
=
1
−
3
3
p
)(
⊗
)(
A
⊗
y
)(
2
I
ω
1
−
y
Q
⊗
p
I
⊗
Q
y
p
y
)
Q
)(
I
q
= ⊗ Λ
I
p
,
y
y
1
−
y
(
Q A Q
)
(
2
ω
=
I
y
p
)
⊗
)
I
q
and
(
K Q K Q I
=
=
1
−
2
2
⊗
Q
1
−
y
p
)(
A
x
⊗
I
q
)(
I
p
⊗
Q
y
)
= ⊗
A
x
I
q
.
Accordingly,
K Q KQ I
1
−=
= ⊗ Λ + ⊗ +
A
x
I
q
p
y
2
ω
(
I
p
⊗
I
q
)
.
(5)
,
j
i
th
(
a
11
ija be the (
)
Let
K can be expressed as
)
I
a I
21
a I
p
1
2
ω
K
+
=
element of
yA ,
i
,
j
1,2,
= . In its matrix form,
q
,
+ Λ
y
(
a
22
+
I
+ Λ
y
a I
12
)
2
ω
a I
p
2
a I
p
1
a I
p
2
2
ω
)
(
a
pp
+
I
+ Λ
y
where I is of dimension q. Apparently, K is also a diagonal block matrix
yΛ are diagonal matrices. The diagonal block matrix K
because both I and
can be rearranged to yield a block diagonal matrix
0
0
ˆ
K
ˆ
K
1
0
0
0
ˆK :
=
0
ˆ
K
ˆ
K
q
2
via appropriate permutations of rows and columns as shown below. Let
the (
)
formation
ijK be
block of K and P be such a permutation matrix that the trans-
TP KP
rearranges the entries of K in such a way that
th
i
j
,
l
ˆ
(
K i
)
i
,
,
)
j
=
( )
K l
i j
,
i
, ,
element of
p
,
=
1,2,
j
ˆ
lK and
T
ˆ
P KP K K
1
ˆ
= ⊕ ⊕ ⊕
ˆ
K
2
( )
,i jK l
is the
thl
diagonal
ˆ ,q
K
(6)
j
th
is the (
. Then we have
=
ˆ
(
where
lK i
element of
)
j
,
,i jK
where
ˆ
K
l
=
A
x
+
(
ω λ
l
+
(
2
)
)
y
I
p
,
l
q
1,2,
=
,
DOI: 10.4236/jmp.2018.99107
1716
Journal of Modern Physics
H.-C. Chen
with (
)l
yλ being the
thl
eigenvalue of
=
=
T
ˆK P KP P Q KQP
yA . Accordingly,
T
1
−
1
K Q KQ−=
. In other words, the matrix K has been transformed to a block
since
diagonal matrix consisting of q blocks. This decomposition takes the advantage
of the homogeneity property of the problem with the boundary conditions
mentioned in Equation (1) along the y direction to decouple the 2D problem
into q independent 1D problems. A similar decomposition can be performed by
using the homogeneity property of the problem along the x direction.
3.2. Level-2 Decomposition: Reflexive Decomposition
As seen from the previous section, the matrix K can be transformed to a block
diagonal matrix, implying that the original problem can be decomposed into
independent subproblems which can then be solved in parallel using multi-
ˆ
processors. Although each diagonal block
lK in Equation (6) can be further
xQ
decomposed into a diagonal matrix using a single transformation matrix
xA , this diagonalization yields little advan-
that consists of the eigenvectors of
tage in general, especially a multiprocessing environment, unless the eigenpairs
are readily available, which is not the case for the matrix
xA .
In this section, we shall employ the reflexivity property of the diagonal blocks
ˆ
lK to further decompose each of them into two independent sub-blocks using
another decomposition technique called the reflexive decomposition [10].
1
nJ be the cross identity matrix of dimension n:
=
Let
nJ
1
. It is
xD is anti-reflexive and
J J
n
n
I=
n
. Now, we note that the Chebyshev differentiation
1NJ + . In other
xD is reflexive, with respect to
2
trivial to see that
matrix
words,
J D J
x N
N
Accordingly, the p p× matrix
= −
D
x
1
+
(Equation (2)), is reflexive with respect to
(
ω λ
matrix [11] [12]. Recall that
l
the reflexivity property
A
x
ˆ
K
=
l
and
2
D
x
=
2
J D J
N
x N
1
+
.
1
+
1
+
xA , which is the stripped matrix of
2
xD
pJ . In fact, it is a centro-symmetric
ˆ
2
+
lK satisfy
. Therefore,
+
(
)
I
p
)
y
ˆ
K J K J
l
=
ˆ
p
l
,
l
p
=
1,2,
,
q
,
ˆ
lK can be decomposed via reflexive decompositions into
indicating that each
two block diagonal submatrices of almost equal size, depending on whether p is
even or odd. The decompositions can be done through orthogonal transfor-
mations which have been shown in [10]. Here we just present the results.
for some
A
11
A
21
First consider the case when p is even. Assume
xA is evenly partitioned into 2 2× sub-blocks:
k=
2
=
A
12
A
22
and
A
x
p
. Taking
k > ,
0
DOI: 10.4236/jmp.2018.99107
the advantage of the reflexivity property of
ˆ
lK by applying the orthogonal
1717
Journal of Modern Physics
H.-C. Chen
transformation
X K X to
T ˆ
A
A
l
ˆ
lK where
X
A
=
1
2
I
J
k
−
k
J
I
, one can easily
decompose
ˆ
lK into two decoupled diagonal subblocks of equal size:
D X K X
=
T
A
l
ˆ
l
=
X
T
A
A
A
x
+
(
T
A
x
+
=
X A X
(
(
ω λ
l
In the case when p is odd, say
A J
12
A
11
D
l
=
+
+
+
A
2
1
k
where
)
2
+
(
ω λ
l
)
=
)
y
p
I
k
2
and
k
k
1,
+
)
I X
2
+
(
ω λ
l
I
(
D
l
0
=
1
)
)
y
0
D
l
2
and J as
A
x
=
A
11
A
21
A
31
A
12
a
22
A
32
A
13
A
23
A
33
,
J
=
J
k
1
J
k
.
(7)
=
)
)
D
l
2
y
> , we consistently partition
0
(
ω λ
l
A J
21
A
22
−
+
+
(
2
k
I
k
xA
k
,
By taking the transformation matrix
0
2
0
ˆ
lK can be decoupled via the orthogonal transformation
AX to be
I
0
J
1
2
X
=
A
k
J
−
0
I
T ˆ
A
l
X K X
A
the matrix
as
where
and
D X K X
=
T
A
l
l
ˆ
=
A
=
X A X
T
A x
A
+
(
+
T
A
A
x
X
(
ω λ
l
+
(
2
)
y
2
+
(
)
ω λ
l
D
)
)
l
0
(
=
I
y
1
A
)
I X
0
D
l
2
D
l
1
=
A
11
k
+
2
A J
13
A
21
A
12
2
a
22
+
(
ω λ
l
+
(
2
)
)
y
D
l
2
=
(
A
33
−
A J
31
k
)
+
In either case, let
= ⊗
X
A
=
X
A
⊕
formation
q
X I
T ˆX KX yields
ˆ
X KX
ˆ
T
A
T
A
T
X K X
=
A
1
D D
= ⊕ ⊕ ⊕
2
1
(
(
D
D
⊕
12
11
X K X
⊕
)
⊕
ˆ
2
D
q
D
21
=
)
2
(
+
(
ω λ
l
X
⊕ ⊕
A
y
)
X
I
k
.
. The orthogonal trans-
A
⊕ ⊕
A
ˆ
X K X
T
A
q
A
⊕
D
22
)
⊕
(
D
q
1
⊕
D
q
2
)
,
which obviously consists of 2q independent diagonal blocks since each
two independent diagonal blocks.
lD has
4. Computational Procedure
The numerical solution to the Helmholtz Equation (1) discretized by the
Chebyshev pseudospectral method yields the linear system
1718
Journal of Modern Physics
DOI: 10.4236/jmp.2018.99107
H.-C. Chen
Ku
=
f K I
,
= ⊗ + ⊗ +
I
A
y
A
x
q
p
2
ω
p
I
I
)
⊗
(
yA is a q q× ,
q
pI is the identity matrix of dimension p,
xA a p p× ,
where
and K a pq pq×
matrix. Here p (q) is the number of internal grid points in the
x (y) direction. The two-level decomposition scheme described in this paper
yields the following transformed linear system consisting of 2q independent
subsystems, each of dimension
Dv
=
g D D
,
11
=
(
⊕
D
12
⊕ ⊕
(
D
q
1
⊕
D
q
2
)
.
p when p is even.
2
)
D
22
D
21
⊕
⊕
(
)
1
,
.
The sizes of the sub-blocks in D differ at most by 1 when p is an odd number.
Computationally, this decomposition is very attractive. Not only can the
decoupled subsystems be solved much more efficiently, they can also be solved
in parallel with course-grain parallelism using a computer system with multi-
processors. This two-level composition consists of the following three stages.
f=
f=
f=
to Ku
K Q KQ−=
with
, and
where
1
u Q u−=
1Q− and inseting
1) Transform Ku
f Q f−=
1
This is a typical similarity transformation that is achieved by premultiplying
1QQ− , which is simply an
both sides of Ku
identity matrix, between K and u. The matrix Q has been defined in Section 3.1
and K can be found in Equation (5).
ˆ
f=
2) Transform Ku
=
The main purpose of this transformation is to reorder the unknowns of the
linear system so that the decoupled coefficent matrix K is converted to a block
diagonal matrix ( ˆK ) for the sake of computational convenience. The matrix P,
ˆK
which is just a permutation matrix, has been explained in Section 3.1 and
can be found in Equation (6).
ˆK P KP
, and
where
ˆu P u
T
to
T
P f
f=
ˆ ˆKu
.
=
,
=
ˆf
T
.
T ˆ
g X f
g= where
f= to Dv
3) Transform
=
This transformation is analogous to that in stage 1 except that X here is an
TX . The matrices X and D
D X KX
1X − by
T ˆ
v X u
, and
=
=
,
orthogonal matrix. Therefore we replace
can be found in Section 3.2
ˆ ˆKu
ˆ
T ˆ
The advantage of this approach is that the decomposed form of D is explicitly
known and, therefore, the actual decompositions from K to K , from K to
ˆK , and from
ˆK to D are not necessary. Furthermore, the diagonal blocks of D
can be obtained without any matrix-matrix multiplications. The computational
procedure of this two-level decomposition boils down to the following three
typical steps.
1) Decomposition step. In this step, we shall compute D and g. Since each
diagonal block of D,
(
=
D D
11
⊕
D
12
)
⊕
(
D
21
⊕
D
22
)
⊕ ⊕
(
D
q
1
⊕
D
q
)
2 ,
is independent and explicitly known, D can be obtained in parallel using 2q
processors, without any difficulty. The main task here is, thus, to obtain g. This
1719
Journal of Modern Physics
DOI: 10.4236/jmp.2018.99107
H.-C. Chen
p
Q
Q I
f=
. It is essential to note that Q,
f first, which involves solving the linear
can be achieved by transforming f to
system Qf
= ⊗ , is a block diagonal
f can be obtained by solving p independent linear
matrix and, therefore,
subsystems, each of size q, instead of solving the linear system as a whole, which
is of size pq. This step can be performed in parallel with large granularity. The
matrix
yA , is in general not known
yQ , consisting of the eigenvectors of
explicitly and, thus, has to be computed from
yA . Fortunately,
yA is only of
ˆf and g can be obtained easily
f has been computed,
dimension q. Once
ˆf is simply a permuted version of
f and
because
involves no
matrix-vector multiplications due to the special form of X.
T ˆ
g X f
=
y
q= for v. Note
2) Solution step. In this step, we solve the linear system Dv
that D consists of 2q diagonal blocks:
= . By con-
1,2,
sistently partitioning the vectors x and g into 2q parts, based on the size of the
blocks, the linear system Dv
g= can now be expressed as
1lD and
2lD ,
q
l
,
D v
l
l
1 2 1
−
D v
l
2 2
l
g
=
g=
2 1,
l
−
2
l
,
q
for
Dv
= . Since all of the subsystems are independent, the linear system
1,2,
l
g= can be solved in parallel, using 2q processors, one for each subsystem.
3) Retrieval step. Once v has been obtained, the solution u to the original
system can be retrieved from v as follows. We first compute ˆu from v using
, which again involves no matrix-vector multiplication. We then obtain
ˆu Xv=
u by simply permuting ˆu using
. The final solution u can now be
computed by the matrix multiplication u Qu= . Note that
and
= ⊗ , implying that ˆu and u can be obtained in parallel with course-
Q I
grain parallelism.
ˆ
u Pu=
= ⊗
X I
Q
X
A
q
p
y
To end this section, it deserves mentioning that this two-level decomposition
is a similarity transformation and therefore, all eigenvalues of the original matrix
K are preserved in matrix D. Obtaining eigenvalues from D is far more efficient
than from K.
5. Numerical Example
To demonstrate the usefulness of the two-level decomposition scheme, we
3ω=
present a numerical example derived from the Helmholtz equation with
[
]
over a square domain on [
]
, subject to homogeneous Dirichlet
1,1
1,1
× −
−
xN = and
boundary conditions on a grid with
yN = . The numerical results
5
presented in this section were conducted using GNU Octave which is software
that offers many powerful high-level programming features for scientific
computations, similar to Matlab. It suffices to show the final decomposed
submatrices to illustrate the advantage of this approach. By excluding the grid
points on the boundary and using the lexicographic ordering to number the
internal nodes [3], the matrix K, of dimension 12, yields
I
= ⊗ + ⊗ +
K I
2
ω
⊗
4
I
I
)
(
4
3
A
y
A
x
4
3
DOI: 10.4236/jmp.2018.99107
1720
Journal of Modern Physics