论文研究 - Chebyshev伪谱法求解亥姆霍兹方程的两级块分解.pdf

发布时间：2022-05-30 发布人：admin 分类：说明书资料大小：0.38M 资料格式：pdf 举报版权申诉

weixin_38500944-12462474-16359648050596483818.pdf-第1页.png

第1页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第2页.png

第2页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第3页.png

第3页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第4页.png

第4页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第5页.png

第5页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第6页.png

第6页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第7页.png

第7页 / 共11页

weixin_38500944-12462474-16359648050596483818.pdf-第8页.png

第8页 / 共11页

Two-Level Block Decompositions for Solving Helmholtz Equation via Chebyshev Pseudo Spectral Method

Abstract

Keywords

1. Introduction

2. Linear System from the Chebyshev Collocation Method

3. Two-Level Block Decompositions

3.1. Level-1 Decomposition: Block Diagonalization

3.2. Level-2 Decomposition: Reflexive Decomposition

4. Computational Procedure

5. Numerical Example

6. Conclusions

Conflicts of Interest

References

Journal of Modern Physics, 2018, 9, 1713-1723 http://www.scirp.org/journal/jmp ISSN Online: 2153-120X ISSN Print: 2153-1196 Two-Level Block Decompositions for Solving Helmholtz Equation via Chebyshev Pseudo Spectral Method Hsin-Chu Chen Department of Computer and Information Science, Clark Atlanta University, Atlanta, GA, USA How to cite this paper: Chen, H.-C. (2018) Two-Level Block Decompositions for Solv- ing Helmholtz Equation via Chebyshev Pseudo Spectral Method. Journal of Mod- ern Physics, 9, 1713-1723. https://doi.org/10.4236/jmp.2018.99107 Received: June 14, 2018 Accepted: July 29, 2018 Published: August 1, 2018 Copyright © 2018 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access 2 u u , ] 1,1 [ ⊗ − ] 1,1 = [ Ω = − 2 ω∇ + Abstract ( ) In this paper, we consider solving the Helmholtz equation f x y in the Cartesian domain , subject to homogeneous Di- richlet boundary condition, discretized with the Chebyshev pseudo-spectral method. The main purpose of this paper is to present the formulation of a two-level decomposition scheme for decoupling the linear system obtained from the discretization into independent subsystems. This scheme takes ad- vantage of the homogeneity property of the physical problem along one di- rection to reduce a 2D problem to several 1D problems via a block diagonali- zation approach and the reflexivity property along the second direction to decompose each of the 1D problems to two independent subproblems using a reflexive decomposition, effectively doubling the number of subproblems. Based on the special structure of the coefficient matrix of the linear system derived from the discretization and a reflexivity property of the second-order Chebyshev differentiation matrix, we show that the decomposed submatrices exhibits a similar property, enabling the system to be decomposed using ref- lexive decompositions. Explicit forms of the decomposed submatrices are de- rived. The decomposition not only yields more efficient algorithm but intro- duces coarse-grain parallelism. Furthermore, it preserves all eigenvalues of the original matrix. Keywords Helmholtz Equation, Chebyshev Pseudo-Spectral Method, Chebyshev Differentiation Matrix, Coarse-Grain Parallelism, Reflexive Matrix 1. Introduction In this paper, we consider the numerical solution to the Helmholtz equation DOI: 10.4236/jmp.2018.99107 Aug. 1, 2018 1713 Journal of Modern Physics

H.-C. Chen 2 ) u u = ] 1,1 ( f x y [ 2 in the Cartesian domain with homo- Ω = − ω∇ + 2∇ is the Laplacian operator, ω geneous Dirichlet boundary condition, where is a real parameter representing the wavenumber, u is the amplitude associated ( with wave propagation, and f x y is a forcing function. Without loss of generality we assume u = 0 on the boundary, i.e., u and ∂Ω (1) 2 ω∇ + ( f x y [ ⊗ − 0 on ] 1,1 ∈Ω x y , ) ( , = = u u ) , ) , , 2 We present a highly parallel numerical algorithm for solving the Helmholtz equation discretized by the Chebyshev pseudospectral scheme. Chebyshev pseudo- spectral methods have long been used to numerically solve partial differential equations [1] [2] [3]. This discretization yields a linear system with coefficient matrix of a special block structure that allows for a block diagonalization which decouples the original matrix, along with some proper permutation of the decoupled matrix into several independent submatrices. By exploiting a reflexivity property that is inherent in the Chebyshev differentiation matrix, it is possible to further decompose each of the submatrices into two independent submatrices when the boundary conditions at both ends of the decoupled 1D subproblems are the same. This second level of decomposition effectively doubles the number of independent submatrices, yielding a higher degree of course-grain parallelism. This paper is organized as follows. In Section 2, we present the general structure of the linear system derived from the Chebyshev collocation method discretized independently in x and y direction. In Section 3, we address the two-level block decomposition. This decomposition was proposed by the author in [4] for solving Poisson equations with identical grids in either direction. In this paper, we reformulate the decomposition to allow for the number of grid points in the x direction to be different from that in the y direction and derive the explicit forms for the decomposed submatrices, making applications of this approach more flexible. The computational procedure and a numerical example to demonstrate its usefulness are presented in Sections 4 and 5, respectively. 2. Linear System from the Chebyshev Collocation Method In this section, we briefly describe the Chebyshev collocation method for solving the Helmholtz equation in two dimensions. We assume that the problem is discretized with a tensor product grid ( ix xD , and ) indexed from 0 to + Chebyshev spectral differen- 1 xD are given as [1] tiation matrix associated with the x direction. The entries of [2] [3] , where jy are the Chebyshev points in x and y direction, respectively. Let xN , be the ( ) 1 + ∗ j N i N ≤ ≤ x y ,i ≤ ≤ , 0 ,0x N N ) ( y x x j 2 ( D x ) 00 = ( D x ) jj = N + 2 x 6 x − ( 2 1 − 1 ( , D ) x N N x x = − + 1 , 2 N 2 x 6 , j = 1,2, , N − 1 x j x )2 j DOI: 10.4236/jmp.2018.99107 1714 Journal of Modern Physics

H.-C. Chen j j ) i , ≠ j i , , j = 0,1, x N , ( D x ) ij = c i c j ) ( i +− 1 ( x x − i where c i i = 2 for  =  1  0 or otherwise N x . The Chebyshev spectral differentiation matrix yD associated with the y direction can be obtained in a similar may. Now, let xA be the stripped matrix of 2 xD by removing the first and last rows and columns of ( ( yA be the stripped matrix of xD and ) ( 2 A D x x ) ( 1,2, 1,2, ) ) A y i , , i , , 1. D N N 1, = = − = , , j j = x y − 2 y 2 ij ij ij ij 2 yD . We have (2) With the grid just described, the Chebyshev collocation method yields, after removing the boundary nodes whose values are already known from the system, the following linear system Ku = f K I , = ⊗ + ⊗ + I A y A x q p 2 ω ( I p ⊗ I q ) (3) p N= − , 1x q N= − , and 1y kI represents the identity matrix of dimen- where sion k. Many solution schemes can be employed to solve Equation (3), including Gaussian eliminations, Gauss-Seidel iterations, ADI iterations, and matrix dia- gonalizations [5] [6] [7] [8]. In this paper, the block version of the matrix diagonalization method will be employed to first decompose the matrix K into a block diagonal matrix consisting of q diagonal blocks. Each diagonal block will then be further decomposed into two smaller diagonal sub-blocks using reflexive decompositions, yielding a total of 2q diagonal blocks. Note that q is the number of internal grid points along the y direction. This two-step block decomposition scheme allows for the linear system Equation (3) to be decoupled into 2q linear subsystems which can then be solved in parallel with course-grain parallelism using multiprocessors or networked computers. 3. Two-Level Block Decompositions 3.1. Level-1 Decomposition: Block Diagonalization The diagonalization scheme has been used elsewhere [7] [8] [9]. In this section, however, we address the decomposition of the matrix K into q diagonal blocks using a different representation. To begin with, let yQ be such a matrix that the similarity transformation diagonalizes yA into 1 = Λ y y Q A Q− Q A Q− yΛ : 1 y y y y y A natural choice of yA so that of Now split the coefficient matrix K in Equation (3) into three parts 3K : yQ is to have its columns formed from the eigenvectors yA . 2K , and yΛ is a diagonal matrix that consists of the eigenvalues of 1K , K 1 = ⊗ I p A K y , 2 = ⊗ A x I K q , 3 = 2 ω I p ⊗ I q ( ) DOI: 10.4236/jmp.2018.99107 1715 Journal of Modern Physics

H.-C. Chen Q I = ⊗ . We have the transformed matrix Q p y K Q KQ K K 1 −= = + 1 + (4) K 3 2 and let with K Q K Q K Q K Q K Q K Q = = = 1 − 1 − 1 − , , 1 2 1 2 3 , where Q 1 − = ⊗ I p Q 1 − y . 3 We now show that the transformed matrix K can be permuted to yield a 3K are diagonal 1K and block diagonal matrix. First, we observe that both 2K is a diagonal block matrix since matrices and 1 − I ( K Q K Q I 1 − y ⊗ = I I p p p 1 = ⊗ 1 p = ⊗ Q ( K Q K Q I = = 1 − 3 3 p )( ⊗ )( A ⊗ y )( 2 I ω 1 − y Q ⊗ p I ⊗ Q y p y ) Q )( I q = ⊗ Λ I p , y y 1 − y ( Q A Q ) ( 2 ω = I y p ) ⊗ ) I q and ( K Q K Q I = = 1 − 2 2 ⊗ Q 1 − y p )( A x ⊗ I q )( I p ⊗ Q y ) = ⊗ A x I q . Accordingly, K Q KQ I 1 −= = ⊗ Λ + ⊗ + A x I q p y 2 ω ( I p ⊗ I q ) . (5) , j i th ( a 11 ija be the ( ) Let K can be expressed as ) I a I 21 a I p 1 2 ω K + =         element of yA , i , j 1,2, = . In its matrix form, q , + Λ y ( a 22 + I + Λ y a I 12 ) 2 ω a I p 2 a I p 1 a I p 2 2 ω ) ( a pp + I + Λ y         where I is of dimension q. Apparently, K is also a diagonal block matrix yΛ are diagonal matrices. The diagonal block matrix K because both I and can be rearranged to yield a block diagonal matrix 0 0 ˆ K ˆ K 1 0 0 0 ˆK :           =     0 ˆ K ˆ K q 2 via appropriate permutations of rows and columns as shown below. Let the ( ) formation ijK be block of K and P be such a permutation matrix that the trans- TP KP rearranges the entries of K in such a way that th i j , l ˆ ( K i ) i , , ) j = ( ) K l i j , i , , element of p , = 1,2, j ˆ lK and T ˆ P KP K K 1 ˆ = ⊕ ⊕ ⊕ ˆ K 2 ( ) ,i jK l is the thl diagonal ˆ ,q K (6) j th is the ( . Then we have = ˆ ( where lK i element of ) j , ,i jK where ˆ K l = A x + ( ω λ l + ( 2 ) ) y I p , l q 1,2, = , DOI: 10.4236/jmp.2018.99107 1716 Journal of Modern Physics

H.-C. Chen with ( )l yλ being the thl eigenvalue of = = T ˆK P KP P Q KQP yA . Accordingly, T 1 − 1 K Q KQ−= . In other words, the matrix K has been transformed to a block since diagonal matrix consisting of q blocks. This decomposition takes the advantage of the homogeneity property of the problem with the boundary conditions mentioned in Equation (1) along the y direction to decouple the 2D problem into q independent 1D problems. A similar decomposition can be performed by using the homogeneity property of the problem along the x direction. 3.2. Level-2 Decomposition: Reflexive Decomposition As seen from the previous section, the matrix K can be transformed to a block diagonal matrix, implying that the original problem can be decomposed into independent subproblems which can then be solved in parallel using multi- ˆ processors. Although each diagonal block lK in Equation (6) can be further xQ decomposed into a diagonal matrix using a single transformation matrix xA , this diagonalization yields little advan- that consists of the eigenvectors of tage in general, especially a multiprocessing environment, unless the eigenpairs are readily available, which is not the case for the matrix xA . In this section, we shall employ the reflexivity property of the diagonal blocks ˆ lK to further decompose each of them into two independent sub-blocks using another decomposition technique called the reflexive decomposition [10]. 1      nJ be the cross identity matrix of dimension n:   =    Let nJ 1 . It is xD is anti-reflexive and J J n n I= n . Now, we note that the Chebyshev differentiation 1NJ + . In other xD is reflexive, with respect to 2 trivial to see that matrix words, J D J x N N Accordingly, the p p× matrix = − D x 1 + (Equation (2)), is reflexive with respect to ( ω λ matrix [11] [12]. Recall that l the reflexivity property A x ˆ K = l and 2 D x = 2 J D J N x N 1 + . 1 + 1 + xA , which is the stripped matrix of 2 xD pJ . In fact, it is a centro-symmetric ˆ 2 + lK satisfy . Therefore, + ( ) I p ) y ˆ K J K J l = ˆ p l , l p = 1,2, , q , ˆ lK can be decomposed via reflexive decompositions into indicating that each two block diagonal submatrices of almost equal size, depending on whether p is even or odd. The decompositions can be done through orthogonal transfor- mations which have been shown in [10]. Here we just present the results. for some A 11 A 21 First consider the case when p is even. Assume xA is evenly partitioned into 2 2× sub-blocks: k= 2  =   A 12 A 22 and A x    p . Taking k > , 0 DOI: 10.4236/jmp.2018.99107 the advantage of the reflexivity property of ˆ lK by applying the orthogonal 1717 Journal of Modern Physics

H.-C. Chen transformation X K X to T ˆ A A l ˆ lK where X A = 1 2    I J k − k J I    , one can easily decompose ˆ lK into two decoupled diagonal subblocks of equal size: D X K X = T A l ˆ l = X T A A A x + ( T A x + = X A X ( ( ω λ l In the case when p is odd, say A J 12 A 11 D l = + + + A 2 1 k where ) 2 + ( ω λ l ) = ) y p I k 2 and k k 1, + ) I X 2 + ( ω λ l I ( D l 0  =   1 ) ) y 0 D l 2    and J as A x =      A 11 A 21 A 31 A 12 a 22 A 32 A 13 A 23 A 33      , J =      J k 1 J k .      (7) = ) ) D l 2 y > , we consistently partition 0 ( ω λ l A J 21 A 22 − + + ( 2 k I k xA k      , By taking the transformation matrix 0 2 0 ˆ lK can be decoupled via the orthogonal transformation AX to be I 0 J 1 2      X = A k J − 0 I T ˆ A l X K X A the matrix as where and D X K X = T A l l ˆ = A = X A X T A x A + ( + T A A x X ( ω λ l + ( 2 ) y 2 + ( ) ω λ l D ) ) l 0 (  =   I y 1 A ) I X 0 D l 2    D l 1 = A 11     k + 2 A J 13 A 21 A 12 2 a 22     + ( ω λ l + ( 2 ) ) y D l 2 = ( A 33 − A J 31 k ) + In either case, let = ⊗ X A = X A ⊕ formation q X I T ˆX KX yields ˆ X KX ˆ T A T A T X K X = A 1 D D = ⊕ ⊕ ⊕ 2 1 ( ( D D ⊕ 12 11 X K X ⊕ ) ⊕ ˆ 2 D q D 21 = ) 2 ( + ( ω λ l X ⊕ ⊕ A y ) X I k . . The orthogonal trans- A ⊕ ⊕ A ˆ X K X T A q A ⊕ D 22 ) ⊕ ( D q 1 ⊕ D q 2 ) , which obviously consists of 2q independent diagonal blocks since each two independent diagonal blocks. lD has 4. Computational Procedure The numerical solution to the Helmholtz Equation (1) discretized by the Chebyshev pseudospectral method yields the linear system 1718 Journal of Modern Physics DOI: 10.4236/jmp.2018.99107

H.-C. Chen Ku = f K I , = ⊗ + ⊗ + I A y A x q p 2 ω p I I ) ⊗ ( yA is a q q× , q pI is the identity matrix of dimension p, xA a p p× , where and K a pq pq× matrix. Here p (q) is the number of internal grid points in the x (y) direction. The two-level decomposition scheme described in this paper yields the following transformed linear system consisting of 2q independent subsystems, each of dimension Dv = g D D , 11 = ( ⊕ D 12 ⊕ ⊕ ( D q 1 ⊕ D q 2 ) . p when p is even. 2 ) D 22 D 21 ⊕ ⊕ ( ) 1 , . The sizes of the sub-blocks in D differ at most by 1 when p is an odd number. Computationally, this decomposition is very attractive. Not only can the decoupled subsystems be solved much more efficiently, they can also be solved in parallel with course-grain parallelism using a computer system with multi- processors. This two-level composition consists of the following three stages. f= f= f= to Ku K Q KQ−= with , and where 1 u Q u−= 1Q− and inseting 1) Transform Ku f Q f−= 1 This is a typical similarity transformation that is achieved by premultiplying 1QQ− , which is simply an both sides of Ku identity matrix, between K and u. The matrix Q has been defined in Section 3.1 and K can be found in Equation (5). ˆ f= 2) Transform Ku = The main purpose of this transformation is to reorder the unknowns of the linear system so that the decoupled coefficent matrix K is converted to a block diagonal matrix ( ˆK ) for the sake of computational convenience. The matrix P, ˆK which is just a permutation matrix, has been explained in Section 3.1 and can be found in Equation (6). ˆK P KP , and where ˆu P u T to T P f f= ˆ ˆKu . = , = ˆf T . T ˆ g X f g= where f= to Dv 3) Transform = This transformation is analogous to that in stage 1 except that X here is an TX . The matrices X and D D X KX 1X − by T ˆ v X u , and = = , orthogonal matrix. Therefore we replace can be found in Section 3.2 ˆ ˆKu ˆ T ˆ The advantage of this approach is that the decomposed form of D is explicitly known and, therefore, the actual decompositions from K to K , from K to ˆK , and from ˆK to D are not necessary. Furthermore, the diagonal blocks of D can be obtained without any matrix-matrix multiplications. The computational procedure of this two-level decomposition boils down to the following three typical steps. 1) Decomposition step. In this step, we shall compute D and g. Since each diagonal block of D, ( = D D 11 ⊕ D 12 ) ⊕ ( D 21 ⊕ D 22 ) ⊕ ⊕ ( D q 1 ⊕ D q ) 2 , is independent and explicitly known, D can be obtained in parallel using 2q processors, without any difficulty. The main task here is, thus, to obtain g. This 1719 Journal of Modern Physics DOI: 10.4236/jmp.2018.99107

H.-C. Chen p Q Q I f= . It is essential to note that Q, f first, which involves solving the linear can be achieved by transforming f to system Qf = ⊗ , is a block diagonal f can be obtained by solving p independent linear matrix and, therefore, subsystems, each of size q, instead of solving the linear system as a whole, which is of size pq. This step can be performed in parallel with large granularity. The matrix yA , is in general not known yQ , consisting of the eigenvectors of explicitly and, thus, has to be computed from yA . Fortunately, yA is only of ˆf and g can be obtained easily f has been computed, dimension q. Once ˆf is simply a permuted version of f and because involves no matrix-vector multiplications due to the special form of X. T ˆ g X f = y q= for v. Note 2) Solution step. In this step, we solve the linear system Dv that D consists of 2q diagonal blocks: = . By con- 1,2, sistently partitioning the vectors x and g into 2q parts, based on the size of the blocks, the linear system Dv g= can now be expressed as 1lD and 2lD , q l , D v l l 1 2 1 − D v l 2 2 l g = g= 2 1, l − 2 l , q for Dv = . Since all of the subsystems are independent, the linear system 1,2, l g= can be solved in parallel, using 2q processors, one for each subsystem. 3) Retrieval step. Once v has been obtained, the solution u to the original system can be retrieved from v as follows. We first compute ˆu from v using , which again involves no matrix-vector multiplication. We then obtain ˆu Xv= u by simply permuting ˆu using . The final solution u can now be computed by the matrix multiplication u Qu= . Note that and = ⊗ , implying that ˆu and u can be obtained in parallel with course- Q I grain parallelism. ˆ u Pu= = ⊗ X I Q X A q p y To end this section, it deserves mentioning that this two-level decomposition is a similarity transformation and therefore, all eigenvalues of the original matrix K are preserved in matrix D. Obtaining eigenvalues from D is far more efficient than from K. 5. Numerical Example To demonstrate the usefulness of the two-level decomposition scheme, we 3ω= present a numerical example derived from the Helmholtz equation with [ ] over a square domain on [ ] , subject to homogeneous Dirichlet 1,1 1,1 × − − xN = and boundary conditions on a grid with yN = . The numerical results 5 presented in this section were conducted using GNU Octave which is software that offers many powerful high-level programming features for scientific computations, similar to Matlab. It suffices to show the final decomposed submatrices to illustrate the advantage of this approach. By excluding the grid points on the boundary and using the lexicographic ordering to number the internal nodes [3], the matrix K, of dimension 12, yields I = ⊗ + ⊗ + K I 2 ω ⊗ 4 I I ) ( 4 3 A y A x 4 3 DOI: 10.4236/jmp.2018.99107 1720 Journal of Modern Physics

分享到：

赞收藏

资料库

论文研究 - Chebyshev伪谱法求解亥姆霍兹方程的两级块分解.pdf

相关推荐

开发技术

热门标签

最新资料