IEEE  TRANSACTIONS  ON  COMPUTER-AIDED  DESIGN.  VOL. 
IO,  NO.  I I .   NOVEMBER  1991 
1441 
FastCap: A Multipole Accelerated 3-D Capacitance 
Extraction Program 
Keith Nabors and Jacob White, Associate Member, IEEE 
Abstract-In 
this paper  a  fast  algorithm  for  computing the 
capacitance of a complicated 3-D geometry of ideal conductors 
in a uniform dielectric is described and its performance in the 
capacitance extractor  FastCap  is examined.  The algorithm  is 
an acceleration of the boundary-element technique for solving 
the  integral  equation  associated  with  the  multiconductor  ca- 
pacitance extraction problem.  Boundary-element methods be- 
come slow when a large number of elements are used because 
they lead to dense matrix problems, which are typically solved 
with some form of Gaussian elimination.  This implies that the 
computation grows as n3, where n is the  number of panels or 
tiles  needed  to  accurately  discretize  the  conductor  surface 
charges.  In  this  paper we present a generalized conjugate  re- 
sidual  iterative  algorithm  with  a  multipole  approximation  to 
compute the iterates.  This combination reduces the complexity 
so that accurate multiconductor capacitance calculations grow 
nearly  as nm, where  m is the  number of  conductors.  Perfor- 
mance comparisons on integrated circuit bus crossing problems 
show that for problems with as few as 12 conductors the mul- 
tipole accelerated boundary element method can be nearly 500 
times faster than  Gaussian elimination  based algorithms,  and 
five  to  ten  times  faster  than  the  iterative  method  alone,  de- 
pending on required accuracy. 
If” 
I.  INTRODUCTION 
the design of high-performance  integrated circuits and 
ntegrated circuit packaging,  there are many cases where 
accurate  estimates  of  the  capacitances  of  complicated 
three-dimensional  structures  are  important  for determin- 
ing final circuit  speeds or functionality. Two examples of 
complicated  three-dimensional  structures  for  which  ca- 
pacitance strongly affects performance are dynamic mem- 
ory cells and the chip carriers commonly used in high den- 
sity packaging.  In these problems, capacitance extraction 
is  made  tractable  by  assuming  the  conductors  are  ideal 
and are embedded in  a piecewise-constant  dielectric  me- 
dium. Then to compute the capacitances, Laplace’s  equa- 
tion  is  solved  numerically  over  the  charge-free  region, 
with the conductors  providing  boundary  conditions. 
Although there are a variety  of numerical  methods  that 
can be used to solve Laplace’s equation, for three-dimen- 
Manuscript  received  May  29,  1990. This  work  was  supported  by  the 
Defense Advanced  Research Projects Agency  under Contract  N00014-87- 
K-825,  by  the  National  Science  Foundation,  and  by  grants from  I.B.M. 
and Analog Devices. This paper was recommended by Associate Editor A. 
E. Ruehli. 
The  authors  are  with  the  Research  Laboratory  of  Electronics,  Depart- 
ment of  Electrical  Engineering  and  Computer Science, Massachusetts  In- 
stitute of Technology, Cambridge, MA, 02139. 
IEEE Log Number 9100300. 
sional  capacitance calculations  the  usual  approach  is  to 
apply  a boundary-element  technique to the  integral  form 
of  Laplace’s  equation  [ I l l ,   [12], 
[14].  In  these  ap- 
proaches  the  surfaces  or edges of  all  the  conductors  are 
broken into small panels or tiles and it is assumed  that on 
each panel i, a charge,  q i ,  is uniformly  or piecewise  lin- 
early distributed. The potential on each panel is then com- 
puted by  summing the contributions  to the potential  from 
all the panels using Laplace’s equation Green’s functions. 
In this way, a matrix of potential  coefficients, P ,  relating 
the set of n panel potentials  and the set of n panel charges 
is constructed. The resulting  n  X  n  system  of  equations 
must  be  solved  to  compute  capacitances.  Typically, 
Gaussian  elimination  or Cholesky  factorization  is used to 
solve the  system  of equations,  in which  case the  number 
of operations  is order n3. Clearly, this approach  becomes 
computationally  intractable  if  the  number  of  panels  ex- 
ceeds several hundred, and this limits the size of the prob- 
lem that  can be analyzed  to one with  a few conductors. 
An  approach  to reducing  the  computation  time  that  is 
particularly  effective for computing the diagonal terms of 
the capacitance matrix, also referred  to as the self-capac- 
itances,  is  to  ignore  small contributions  to  the  potential 
coefficient  matrix  from  pairs  of  panels  which  are  sepa- 
rated  by  large  distances  [l]. In  this  paper  we  present  a 
similar  approach,  which  approximates  small  potential 
coefficients  with multipole  expansions. We show that this 
approach  produces  an  algorithm  which  accurately  com- 
putes both the self  and coupling capacitances, and has  a 
computational  complexity  of  nearly  mn, where m  is  the 
number of conductors. Our algorithm, which is really the 
pasting  together  of  three  well-known  algorithms  [ 131, is 
presented  in three  sections. To begin,  in the next  section 
one of the standard integral equation approaches is briefly 
described, and it is shown that the algorithm  requires  the 
solution of an n  x n dense nearly symmetric matrix. Then, 
in Section 111, a generalized  conjugate residual  algorithm 
is described  and is shown to reduce the complexity  of the 
calculation  to  roughly  order  mn’.  In  Section  IV,  it  is 
shown that the major computation  of the conjugate resid- 
ual algorithm,  evaluation of a potential field from a charge 
distribution, can be computed in order n time using a mul- 
tipole algorithm. In  Section  V, we describe some exper- 
imental  results  and  in  Section  VI  we present  our conclu- 
sions. Finally, some implementation  details are presented 
in an appendix. 
0278-0070/91/1100-1447$01.00 0 1991 IEEE 
1448 
IEEE  TRANSACTIONS  ON  COMPUTER-AIDED  DESIGN.  VOL.  IO. NO.  I I ,   NOVEMBER  1991 
11.  THE INTEGRAL EQUATION APPROACH 
Consider a system  of m  ideal  conductors embedded  in 
a uniform  lossless  dielectric  medium. For such a system, 
the relation  between  the m conductor potentials,  denoted 
by  fi  E  91m, and  thg  m  total  charges  on  each  conductor, 
denoted by 4 E 91m, is given by 9 = Cb, where  C E !Rm x m  
is  referred  to  as the  capacitance matrix.  Thejth column 
of  C can be calculated  by  solving  for the total charges on 
each  of  the  conductors when  the j t h  conductor is at  unit 
potential and all the other conductors are at zero potential. 
Then the charge on conductor i, q, , is equal to  C,, . 
To  find  the  conductor  charge  distributions  given  the 
conductor potentials, it is necessary to solve the first-kind 
integral equation 
$ ( x )   = 
G(x, x ’ ) u ( x ’ )  du’ 
(1) 
for the surface charge density  U ,  where x ,  x ’  E 913 and are 
positions in three-dimensional  space, du’  is the incremen- 
tal  surface area, rl. is the  surface potential  and  is known, 
and  G(x, x ’ )  is  the  Green’s  function,  which  is  l/llx  - 
x ’ 11  in free space. I  Here,  (Ix - x ’ 11  denotes the Euclidean 
distance  between x and x ’. Given the surface charge den- 
sity  (T,  the total  charge on the  ith conductor,  Q, , can  be 
computed  from 
i 
surfaces 
Q, = 3 
P 
I  th conductor’s Furface 
u ( x ’ )  du‘. 
(2) 
There are a variety of approaches for numerically  com- 
puting  the  conductor  surface  charge  density  given  the 
conductor  potentials, some of  which  involve  reformulat- 
ing  (1)  as a partial  differential  equation  and  using  finite 
difference  methods  in  three  space dimensions  [4], [16]. 
We will  focus on  the boundary-element  methods  applied 
directly to solving (1)  [ 111,  [ 121, [ 141 as they have proved 
to be efficient and accurate when applied to problems with 
ideal  conductors  in  a  uniform dielectric  medium.  These 
methods  are also referred  to as panel  methods  [6] or the 
method of moments [5] in other application domains.  This 
class  of  method  exploits  the  fact  that  the  charge  is  re- 
stricted  to the  surface of the  conductors, and rather  than 
discretizing  all  of  free space, just  the  surface charge  on 
the conductors is discretized.  The surface potential,  which 
is  known,  is  related  to  the  discretized  surface  charge 
through  integrals  of  Green’s  functions.  The  system  thus 
constructed  can then be solved for the discretized  surface 
charge. 
The simplest  commonly  used  approach  to constructing 
a  system  of  equations  that  can  be  solved  for the  discre- 
tized  surface charge is the  “point-matching”  or colloca- 
tion method. In this method, the surfaces of m conductors 
in free space are discretized  into a total  of n two-dimen- 
sional panels (see, for example, Fig. 5(b)). For each panel 
k ,  an  equation  is  written  that  relates  the  potential  at  the 
‘Note that  the  scale  factor  1 /47rc0  can be  ignored here and reintroduced 
later to give the results in units of  farads. 
center of that kth panel  to the sum of the contributions  to 
that potential from the charge distribution  on all n panels. 
That is, 
( 3 )  
where xk is the center of panel  k, x ’ is the position  on the 
surface of panel 1, pk is the potential  at the center of panel 
k, and U /  (x ’) is the surface charge density on the 1 th panel. 
The integral in ( 3 )  is the free-space Green’s function mul- 
tiplied by  the  charge density  and integrated  over the  sur- 
face of  the  Eth  panel.  Note  that  as the  distance between 
panel k and panel 1 becomes large compared with the sur- 
face area of panel I, the integral  reduces to q l /  llxl  - xkll, 
where xI  is the  center of  the  lth panel  and q/ is the total 
charge on panel  1. 
In a first-order  collocation  method  (higher order meth- 
ods are rarely  used), it is assumed  that the surface charge 
density on the panel  is constant [ 121. In that  case ( 3 )  can 
be  simplified  to 
1 
da I ,  
(4) 
where  a/ is  the  surface area of  panel  1.  When  applied  to 
the collection  of n panels, a dense linear system  results, 
p q  = p ,  
where P  E 91n  ‘; q, p  E V; and 
Pk,  = - 
1 
da I .  
(5) 
(6) 
Note  that  q  and p  are  the  vectors  of panel  charges  and 
potentials  rather  than  the conductor charge and potential 
vectors, q and fi, mentioned  above. The dense linear sys- 
tem  of  (5)  can  be  solved,  typically  by  some  form  of 
Gaussian  elimination,  to  compute  panel  charges  from  a 
given set of panel potentials. To compute thejth column 
of the capacitance matrix, (5) must be solved for q, given 
a p  vector where Pk  = 1 if panel k is on t h e j  th conductor, 
and Pk  = 0 otherwise. Then the  ijth term  of  the  capaci- 
tance  matrix  is  computed  by  summing  all  the  panel 
charges on the ith conductor, that is, 
c,, = 
q k .  
keconductor, 
(7) 
111.  SOLUTION  BY  THE  GENERALIZED CONJUGATE 
RESIDUAL METHOD 
In  order  to  solve for  a  complete m  X  m  capacitance 
matrix, the n  x n matrix of potential coefficients, P ,  must 
be  factored  once, usually  into  P  = LU, where  L  and  U 
are strictly  lower and  upper triangular  respectively,  and 
this  requires  order  n3 operations.  Then,  as  there  are  m 
conductors, the factored  system must solved m times with 
m different  right-hand  sides, and this  requites order mn2 
NABORS  AND  WHITE:  FASTCAP 
1449 
operations.  Since  n  is  the  total  number  of  panels  into 
which  the conductor surfaces are discretized, m is neces- 
sarily much smaller than n. Therefore, the n3 time for fac- 
torization  dominates  for large problems, but factorization 
can be avoided by  using  iterative  methods  to solve the m 
charge distribution  problems. 
From the definition of P given by  (6), it is clear that  P 
is a positive nonsymmetric  matrix and that the largest ele- 
ment  in  each  row  is the  diagonal,  although  the  matrix  is 
not  diagonally  dominant.  Therefore,  conjugate-descent 
methods,  such  as  the  generalized  conjugate  residual 
(GCR) algorithm  [15] given  below  in  Algorithm  1,  are 
likely  to be more effective than the more familiar Gauss- 
Seidel or Gauss-Jacobi  style algorithms. 
Algorithm  1: GCR Algorithm for Solving  Pq  = p 
/* Setup. Note  that uifer's are search directions  and  */ 
/* w is the residual.  */ 
w = p ; q  = o .  
/* GCR Loop.  */ 
For iter  = 0, 1,  2, 
*  until  converged  { 
U l f e r   = w .  
pu"er = Pw, 
/* P-orthogonalize  pufter with  respect to pu".  */ 
Form  = 0 to iter { 
p  = p u " e r T p u m .  
U i f e r   = Ulfer  - p u m .  
puiter  = pu iter  - Dpu". 
3 
/* Normalize the direction.  */ 
pu Iter  = pu [ f e r /  11 pu lferll . 
ifer  = 
/* Update the charge and the residual. */ 
q  = q  + CYutter. 
Ly  = W T P U r f e r .  
w  = w  + (YpUrfer. 
ifer/  1) pu iter11 . 
I 
IV.  ACCELERATING 
THE  MATRIX-VECTOR PRODUCT 
As can be seen from examining  Algorithm  1 ,  assuming 
the number of iterations  required is small, the major costs 
of the GCR algorithm  are initially  forming the dense ma- 
trix  P  and  in  each iteration  computing the  matrix-vector 
product  Pw, both  of  which  require  order n2 operations. 
Computing  the  capacitance  matrix  with  Algorithm  1  is 
therefore at least order mn2 and may be higher if the num- 
ber  of  GCR  iterations  increases  with  the  problem  size. 
Note  that  if  the  number  of  panels  per  conductor is  low, 
Algorithm  1 may  not  be  much  more  efficient  than  using 
direct  factorization. 
An  approach  that  avoids  forming  most  of  P ,  and  re- 
duces  the  cost  of  computing  the  matrix-vector  product 
Pw, can be derived by  recalling  that  if w is thought  of as 
representing  charges  distributed  on  panels,  then  Pw  is  a 
potential  arising  from  that  charge  distribution.  In  addi- 
tion,  if  the  distance between  the  centers of  panel  i  and 
panelj is large compared  with the panel  sizes, then P,, = 
n, chargepoints 
\ 
*
*
 
Fig.  1 .   The evaluation point  potentials  are approximated  with  a  multipole 
expansion. 
l / l l x ,   - x, 11.  That is, for widely separated panels, thejth 
panel  charge has the  same effect  on  the  potential  at x ,  as 
would  a point charge of value w, located at panel j ' s  cen- 
ter. 
To  see  how  these  observations  can  help  simplify  the 
computation  of Pw, consider the situation  (depicted  in 2- 
D  for simplicity)  in  Figs.  1  and  2.  In  either  figure,  the 
obvious  approach  to  determining  the  potential  at  the  n I  
evaluation  points  from  the n2 point  charges  involves  n1 * 
n2 operations;  at each of the n 1  evaluation  points one sim- 
ply sums the contribution  to the potential from n2 charges. 
An accurate approximation  for the potentials  for the case 
of Fig.  1  can be computed in many fewer operations using 
multipole  expansions, which  exploit the fact that  r  >> R 
(defined  in Fig.  1). That is, the details of the distribution 
of the charges  in  the inner circle of  radius R in Fig.  1 do 
not  strongly  affect  the  potentials  at the  evaluation  points 
outside  the  outer circle of  radius  r. It  is also possible  to 
compute  an  accurate approximation  for the  potentials  at 
the evaluation  points  in the inner circle of  Fig. 2 in many 
fewer  than  n l   *  n2  operations  using  local  expansions, 
which  again exploit  the  fact  that  r  >>  R  (as in  Fig.  2). 
In this second case, what can be ignored are the details of 
the evaluation  point distribution. 
A.  Multipole  Expansions 
A  rough  approximation  to the  effect  of  the  n2 charges 
in the  inner circle  in  Fig.  1 can  be derived  by  replacing 
those  charges  with  a  single  charge  equal  to  their  sum, 
placed at the inner circle's  center (see Fig.  3). The num- 
ber of  operations  to  compute the  n l  potentials  given  this 
simplification  is  then  n2 + n I ,  n2 operations to  compute 
the sum of charges, and n l  operations to calculate the po- 
tentials at the evaluation  points. Note that the accuracy of 
this approximation  improves as the separation, r, between 
the  nearest  evaluation  point  and  the  center  of  the  inner 
circle  containing  the  charges increases  relative  to the  in- 
ner circle's  radius. 
1450 
IEEE  TRANSACTIONS  ON  COMPUTER-AIDED DESIGN,  VOL.  10,  NO.  11,  NOVEMBER  1991 
are  called  spherical  harmonics  [2],  [8]  and  the  M:  are 
complex weights known as the multipole coefficients. The 
coefficients  are related to the  charges  by 
M ;   = c S,P:'  y i r n  (CY, 3  Or), 
n2 
r = l  
(10) 
where  p, , CY,, and p, are the  spherical  coordinates  of  the 
ith  charge  relative  to  the  sphere's  center.  It  has  been 
shown  that  the  truncated  multipole  expansion  error  is 
bounded by 
' 
l$(r,,  e,,  4,)  - Zo ,, 
1' 
-+ y;ce,> @/)I 
Mln 
Fig  2  The evaluation  point  potentials  are  approximated with  a  local  ex- 
pansion 
@
a
 
< Kl(:)'" 
5 Kl(;)'+' 
(1 1) 
where Kl is independent  of 1, and rand R are, as in Figs. 
1  and  3 ,  the  distance  to the  nearest  evaluation  point  and 
the radius of the sphere of charge, respectively  [2]. If the 
nearest  evaluation  point  is  well  outside the  sphere,  then 
(1 1) implies that all the evaluation  point potentials  can be 
accurately  computed  using just a few  terms  of the  multi- 
pole expansion. 
B.  Local Expansions 
Multipole  expansions  cannot  be  used  to  simplify  cal- 
culating  the  potentials  for  the  evaluation  points  in  the 
smaller circle of  Fig. 2 ,  since the charges are too widely 
distributed.  However,  it  is  still  possible  to  compute ap- 
proximate  potentials  at  the  n l  evaluation  points  because 
of the n2 charges  in n, + n2 operations. To see this, con- 
sider that  the  potential  at  any  of the  n l  evaluation  points 
in the  smaller circle is  roughly  the  Same as the  potential 
evaluated  at the center of the circle. Thus the potential  at 
an evaluation  point can be approximated  by 
where pi is the distance  from the  ith charge to the  center 
of the circle containing  the evaluation  points.  Estimating 
the potentials at the n l  evaluation points therefore requires 
n2 operations  to compute the potential  at the circle's cen- 
ter by (12), and n l  additional operations to copy that result 
to  the  n l  evaluation  points.  Note  that  the  approximation 
improves  as the  separation  between  the  charges and  the 
circle's  center increases  relative  to the circle's  radius. 
Just  as in the  multipole  case, it  is possible  to  improve 
the  accuracy  of  the  above  local  expansion  by  including 
the effect of the distance between  an evaluation  point and 
the  enclosing  sphere's  center.  In  general,  the  truncated 
local expansion  approximation  for the exact potential  in a 
sphere caused by  charges outside the radius of the sphere 
is given by 
@
a
@ 
 
Fig.  3 .   The charges  are  replaced  by  the  first  multipole  expansion  coeffi- 
cient. 
In the  simplified  approach  above, the  potential  arising 
from the charges  in Fig.  1  is approximated  by 
17 2 
where r,  is the  distance  between  the center of  the  charge 
circle and thejth evaluation point. Such an approximation 
is referred to as a monopole  approximation  and is the first 
term  in  the  general  multipole  approximation  for  charge 
distributions.  In  general,  the  true  potential,  $,  arising 
from  point  charges inside  a  sphere at  a  location  outside 
the  radius  of  the  sphere  can  be  approximated  arbitrarily 
accurately  by a truncated  multipole  expansion, 
where  1 is the  order of  the  expansion,  and  rj , e,,  and  $j 
are the spherical coordinates  of t h e j  th evaluation  location 
in  relation  to the  sphere's  center. The  Y:(Oj, 4j) factors 
NABORS  AND  WHITE:  FASTCAP 
where  1 is  the  order of  the  expansion,  rJ, e,,  and  $J are 
the  spherical  coordinates  of  the j t h   evaluation  location 
with respect  to the sphere's center, and the Lr factors are 
the  complex  local  expansion  coefficients.  For a set  of  n2 
charges  outside  the  sphere,  the  local  expansion  coeffi- 
cients are given by 
n2 
where  p i ,  a i ,  and Pi are the  spherical  coordinates  of  the 
ith charge  relative  to  the center of  the  sphere  containing 
the evaluation points.  As for the multipole expansion, the 
error introduced by the local expansion is related to a ratio 
of distances, 
I 
I 
ti 
I 
where K2 is independent  of  1,  and  r and R are, as in Fig. 
2, the  distance  to the  nearest  charge  location  and  the  ra- 
dius of  the  sphere of  evaluation  points,  respectively  [2]. 
Therefore, if the charges are well outside the sphere, then 
the potential inside the sphere can be accurately computed 
using just a few terms of the  local  expansion. 
C.  The Multipole Algorithm 
Low-order  multipole  and local  expansions  can  be used 
to accurately compute the potentials at n evaluation points 
arising  from  n  panel  charges in order n  operations, even 
for general evaluation  point  and charge distributions,  but 
the  multipole  and  local  expansions  have  to  be  applied 
carefully,  both to ensure adequate  separation and to avoid 
excess  calculation.  Below  we give a multipole  algorithm 
for computing  the  potentials  at the  n  panel  center points 
arising from n panel charges. The algorithm requires O ( n )  
operations,  and  was  originally  presented  in  [ 2 ]  with  var- 
iants in  [9], [13], and  [17]. The algorithm  is  reproduced 
here, modified  to fit the boundary-element  calculations. 
To begin, the smallest cube that contains the entire col- 
lection of panels for the problem of interest is determined. 
This cube will be referred to as the level 0, or root,  cube. 
Then,  the  volume  of  the  cube  is  broken  into  eight  child 
cubes of equal size, referred to as level  1 cubes, and each 
has the  level 0 cube as its parent. The panels  are divided 
among the child cubes by  associating  a panel with  a cube 
if the panel's  center point  is contained  in the  cube. Each 
of  the  level  1 cubes  is then  subdivided  into eight  level  2 
child cubes and the panels  are again distributed  based on 
their  center point  locations. The result  is a  collection  of 
64 level 2 cubes and a 64-way partition of the panels.  This 
process  is  repeated  to  produce  L  levels  of  cubes  and  L 
partitions  of  panels  starting  with  an  eight-way  partition 
and ending with  an gL-way partition.  The number of lev- 
els, L, is chosen  so that  the maximum  number  of  panels 
I45 I 
in the finest, or Lth, level cube is less than some threshold 
(four is a typical  default). 
The following terms are used  to concisely  describe the 
multipole algorithm. 
Dejinition  I :   The evaluation  points  of  a  cube are the 
center points of the panels associated  with  the cube. 
Dejinition 2:  The nearest neighbors of a cube are those 
cubes on the same level which  have a comer in common 
with the given cube. 
Dejinition  3:  The second  nearest  neighbors  of  a  cube 
are  those  cubes  on  the  same level  which  are not  nearest 
neighbors  but  have  a  comer  in  common  with  a  nearest 
neighbor. 
Note that there are at most  124 nearest and second near- 
est neighbors  of a cube, excluding  the cube itself. 
Dejinirion  4:  The interaction cubes of a given cube are 
those cubes which are either the second nearest neighbors 
of the given cube's  parent,  or children of the given cube's 
parent's  nearest  neighbors,  excluding  nearest  or second 
nearest  neighbors  of the given cube. 
There  are  a  maximum  of  189 interaction  cubes  for  a 
given cube; roughly  half are from a level one coarser than 
the  level  of  the  given  cube,  the  rest  being  on  the  same 
level.  The interaction  cubes have  two important  proper- 
ties.  When  combined  with  the  given  cube's  nearest  and 
second  nearest  neighbors,  they  entirely  cover  the  same 
volume as the given cube's parent and the parent's  nearest 
and second nearest neighbors.  Also, the interaction cubes 
are  such  that  the  distance  between  a  point  in  the  given 
cube and a point  in the interaction  cube is more than half 
the  distance  between  the  centers  of  the  given  and  inter- 
action  cubes.  This  latter  property  guarantees  that  when 
multipole  expansions  are used  to approximate the effects 
of  interaction  cubes,  and  when  these  multipole  expan- 
sions  are  gathered  together  in  a  local  expansion  for the 
given  cube,  the  resulting  approximation  will  converge 
rapidly  with  increasing expansion  order. 
Remark:  As the  charges  in this  problem  are not  point 
charges, but  are distributed  on panels,  it  is  necessary  to 
ensure  that  each  panel  is  entirely  considered  in  a  finest 
level  cube in  order to  ensure that  evaluation  points  in  a 
cube  are  well  separated  from  panel  charges  in  an  inter- 
action  cube.  This  may  require  breaking  a  panel  up  into 
several panels,  but  as the multipole  algorithm  grows  lin- 
early  with  the  number  of  panels,  this  is  not  a significant 
computational  burden. 
The structure of the multipole  algorithm  for computing 
the  n panel  potentials  from n  panel  charges is given  be- 
low. The formulas  for various  transformations  and shifts 
required  are given in the Appendix. A three-letter  key for 
each transformation  is given  to  simplify  finding  the  cor- 
responding  appendix  formula. 
1452 
IEEE  TRANSACTIONS  ON  COMPUTER-AIDED  DESIGN,  VOL. 
IO,  NO.  I I ,   NOVEMBER  1991 
Algorithm 2: Multipole Algorithm for Computing Pw 
I* 
THE  DIRECT  PASS:  The  potentials  arising  from 
nearby  charges are computed  directly. 
*/ 
For each cube of  the 8L cubes on the finest level 
I* Map panel charge distributions to potentials (Q2P). 
*I 
Compute the potential  at all the evaluation  points  in 
the  cube arising  from  the charge distributions  on all 
the  panels  in  the  cube, in  the  cube’s  nearest  neigh- 
bors, and in the cube’s second  nearest  neighbors. 
3 
/* 
THE  UPWARD  PASS:  Computes a  multipole  expan- 
sion for every  cube at every  level. The computation  is 
order n because  the  multipole  expansion  for any  cube 
at  a  level  coarser than  the  finest  level  is  computed  by 
combining  the  multipole expansions  of  its children. 
*/ 
For each cube of the  8L cubes on the  finest level  { 
I*  Map  panel  charges  to  multipole  coefficients 
(Q2M).  *I 
Construct  a  multipole  expansion  for the  charge  dis- 
tributions  on  all  the  panels  in  the  cube,  about  the 
cube’s center. 
3 
For each level  i  = L  - 1 to 2  { 
For each  cube of the  8’ cubes on level  i  { 
For each of the 8 children  of  the cube  { 
/* Map multipole coefficients  to multipole coef- 
ficients (M2M). * I  
Shift  the  multipole  expansion  about  the  child 
cube’s center to a multipole expansion about the 
cube’s center and add it to the multipole expan- 
sion for the cube. 
3 
1 
3 
/* 
THE  DOWNWARD  PASS:  Computes  a  local  expan- 
sion  for every  cube.  The local  expansion  includes  the 
effects of all panel charges not in the cube or its nearest 
and  second  nearest  neighbors.  Note  that  at  the  finest 
level this  includes  the effects  of  all  panels  that  are not 
treated in the direct pass. 
*/ 
For each level i  = 2 to L  { 
For each cube of the 8’ cubes on level  i { 
/*  Map  local  coefficients  to  local  coefficients 
(LZL). * I  
If the cube’s parent has a local expansion, shift that 
expansion  to  a  local  expansion  about  the  cube’s 
center. 
For each of the cube’s interaction  cubes { 
/*  Map  multipole  coefficients  to  local  coeffi- 
cients  (M2L).  *I 
Convert the multipole expansion  about the cen- 
ter of the  interaction  cube to a local  expansion 
about  the cube’s  center and  add  it to the local 
expansion  for the cube. 
3 
3 
3 
/* 
THE  EVALUATION  PASS:  Evaluates  the  local  ex- 
pansions  at the finest level. 
*I 
For each cube of the 8L cubes on the lowest  level  { 
I* Map local coefficients  into potentials  (L2P).  *I 
Evaluate  the cube’s local expansion  for the potential 
at all the evaluation  points in the cube, and add those 
computed  potentials  to  the  evaluation  point  poten- 
tials. 
3 
V.  IMPLEMENTATION 
I N   FASTCAP 
Our  implementation  of  the  multipole-accelerated  ca- 
pacitance extraction algorithm uses an optimization  which 
exploits  the  fact that  the  conversion  and  shift  operations 
are linear functions of the charges or the expansion  coef- 
ficients  when  the  geometry  is fixed.  That  is, the  compli- 
cated  evaluations  involved  in  converting  charges  to  po- 
tentials  or  multipole  coefficients,  shifting  multipole 
coefficients,  converting  multipole  coefficients  to  local 
coefficients,  shifting  local  coefficients,  converting  local 
coefficients to potentials are all computed once and stored 
as matrices  which operate on charges or coefficients. 
As an example, consider forming a first-order multipole 
expansion  for a collection  of k  charges. Following  (9), a 
first-order multipole  expansion  has the form 
where M i ,  - 
, MI are complex  multipole  coefficients. 
Since M: is real for all n ,  and M i ”  is always the complex 
conjugate  of M f ,  the multipole  expansion  can be written 
in terms of real  coefficients  as 
+ Mi 
P 1 (cos e) sin 4 
2 r2 
NABORS  AND  WHITE:  FASTCAP 
where  a;, My, MI, and  il?; are  real  coefficients  and 
Pr(cos 0)  is the associated  Legendre function  of  degree 
n  and order m. This equation  appears  as (A9) in the  Ap- 
pendix,  where  it  is  discussed  in  more  detail.  This  low- 
order expansion  can be more simply represented  as 
rl/(x,y,z) ~ M ; - + M O 4 - a I ~ - f
1 
r 
I  r 3  
2 r 3  
i p -  
2 r 3 ’  
1453 
ces for all the other multipole  algorithm  conversions  and 
shifts  can  be  constructed  and  used  repeatedly  in  subse- 
quent Pw product  calculations. 
In  our  implementation  of  the  complete  multipole-ac- 
celerated  capacitance  extraction  algorithm, given  below, 
the  shift  and  conversion  coefficients  are  computed  once 
and  stored. 
( 18) 
where x ,  y ,  and z are the Cartesian coordinates of the eval- 
uation points and r  = dx’  + y 2  + z2, as usual. 
The  real  coefficients  are  calculated  using  formulas 
(A1 1) and (A12) in the Appendix, which are analogous  to 
(10). Writing  the four equations for the  four real  coeffi- 
cients as one matrix equation  yields the 4  x  k  linear sys- 
tem 
1 
rp;(cos 
*  . . Po,(cos ax) 
. * 
Plp:(cos 
2p, PI (cos al) cos p1  . * 
*  pkPy(C0s ax) 
Q I )  
2p,PI  (cos q) cos 
[:I, 
[;?I 
- 
. 
= 
MI 
(19) 
where q l ,  . 
, qk are the values  of the k  charges. The 4 
X  k  matrix  is called  the  Q2M conversion  matrix.  Its  ith 
column depends only on the coordinates of the ith charge. 
Substituting  for the  associated  Legendre functions  using 
(A5) and  (A6)  from the  Appendix  and  switching  to  rect- 
angular coordinates  simplifies the  matrix to 
Note that the first row of the matrix implies that 
is the 
sum of all the charge strengths, making  it  identical to the 
coefficient M; in (16). 
Since the  Q2M  matrix  is a  function  of  the  charge  po- 
sitions  alone,  its  entries need  be  calculated  only  once  if 
several  multipole  algorithm  potential  evaluations are  re- 
quired  for the  same charge geometry.  In  the  notation  of 
Algorithm  2 ,  this  amounts  to  using  the  multipole  algo- 
rithm  to  compute  several  Pw  products  with  the  same P 
but  with  different  w vectors.  Each  time the  multipole  al- 
gorithm  is  used  to  form  a  different  Pw  product,  a  new 
vector  of charge  strengths  is  multiplied  by  the  Q2M ma- 
trix,  yielding  a  vector  of  updated  multipole  expansion 
coefficients. In a similar way, geometry-dependent  matri- 
I 
Algorithm  3: Multipole-Accelerated  Capacitance 
Extraction  Algorithm 
/* Setup Phase.  * I  
Divide the m conductors  into a total of n panels. 
Divide  the  problem  domain  into  a  hierarchy  of  cubes, 
so that  each of the finest  level cubes has a maximum 
of 4 panels. 
Compute the conversion  and shifting matrices  from the 
topology. 
/* Loop Through  all the  Conductors. */ 
F o r j  = 1 t o m  { 
/*  Set the  Potential  of  the  Panels  on  Conductorj to 
one.  */ 
Fork  = 1 to n  { 
If panel  k  is on conductor j ,  set Pk  = 1 
Elsepk = 0. 
} 
/*  Solve for the Panel  Charges using  MGCR.  */ 
Use GCR (Algorithm  1) to solve Pq = p ,  using Mul- 
/*  Sum  the  Charges on  Conductor i to  Compute C,, 
tipole  (Algorithm  2) to compute Pw. 
*/ 
For i  = 1 to m 
c,, = Ckmnductor,  q k .  
j, 
To  determine  the  effectiveness  of  this  approach,  the 
multipole  accelerated  algorithm  was  tested  on the  easily 
parameterized bus structure given in Fig. 4, for buses with 
2  X  2  conductors  through  6  x  6  conductors.  The con- 
ductor  surfaces  are discretized  by  first  cutting  each con- 
ductor  into  sections  based  on  where  pairs  of  conductors 
overlap. In the 2  x  2 bus  example, each conductor is cut 
into  five  sections  (see  Fig.  5(a)), and  in  the  6  x  6  ex- 
ample each conductor is divided into  13 sections. The dis- 
cretization is then completed by dividing each face of each 
section  into  nine  panels,  as  demonstrated  in  Fig.  5(b). 
The  edge  panels  have  widths  that  are  10% of  the  inner 
panel  widths  to  accurately  discretize  the  expected  in- 
creased charge density  near conductor  edges [ 141. 
In Table I we report the results of our experiments with 
the various approaches to solving ( 5 ) ,  the matrix problem 
associated  with  the  boundary  element method.  In the ta- 
ble, the total number of panels resulting from the conduc- 
tor surface discretization  is given,  followed  by  the CPU 
times  (on an IBM 6000) required  to compute the entire m 
X  m capacitance matrix, where  m is the  number  of  con- 
ductors. Three methods  for solving (5) are compared:  di- 
rect or LU factorization, GCR, and multipole  accelerated 
GCR  (MGCR).  The  MGCR  algorithm’s  CPU  times  are 
1454 
IEEE  TRANSACTIONS  ON  COMPUTER-AIDED  DESIGN.  VOL.  IO,  NO.  I I ,   NOVEMBER  1991 
Fig. 4.  Bus structure test problem  with  2  x  2 conductors 
(a) 
(b) 
Fig. 5 .   Conductor sections are divided into panels. 
COMPARISON OF EXTRACTION METHODS (CPU TIMES IN IBM 6000 
TABLE I 
SECONDS) 
Test Problem 
2 x 2   3 x 3   4 x 4   5 x 5  
6 x 6  
Panels 
Direct time 
GCR time 
MGCRtime(1 = 2) 
MGCR time ( I   = 1) 
MGCR time  ( I  = 0) 
GCR iters 
MGCR  iters ( I   = 2) 
MGCR  iters  (/  =  1) 
MGCR  iters (1  = 0) 
792 
275 
121 
55 
29 
9 
48 
48 
54 
58 
1620 
2700 
570 
218 
108 
48 
78 
82 
88 
90 
2736 
12969 
2115 
378 
245 
98 
120 
120 
120 
120 
4140 
44345 
4881 
790 
436 
216 
150 
150 
150 
150 
5832 
(141603) 
(14877) 
1412 
775 
356 
(180) 
180 
180 
180 
strongly dependent  on the number of expansion  terms, so 
the time required  when zero-, first-, and second-order  ex- 
pansions  (1  = 0, 1, 2) are used is given. Also in the table 
are the  total  number of  iterations  required  to  reduce  the 
max  norm  of  the residual  in  GCR and MGCR  to  1 %  of 
its  original  value.  The CPU times  in parentheses  are ex- 
trapolated.  They correspond  to calculations  that were not 
possible  because  of  the  excessive  memory  required  to 
store the entire potential  coefficient  matrix, and our lack 
of  patience. 
As  is clear from  the  results,  the  multipole-accelerated 
GCR algorithm  is very  effective for the  larger  problems, 
particularly  if the expansion order is low. To examine the 
effect  of  expansion  order  on  accuracy,  in  Table  I1 we 
compare  the resulting  capacitances  computed  by  solving 
( 5 )  for the 4 X  4 conductor problem  with  LU  factoriza- 
tion,  GCR, and MGCR  for expansion  orders 0,  1, and 2. 
One row of the 4 X  4 capacitance matrix2 is given for the 
five  different  solution  methods.  The  row  chosen  repre- 
sents  the capacitance  associated  with  one of the  conduc- 
tors  on  the outer edge.  Taking  the  direct  method  results 
as  exact,  the  data  indicate  that  MGCR  can  attain  better 
than  10% accuracy  in  the  diagonal  entry  of  the  capaci- 
tance matrix with only a zero-order (1 = 0) expansion. To 
'In  the 4  X  4 conductor example the lengths  have  been  normalized  so 
that  the  conductors are each 5  m long,  1 m high,  and  1 m  wide,  and  all 
interconductor  spaces are 1 m. The capacitances are given in picofarads by 
scaling the program  results by  4neo =  111.27 pF/m.