logo资料库

Capacity of multi-antenna Gaussian channels.pdf

第1页 / 共28页
第2页 / 共28页
第3页 / 共28页
第4页 / 共28页
第5页 / 共28页
第6页 / 共28页
第7页 / 共28页
第8页 / 共28页
资料共28页,剩余部分请下载后查看
Capacity of Multi-antenna Gaussian Channels _I. Emre Telatar Abstract We investigate the use of multiple transmitting andor receiving antennas for single user communications over the additive Gaussian channel with and without fading. We derive formulas for the capacities and error exponents of such channels, and describe computational procedures to evaluate such for- mulas. We show that the potential gains of such multi-antenna systems over single-antenna systems is rather large under independence assumptions for the fades and noises at dierent receiving antennas.  Introduction We will consider a single user Gaussian channel with multiple transmitting andor receiving antennas. We will denote the number of transmitting antennas by t and the number of receiving antennas by r. We will exclusively deal with a linear model in which the received vector y  C r depends on the transmitted vector x  C t via y = Hx + n  where H is a r  t complex matrix and n is zero-mean complex Gaussian noise with independent, equal variance real and imaginary parts. We assume Enny = Ir, that is, the noises corrupting the dierent receivers are independent. The transmitter is constrained in its total power to P , Equivalently, since xyx = trxxy, and expectation and trace commute, Exyx  P: trExxy  P:  Rm. C-, Lucent Technologies, Bell Laboratories,  Mountain Avenue, Murray Hill, NJ, USA  , telatar@lucent.com 
This second form of the power constraint will prove more useful in the upcoming discussion. We will consider several scenarios for the matrix H: . H is deterministic. . H is a random matrix for which we shall use the notation H, chosen according to a probability distribution, and each use of the channel corresponds to an independent realization of H. . H is a random matrix, but is xed once it is chosen. The main focus of this paper in on the last two of these cases. The rst case is included so as to expose the techniques used in the later cases in a more familiar context. In the cases when H is random, we will assume that its entries form an i.i.d. Gaussian collection with zero-mean, independent real and imaginary parts, each with variance =. Equivalently, each entry of H has uniform phase and Rayleigh magnitude. This choice models a Rayleigh fading environment with enough separation within the receiving antennas and the transmitting antennas such that the fades for each transmitting-receiving antenna pair are independent. In all cases, we will assume that the realization of H is known to the receiver, or, equivalently, the channel output consists of the pair y; H, and the distribution of H is known at the transmitter.  Preliminaries A complex random vector x  C n is said to be Gaussian if the real random vector Imxi, is Gaussian. Thus, ^x  Rn consisting of its real and imaginary parts, ^x = hRex to specify the distribution of a complex Gaussian random vector x, it is necessary to specify the expectation and covariance of ^x, namely, E^x  Rn and E^x E^x^x E^xy  Rnn: We will say that a complex Gaussian random vector x is circularly symmetric if the covariance of the corresponding ^x has the structure E^x E^x^x E^xy =  ReQ ImQ ReQ ImQ  for some Hermitian non-negative denite Q  C nn. Note that the real part of an Hermitian matrix is symmetric and the imaginary part of an Hermitian matrix is anti-symmetric and thus the matrix appearing in  is real and symmetric. In this case ExExxExy = Q, and thus, a circularly symmetric complex Gaussian random vector x is specied by prescribing Ex and Ex Exx Exy. 
For any z  C n and A  C nm dene ^z =Rez Imz and Lemma . The mappings z ! ^z = hRez following properties: ImA ^A =ReA ImA ReA : Imzi and A ! ^A = hReA ImA ImA ReAi have the C = AB  ^C = ^A ^B C = A + B  ^C = ^A + ^B C = Ay  ^C = ^Ay C = A  ^C = ^A det ^A = j detAj = detAAy z = x + y  ^z = ^x + ^y y = Ax  ^y = ^A^x Rexyy = ^xy ^y: a b c d e f g h Proof. The properties a, b and c are immediate. d follows from a and the fact that ^In = In. e follows from det ^A = det I iI I ^AI iI I = det  A ImA A = detA detA: f, g and h are immediate. Corollary . U  C nn is unitary if and only if ^U  Rnn is orthonormal. Proof. U yU = In   ^U y ^U = ^In = In. Corollary . If Q  C nn is non-negative denite then so is ^Q  Rnn . Proof. Given x = x; : : : ; xny  Rn , let z = x + jxn+; : : : ; xn + jxny  C n, so that x = ^z. Then by g and h xy ^Qx = RezyQz = zyQz  : The probability density with respect to the standard Lebesgue measure on C n of a circularly symmetric complex Gaussian with mean  and covariance Q is given 
by ;Qx = det ^Q= exp^x ^y ^Q^x ^ = detQ expx yQx  where the second equality follows from dh. The dierential entropy of a com- plex Gaussian x with covariance Q is given by HQ = E Q log Qx = log detQ + log eExyQx = log detQ + log e trExxyQ = log detQ + log e trI = log deteQ: For us, the importance of the circularly symmetric complex Gaussians is due to the following lemma: circularly symmetric complex Gaussians are entropy maximizers. Lemma . Suppose the complex random vector x  C n is zero-mean and satises Exxy = Q, i.e., Exix j  = Qij,   i; j  n. Then the entropy of x satises Hx  log deteQ with equality if and only if x is a circularly symmetric complex Gaussian with Exxy = Q j dx = Qij,   i; j  n. j dx = Qij, and that log Qx is a linear combination of Let the terms xix Proof. Let p be any density function satisfying RC n pxxix Qx = detQ expxyQx: Observe that RC n Qxxix j . Thus E Qlog Qx = E plog Qx. Then, Hp HQ = ZC n = ZC n =ZC n  ; px log px dx +ZC n px log px dx +ZC n px log Qx px dx Qx log Qx dx px log Qx dx with equality only if p = Q. Thus Hp  HQ. Lemma . If x  C n is a circularly symmetric complex Gaussian then so is y = Ax for any A  C mn. 
Proof. We may assume x is zero-mean. Let Q = Exxy. Then y is zero-mean, ^y = ^A^x, and E^y ^yy = ^AE^x^xy ^Ay =   ^A ^Q ^Ay =   ^K where K = AQAy. Lemma . If x and y are independent circularly symmetric complex Gaussians, then z = x + y is a circularly symmetric complex Gaussian. Proof. Let A = Exxy and B = Eyyy. Then E^z ^zy =  ^C with C = A + B.   The Gaussian channel with fixed transfer function We will start by reminding ourselves the case of deterministic H. The results of this section can be inferred from , Ch.  . Capacity We will rst derive an expression for the capacity CH; P  of this channel. To that end, we will maximize the average mutual information Ix; y between the input and the output of the channel over the choice of the distribution of x. By the singular value decomposition theorem, any matrix H  C rt can be written as H = UDV y where U  C rr and V  C tt are unitary, and D  Rrt is non-negative and diagonal. In fact, the diagonal entries of D are the non-negative square roots of the eigenvalues of HH y, the columns of U are the eigenvectors of HH y and the columns of V are the eigenvectors of H yH. Thus, we can write  as y = UDV yx + n: Let ~y = U yy, ~x = V yx, ~n = U yn. Note that U and V are invertible, ~n has the same distribution as n and, E~xy ~x = Exyx. Thus, the original channel is equivalent to the channel ~y = D ~x + ~n  where ~n is zero-mean, Gaussian, with independent, identically distributed real and imaginary parts and E ~n ~ny = Ir. Since H is of rank at most minfr; tg, at most minfr; tg of the singular values of it are non-zero. Denoting these by = , i = ; : : : ; minfr; tg, we can write  component-wise, to get i ~yi = = i ~xi + ~ni;   i  minfr; tg; 
and the rest of the components of ~y if any are equal to the corresponding components of ~n. We thus see that ~yi for i  minft; rg is independent of the transmitted signal and that ~xi for i  minft; rg don’t play any role. To maximize the mutual information, we need to choose f~xi :   i  minfr; tgg to be independent, with each ~xi having independent Gaussian, zero-mean real and imaginary parts. The variances need to be chosen via water-lling" as ERe~xi = EIm~xi =    i + where  is chosen to meet the power constraint. Here, a+ denotes maxf; ag. The power P and the maximal mutual information can thus be parametrized as P  =Xi   i +; C =Xi lni+: Remark  Reciprocity. Since the non-zero eigenvalues of H yH are the same as those of HH y, we see that the capacities of channels corresponding to H and H y are the same. Example . Take Hij =  for all i; j. We can write H as p=r ...p=r H =   prthp=t : : :p=ti and we thus see that in the singular value decomposition of H the diagonal matrix D will have only one non-zero entry, prt. We also see that the rst column of U is p=r; : : : ; y and the rst column of V is p=t; : : : ; y. Thus, C = log + rtP : The x = V ~x that achieves this capacity satises Exix j  = P=t for all i; j, i.e., the transmitters are all sending the same signal. Note that, even though each transmitter is sending a power of P=t, since their signals add coherently at the receiver, the power received at each receiver is P t. Since each receiver sees the same signal and the noises at the receivers are uncorrelated the overall signal to noise ratio is P rt. Example . Take r = t = n and H = In. Then C = n log + P=n For x that achieves this capacity Exix j  = ijP=n, i.e, the components of x are i.i.d. However, it is incorrect to infer from this conclusion that to achieve capacity one has to do independent coding for each transmitter. It is true that the capacity of this 
channel can be achieved by splitting the incoming data stream into t streams, coding and modulating these schemes separately, and then sending the t modulated signals over the dierent transmitters. But, suppose Nt bits are going to be transmitted, and we will either separate them into t groups of N bits each and use each group to select one of N signals for each transmitter, or, we will use all all Nt bits to select one of N t signal vectors. The second of these alternatives will yield a probability of error much smaller than the rst, at the expense of much greater complexity. Indeed, the log of the error probability in the two cases will dier by a factor of t. See the error exponents of parallel channels in , pp.  . . Alternative Derivation of the Capacity The mutual information Ix; y can be written as Ix; y = Hy Hyjx = Hy Hn; and thus maximizing Ix; y is equivalent to maximizing Hy. Note that if x satises Exyx  P , so does x Ex, so we can restrict our attention to zero-mean x. Furthermore, if x is zero-mean with covariance Exxy = Q, then y is zero-mean with covariance Eyyy = HQH y + Ir, and by Lemma  among such y the entropy is largest when y is circularly symmetric complex Gaussian, which is the case when x is circularly symmetric complex Gaussian Lemmas  and . So, we can further restrict our attention to circularly symmetric complex Gaussian x. In this case the mutual information is given by Ix; y = log detIr + HQH y = log detIt + QH yH where the second equality follows from the determinant identity detI+AB = detI+ BA, and it only remains to choose Q to maximize this quantity subject to the constraints trQ  P and that Q is non-negative denite. The quantity log detI + HQH y will occur in this document frequently enough that we will let Q; H = log detI + HQH y to denote it. Since H yH is Hermitian it can be diagonalized, H yH = U yU, with unitary U and non-negative diagonal  = diag; : : : ; t. Applying the determinant identity again we see that detIr + HQH y = detIt + =UQU y=: Observe that ~Q = U QU y is non-negative denite when and only when Q is, and that tr ~Q = trQ; thus the maximization over Q can be carried equally well over ~Q. 
Note also that for any non-negative denite matrix A, detA Qi Aii, thus detIr + = ~Q= Yi  + ~Qiii with equality when ~Q is diagonal. Thus we see that the maximizing ~Q is diagonal, and the optimal diagonal entries can be found via water-lling" to be ~Qii =  i +; i = ; : : : ; t ~Qii = P . The corresponding maximum mutual infor- where  is chosen to satisfyPi mation is given by as before. . Error Exponents Xi logi+ Knowing the capacity of a channel is not always sucient. One may be interested in knowing how hard it is to get close to this capacity. Error exponents provide a partial answer to this question by giving an upper bound to the probability of error achievable by block codes of a given length n and rate R. The upper bound is known as the random coding bound and is given by Perror  expnErR; where the random coding exponent ErR is given by ErR = max  E R; where, in turn, E is given by the supremum over all input distributions qx satis- fying the energy constraint of E; qx = logZ Z qxxpyjx=+ dx+ dy: In our case pyjx = detIr expyxyyx. If we choose qx as the Gaussian distribution Q we get after some algebra E; Q =  log detIr +  + HQH y =   + Q; H: 
分享到:
收藏