logo资料库

Orthogonal Matching Pursuit: Recursive Function Approximat ion w....pdf

第1页 / 共5页
第2页 / 共5页
第3页 / 共5页
第4页 / 共5页
第5页 / 共5页
资料共5页,全文预览结束
Recursive Function Approximat ion with Applications to Wavelet Orthogonal Matching Pursuit: Decomposition Y. C. PATI Information Systems Laboratory Dept. of Electrical Engineering Stanford University, Stanford, CA 94305 R. REZAIIFAR AND P. s. KRISHNAPRASAD Institute for Systems Research Dept. of Electrical Engineering University of Maryland, College Park, MD 20742 Abstract In this paper we describe a recursive algorithm to compute representations of functions with respect to nonorthogonal and possibly overcomplete dictionaries of elementary building blocks e.g. affine (wavelet) frames. We propose a modification to the Matching Pursuit algorithm of Mallat and Zhang (1992) that maintains full backward orthogonality of the residual (error) at every step and thereby leads to improved convergence. We refer to this modified algorithm as Orthogonal Matching Pursuit (OMP). It is shown that all additional computation required for the OMP al- gorithm may be performed recursively. 1 Introduction and Background Given a collection of vectors D = {xi} in a Hilbert space R, let us define V=Span{z,}, and W = V ' (inR). We shall refer to D as a dictionary, and will assume the vectors xn, are normalized (llznII = 1). In [3] Mal- lat and Zhang proposed an iterative algorithm that they termed Matching Pursuit (MP) to construct rep- resentations of the form Pvf = C a n z n , n (1) where PV is the orthogonal projection operator onto V. Each iteration of the MP algorithm results in an intermediate representation of the form where fk is the current approximation, and Rk f the current residual (error). Using initial values of Ro f = f , fo = 0, and k = 1 , the M P algorithm is comprised of the following steps, (I) Compute the inner-products {(Rkf, Zn)},. (11) Find nktl such that I(Rkf, ' n k + I ) l 2 asYp I(Rkf, zj)I 1 where 0 < a 5 1. (111) Set, (IV) Increment k, (k t IC + l), and repeat steps (1)- (IV) until some convergence criterion has been satisfied. The proof of convergence [3] of MP relies essentially on = 0. This orthogonality the fact that (Rk+lf, z ~ ~ + ~ ) of the residual to the last vector selected leads to the following "energy conservation" equation. IIRkfl12 = llRk+lf112 + I(Rkf12nk+1)12' (2) It has been noted that the MP algorithm may be de- rived as a special case of a technique known as Pro- jection Pursuit (c.f, [2]) in the statistics literature. A shortcoming of the Matching Pursuit algorithm in its originally proposed form is that although asymp totic convergence is guaranteed, the resulting approxi- mation after any finite number of iterations will in gen- eral be suboptimal in the following sense. Let N < 00, 1058-6393193 $03.00 Q 1993 IEEE 40
gives the optimal approximation with respect to the selected subset of the dictionary. This is achieved by ensuring full backward orthogonality of the error i.e. at each iteration Rkf E Vi. For the example in Fig- ure 1, OMP ensures convergence in exactly two itera- tions. It is also shown that the additional computation required for OMP, takes a simple recursive form. We demonstrate the utility of OMP by example of applications to representing functions with respect to time-frequency localized affine wavelet dictionaries. We also compare the performance of OMP with that of MP on two numerical examples. 2 Orthogonal Matching Pursuit Assume we have the following kth-order model for f EN, k f = x Q a f n + R k f , with (Rk f , f n ) = 0, 71 = 1,. . .k. n=l (3) The superscript k, in the coefficients a t , show the de- pendence of these coefficients on the model-order. We would like to update this kth-order model to a model of order k + 1, f = Qk+lz, -I- & + i f , with (&+if, fn) = 0, n = l , ... k + l . k+l n = l (4) Since elements of the dictionary D are not required to be orthogonal, to perform such an update, we also require an auxiliary model for the dependence of Xk+l on the previous t,’~ ( n = 1,. . . k). Let, k fk+1 = b;fn+7k, with (yk,xn) = 0, 72 = 1,. . .k. n = l (5) Thus ~ “ , l b ~ z , = PVkzk+l, and 7 k = p vk L f k + l , is the component of z k + l which is unexplained by ( 2 1 , . . . , f k } . Using the auxiliary model (5), it may be shown that the correct update from the kth-order model to the model of order k + 1, is given by be the number of MP iterations performed. Thus we have N-1 f N = (Rkf, f n k + , ) k=O Define VN = Span{znl,.. . , xnN}. We shall refer to fN as an optimal N-term approximation if fN = PVNf’ i.e. fN is the best approximation we can construct using the selected subset {2n1, . . . , Z n N } of the dictionary D. (Note that this notion of optimal- ity does not involve the problem of selecting an “op timal” N-element subset of the dictionary.) In this sense, f~ is an optimal N-term approximation, if and only if RNf E V;. As MP only guarantees that RN f 1 f n N , fN as generated by MP will in general be suboptimal. The difficulty with such suboptimality is easily illustrated by a simple example in IR2. Let z1, and 2 2 be two vectors in IR2, and take f E Et2, as shown in Figure l(a). Figure l(b) is a plot of llRkfllz Figure 1: Matching pursuit example in IR2: (a) Dic- tionary D = ( 1 1 , Q} and a vector f E Et2 versus k. Hence although asymptotic convergence is guaranteed, after any finite number of steps, the error may still be quite large. In this paper we propose a refinement of the Match- ing Pursuit (MP) algorithm that we refer to as Or- thogonal Matching Pursuit (OMP). For nonorthogo- nal dictionaries, OMP will in general converge faster than MP. For any finite size dictionary of N elements, OMP converges to the projection onto the span of the dictionary elements in no more than N steps. Fur- thermore after any finite number of iterations, OMP A simlar difficulty with the Projection Pursuit algorithm was noted by Donoho et.ol. [l] who suggested that bockfitting may be used to improve the convergenceof PPR. Although the techniqueii not fully described in [l] it appears that it is in the same spirit as the technique we present here. 41
It also follows that the residual &+If R k f = R k + l f -k O k y k , and satisfies, Theorem 2.1 For f E 3, let R k f be the residuals genemted by OMP. Then 2.1 The OMP Algorithm The results of the previous section may be used to construct the following algorithm that we will refer to as Orthogonal Matching Pursuit (OMP). Initialization: fo = 0, Rof=f, Do = { 1 20 = 0, a: = 0, k = O :III) If I ( R k f , z n r + l ) l < 6, (6 > 0) then stop. :IV) Reorder the dictionary D, by applying the per- mutation k + 1 C) n k + l . (V) Compute {bk,}k=l, such that, z k + 1 = zn=l bkzn -k yk k and (Cyk,Z,) = 0, = Qk = llykll-2 ( R k f , Z k + l ) , n = I , . . .,k. n = 1,. ..,k, [VI) Set, Oi+l = k k 0, - Q k b n , and update the model, k + l a ~ " n = l fk+l = ~ R k + l f = f - f k + l D k + l = D k U { z k + l } . VII) Set k t k + 1, and repeat (1)-(VU). ~ n 2.2 Some Properties of OMP As in the case of MP, convergence of OMP relies on an energy conservation equation that now takes the form (7). The following theorem summarizes the convergence properties of OMP. (il) fN = PvN f, N = 0, I, 2,. . .. Proof: The proof of convergence parallels the proof of Theorem 1 in [3]. The proof of the the second prop erty follows immediately from the orthogonality con- ditions of Equation (3). Remarks: The key difference between MP and OMP lies in P r o p erty (iii) of Theorem 2.1. Property (iii) implies that at the N t h step we have the best approximation we can get using the N vectors we have selected from the dictionary. Therefore in the case of finite dictio- naries of size M, OMP converges in no more than M iterations to the projection of f onto the span of the dictionary elements. As mentioned earlier, Matching Pursuit does not possess this property. 2.3 Some Computational Details in As the case of MP, inner products { ( R k f, zcj)} may be computed recursively. For OMP we may express these recursions implicitly in the for- mula the ( R k f y z j ) = ( f - f k , z j ) = (f,zj) - k (%,zj). n = l (8) The only additional computation required for OMP, arises in determining the bi's of the auxiliary model (5). To compute the hi's we rewrite the normal equa- tions associated with ( 5 ) as a system of k linear equa- tions, v k = A k b k , (9) where and 42
Note that the positive constant 6 used in Step (111) of OMP guarantees nonsingularity of the matrix Ak, hence we may write bk = Ak'Vk. (10) However, since Ak+l may be written as (where * denotes conjugate transpose) it may be shown using the block matrix inversion formula that 100, I , , , , , , , , 1 where P = 1/(1 - vibk). Hence and therefore bktl, may be computed recursively using A;', and bk from the previous step. Figure 2: Example I : (a) Original signal f, with OMP approximation superimposed, (b) Squared L2 norm of residual Rkf versus iteration number k, for both OMP (solid line) and MP (dashed line). 3 Examples In the following examples we consider represen- tations with repect to an affine wavelet frame con- structed from dilates and translates of the second derivate of a Gaussian, i.e. D = {$m,", m,n E Z} where, $m,"(C) = 242$(2% - n), and the analyzing wavelet $J is given by, Note that for wavelet dictionaries, the initial set of in- ner products {(f, $m,n)}, are readily computed by one convolution followed by sampling at each dilation level m. The dictionary used in these examples consists of a total of 351 vectors. In our first example, both OMP and MP were a p plied to the signal shown in Figure 2(a). We see from Figure 2(b) that OMP clearly converges in far fewer iterations than MP. The squared magnitude of the co- efficients o k , of the resulting representation is shown in Figure 3. We could also compare the two algorithms on the basis of required computational effort to com- pute representations of signals to within a prespecified error. However such a comparison can only be made for a given signal and dictionary, as the number of it- erations required for each algorithm depends on both the signal and the dictionary. For example, for the signal of Example I, we see from Figure 4 that it is 3 43 -80 -60 4 0 -P 0 a0 Tnnl.icn Indm 40 80 80 Figure 3: Distribution of coefficients obtained by ap- plying OMP in Example I. Shading is proportional to squared magnitude of the coefficients ah, with dark colors indicating large magnitudes. to 8 times more expensive to achieve a prespecified er- ror using OMP even though OMP converges in fewer iterations. On the other hand for the signal shown in Figure 5, which lies in the span of three dictionary vectors, it is approximately 20 times more expensive to apply MP. In this case OMP converges in exactly three iterations. 4 Summary and Conclusions In this paper we have described a recursive al- gorithm, which we refer to as Orthogonal Matching Pursuit (OMP) , to compute representations of signals with respect to arbitrary dictionaries of elementary functions. The algorithm we have described is a mod- ification of the Matching Pursuit (MP) algorithm of Mallat and Zhang [3] that improves convergence us-
at each step. Acknowledgements This research of Y.C.P. was supported in part by NASA Headquarters, Center for Aeronautics and Space Informa- tion Sciences (CASK) under Grant NAGW419,S6, and in part by the Advanced Research Projects Agency of the De- partment of Defense monitored by the Air Force Office of Scientific Research under Contract F49620-93-1-0085. This research of R.R. and P.S.K was supported in part by the Air Force Office of Scientific Research under con- tract F49620-92-J-0500, the AFOSR University Research Initiative Program under Grant AFOSR-90-0105, by the Army Research Office under Smart Structures URI Con- tract no. DAAL03-92-G-0121, and by the National Sci- ence Foundation's Engineering Research Centers Program, NSFD CDR 8803012. References [l] D. Donoho, I. Johnstone, P. Rousseeuw, and W. Stahel. The Annals of Statistics, 13(2):496- 500, 1985. Discussion following article by P. Hu- ber. [2] P. J . Huber. Projection pursuit. The Annals of Statistics, 13(2):435-475, 1985. [3] S. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. Preprint. Submitted to IEEE Transactions on Signal Processing, 1992. I 0' - HP 1 - - - - I Figure 4: Computational cost (FLOPS) versus ap- proximation error for both OMP (solid line) and MP (dashed line) applied to the signal in Example I. Figure 5: Example 11: (a) Original signal f , (b) Squared L2 norm of residual Rkf versus iteration number IC, for both OMP (solid line) and MP (dashed line). ing an additional orthogonalization step. The main benefit of OMP over MP is the fact that it is guar- anteed to converge in a finite number of steps for a finite dictionary. We also demonstrated that all addi- tional computation that is required for OMP may be performed recursively. The two algorithms, MP and OMP, were compared on two simple examples of decomposition with respect to a wavelet dictionary. It was noted that although OMP converges in fewer iterations than MP, the com- putational effort required for each algorithm depends on both the class of signals and choice of dictionary. Although we do not provide a rigorous argument here, it seems reasonable to conjecture that OMP will be computationally cheaper than MP for very redundant dictionaries, as knowledge of the redundancy is ex- ploited in OMP to reduce the error as much as possible 44
分享到:
收藏