logo资料库

语音编码解码协议G.729及建模.pdf

第1页 / 共10页
第2页 / 共10页
第3页 / 共10页
第4页 / 共10页
第5页 / 共10页
第6页 / 共10页
第7页 / 共10页
第8页 / 共10页
资料共10页,剩余部分请下载后查看
Digital Coding of Speech Signals and Images – Spring 2004 ITU-T G.729/G.729A CS-ACELP 8kbps Speech Coder Presented by: Lior Shadhan Introduction The G.729 Encoder Overview Basic Block Diagram General Algorithm Description Quantized Parameters and Quantization Methods The G.729 Decoder Overview Basic Block Diagram General Algorithm Description Extensions (G.729 Standard Annexes, G.729A) Performance Agenda Agenda Agenda 2
Introduction Introduction Introduction G.729 is a result of the combined work of the ITU-T SG 15 and 12, Academic institutions and the Industry between the years 1990-1996. Their Goal was: Define a 8kbps speech coder with equivalent speech quality to that of a 32 kbps G.726 ADPCM for most operating conditions. Main Features: Short algorithmic delay (10mS+5mS look-ahead). Low bandwidth toll-quality coder (8kbs at ~3.9 MOS). Provision for concealment of detected frame erasures. Improved robustness against channel error. Bit stream interoperable with G.729A. 3 Encoder --- Basic Block Diagram Encoder Basic Block Diagram Basic Block Diagram Encoder Encoder Back 4
Algorithm Overview Algorithm Overview --- Encoder Encoder Encoder Algorithm Overview Input Samples Band limited, sampled at 8kHz. Linear PCM, 16 bit. Preprocessing High-Pass Filter, 140Hz. Scaling (by 2). LP Analysis (of the 10th order) 10mS 80 Samples 30mS 240 Samples Performed once per 10mS frame using a 30mS asymmetric window. PARCORs are being calculated from the modified autocorrelation and converted to LP coefficients using the Levinson-Durbin algorithm. LSFs are being calculated from the LP coefficients. Encoder 5 Algorithm Overview --- Encoder Algorithm Overview Encoder Encoder Algorithm Overview LP Analysis (continued) L0,L1,L2,L3) The LSFs are being quantized (L0,L1,L2,L3 The calculated LSFs are used for the second sub-frame. For the first sub-frame interpolated LSF (in the cosine domain) are being used. The LSFs are converted back to LP coefficients in order to construct the synthesis and weighting filters for each sub-frame. Perceptual Weighting zA /( γ 1 zA /( γ Based on the unquantized LPCs. 2 The values of γ1 and γ2 are a function of the spectral shape of the input signal. The adaptation of γ1 and γ2 is being implemented once for each frame. The values are being interpolated for the first sub-frame. zW = )( ) ) Encoder 6
Algorithm Overview Algorithm Overview --- Encoder Encoder Encoder Algorithm Overview Open Loop Pitch Estimation Calculated for each frame of the weighted speech signal. Choose 3 maxima points of the correlation 79 from the ranges: k = 29,---,39; k = 40,---,79; k = 80,---,143 The maxima points are normalized and weighted (favoring the delays in the lower range). kR )( ∑ = = n 0 knsns − w )( ( w ) Closed Loop Pitch Search (Adaptive Codebook) Search for the excitation in the adaptive codebook that will minimize the weighted MSE for each sub-frame, 39 ∑ i.e. maximize: nynx )( )( k nyny )( k )( k n 0 = 39 ∑ n = 0 Encoder 7 Algorithm Overview --- Encoder Algorithm Overview Encoder Encoder Algorithm Overview Closed Loop Pitch Search (continued) The target signal x(n) is the LP residual filtered through the quantized synthesis and unquantized weighted filters: x(n)=r(n)*h(n)-ZIR Lag for the first sub-frame is searched around the open-loop lag. Lag for the second sub-frame is searched around the lag from the first sub-frame The adaptive codebook gain is given by: 39 g p The Lags and Gains for each sub-frame are quantized. A new target signal is computed for the fixed-codebook: x'(n)=x(n)-gpy(n) ∑ n 0 == 39 ∑ n = 0 nynx )()( nyny )()( Encoder 8
Algorithm Overview Algorithm Overview --- Encoder Encoder Encoder Algorithm Overview Algebraic Codebook Search Codebook vectors are determined from the transmitted index. Each code-vector contains four positive/negative non-zero pulses. The optimum codeword will maximize: 39 2 ⎛ ⎜ ⎝ n ⎞ ⎟ ⎠ k ncnd )( )( ∑ 0 HcHc T K T k = , nd )( = 39 ∑ ni = nihix ()( ′ − ) Because lag values can be less than the frame size, h(n) is filtered with a long term prediction filter before the codebook search. The search is performed in four nested loops, where in each loop the contribution of a new pulse is added. Encoder 9 Bit Allocation Bit Allocation Bit Allocation Bit Allocation for a 10mS frame: Parameter Line Spectrum Pairs Adaptive Codebook Index Parity for P1 Fixed Codebook Index Fixed Codebook Pulse Signs Codebooks Gains (stage 1) Codebooks Gains (stage 2) Codeword L0, L1, L2 ,L3 P1, P2 P0 C1, C2 S1, S2 F1, F2 G1, G2 Subframe 1 Subframe 2 Total per frame Description 8 1 13 4 3 4 5 13 4 3 4 18 13 1 26 8 6 8 80 LP Filter coefficients Lag / Delay 6 bit parity bit for P1 Indexes of 4 pulses Amplitudes of the pulses Encoded gains bits/frame Total number of bits per 10mS frame: 80. Resulting rate: 8kbps. 10
Quantization Methods Quantization Methods Quantization Methods LSFs: Quantization using Predictive VQ (with 4th order MA filter). Two MA predictor modes (L0): Strong / mild correlation of the LSF coefficients between frames. The difference between the computed and predicted coefficients is quantized using a two-stage vector quantizer. First stage: a 10-D VQ using codebook with 128 entries (L1). MSE criteria for choosing the vector. Second stage: a 10bit VQ that has been implemented as a split VQ using two 5-D codebooks, each containing 32 entries (L2, L3). Weighted MSE criteria for choosing the 2 vectors. The MA predictor that minimize the weighted MSE is chosen. Quantization Methods Quantization Methods Quantization Methods Adaptive codebook indexes: The lag for the first sub-frame is encoded using 8 bit + parity bit. The lag for the second sub-frame is encoded relative to the lag of the first sub-frame using 5 bits. Fixed codebook indexes: Binary encoding of the location of the pulse (3/4 bits) for each pulse. Fixed codebook pulse signs: Sign bit for each pulse. Codebook gains: Quantized together using 7bit CS-VQ (3bit + 4bit 2-D VQ). For the fixed codebook, only the correction factor γ of the MA predicted gain is quantized. Criteria: Minimize nyg )( p − nzg ( c 2)) E = w nx )(( − 39 ∑ n = 0 11 12
Decoder Decoder --- Block Diagram Block Diagram Block Diagram Decoder Back 13 Algorithm Overview --- Decoder Algorithm Overview Decoder Decoder Algorithm Overview Generate the LP filter coefficients. For each sub-frame, the scaled adaptive- and fixed-codebook vectors are filtered with the LP synthesis filter to generate the reconstructed speech signal. Post processing that cascades the followings is being implemented on the weighted residual of the reconstructed speech signal: Long-term filter. Short-term filter. Tilt compensation filter. Adaptive gain control. High pass. Up scaling (by 2). Decoder 14
G.729 Annexes: Extensions Extensions Extensions C A B D E F Reduced complexity 8 kbit/s CS-ACELP speech codec A silence compression scheme for G.729 optimized for terminals conforming to Recommendation V.70 Reference floating-point implementation for G.729 CS- ACELP 8 kbit/s speech coding 6.4 kbit/s CS-ACELP speech coding algorithm 11.8 kbit/s CS-ACELP speech coding algorithm Reference implementation of G.729 Annex B DTX functionality for Annex D Reference implementation of G.729 Annex B DTX functionality for Annex E Reference implementation of switching procedure between G.729 Annexes D and E Reference fixed-point implementation for integrating G.729 CS-ACELP speech coding main body with Annexes B, D and E Appendix External synchronous reset performance for G.729 codecs in H G I systems using external VAD/DTX/CNG G.729 --- Annex A G.729 Annex A Annex A G.729 Bit stream interoperable with G.729. Reduction in the complexity of G.729 (~40-50%) without significant degradation (small degradation in the case of 3 tandems and in the presence of background noise). Main algorithmic changes: Weighting filter: )(ˆ zA /(ˆ γzA ) Open loop pitch analysis is simplified using “decimation”: zW )( = , γ = 75.0 kR )( = 79 ∑ n = 0 s w sn )2( w 2( kn − ) Simplified adaptive codebook search, i.e. maximize: nynx )( )( 39 k ∑ n = 0 The algebraic codebook search is simplified. The long-term filter at the receiver uses only integer delays. = 39 ∑ n = 0 nunx )( b )( k 15 16
分享到:
收藏