Digital Coding of Speech Signals and Images – Spring 2004
ITU-T G.729/G.729A
CS-ACELP 8kbps Speech Coder
Presented by:
Lior Shadhan
Introduction
The G.729 Encoder Overview
Basic Block Diagram
General Algorithm Description
Quantized Parameters and Quantization Methods
The G.729 Decoder Overview
Basic Block Diagram
General Algorithm Description
Extensions (G.729 Standard Annexes, G.729A)
Performance
Agenda
Agenda
Agenda
2
Introduction
Introduction
Introduction
G.729 is a result of the combined work of the ITU-T SG 15 and 12,
Academic institutions and the Industry between the years 1990-1996.
Their Goal was:
Define a 8kbps speech coder with equivalent
speech quality to that of a 32 kbps G.726 ADPCM
for most operating conditions.
Main Features:
Short algorithmic delay (10mS+5mS look-ahead).
Low bandwidth toll-quality coder (8kbs at ~3.9 MOS).
Provision for concealment of detected frame erasures.
Improved robustness against channel error.
Bit stream interoperable with G.729A.
3
Encoder --- Basic Block Diagram
Encoder
Basic Block Diagram
Basic Block Diagram
Encoder
Encoder
Back
4
Algorithm Overview
Algorithm Overview --- Encoder
Encoder
Encoder
Algorithm Overview
Input Samples
Band limited, sampled at 8kHz.
Linear PCM, 16 bit.
Preprocessing
High-Pass Filter, 140Hz.
Scaling (by 2).
LP Analysis (of the 10th order)
10mS
80 Samples
30mS
240 Samples
Performed once per 10mS frame using a 30mS asymmetric window.
PARCORs are being calculated from the modified autocorrelation and
converted to LP coefficients using the Levinson-Durbin algorithm.
LSFs are being calculated from the LP coefficients.
Encoder
5
Algorithm Overview --- Encoder
Algorithm Overview
Encoder
Encoder
Algorithm Overview
LP Analysis (continued)
L0,L1,L2,L3)
The LSFs are being quantized (L0,L1,L2,L3
The calculated LSFs are used for the second sub-frame.
For the first sub-frame interpolated LSF (in the cosine domain) are
being used.
The LSFs are converted back to LP coefficients in order to construct
the synthesis and weighting filters for each sub-frame.
Perceptual Weighting
zA
/(
γ
1
zA
/(
γ
Based on the unquantized LPCs.
2
The values of γ1 and γ2 are a function of the spectral shape of the
input signal.
The adaptation of γ1 and γ2 is being implemented once for each frame.
The values are being interpolated for the first sub-frame.
zW =
)(
)
)
Encoder
6
Algorithm Overview
Algorithm Overview --- Encoder
Encoder
Encoder
Algorithm Overview
Open Loop Pitch Estimation
Calculated for each frame of the weighted speech signal.
Choose 3 maxima points of the correlation
79
from the ranges:
k = 29,---,39; k = 40,---,79; k = 80,---,143
The maxima points are normalized and weighted (favoring the
delays in the lower range).
kR
)(
∑
=
=
n
0
knsns
−
w
)(
(
w
)
Closed Loop Pitch Search (Adaptive Codebook)
Search for the excitation in the adaptive codebook that will
minimize the weighted MSE for each sub-frame,
39
∑
i.e. maximize:
nynx
)(
)(
k
nyny
)(
k
)(
k
n
0
=
39
∑
n
=
0
Encoder
7
Algorithm Overview --- Encoder
Algorithm Overview
Encoder
Encoder
Algorithm Overview
Closed Loop Pitch Search (continued)
The target signal x(n) is the LP residual filtered
through the quantized synthesis and unquantized weighted filters:
x(n)=r(n)*h(n)-ZIR
Lag for the first sub-frame is searched around the open-loop lag.
Lag for the second sub-frame is searched around the lag from the
first sub-frame
The adaptive codebook gain is given by:
39
g
p
The Lags and Gains for each sub-frame are quantized.
A new target signal is computed for the fixed-codebook:
x'(n)=x(n)-gpy(n)
∑
n
0
== 39
∑
n
=
0
nynx
)()(
nyny
)()(
Encoder
8
Algorithm Overview
Algorithm Overview --- Encoder
Encoder
Encoder
Algorithm Overview
Algebraic Codebook Search
Codebook vectors are determined from the transmitted index.
Each code-vector contains four positive/negative non-zero pulses.
The optimum codeword will maximize:
39
2
⎛
⎜
⎝
n
⎞
⎟
⎠
k
ncnd
)(
)(
∑
0
HcHc
T
K
T
k
=
,
nd
)(
=
39
∑
ni
=
nihix
()(
′
−
)
Because lag values can be less than the frame size, h(n) is filtered
with a long term prediction filter before the codebook search.
The search is performed in four nested loops, where in each loop
the contribution of a new pulse is added.
Encoder
9
Bit Allocation
Bit Allocation
Bit Allocation
Bit Allocation for a 10mS frame:
Parameter
Line Spectrum Pairs
Adaptive Codebook Index
Parity for P1
Fixed Codebook Index
Fixed Codebook Pulse Signs
Codebooks Gains (stage 1)
Codebooks Gains (stage 2)
Codeword
L0, L1, L2 ,L3
P1, P2
P0
C1, C2
S1, S2
F1, F2
G1, G2
Subframe 1
Subframe 2
Total per frame
Description
8
1
13
4
3
4
5
13
4
3
4
18
13
1
26
8
6
8
80
LP Filter coefficients
Lag / Delay
6 bit parity bit for P1
Indexes of 4 pulses
Amplitudes of the pulses
Encoded gains
bits/frame
Total number of bits per 10mS frame: 80.
Resulting rate: 8kbps.
10
Quantization Methods
Quantization Methods
Quantization Methods
LSFs:
Quantization using Predictive VQ (with 4th order MA filter).
Two MA predictor modes (L0): Strong / mild correlation of the
LSF coefficients between frames.
The difference between the computed and predicted coefficients is
quantized using a two-stage vector quantizer.
First stage: a 10-D VQ using codebook with 128 entries (L1).
MSE criteria for choosing the vector.
Second stage: a 10bit VQ that has been implemented as a split VQ
using two 5-D codebooks, each containing 32 entries (L2, L3).
Weighted MSE criteria for choosing the 2 vectors.
The MA predictor that minimize the weighted MSE is chosen.
Quantization Methods
Quantization Methods
Quantization Methods
Adaptive codebook indexes:
The lag for the first sub-frame is encoded using 8 bit + parity bit.
The lag for the second sub-frame is encoded relative to the lag of the
first sub-frame using 5 bits.
Fixed codebook indexes:
Binary encoding of the location of the pulse (3/4 bits) for each pulse.
Fixed codebook pulse signs:
Sign bit for each pulse.
Codebook gains:
Quantized together using 7bit CS-VQ (3bit + 4bit 2-D VQ).
For the fixed codebook, only the correction factor γ of the MA
predicted gain is quantized.
Criteria: Minimize
nyg
)(
p
−
nzg
(
c
2))
E
=
w
nx
)((
−
39
∑
n
=
0
11
12
Decoder
Decoder --- Block Diagram
Block Diagram
Block Diagram
Decoder
Back
13
Algorithm Overview --- Decoder
Algorithm Overview
Decoder
Decoder
Algorithm Overview
Generate the LP filter coefficients.
For each sub-frame, the scaled adaptive- and fixed-codebook vectors
are filtered with the LP synthesis filter to generate the reconstructed
speech signal.
Post processing that cascades the followings is being implemented on
the weighted residual of the reconstructed speech signal:
Long-term filter.
Short-term filter.
Tilt compensation filter.
Adaptive gain control.
High pass.
Up scaling (by 2).
Decoder
14
G.729 Annexes:
Extensions
Extensions
Extensions
C
A
B
D
E
F
Reduced complexity 8 kbit/s CS-ACELP speech codec
A silence compression scheme for G.729 optimized for
terminals conforming to Recommendation V.70
Reference floating-point implementation for G.729 CS-
ACELP 8 kbit/s speech coding
6.4 kbit/s CS-ACELP speech coding algorithm
11.8 kbit/s CS-ACELP speech coding algorithm
Reference implementation of G.729 Annex B DTX
functionality for Annex D
Reference implementation of G.729 Annex B DTX
functionality for Annex E
Reference implementation of switching procedure between
G.729 Annexes D and E
Reference fixed-point implementation for integrating G.729
CS-ACELP speech coding main body with Annexes B, D and E
Appendix External synchronous reset performance for G.729 codecs in
H
G
I
systems using external VAD/DTX/CNG
G.729 --- Annex A
G.729
Annex A
Annex A
G.729
Bit stream interoperable with G.729.
Reduction in the complexity of G.729 (~40-50%) without significant
degradation (small degradation in the case of 3 tandems and in the
presence of background noise).
Main algorithmic changes:
Weighting filter:
)(ˆ
zA
/(ˆ
γzA
)
Open loop pitch analysis is simplified using “decimation”:
zW
)(
=
,
γ
=
75.0
kR
)(
=
79
∑
n
=
0
s
w
sn
)2(
w
2(
kn
−
)
Simplified adaptive codebook search, i.e. maximize:
nynx
)(
)(
39
k
∑
n
=
0
The algebraic codebook search is simplified.
The long-term filter at the receiver uses only integer delays.
=
39
∑
n
=
0
nunx
)(
b
)(
k
15
16