SPro
Speech Signal Processing Toolkit, release 5.0.
Last updated 9 November 2010.
Guillaume Gravier
Copyright c 1996 – 2010, Guillaume Gravier.
i
Table of Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What is SPro? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 How to read this manual? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Installing SPro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Reporting bugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Speech analysis techniques . . . . . . . . . . . . . . . . . 5
2.1 Pre-emphasis and windowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Variable resolution spectral analysis . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Filter-bank analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Linear predictive analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 PLP analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Cepstral analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Deltas and normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 The SPro tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 File formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Waveform streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Feature streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Common options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.1 I/O options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Waveform framing options. . . . . . . . . . . . . . . . . . . . . . 14
3.2.3 Feature vector options . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.4 Miscellaneous options . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 I/O via stdin and stdout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Extracting features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Filter-bank analysis tools. . . . . . . . . . . . . . . . . . . . . . . 15
Filter-bank log-magnitude features . . . . . . . . . . . . . 15
Filter-bank cepstral features . . . . . . . . . . . . . . . . . . . 15
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.2 LPC analysis tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Linear prediction coefficients . . . . . . . . . . . . . . . . . . . 17
Linear prediction cepstrum. . . . . . . . . . . . . . . . . . . . . 17
PLP cepstrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Manipulating feature streams . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.1 Operations on feature streams . . . . . . . . . . . . . . . . . . 19
3.5.2 Exporting features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.3 Importing from a previous SPro release. . . . . . . . . . 21
3.5.4 Copy options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ii
SPro
4 The SPro library . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Waveform streams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 Memory allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.2 Opening streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.3 Reading frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1.4 Computing frame energy . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Feature description flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Feature streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.1 Opening feature streams . . . . . . . . . . . . . . . . . . . . . . . 26
Conversion flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Opening for I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Accessing stream attributes . . . . . . . . . . . . . . . . . . . . 28
4.3.2 Reading and writing feature vectors . . . . . . . . . . . . . 29
4.3.3 Seeking into a stream . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Storing features without streams . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4.1 Buffer allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4.2 Accessing buffer elements . . . . . . . . . . . . . . . . . . . . . . 31
4.4.3 Buffer I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.4 Buffers and streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 Feature conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.6 FFT-based functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6.1 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6.2 Filter-bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.6.3 Cosine transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.7 LPC-based functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.7.1 Linear prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.7.2 LPC conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.8 Miscellaneous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5 Quick reference guide . . . . . . . . . . . . . . . . . . . . . 41
5.1 sfbank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 sfbcep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 slpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 slpcep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.5 splp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
iii
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.6 scopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2 Changes from previous version . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3 Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
iv
SPro
Chapter 1: Introduction
1
1 Introduction
1.1 What is SPro?
SPro is a speech signal processing toolkit which provides runtime commands implement-
ing standard feature extraction algorithms for speech and speaker recognition applications
and a C library to implement new algorithms and to use SPro files within your own pro-
grams.
SPro was originally designed for variable resolution spectral analysis but also provides for
feature extraction techniques classically used in speech applications. There are commands
for the following representations:
• filter-bank energies
• cepstral coefficients (filter-bank or linear prediction)
• linear prediction derived representation (prediction and reflection coefficients, log area
ratios and line spectrum pairs)
Though the toolkit has been designed as a front-end for applications such as speech or
speaker recognition, we believe the library provides enough possibilities to implement var-
ious feature extraction algorithms easily (e.g. zero crossing rate). However, no command
for such features is provided.
The library, written in ANSI C, provides functions for the following:
• waveform signal input
• low-level signal processing (FFT, LPC analysis, etc.)
• low-level feature processing (lifter, CMS, variance normalization, deltas, etc.)
• feature I/O
The library does not provide for high-level feature extraction functions which directly
converts a waveform into features, mainly because such functions would require a tremen-
dous number of arguments in order to be versatile. However, it is rather trivial to write
such a function for your particular needs using the SPro library.
1.2 How to read this manual?
The manual is divided into three main parts:
1. user manual
2. programmer manual
3. reference manual
Chapter 3 [SPro tools], page 11 is the user manual.
It provides a description of the
speech analysis algorithms involved (see Chapter 2 [Speech analysis], page 5) and explains
in details the use and the implementation of the SPro commands sfbank, sfbcep, slpc,
slpcep and scopy. Section 3.1 [File formats], page 11 describes the supported waveform
file formats and the SPro feature file format. The next sections are dedicated to the detailed
description of the SPro tools.
2
SPro
Chapter 4 [SPro library], page 23 is the programmer manual which describes the library
main data structures and the associated functions.
Chapter 5 [Reference guide], page 41 provides a quick reference manual for the SPro
tools syntax.
If you have been using a former version of SPro, read Section 6.3 [Compatibility], page 55
carefully for crucial information on the (in)compatibility of SPro 5.0 with the previous
versions.
Finally, to learn more about the evolution of SPro, the history of the various SPro
releases is detailed in Chapter 6 [Changes], page 55.
1.3 Installing SPro
Installation follows the standard GNU installation procedure. The two following lines in
your favorite shell
./configure
make
will build the library and the runtimes. SPro supports some extra features based on some
external packages. These features can be turned on/off (depending on whether you have
them already installed on your machine) using the ‘--with-xxx’ options of the configure
script. Supported enable options are:
--with-sphere[=path]
SPHERE 2.6 file format support
installed in a standard place on your
the sphere library is
system (e.g.
If
‘/usr/local/include’ and ‘/usr/local/lib’),
there is no need to specify path.
Otherwise, path should point to the directory where the sphere library has been installed.
configure will search for the library includes in path/include and for the archives in
path/lib. Compiling SPro with the ‘-O3’ option of the gcc compiler (CFLAGS=-O3) is a
good idea for sake of rapidity.
Before installing, you may want to check your build by typing
make check
Finally, installing the library, the runtimes and the info documentation can be done
running
make install
The installation path is specified by the configuration script (try ./configure --help
for details) and defaults to ‘/usr/local’.
See file ‘INSTALL’ in the distribution top directory for more details.
To the author knowledge, SPro has been successfully build and used on Linux,
SPARC/SunOS, and HP-UX. It should also work on AIX though this has not been tested
so far.
1.4 License
As of release 5.0, SPro is distributed as a free software under the MIT License agreement:
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in