Society for Industrial and Applied Mathematics, 2008. — 145 p.
This book arose from a pair of symposia on hidden Markov models (HMMs) that Kevin Vixie organized at the 2001 SIAM Conference on Applications of Dynamical Systems. At the end of the first symposium someone asked the speakers for a simple reference that explains the basic ideas and algorithms for applying HMMs. We were stumped. A group of the participants suggested writing this book to answer the question. I have aimed the book at readers who have backgrounds and interests typical of those who attended that conference. In my view, HMMs are discrete-state, discrete-time stochastic dynamical systems, and they are often used to approximate dynamical systems with continuous state spaces operating in continuous time. Thus, by using familiar analogies, it is easy to explain HMMs to readers who have studied dynamical systems.
The basic techniques were well developed in the 1970’s for work on speech and language processing. Many in speech research learned about the techniques at a symposium held at the Institute for Defense Analysis in Princeton, NJ. J. D. Ferguson edited the proceedings [4], and copies were given to the attendees.1 The volume was called the blue book by workers in the field. I was not part of that community, but I have a copy of the blue book. It explains the basic algorithms and illustrates them with simple concrete examples. I hope this book is as simple, useful, and clear.
Although there are other books and papers that are about HMMs exclusively or in part, I hope that readers find the following features of this present volume useful:
It is introductory. An undergraduate background in engineering, mathematics, or science that includes work in probability, linear algebra, and differential equations provides the prerequisites for most of the book. The exceptions are ideas from dynamical systems and information theory. In particular, I use the Gibbs inequality (see (2.53)) in developing the estimate maximize (EM) algorithm in Chapter
2. Although Chapter 5 deals with Lyapunov exponents and entropies, it is not a prerequisite for any other chapter.
Algorithms are explained and justified. I present enough of the theory behind the basic algorithms in Chapter 2 so that a reader can use it as a guide to developing his own variants.
I provide code implementing the algorithms and data for the examples. Although algorithms are given in pseudocode in the text, a working implementation of each of the algorithms that I describe is available on the Web [56]. I have chosen to write the programs in the Python language [59] because it is easy to read and the interpreter is free software. I have written the programs to follow the descriptions and notation in the text. I provide data and scripts (makefiles, shell scripts, etc.) that make all of the figures in the text. On a GNU system, issuing make book.pdf from a command line compiles the software, runs the numerical experiments, makes the figures, and formats the entire book.
It uses analogies to dynamical systems. For example, I demonstrate the HMM training algorithm by applying it to data derived from the Lorenz system. The result, as Fig. 1.9 illustrates, is that the algorithm estimates a discrete-state generating mechanism that is an approximation to the state space of the original Lorenz system.
I illustrate with a practical example. In Chapter 6, I present an application to experimental measurements of electrocardiograms (ECGs).
Basic Algorithms
Variants and Generalizations
Continuous States and Observations and Kalman Filtering
Performance Bounds and a Toy Problem
Obstructive Sleep Apnea
A Formulas for Matrices and Gaussians
B Notes on Software