logo资料库

Kevin Patrick Murphy 博士论文.pdf

第1页 / 共225页
第2页 / 共225页
第3页 / 共225页
第4页 / 共225页
第5页 / 共225页
第6页 / 共225页
第7页 / 共225页
第8页 / 共225页
资料共225页,剩余部分请下载后查看
Dynamic Bayesian Networks: Representation, Inference and Learning by Kevin Patrick Murphy B.A. Hon. (Cambridge University) 1992 M.S. (University of Pennsylvania) 1994 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA, BERKELEY Committee in charge: Professor Stuart Russell, Chair Professor Michael Jordan Professor Peter Bickel Professor Jeffrey Bilmes Fall 2002
The dissertation of Kevin Patrick Murphy is approved: Chair University of California, Berkeley Fall 2002 Date Date Date Date
Dynamic Bayesian Networks: Representation, Inference and Learning Copyright 2002 by Kevin Patrick Murphy
ABSTRACT Dynamic Bayesian Networks: Representation, Inference and Learning by Kevin Patrick Murphy Doctor of Philosophy in Computer Science University of California, Berkeley Professor Stuart Russell, Chair Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T ) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T ) space instead of O(T ); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling. 1
DEDICATION To my parents for letting me pursue my dream for so long so far away from home & To my wife for giving me new dreams to pursue i
ACKNOWLEDGMENTS I would like to thank my advisor, Stuart Russell, for supporting me over the years, and for giving me so much freedom to explore and discover new areas of probabilistic AI. My other committee members have also been very supportive. Michael Jordan has long been an inspiration to me. His classes and weekly meetings have proved to be one of my best learning experiences at Berkeley. Jeff Bilmes proved to be a most thorough reviewer, as I expected, and has kept me honest about all the details. Peter Bickel brought a useful outsider’s perspective to the thesis, and encouraged me to make it more accessible to non computer scientists (although any failings in this regard are of course my fault). I would like to thank my many friends and colleagues at Berkeley with whom I have had the pleasure of working over the years. These include Eyal Amir, David Andre, Serge Belongie, Jeff Bilmes, Nancy Chang, Nando de Freitas, Nir Friedman, Paul Horton, Srini Narayanan, Andrew Ng, Mark Paskin, Sekhar Tatikonda, Yair Weiss, Erix Xing, Geoff Zweig, and all the members of the RUGS and IR groups. I would like to thank Jim Rehg for hiring me as an intern at DEC/Compaq/HP Cambridge Research Lab in 1997, where my Bayes Net Toolbox (BNT) was born. I would like to thank Gary Bradski for hiring me as an intern at Intel in 2000 to work on BNT, and for providing me with the opportunity to work with people spanning three countries formerly known as superpowers — USA, China and Russia. In particular, I would like to thank Wei Hu and Yimin Zhang, of ICRC, for their help with BNT. I would also like to thank the many people on the web who have contributed bug fixes to BNT. By chance, I was able to work with Sebastian Thrun during part of my time with Intel, for which I am very grateful. I would like to thank my friends in Jennie Nation and beyond for providing a welcome distraction from school. Finally, I would like to thank my wife Margaret for putting up with my weekends in the office, for listening to my sagas from Soda land, and for giving me the motivation to finish this thesis. ii
Contents 1 Introduction 1.1 State-space models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Hidden Markov Models (HMMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 The problem with HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Kalman Filter Models (KFMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 The problem with KFMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Overview of the rest of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 A note on software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Declaration of previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 DBNs: Representation 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 DBNs defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Representing HMMs and their variants as DBNs . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 HMMs with mixture-of-Gaussians output . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 HMMs with semi-tied mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Auto-regressive HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Buried Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 1 3 4 7 9 9 10 10 11 12 12 13 13 14 14 16 16 18 18 18 20 21 22 23 24
2.3.5 Mixed-memory Markov models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 Input-output HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.7 Factorial HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.8 Coupled HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.9 Hierarchical HMMs (HHMMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.10 HHMMs for Automatic speech recognition (ASR) . . . . . . . . . . . . . . . . . . 2.3.11 Asynchronous IO-HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.12 Variable-duration (semi-Markov) HMMs . . . . . . . . . . . . . . . . . . . . . . . 2.3.13 Mixtures of HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.14 Segment models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.15 Abstract HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Continuous-state DBNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Representing KFMs as DBNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Vector autoregressive (VAR) processes . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Switching KFMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.4 Fault diagnosis in hybrid systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.5 Combining switching KFMs with segment models . . . . . . . . . . . . . . . . . . 2.4.6 Data association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.7 Tracking a variable, unknown number of objects . . . . . . . . . . . . . . . . . . . 2.5 First order DBNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Exact inference in DBNs 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The forwards-backwards algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The forwards pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 The backwards pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 An alternative backwards pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Two-slice distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 A two-filter approach to smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6 Time and space complexity of forwards-backwards . . . . . . . . . . . . . . . . . . 3.2.7 Abstract forwards and backwards operators . . . . . . . . . . . . . . . . . . . . . . 3.3 The frontier algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Forwards pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Backwards pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 25 26 27 28 35 41 41 43 44 46 49 49 49 51 52 53 55 56 57 58 58 58 59 60 60 61 61 62 63 63 64 64 iv
分享到:
收藏