logo资料库

Factor Graphs and the Sum-Product Algorithm.pdf

第1页 / 共22页
第2页 / 共22页
第3页 / 共22页
第4页 / 共22页
第5页 / 共22页
第6页 / 共22页
第7页 / 共22页
第8页 / 共22页
资料共22页,剩余部分请下载后查看
498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Factor Graphs and the Sum-Product Algorithm Frank R. Kschischang, Senior Member, IEEE, Brendan J. Frey, Member, IEEE, and Hans-Andrea Loeliger, Member, IEEE Abstract—Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of “local” functions, each of which depends on a subset of the variables. Such a factorization can be visualized with a bipartite graph that we call a factor graph. In this tutorial paper, we present a generic message-passing algo- rithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes—either exactly or approximately—var- ious marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative “turbo” decoding algorithm, Pearl’s belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms. Index Terms—Belief propagation, factor graphs, fast Fourier transform, forward/backward algorithm, graphical models, iter- ative decoding, Kalman filtering, marginalization, sum-product algorithm, Tanner graphs, Viterbi algorithm. I. INTRODUCTION T HIS paper provides a tutorial introduction to factor graphs and the sum-product algorithm, a simple way to under- stand a large number of seemingly different algorithms that have been developed in computer science and engineering. We con- sider algorithms that deal with complicated “global” functions of many variables and that derive their computational efficiency by exploiting the way in which the global function factors into a product of simpler “local” functions, each of which depends on a subset of the variables. Such a factorization can be visual- ized using a factor graph, a bipartite graph that expresses which variables are arguments of which local functions. Manuscript received August 3, 1998; revised October 17, 2000. The work of F. R. Kschischang was supported in part, while on leave at the Massachu- setts Institute of Technology, by the Office of Naval Research under Grant N00014-96-1-0930, and by the Army Research Laboratory under Cooperative Agreement DAAL01-96-2-0002. The work of B. J. Frey was supported, while a Beckman Fellow at the Beckman Institute of Advanced Science and Technology, University of Illinois at Urbana-Champaign, by a grant from the Arnold and Mabel Beckman Foundation. The material in this paper was presented in part at the 35th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, September 1997. F. R. Kschischang is with the Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada (e-mail: frank@comm.utoronto.ca). B. J. Frey is with the Faculty of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada, and the Faculty of Electrical and Com- puter Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801-2307 USA (e-mail: frey@.uwaterloo.ca). H.-A. Loeliger is with the Signal Processing Lab (ISI), ETH Zentrum, CH-8092 Zürich, Switzerland (e-mail: loeliger@isi.ee.ethz.ch). Communicated by T. E. Fuja, Associate Editor At Large. Publisher Item Identifier S 0018-9448(01)00721-0. The aim of this tutorial paper is to introduce factor graphs and to describe a generic message-passing algorithm, called the sum-product algorithm, which operates in a factor graph and at- tempts to compute various marginal functions associated with the global function. The basic ideas are very simple; yet, as we will show, a surprisingly wide variety of algorithms devel- oped in the artificial intelligence, signal processing, and dig- ital communications communities may be derived as specific instances of the sum-product algorithm, operating in an appro- priately chosen factor graph. Genealogically, factor graphs are a straightforward gen- eralization of the “Tanner graphs” of Wiberg et al. [31], [32]. Tanner [29] introduced bipartite graphs to describe families of codes which are generalizations of the low-density parity-check (LDPC) codes of Gallager [11], and also described the sum-product algorithm in this setting. In Tanner’s original formulation, all variables are codeword symbols and hence “visible”; Wiberg et al., introduced “hidden” (latent) state vari- ables and also suggested applications beyond coding. Factor graphs take these graph-theoretic models one step further, by applying them to functions. From the factor-graph perspective (as we will describe in Section III-A), a Tanner graph for a code represents a particular factorization of the characteristic (indicator) function of the code. While it may seem intuitively reasonable that some algo- rithms should exploit the manner in which a global function factors into a product of local functions, the fundamental insight that many well-known algorithms essentially solve the “MPF” (marginalize product-of-functions) problem, each in their own particular setting, was first made explicit in the work of Aji and McEliece [1]. In a landmark paper [2], Aji and McEliece develop a “generalized distributive law” (GDL) that in some cases solves the MPF problem using a “junction tree” represen- tation of the global function. Factor graphs may be viewed as an alternative approach with closer ties to Tanner graphs and previously developed graphical representations for codes. Es- sentially, every result developed in the junction tree/GDL set- ting may be translated into an equivalent result in the factor graph/sum-product algorithm setting, and vice versa. We prefer the latter setting not only because it is better connected with pre- vious approaches, but also because we feel that factor graphs are in some ways easier to describe, giving them a modest pedagog- ical advantage. Moreover, the sum-product algorithm can often be applied successfully in situations where exact solutions to the MPF problem (as provided by junction trees) become computa- tionally intractable, the most prominent example being the iter- ative decoding of turbo codes and LDPC codes. In Section VI we do, however, discuss strategies for achieving exact solutions to the MPF problem in factor graphs. 0018–9448/01$10.00 © 2001 IEEE
KSCHISCHANG et al.: FACTOR GRAPHS AND THE SUM-PRODUCT ALGORITHM 499 There are also close connections between factor graphs and graphical representations (graphical models) for multidimen- sional probability distributions such as Markov random fields [16], [18], [26] and Bayesian (belief) networks [25], [17]. Like factor graphs, these graphical models encode in their structure a particular factorization of the joint probability mass function of several random variables. Pearl’s powerful “belief propagation” algorithm [25], which operates by “message-passing” in a Bayesian network, translates immediately into an instance of the sum-product algorithm operating in a factor graph that expresses the same factorization. Bayesian networks and belief propagation have been used previously to explain the iterative decoding of turbo codes and LDPC codes [9], [10], [19], [21], [22], [24], the most powerful practically decodable codes known. Note, however, that Wiberg [31] had earlier described these decoding algorithms as instances of the sum-product algorithm; see also [7]. We begin the paper in Section II with a small worked example that illustrates the operation of the sum-product algorithm in a simple factor graph. We will see that when a factor graph is cycle-free, then the structure of the factor graph not only en- codes the way in which a given function factors, but also en- codes expressions for computing the various marginal functions associated with the given function. These expressions lead di- rectly to the sum-product algorithm. In Section III, we show how factor graphs may be used as a system and signal-modeling tool. We see that factor graphs are compatible both with “behavioral” and “probabilistic” modeling styles. Connections between factor graphs and other graphical models are described briefly in Appendix B, where we recover Pearl’s belief propagation algorithm as an instance of the sum- product algorithm. In Section IV, we apply the sum-product algorithm to trellis-structured (hidden Markov) models, and obtain the forward/backward algorithm, the Viterbi algorithm, and the Kalman filter as instances of the sum-product algorithm. In Section V, we consider factor graphs with cycles, and obtain the iterative algorithms used to decode turbo-like codes as instances of the sum-product algorithm. In Section VI, we describe several generic transformations by which a factor graph with cycles may sometimes be con- verted—often at great expense in complexity—to an equivalent cycle-free form. We apply these ideas to the factor graph repre- senting the discrete Fourier transform (DFT) kernel, and derive a fast Fourier transform (FFT) algorithm as an instance of the sum-product algorithm. Some concluding remarks are given in Section VII. II. MARGINAL FUNCTIONS, FACTOR GRAPHS, AND THE SUM-PRODUCT ALGORITHM , Throughout this paper we deal with functions of many vari- be a collection of variables, in which, takes on values in some (usually finite) domain -valued function ables. Let for each , (or alphabet) of these variables, i.e., a function with domain be an . Let of . The domain and codomain is called the configuration space for the given collection of variables, and each element of is a particular configuration of the variables, i.e., an assign- of may in ment of a value to each variable. The codomain general be any semiring [2], [31, Sec. 3.6]; however, at least ini- is the tially, we will lose nothing essential by assuming that set of real numbers. Assuming that summation in ated with every function . For each tions summing the value of the variables that have . is well defined, then associ- are marginal func- is obtained by over all configurations of , the value of This type of sum is so central to this paper that we introduce a nonstandard notation to handle it: the “not-sum” or summary. Instead of indicating the variables being summed over, we indi- is cate those variables not being summed over. For example, if , then the “summary a function of three variables for ” is denoted by , and , In this notation we have i.e., the th marginal function associated with the summary for of . is We are interested in developing efficient procedures for com- puting marginal functions that a) exploit the way in which the global function factors, using the distributive law to simplify the summations, and b) reuses intermediate values (partial sums). As we will see, such procedures can be expressed very natu- rally by use of a factor graph. Suppose that local functions, each having some subset of arguments; i.e., suppose that factors into a product of several as (1) is a discrete index set, is a subset of is a function having the elements of , as argu- where and ments. Definition: A factor graph is a bipartite graph that expresses the structure of the factorization (1). A factor graph has a vari- , a factor node for each local func- able node for each variable to factor node tion , and an edge-connecting variable node if and only if is an argument of . A factor graph is thus a standard bipartite graphical represen- tation of a mathematical relation—in this case, the “is an argu- ment of” relation between variables and local functions. Example 1 (A Simple Factor Graph): Let be a function of five variables, and suppose that can be expressed as a product (2)
500 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Fig. 1. A factor graph for the product f (x )f (x )f (x ; x ; x )  f (x ; x )f (x ; x ). of five factors, so that , , , , and , . The factor graph that corresponds to (2) is shown in Fig. 1. A. Expression Trees In many situations (for example, when rep- resents a joint probability mass function), we are interested in . We can obtain an ex- computing the marginal functions pression for each marginal function by using (2) and exploiting the distributive law. To illustrate, we write from Example 1 as or, in summary notation Similarly, we find that (3) (4) In computer science, arithmetic expressions like the right-hand sides of (3) and (4) are often represented by or- dered rooted trees [28, Sec. 8.3], here called expression trees, in which internal vertices (i.e., vertices with descendants) represent arithmetic operators (e.g., addition, multiplication, negation, etc.) and leaf vertices (i.e., vertices without descen- dants) represent variables or constants. For example, the tree of . When the operators Fig. 2 represents the expression in an expression tree are restricted to those that are completely symmetric in their operands (e.g., multiplication and addition), Fig. 2. An expression tree representing x(y + z). it is unnecessary to order the vertices to avoid ambiguity in interpreting the expression represented by the tree. In this paper, we extend expression trees so that the leaf ver- tices represent functions, not just variables or constants. Sums and products in such expression trees combine their operands in the usual (pointwise) manner in which functions are added and multiplied. For example, Fig. 3(a) unambiguously represents the expression on the right-hand side of (3), and Fig. 4(a) unambigu- ously represents the expression on the right-hand side of (4). The operators shown in these figures are the function product and the summary, having various local functions as their arguments. and Also shown in Figs. 3(b) and 4(b), are redrawings of the factor as root vertex, graph of Fig. 1 as a rooted tree with respectively. This is possible because the global function de- fined in (2) was deliberately chosen so that the corresponding factor graph is a tree. Comparing the factor graphs with the cor- responding trees representing the expression for the marginal function, it is easy to note their correspondence. This observa- tion is simple, but key: when a factor graph is cycle-free, the factor graph not only encodes in its structure the factorization of the global function, but also encodes arithmetic expressions by which the marginal functions associated with the global func- tion may be computed. as root. Every node Formally, as we show in Appendix A, to convert a cycle-free to the cor- factor graph representing a function , draw the factor graph as responding expression tree for in the factor graph a rooted tree with then has a clearly defined parent node, namely, the neighboring node through which the unique path from to must pass. Re- place each variable node in the factor graph with a product op- erator. Replace each factor node in the factor graph with a “form product and multiply by ” operator, and between a factor node summary operator. These local transformations are illustrated in Fig. 5(a) for a variable node, and in Fig. 5(b) for a factor node with parent . Trivial products (those with one or no operand) act as identity opera- tors, or may be omitted if they are leaf nodes in the expression applied to a function with a tree. A summary operator is also a trivial operation, and may be omitted. single argument Applying this transformation to the tree of Fig. 3(b) yields the expression tree of Fig. 3(a), and similarly for Fig. 4. Trivial op- erations are indicated with dashed lines in these figures. and its parent , insert a B. Computing a Single Marginal Function Every expression tree represents an algorithm for computing the corresponding expression. One might describe the algorithm as a recursive “top-down” procedure that starts at the root vertex and evaluates each subtree descending from the root, combining the results as dictated by the operator at the root. Equivalently, we prefer to describe the algorithm as a “bottom-up” procedure that begins at the leaves of the tree, with each operator vertex
KSCHISCHANG et al.: FACTOR GRAPHS AND THE SUM-PRODUCT ALGORITHM 501 Fig. 3. (a) A tree representation for the right-hand side of (3). (b) The factor graph of Fig. 1, redrawn as a rooted tree with x as root. Fig. 4. (a) A tree representation for the right-hand side of (4). (b) The factor graph of Fig. 1, redrawn as a rooted tree with x as root. and , evaluating combining its operands and passing on the result as an operand , represented by the ex- for its parent. For example, pression tree of Fig. 2, might be evaluated by starting at the leaf , and passing on the result as an nodes operator, which multiplies the result with . operand for the Rather than working with the expression tree, it is simpler and more direct to describe such marginalization algorithms in terms of the corresponding factor graph. To best understand such algorithms, it helps to imagine that there is a processor associated with each vertex of the factor graph, and that the factor-graph edges represent channels by which these proces- sors may communicate. For us, “messages” sent between pro- cessors are always simply some appropriate description of some marginal function. (We describe some useful representations in Section V-E.) We now describe a message-passing algorithm that we will temporarily call the “single- sum-product algorithm,” since it computes, for a single value of , the marginal function in a rooted cycle-free factor graph, with taken as root vertex. The computation begins at the leaves of the factor graph. Each leaf variable node sends a trivial “identity function” message to sends a description of its parent, and each leaf factor node to its parent. Each vertex waits for messages from all of its children before computing the message to be sent to its parent. This computation is performed according to the transformation shown in Fig. 5; i.e., a variable node simply sends the product of messages received from its children, while a factor node with parent from its children, and then operates on the result with a summary operator. By a “product of messages” we mean an appropriate description of the (pointwise) product of the cor- responding functions. If the messages are parametrizations of the functions, then the resulting message is the parametrization of the product function, not (necessarily) literally the numerical forms the product of with the messages received
502 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Fig. 5. Local substitutions that transform a rooted cycle-free factor graph to an expression tree for a marginal function at (a) a variable node and (b) a factor node. product of the messages. Similarly, the summary operator is ap- plied to the functions, not necessarily literally to the messages themselves. The computation terminates at the root node , where the is obtained as the product of all mes- marginal function sages received at . to factor It is important to note that a message passed on the edge , or vice versa, is a , either from variable , the variable associated with the single-argument function of given edge. This follows since, at every factor node, summary operations are always performed for the variable associated with the edge on which the message is passed. Likewise, at a variable node, all messages are functions of that variable, and so is any product of these messages. is an edge in the tree, where The message passed on an edge during the operation of the single- sum-product algorithm can be interpreted as follows. If is a variable node is a factor node, then the analysis of Appendix A shows and during the operation of the sum- that the message passed on of the product of product algorithm is simply a summary for the local functions descending from the vertex that originates the message. C. Computing All Marginal Functions In many circumstances, we may be interested in computing for more than one value of . Such a computation might be accomplished by applying the single- algorithm separately , but this approach is unlikely to for each desired value of be efficient, since many of the subcomputations performed for different values of will be the same. Computation of for all simultaneously can be efficiently accomplished by es- sentially “overlaying” on a single factor graph all possible in- stances of the single- algorithm. No particular vertex is taken as a root vertex, so there is no fixed parent/child relationship of any among neighboring vertices. Instead, each neighbor . The given vertex is computed just as in the single- message passed from to algorithm, i.e., as if were indeed the parent of and all other neighbors of were children. is at some point regarded as a parent of As in the single- algorithm, message passing is initiated at remains idle until messages have ar- the leaves. Each vertex rived on all but one of the edges incident on . Just as in the Fig. 6. A factor-graph fragment, showing the update rules of the sum-product algorithm. , vertex single- algorithm, once these messages have arrived, is able to compute a message to be sent on the one remaining edge to its neighbor (temporarily regarded as the parent), just as in the single- algorithm, i.e., according to Fig. 5. Let us denote . After sending a message to this temporary parent as vertex returns to the idle state, waiting for a “return mes- sage” to arrive from . Once this message has arrived, the vertex is able to compute and send messages to each of its neigh- ), each being regarded, in turn, as a parent. bors (other than The algorithm terminates once two messages have been passed , over every edge, one in each direction. At variable node the product of all incoming messages is the marginal function , just as in the single- algorithm. Since this algorithm op- erates by computing various sums and products, we refer to it as the sum-product algorithm. The sum-product algorithm operates according to the fol- lowing simple rule: The message sent from a node product of the local function at is a variable node) with all messages received at if on edges other than , summarized for the variable associated with . on an edge (or the unit function is the Let denote the message sent from node in the operation of the sum-product algorithm, let to node denote the message sent from node denote the set of neighbors of a given node in a factor graph. Then, as illustrated in Fig. 6, the message computations per- formed by the sum-product algorithm may be expressed as fol- lows: to node . Also, let (5) (6) where is the set of arguments of the function . The update rule at a variable node takes on the particularly simple form given by (5) because there is no local function to is include, and the summary for of a product of functions of
KSCHISCHANG et al.: FACTOR GRAPHS AND THE SUM-PRODUCT ALGORITHM 503 Termination: Fig. 7. Messages generated in each (circled) step of the sum-product algorithm. simply that product. On the other hand, the update rule at a local function node given by (6) in general involves nontrivial func- tion multiplications, followed by an application of the summary operator. We also observe that variable nodes of degree two perform no computation: a message arriving on one (incoming) edge is simply transferred to the other (outgoing) edge. In the termination step, we compute as the product of . Equivalently, since the message all messages directed toward passed on any given edge is equal to the product of all but one as the product of the of these messages, we may compute two messages that were passed (in opposite directions) over any . Thus, for example, we may compute single edge incident on in three other ways as follows: D. A Detailed Example Fig. 7 shows the flow of messages that would be generated by the sum-product algorithm applied to the factor graph of Fig. 1. The messages may be generated in five steps, as indicated with circles in Fig. 7. In detail, the messages are generated as follows. Step 1: Step 2: Step 3: Step 4: Step 5: III. MODELING SYSTEMS WITH FACTOR GRAPHS We describe now various ways in which factor graphs may be used to model systems, i.e., collections of interacting variables. In probabilistic modeling of systems, a factor graph can be used to represent the joint probability mass function of the vari- ables that comprise the system. Factorizations of this function can give important information about statistical dependencies among these variables. Likewise, in “behavioral” modeling of systems—as in the work of Willems [33]—system behavior is specified in set-the- oretic terms by specifying which particular configurations of variables are valid. This approach can be accommodated by a factor graph that represents the characteristic (i.e., indicator) function for the given behavior. Factorizations of this charac- teristic function can give important structural information about the model. In some applications, we may even wish to combine these two modeling styles. For example, in channel coding, we model both the valid behavior (i.e., the set of codewords) and the a posteriori joint probability mass function over the variables that define the codewords given the received output of a channel. (While it may even be feasible to model complicated channels with memory [31], in this paper we will model only memoryless channels.) In behavioral modeling, “Iverson’s convention” [14, p. 24] is a predicate (Boolean proposition) in- -valued can be useful. If volving some set of variables, then function that indicates the truth of is the , i.e. For example, takes a value of otherwise. if the condition if is true otherwise. (7) is the function that is satisfied, and
504 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 If we let denote the logical conjunction or “AND” operator, then an important property of Iverson’s convention is that (8) (assuming as a logical conjunction of predicates, then according to (8), and hence represented by a factor graph. can be written can be factored ). Thus, if and A. Behavioral Modeling Let uration space we mean any subset configurations. Since a system is specified via its behavior this approach is known as behavioral modeling [33]. be a collection of variables with config- . By a behavior in , are the valid , . The elements of of each variable is some finite alphabet space is the -fold Cartesian product Behavioral modeling is natural for codes. If the domain of , so that the configuration , then a behavior , and the valid is called a block code of length over configurations are called codewords. The characteristic (or set membership indicator) function for Fig. 8. A Tanner graph for the binary linear code of Example 2. denotes the sum in GF where . The corresponding factor graph is shown in Fig. 8, where we have used a special symbol for the parity checks (a square with a “ ” sign). Although strictly speaking the factor graph represents the factorization of the code’s characteristic function, we will often refer to the factor graph as representing the code itself. A factor graph obtained in this way is often called a Tanner graph, after [29]. linear It should be obvious that a Tanner graph for any a behavior is defined as block code may be obtained from a parity-check matrix Obviously, specifying might also give . (We a probabilistic interpretation by noting that is proportional to a probability mass function that is uniform is equivalent to specifying over the valid configurations.) In many important cases, membership of a particular config- can be determined by applying a series uration in a behavior of tests (checks), each involving some subset of the variables. A configuration is deemed valid if and only if it passes all tests; may be written as a logical i.e., the predicate factors conjunction of a series of “simpler” predicates. Then according to (8) into a product of characteristic functions, each indicating whether a particular subset of variables is an element of some “local behavior.” Example 2 (Tanner Graphs for Linear Codes): The char- parity- acteristic function for any linear code defined by an check matrix variable nodes and nary linear code with parity-check matrix factor nodes. For example, if can be represented by a factor graph having is the bi- (9) is the set of all binary -tuples then that satisfy three simultaneous equations expressed in matrix . (This is a so-called kernel representation, form as since the linear code is defined as the kernel of a particular linear is completely determined by transformation.) Membership in checking whether each of the three equations is satisfied. There- fore, using (8) and (9) we have for the code. Such a parity-check matrix has and factor nodes (or checks) to the rows of at least of edge-connecting factor node columns and rows. Variable nodes correspond to the columns , with an if and only if . Of course, since there are, in general, many parity- check matrices that represent a given code, there are, in general, many Tanner graph representations for the code. to variable node Given a collection of general nonlinear local checks, it may be a computationally intractable problem to determine whether the corresponding behavior is nonempty. For example, the canon- ical NP-complete problem SAT (Boolean satisfiability) [13] is simply the problem of determining whether or not a collection of Boolean variables satisfies all clauses in a given set. In effect, each clause is a local check. Often, a description of a system is simplified by introducing hidden (sometimes called auxiliary, latent, or state) variables. Nonhidden variables are called visible. A particular behavior with both auxiliary and visible variables is said to represent a if the projection of the elements of given (visible) behavior on the visible variables is equal to . Any factor graph for is then considered to be also a factor graph for . Such graphs were introduced by Wiberg et al. [31], [32] and may be called Wiberg-type graphs. In our factor graph diagrams, as in Wiberg, hidden variable nodes are indicated by a double circle. An important class of models with hidden variables are the trellis representations (see [30] for an excellent survey). A trellis is an edge-labeled directed graph with distin- for a block code guished root and goal vertices, essentially defined by the prop- erty that each sequence of edge labels encountered in any di- rected path from the root vertex to the goal vertex is a codeword is represented by at least one in such path. Trellises also have the property that all paths from the root to any given vertex should have the same fixed length , called the depth of the given vertex. The root vertex has depth , and the goal vertex has depth . The set of depth vertices . For example, can be viewed as the domain of a state variable , and that each codeword in
KSCHISCHANG et al.: FACTOR GRAPHS AND THE SUM-PRODUCT ALGORITHM 505 Fig. 9. (a) A trellis and (b) the corresponding Wiberg-type graph for the code of Fig. 8. Fig. 9(a) is a trellis for the code of Example 2. Vertices at the same depth are grouped vertically. The root vertex is leftmost, the goal vertex is rightmost, and edges are implicitly directed from left to right. A trellis divides naturally into sections, where the th trellis is the subgraph of the trellis induced by the vertices and depth . The set of edge labels in may be . In effect, each defines a “local behavior” that constrains the section at depth viewed as the domain of a (visible) variable trellis section possible combinations of , and , . Globally, a trellis defines a behavior in the configuration . A configu- space of the variables ration of these variables is valid if and only if it satisfies the local constraints imposed by each of the trellis sections. The characteristic function for this behavior thus factors naturally factors, where the th factor corresponds to the th trellis into section as its arguments. and has , and , The following example illustrates these concepts in detail for the code of Example 2. Example 3 (A Trellis Description): Fig. 9(a) shows a trellis for the code of Example 2, and Fig. 9(b) shows the corresponding Wiberg-type graph. In addition to the visible , there are also hidden (state) variable nodes . Each local check, shown as a variable nodes generic factor node (black square), corresponds to one section of the trellis. In this example, the local behavior corresponding to the second trellis section from the left in Fig. 9 consists of the fol- lowing triples : (10) are taken to where the domains of the state variables , respectively, numbered from bottom be to top in Fig. 9(a). Each element of the local behavior corre- sponds to one trellis edge. The corresponding factor node in the Wiberg-type graph is the indicator function and and . It is important to note that a factor graph corresponding to a trellis is cycle-free. Since every code has a trellis representa- Fig. 10. Generic factor graph for a state-space model of a time-invariant or time-varying system. tion, it follows that every code can be represented by a cycle-free factor graph. Unfortunately, it often turns out that the state-space sizes (the sizes of domains of the state variables) can easily be- come too large to be practical. For example, trellis representa- tions of turbo codes have enormous state spaces [12]. However, such codes may well have factor graph representations with rea- sonable complexities, but necessarily with cycles. Indeed, the “cut-set bound” of [31] (see also [8]) strongly motivates the study of graph representations with cycles. Trellises are basically conventional state-space system models, and the generic factor graph of Fig. 10 can represent any state-space model of a time-invariant or time-varying system. As in Fig. 9, each local check represents a trellis section; i.e., each check is an indicator function for the set of allowed combinations of left (previous) state, input symbol, output symbol, and right (next) state. (Here, we allow a trellis edge to have both an input label and an output label.) Example 4 (State-Space Models): For example, the classical linear time-invariant state-space model is given by the equations where is the discrete time index, are the time- input variables, (11) are the time- output variables, the time- propriate dimension, and the equations are over some field are are matrices of ap- . state variables, , and , ,
分享到:
收藏