Understanding Bayesian Networks
with Examples in R
Marco Scutari
scutari@stats.ox.ac.uk
Department of Statistics
University of Oxford
January 23–25, 2017
Definitions
Marco Scutari
University of Oxford
Definitions
A Graph and a Probability Distribution
Bayesian networks (BNs) are defined by:
a network structure, a directed acyclic graph G = (V, A), in which
each node vi ∈ V corresponds to a random variable Xi;
a global probability distribution X with parameters Θ, which can
be factorised into smaller local probability distributions according to
the arcs aij ∈ A present in the graph.
The main role of the network structure is to express the conditional
independence relationships among the variables in the model through
graphical separation, thus specifying the factorisation of the global
distribution:
N
P(X) =
P(Xi | ΠXi; ΘXi)
where ΠXi = {parents of Xi}
i=1
Marco Scutari
University of Oxford
Definitions
Where to Look: Book References
(Best perused as ebooks, the Koller & Friedman is ≈ 21/2 inches thick.)
Marco Scutari
University of Oxford
Definitions
How to Use: Software References
DISCLAIMER: I am the author of the bnlearn R package
and I will use it for the most part in this course.
install.packages("bnlearn")
For displaying graphs, I will use the Rgraphviz from
BioConductor:
source("http://bioconductor.org/biocLite.R")
biocLite(c("graph", "Rgraphviz"))
For exact inference on discrete Bayesian networks:
source("http://bioconductor.org/biocLite.R")
biocLite(c("graph", "Rgraphviz", "RBGL"))
install.packages("gRain")
Other packages from CRAN:
install.packages(c("pcalg", "catnet", "abn"))
Marco Scutari
University of Oxford
Definitions
Graphs
The first component of a BN is a graph. A
graph G is a mathematical object with:
a set of nodes V = {v1, . . . , vN};
a set of arcs A which are identified by
pairs for nodes in V, e.g. aij = (vi, vj).
Given V, a graph is uniquely identified by A.
The arcs in A can be:
undirected if (vi, vj) is an unordered pair
and the arc vi − vj has no direction;
directed if (vi, vj) = (vj, vi) is an ordered
pair and the arc has a specific direction
vi → vj.
The assumption is that there is at most one
arc between a pair of nodes.
Marco Scutari
University of Oxford
EABCDABCDE
Definitions
Directed Acyclic Graphs
BNs use a specific kind of graph called a directed acyclic graph, that:
contains only directed arcs;
does not contain any loop (e.g. an arc vi → vi from a node to
does not contain any cycle (e.g. a sequence of arcs
itself);
vi → vj → . . . → vk → vi that starts and ends in the same node).
Marco Scutari
University of Oxford
ABCDEABCDEABCDE
Definitions
How the DAG Maps to the Probability Distribution
Formally, the DAG is an independence map of the probability
distribution of X, with graphical separation (⊥⊥G) implying probabilistic
independence (⊥⊥P ).
Marco Scutari
University of Oxford
CABDEFDAGGraphicalseparationProbabilisticindependence