logo资料库

Inferring Phylogenies.pdf

第1页 / 共681页
第2页 / 共681页
第3页 / 共681页
第4页 / 共681页
第5页 / 共681页
第6页 / 共681页
第7页 / 共681页
第8页 / 共681页
资料共681页,剩余部分请下载后查看
01
IMG_0001
IMG_0002
IMG_0003
IMG_0004
02
IMG_0004
IMG_0005
03
04
05
06
07
IMG_0001
IMG_0002
IMG_0003
IMG_0004
IMG_0005
IMG_0006
IMG_0007
IMG_0008
08
IMG
IMG_0001
IMG_0002
09
IMG_0009
IMG_0010
10
IMG
IMG_0001
Inferring Phylogenies Joseph Felsenstein University of Washington Sinauer Associates, Inc. • Publishers Sunderland, Massachusetts
COVER: "Far Flung Fir" 27" x 19" x 19" painted plaster over rebar and recycled styrofoam. Copyright © Joan A. Rudd 1985, exhibited Oregon Biennial 1985, Portland Art Museum. Collection of the author. INFERRING PHYLOGENIES Copyright © 2004 by Joseph Felsenstein All rights reserved. This book may not be reproduced in whole or in part without permission from the publisher. For information or to order, address: Sinauer Associates, Inc., PO Box 407,23 Plumtree Road, Sunderland, MA, 01375 U.S.A. FAX: 413-549-1118 Internet: publish@sinauer.com; http://www.sinauer.com Printed in U.S.A. 543
Contents Preface 1 Parsimony methods A simple example . Evaluating a particular tree . Rootedness and unrootedness Methods of rooting the tree Branch lengths . . . . . Unresolved questions . . . . 2 Counting evolutionary changes The Fitch algorithm . . . The Sankoff algorithm . . . . . . . . . . . . . Connection between the two algorithms . . . . . . . . . . . . Using the algorithms when modifying trees Views Using views when a tree is altered Further economies . . 3 How many trees are there? Rooted bifurcating trees Unrooted bifurcating trees Multifurcating trees .... Tree shapes Unrooted trees with multifurcations . Rooted bifurcating tree shapes .. Rooted multifurcating tree shapes Unrooted Shapes Labeled histories Perspective . . . . . . . v xix 1 1 1 4 6 8 9 11 11 13 16 16 16 17 18 19 20 24 25 28 29 29 30 32 35 36
vi 4 Finding the best tree by heuristic search Nearest-neighbor interchanges Subtree pruning and regrafting Tree bisection and reconnection Other tree rearrangement methods . . . . . Tree-fusing . . . . Genetic algorithms Tree windows and sectorial search . . . Speeding up rearrangements Sequential addition Star decomposition . . Tree space Search by reweighting of characters. Simulated annealing History . . . . . . 5 Finding the best tree by branch and bound A nonbiological example Finding the optimal solution. NP-hardness . Branch and bound methods Phylogenies: Despair and hope Branch and bound for parsimony Improving the bound . . . . . Using still-absent states. Using compatibility. Rules limiting the search 6 Ancestral states and branch lengths Reconstructing ancestral states . Accelerated and delayed transformation. Branch lengths . . . . . . . . . . . . . . . . 7 Variants of parsimony Camin-Sokal parsimony Parsimony on an ordinal scale Dollo parsimony. Polymorphism parsimony .. Unknown ancestral states .. Multiple states and binary coding. Dollo parsimony and multiple states . . . . . 37 38 41 44 44 44 45 46 46 47 48 50 51 52 53 54 54 57 59 60 60 61 64 64 64 65 67 67 70 70 73 73 74 75 76 78 78 80
Polymorphism parsimony and multiple states . Transformation series analysis Weighting characters . . . . . . . Successive weighting and nonlinear weighting . . . . . . . . Successive weighting . . . Nonsuccessive algorithms 8 Compatibility Testing compatibility . The Pairwise Compatibility Theorem . Cliques of compatible characters ... Finding the tree from the clique . . . . Other cases where cliques can be used Where cliques cannot be used . . . . . . . . . . . . . Perfect phylogeny. Using compatibility on molecules anyway. 9 Statistical properties of parsimony Likelihood and parsimony . . Consistency and parsimony . . . . . . . . . .. . . . . . . . The weights . Unweighted parsimony .... Limitations of this justification of parsimony Farris's proofs . . . . . No common mechanism ..... Likelihood and compatibility .. Parsimony versus compatibility. . . Character patterns and parsimony Observed numbers of the patterns Observed fractions of the patterns Expected fractions of the patterns. Inconsistency . When inconsistency is not a problem . The nucleotide sequence case . . . . . Other situations where consistency is guaranteed Does a molecular clock guarantee consistency? The Farris zone Some perspective . . . . . . . . . . . . . . 10 A digression on history and philosophy How phylogeny algorithms developed. Sokal and Sneath . . . . . Edwards and Cavalli-Sforza ... Camin and Sokal and parsimony . . . . vii 81 81 82 83 83 85 87 88 89 91 92 94 94 95 96 97 97 100 100 101 102 103 105 107 107 107 110 111 111 113 114 115 117 118 120 121 123 123 123 125 129
viii . . . . . . Eck and Dayhoff and molecular parsimony . . Fitch and Margoliash popularize distance matrix methods Wilson and Le Quesne introduce compatibility Jukes and Cantor and molecular distances .. Farris and Kluge and unordered parsimony . Fitch and molecular parsimony . Further work. . . What about Willi Hennig and Walter Zimmerman? . . . . . . . . . . . . . . . . Different philosophical frameworks H ypothetico-deductive Logical parsimony . Logical probability? . . Criticisms of statistical inference The irrelevance of classification . . . . . 11 Distance matrix methods Branch lengths and times. . The least squares methods . . . Least squares branch lengths Finding the least squares tree topology . . . . . . . . The statistical rationale .. Generalized least squares Distances The Jukes-Cantor model-an example Why correct for multiple changes? . . Minimum evolution. Clustering algorithms. . . UPGMA and least squares . A clustering algorithm An example . UPGMA on nonclocklike trees. . . Neighbor-joining . . . . . . . . . Performance . Using neighbor-joining with other methods Relation of neighbor-joining to least squares Weighted versions of neighbor-joining . . Other approximate distance methods. . . Distance Wagner method. . A related family . . Minimizing the maximum discrepancy Two approaches to error in trees . . . . . . . . . . . . . . . A puzzling formula . Consistency and distance methods . . . . . . . . . . . 130 131 133 134 134 136 136 136 138 139 141 142 143 145 147 147 148 148 153 153 154 155 156 158 159 161 161 162 162 165 166 168 169 169 170 171 171 171 172 172 173 174
A limitation of distance methods 12 Quartets of species The four point metric The split decomposition Related methods Short quartets methods. The disk-covering method Challenges for the short quartets and DCM methods . Three-taxon statement methods. . . Other uses of quartets with parsimony Consensus supertrees . . . . eighborliness. . . . . . . . . . . . . . De Soete's search method Quartet puzzling and searching tree space . Perspective. . . . . . . . . . . . . . . . . . . 13 Models of DNA evolution Kimura's two-parameter model Calculation of the distance. . . The Tamura-Nei model, F84, and HKY The general time-reversible model . Distances from the GTR model . . . . . . . . . . . . . . The general 12-parameter model .. LogDet distances . . . Other distances Variance of distance. Rate variation between sites or loci Different rates at different sites Distances with known rates . . Distribution of rates. . . . . . Gamma- and lognormally distributed rates Distances from gamma-distributed rates. . Models with nonindependence of sites . 14 Models of protein evolution . . . . . . . . . . Amino acid models . The Dayhoff model . Other empirically-based models. Models depending on secondary structure . . . . Codon-based models . . . . . . . . . . . . Inequality of synonymous and nonsynonymous substitutions . . . . Protein structure and correlated change . . . . . . . . . . . . ix 175 176 177 178 182 182 183 185 186 188 189 191 192 193 194 196 196 198 200 204 206 210 211 213 214 215 215 216 216 217 217 221 222 222 222 223 225 226 227 228
x 15 Restriction sites, RAPDs, AFLPs, and microsatellites Restriction sites . . . . . . . . . . . . . . Nei and Tajima's model . Distances based on restriction sites Issues of ascertainment . Parsimony for restriction sites . . . . . . . . . . . Modeling restriction fragments Parsimony with restriction fragments RAPDs and AFLPs . . . . . The issue of dominance. Unresolved problems . . Microsatellite models ..... The one-step model . . . Microsatellite distances. A Brownian motion approximation. Models with constraints on array size Multi-step and heterogeneous models Snakes and Ladders . Complications . 16 Likelihood methods Maximum likelihood . . . . . . . Anexample Computing the likelihood of a tree . . Economizing on the computation Handling ambiguity and error .. . Unrootedness Finding the maximum likelihood tree Inferring ancestral sequences Rates varying among sites .. Hidden Markov models Autocorrelation of rates HMMs for other aspects of models Estimating the states ... . Models with clocks . . . . . . Relaxing molecular clocks Models for relaxed clocks Covarions Empirical approaches to change of rates . Are ML estimates consistent? ... Comparability of likelihoods A nonexistent proof? A simple proof. . . . . . . . . 230 230 230 233 234 235 236 239 239 240 240 241 241 242 244 246 246 246 247 248 248 249 251 253 255 256 256 259 260 262 264 265 265 266 266 267 268 269 269 270 270 271
分享到:
收藏