logo资料库

Springer.Statistical Analysis of Network Data with R.(Use R!).20....pdf

第1页 / 共214页
第2页 / 共214页
第3页 / 共214页
第4页 / 共214页
第5页 / 共214页
第6页 / 共214页
第7页 / 共214页
第8页 / 共214页
资料共214页,剩余部分请下载后查看
Cover
Preface
Contents
Biographies
Chapter 1 Introduction
1.1 Why Networks?
1.2 Types of Network Analysis
1.2.1 Visualizing and Characterizing Networks
1.2.2 Network Modeling and Inference
1.2.3 Network Processes
1.3 Why Use R for Network Analysis?
1.4 About This Book
1.5 About the R code
Chapter 2 Manipulating Network Data
2.1 Introduction
2.2 Creating Network Graphs
2.2.1 Undirected and Directed Graphs
2.2.2 Representations for Graphs
2.2.3 Operations on Graphs
2.3 Decorating Network Graphs
2.3.1 Vertex, Edge, and Graph Attributes
2.3.2 Using Data Frames
2.4 Talking About Graphs
2.4.1 Basic Graph Concepts
2.4.2 Special Types of Graphs
2.5 Additional Reading
Chapter 3 Visualizing Network Data
3.1 Introduction
3.2 Elements of Graph Visualization
3.3 Graph Layouts
3.4 Decorating Graph Layouts
3.5 Visualizing Large Networks
3.6 Using Visualization Tools Outside of R
3.7 Additional Reading
Chapter 4 Descriptive Analysis of Network Graph Characteristics
4.1 Introduction
4.2 Vertex and Edge Characteristics
4.2.1 Vertex Degree
4.2.2 Vertex Centrality
4.2.3 Characterizing Edges
4.3 Characterizing Network Cohesion
4.3.1 Subgraphs and Censuses
4.3.2 Density and Related Notions of Relative Frequency
4.3.3 Connectivity, Cuts, and Flows
4.4 Graph Partitioning
4.4.1 Hierarchical Clustering
4.4.2 Spectral Partitioning
4.4.3 Validation of Graph Partitioning
4.5 Assortativity and Mixing
4.6 Additional Reading
Chapter 5 Mathematical Models for Network Graphs
5.1 Introduction
5.2 Classical Random Graph Models
5.3 Generalized Random Graph Models
5.4 Network Graph Models Based on Mechanisms
5.4.1 Small-World Models
5.4.2 Preferential Attachment Models
5.5 Assessing Significance of Network Graph Characteristics
5.5.1 Assessing the Number of Communities in a Network
5.5.2 Assessing Small World Properties
5.6 Additional Reading
Chapter 6 Statistical Models for Network Graphs
6.1 Introduction
6.2 Exponential Random Graph Models
6.2.1 General Formulation
6.2.2 Specifying a Model
6.2.3 Model Fitting
6.2.4 Goodness-of-Fit
6.3 Network Block Models
6.3.1 Model Specification
6.3.2 Model Fitting
6.3.3 Goodness-of-Fit
6.4 Latent Network Models
6.4.1 General Formulation
6.4.2 Specifying the Latent Effects
6.4.3 Model Fitting
6.4.4 Goodness-of-Fit
6.5 Additional Reading
Chapter 7 Network Topology Inference
7.1 Introduction
7.2 Link Prediction
7.3 Association Network Inference
7.3.1 Correlation Networks
7.3.2 Partial Correlation Networks
7.3.3 Gaussian Graphical Model Networks
7.4 Tomographic Network Topology Inference
7.4.1 Constraining the Problem: Tree Topologies
7.4.2 Tomographic Inference of Tree Topologies:An Illustration
7.5 Additional Reading
Chapter 8 Modeling and Prediction for Processes on Network Graphs
8.1 Introduction
8.2 Nearest Neighbor Methods
8.3 Markov Random Fields
8.3.1 General Characterization
8.3.2 Auto-Logistic Models
8.3.3 Inference and Prediction for Auto-logistic Models
8.3.4 Goodness of Fit
8.4 Kernel Methods
8.4.1 Designing Kernels on Graphs
8.4.2 Kernel Regression on Graphs
8.5 Modeling and Prediction for Dynamic Processes
8.5.1 Epidemic Processes: An Illustration
8.6 Additional Reading
Chapter 9 Analysis of Network Flow Data
9.1 Introduction
9.2 Modeling Network Flows: Gravity Models
9.2.1 Model Specification
9.2.2 Inference for Gravity Models
9.3 Predicting Network Flows: Traffic Matrix Estimation
9.3.1 An Ill-Posed Inverse Problem
9.3.2 The Tomogravity Method
9.4 Additional Reading
Chapter 10 Dynamic Networks
10.1 Introduction
10.2 Representation and Manipulation of Dynamic Networks
10.3 Visualization of Dynamic Networks
10.4 Characterization of Dynamic Networks
10.5 Modeling Dynamic Networks
References
Index
UseR! Eric D. Kolaczyk Gábor Csárdi Statistical Analysis of Network Data with R
Use R! Series Editors: Robert Gentleman Kurt Hornik Giovanni Parmigiani For further volumes: http://www.springer.com/series/6991
Use R! Albert: Bayesian Computation with R (2nd ed. 2009) Bivand/Pebesma/G´omez-Rubio: Applied Spatial Data Analysis with R (2nd ed. 2013) Cook/Swayne: Interactive and Dynamic Graphics for Data Analysis: With R and GGobi Hahne/Huber/Gentleman/Falcon: Bioconductor Case Studies Paradis: Analysis of Phylogenetics and Evolution with R (2nd ed. 2012) Pfaff: Analysis of Integrated and Cointegrated Time Series with R (2nd ed. 2008) Sarkar: Lattice: Multivariate Data Visualization with R Spector: Data Manipulation with R
Eric D. Kolaczyk • G´abor Cs´ardi Statistical Analysis of Network Data with R 123
Eric D. Kolaczyk Department of Mathematics and Statistics Boston University Professor Boston, MA, USA G´abor Cs´ardi Department of Statistics Harvard University Research Associate Cambridge, MA, USA ISSN 2197-5736 ISBN 978-1-4939-0982-7 DOI 10.1007/978-1-4939-0983-4 Springer New York Heidelberg Dordrecht London ISSN 2197-5744 (electronic) ISBN 978-1-4939-0983-4 (eBook) Library of Congress Control Number: 2014936989 © Springer Science+Business Media New York 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of pub- lication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Pour Jos´ee, sans qui ce livre n’aurait pas vu le jour—E.D.K. Z-nak, a gy´em´ant´ert ´es az arany´ert—G.CS.
Preface Networks and network analysis are arguably one of the largest recent growth areas in the quantitative sciences. Despite roots in social network analysis going back to the 1930s and roots in graph theory going back centuries, the phenomenal rise and popularity of the modern field of ‘network science’, as it is sometimes called, is something that likely could not have been predicted 10–15 years ago. Networks have permeated everyday life, far beyond the realm of research and methodology, through now-familiar realities like the Internet, social networks, viral marketing, and more. Measurement and data analysis are integral components of network research. As a result, there is a critical need for all sorts of statistics for network analysis, both common and sophisticated, ranging from applications, to methodology, to theory. As with other areas of statistics, there are both descriptive and inferential statistical techniques available, aimed at addressing a host of network-related tasks, including basic visualization and characterization of network structure; sampling, modeling, and inference of network topology; and modeling and prediction of network-indexed processes, both static and dynamic. Software for performing many such network-related analyses is now available in various languages and environments, across different platforms. Not surprisingly, the R community has been particularly active in the development of software for do- ing statistical analysis of network data. As of this writing there are already dozens of contributed R packages devoted to some aspect of network analysis. Together, these packages address tasks ranging from standard manipulation, visualization, and characterization of network data (e.g., igraph, network, and sna), to modeling of networks (e.g., igraph, eigenmodel, ergm, and mixer), to network topology infer- ence (e.g., glasso and huge). In addition, there is a great deal of analysis that can be done using tools and functions from the R base package. In this book we aim to provide an easily accessible introduction to the statistical analysis of network data, by way of the R programming language. As a result, this book is not, on the one hand, a detailed manual for using the various R packages en- countered herein, nor, on the other hand, does it provide exhaustive coverage of the conceptual and technical foundations of the topic area. Rather, we have attempted vii
分享到:
收藏