logo资料库

Social Network Analysis for Startups [2011].pdf

第1页 / 共190页
第2页 / 共190页
第3页 / 共190页
第4页 / 共190页
第5页 / 共190页
第6页 / 共190页
第7页 / 共190页
第8页 / 共190页
资料共190页,剩余部分请下载后查看
Table of Contents
Preface
Prerequisites
Open-Source Tools
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Thanks
Chapter 1. Introduction
Analyzing Relationships to Understand People and Groups
Binary and Valued Relationships
Symmetric and Asymmetric Relationships
Multimode Relationships
From Relationships to Networks—More Than Meets the Eye
Social Networks vs. Link Analysis
The Power of Informal Networks
Terrorists and Revolutionaries: The Power of Social Networks
Social Networks in Prison
Informal Networks in Terrorist Cells
The Revolution Will Be Tweeted
Social Media and Social Networks
Egyptian Revolution and Twitter
Chapter 2. Graph Theory—A Quick Introduction
What Is a Graph?
Adjacency Matrices
Edge-Lists and Adjacency Lists
7 Bridges of Königsberg
Graph Traversals and Distances
Depth-First Traversal
Implementation
DFS with NetworkX
Breadth-First Traversal
Algorithm
BFS with NetworkX
Paths and Walks
Dijkstra’s Algorithm
Graph Distance
Graph Diameter
Why This Matters
6 Degrees of Separation is a Myth!
Small World Networks
Chapter 3. Centrality, Power, and Bottlenecks
Sample Data: The Russians are Coming!
Get Oriented in Python and NetworkX
Read Nodes and Edges from LiveJournal
Snowball Sampling
Saving and Loading a Sample Dataset from a File
Centrality
Who Is More Important in this Network?
Find the “Celebrities”
Degree centrality in the LiveJournal network
Find the Gossipmongers
Find the Communication Bottlenecks and/or Community Bridges
Putting It Together
Who Is a “Gray Cardinal?”
In practice
Klout Score
PageRank—How Google Measures Centrality
Simplified PageRank algorithm
What Can’t Centrality Metrics Tell Us?
Chapter 4. Cliques, Clusters and Components
Components and Subgraphs
Analyzing Components with Python
Islands in the Net
Subgraphs—Ego Networks
Extracting and Visualizing Ego Networks with Python
Triads
Fraternity Study—Tie Stability and Triads
Triads and Terrorists
The “Forbidden Triad” and Structural Holes
Structural Holes and Boundary Spanning
Triads in Politics
Directed Triads
Analyzing Triads in Real Networks
Real Data
Cliques
Detecting Cliques
Hierarchical Clustering
The Algorithm
Clustering Cities
Preparing Data and Clustering
Block Models
Triads, Network Density, and Conflict
Chapter 5. 2-Mode Networks
Does Campaign Finance Influence Elections?
Theory of 2-Mode Networks
Affiliation Networks
Attribute Networks
A Little Math
2-Mode Networks in Practice
PAC Networks
Candidate Networks
Expanding Multimode Networks
Exercise
Chapter 6. Going Viral! Information Diffusion
Anatomy of a Viral Video
What Did Facebook Do Right?
How Do You Estimate Critical Mass?
Wikinomics of Critical Mass
Content is (Still) King
Heterogenous Preferences
How Does Information Shape Networks (and Vice Versa)?
Birds of a Feather?
Homophily vs. Curiosity
Boundary Spanners
Weak Ties
Dunbar Number and Weak Ties
A Simple Dynamic Model in Python
Influencers in the Midst
Exercises for the Reader
Coevolution of Networks and Information
Exercises for the Reader
Why Model Networks?
Chapter 7. Graph Data in the Real World
Medium Data: The Tradition
Big Data: The Future, Starting Today
“Small Data”—Flat File Representations
EdgeList Files
.net Format
GML, GraphML, and other XML Formats
Ancient Binary Format—##h Files
“Medium Data”: Database Representation
What are Cursors?
What are Transactions?
Names
Nodes as Data, Attributes as ?
The Class
Functions and Decorators
Decorator notation
The Adaptor
Working with 2-Mode Data
Exercises for the Reader
Social Networks and Big Data
NoSQL
Structural Realities
Plain text is king
The freedom to store
Computational Complexities
Big Data is Big
Big Data at Work
What Are We Distributing?
Hadoop, S3, and MapReduce
Hive
SQL is Still Our Friend
Appendix A. Data Collection
A Note on the Ethics of Data Collection
The Old-Fashioned Way
Mining Server Logs
Mining Social Media Sites
Business and Investments
Politics, Elections, and Courts
Blogosphere and Social Bookmarking
Twitter Data Collection
Facebook
Private Ego-Networks
Facebook Social Graph API
Appendix B. Installing Software
Why (We Love) Python?
Exploratory Programming
Python
IPython
NetworkX
matplotlib
pylab: matplotlib with IPython
Social Network Analysis for Startups Maksim Tsvetovat and Alexander Kouznetsov Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo
Social Network Analysis for Startups by Maksim Tsvetovat and Alexander Kouznetsov Copyright © 2011 Maksim Tsvetovat and Alexander Kouznetsov. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com. Editors: Shawn Wallace and Mike Hendrickson Production Editor: Kristen Borg Proofreader: O’Reilly Production Services Cover Designer: Karen Montgomery Interior Designer: David Futato Illustrator: Robert Romano Revision History for the First Edition: See http://oreilly.com/catalog/errata.csp?isbn=9781449306465 for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Social Network Analysis for Startups, the image of a hawfinch, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein. ISBN: 978-1-449-30646-5 [LSI] 1316789838
Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1. Binary and Valued Relationships Symmetric and Asymmetric Relationships Multimode Relationships Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Analyzing Relationships to Understand People and Groups 2 3 3 3 4 6 10 10 11 14 From Relationships to Networks—More Than Meets the Eye Social Networks vs. Link Analysis The Power of Informal Networks Terrorists and Revolutionaries: The Power of Social Networks Social Networks in Prison Informal Networks in Terrorist Cells The Revolution Will Be Tweeted What Is a Graph? Depth-First Traversal Breadth-First Traversal Paths and Walks Dijkstra’s Algorithm Graph Distance Graph Traversals and Distances Adjacency Matrices Edge-Lists and Adjacency Lists 7 Bridges of Königsberg 2. Graph Theory—A Quick Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 19 21 22 23 25 27 30 31 33 35 36 36 37 37 Graph Diameter Why This Matters 6 Degrees of Separation is a Myth! Small World Networks iii
Centrality Sample Data: The Russians are Coming! Get Oriented in Python and NetworkX Read Nodes and Edges from LiveJournal Snowball Sampling Saving and Loading a Sample Dataset from a File 3. Centrality, Power, and Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 39 39 41 43 44 45 45 45 49 51 54 55 57 58 60 Who Is More Important in this Network? Find the “Celebrities” Find the Gossipmongers Find the Communication Bottlenecks and/or Community Bridges Putting It Together Who Is a “Gray Cardinal?” Klout Score PageRank—How Google Measures Centrality What Can’t Centrality Metrics Tell Us? Subgraphs—Ego Networks Triads Components and Subgraphs Extracting and Visualizing Ego Networks with Python Analyzing Components with Python Islands in the Net 4. Cliques, Clusters and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 61 62 62 65 65 67 68 68 72 73 74 76 77 79 79 79 81 82 83 84 86 88 Fraternity Study—Tie Stability and Triads Triads and Terrorists The “Forbidden Triad” and Structural Holes Structural Holes and Boundary Spanning Triads in Politics Directed Triads Analyzing Triads in Real Networks Real Data The Algorithm Clustering Cities Preparing Data and Clustering Block Models Triads, Network Density, and Conflict Cliques Detecting Cliques Hierarchical Clustering 5. 2-Mode Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 93 Does Campaign Finance Influence Elections? iv | Table of Contents
Theory of 2-Mode Networks Affiliation Networks Attribute Networks A Little Math 2-Mode Networks in Practice PAC Networks Candidate Networks Expanding Multimode Networks Exercise 96 96 98 98 100 102 102 105 107 Anatomy of a Viral Video How Does Information Shape Networks (and Vice Versa)? What Did Facebook Do Right? How Do You Estimate Critical Mass? Wikinomics of Critical Mass Content is (Still) King 6. Going Viral! Information Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 109 110 111 112 113 116 117 117 119 119 121 125 127 127 133 134 Birds of a Feather? Homophily vs. Curiosity Weak Ties Dunbar Number and Weak Ties A Simple Dynamic Model in Python Influencers in the Midst Exercises for the Reader Coevolution of Networks and Information Exercises for the Reader Why Model Networks? Medium Data: The Tradition Big Data: The Future, Starting Today “Small Data”—Flat File Representations EdgeList Files .net Format GML, GraphML, and other XML Formats Ancient Binary Format—##h Files 7. Graph Data in the Real World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 138 138 139 139 140 141 142 142 143 144 144 145 145 146 What are Cursors? What are Transactions? Names Nodes as Data, Attributes as ? The Class Functions and Decorators “Medium Data”: Database Representation Table of Contents | v
The Adaptor Working with 2-Mode Data Exercises for the Reader Social Networks and Big Data NoSQL Structural Realities Computational Complexities Big Data is Big Big Data at Work What Are We Distributing? Hadoop, S3, and MapReduce Hive SQL is Still Our Friend 148 150 151 151 152 153 156 156 156 157 157 158 160 A. Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 B. Installing Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 vi | Table of Contents
分享到:
收藏