logo资料库

Graph Databases - New Opportunities for Connected Data(2nd) 无水印p....pdf

第1页 / 共237页
第2页 / 共237页
第3页 / 共237页
第4页 / 共237页
第5页 / 共237页
第6页 / 共237页
第7页 / 共237页
第8页 / 共237页
资料共237页,剩余部分请下载后查看
Cover
Copyright
Table of Contents
Foreword
Graphs Are Everywhere, or the Birth of Graph Databases as We Know Them
Preface
About the Second Edition
About This Book
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Acknowledgments
1. Introduction
What Is a Graph?
A High-Level View of the Graph Space
Graph Databases
Graph Compute Engines
The Power of Graph Databases
Performance
Flexibility
Agility
Summary
2. Options for Storing Connected Data
Relational Databases Lack Relationships
NOSQL Databases Also Lack Relationships
Graph Databases Embrace Relationships
Summary
3. Data Modeling with Graphs
Models and Goals
The Labeled Property Graph Model
Querying Graphs: An Introduction to Cypher
Cypher Philosophy
MATCH
RETURN
Other Cypher Clauses
A Comparison of Relational and Graph Modeling
Relational Modeling in a Systems Management Domain
Graph Modeling in a Systems Management Domain
Testing the Model
Cross-Domain Models
Creating the Shakespeare Graph
Beginning a Query
Declaring Information Patterns to Find
Constraining Matches
Processing Results
Query Chaining
Common Modeling Pitfalls
Email Provenance Problem Domain
A Sensible First Iteration?
Second Time's the Charm
Evolving the Domain
Identifying Nodes and Relationships
Avoiding Anti-Patterns
Summary
4. Building a Graph Database Application
Data Modeling
Describe the Model in Terms of the Application's Needs
Nodes for Things, Relationships for Structure
Fine-Grained versus Generic Relationships
Model Facts as Nodes
Represent Complex Value Types as Nodes
Time
Iterative and Incremental Development
Application Architecture
Embedded versus Server
Clustering
Load Balancing
Testing
Test-Driven Data Model Development
Performance Testing
Capacity Planning
Optimization Criteria
Performance
Redundancy
Load
Importing and Bulk Loading Data
Initial Import
Batch Import
Summary
5. Graphs in the Real World
Why Organizations Choose Graph Databases
Common Use Cases
Social
Recommendations
Geo
Master Data Management
Network and Data Center Management
Authorization and Access Control (Communications)
Real-World Examples
Social Recommendations (Professional Social Network)
Authorization and Access Control
Geospatial and Logistics
Summary
6. Graph Database Internals
Native Graph Processing
Native Graph Storage
Programmatic APIs
Kernel API
Core API
Traversal Framework
Nonfunctional Characteristics
Transactions
Recoverability
Availability
Scale
Summary
7. Predictive Analysis with Graph Theory
Depth- and Breadth-First Search
Path-Finding with Dijkstra's Algorithm
The A* Algorithm
Graph Theory and Predictive Modeling
Triadic Closures
Structural Balance
Local Bridges
Summary
Appendix A. NOSQL Overview
The Rise of NOSQL
ACID versus BASE
The NOSQL Quadrants
Document Stores
Key-Value Stores
Column Family
Query versus Processing in Aggregate Stores
Graph Databases
Property Graphs
Hypergraphs
Triples
Index
About the Authors
2 n d E ditio n Graph Databases NEW OPPORTUNITIES FOR CONNECTED DATA Ian Robinson, Jim Webber & Emil Eifrem
SECOND EDITION Graph Databases Ian Robinson, Jim Webber & Emil Eifrem
Graph Databases by Ian Robinson, Jim Webber, and Emil Eifrem Copyright © 2015 Neo Technology, Inc. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/ institutional sales department: 800-998-9938 or corporate@oreilly.com. Interior Designer: David Futato Cover Designer: Ellie Volckhausen Illustrator: Rebecca Demarest Editor: Marie Beaugureau Production Editor: Kristen Brown Proofreader: Christina Edwards Indexer: WordCo Indexing Services June 2013: June 2015: First Edition Second Edition Revision History for the Second Edition 2015-05-04: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781491930892 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Graph Databases, the cover image of an European octopus, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-491-93200-1 [LSI]
Table of Contents Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 What Is a Graph? 1 A High-Level View of the Graph Space 4 Graph Databases 5 Graph Compute Engines 7 The Power of Graph Databases 8 Performance 8 Flexibility 9 Agility 9 Summary 10 2. Options for Storing Connected Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Relational Databases Lack Relationships 11 NOSQL Databases Also Lack Relationships 15 Graph Databases Embrace Relationships 18 Summary 24 3. Data Modeling with Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Models and Goals 25 The Labeled Property Graph Model 26 Querying Graphs: An Introduction to Cypher 27 Cypher Philosophy 28 MATCH 30 RETURN 30 iii
Other Cypher Clauses 31 A Comparison of Relational and Graph Modeling 32 Relational Modeling in a Systems Management Domain 33 Graph Modeling in a Systems Management Domain 38 Testing the Model 39 Cross-Domain Models 41 Creating the Shakespeare Graph 45 Beginning a Query 46 Declaring Information Patterns to Find 48 Constraining Matches 49 Processing Results 50 Query Chaining 51 Common Modeling Pitfalls 52 Email Provenance Problem Domain 52 A Sensible First Iteration? 52 Second Time’s the Charm 55 Evolving the Domain 58 Identifying Nodes and Relationships 63 Avoiding Anti-Patterns 63 Summary 64 4. Building a Graph Database Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Data Modeling 65 Describe the Model in Terms of the Application’s Needs 66 Nodes for Things, Relationships for Structure 67 Fine-Grained versus Generic Relationships 67 Model Facts as Nodes 68 Represent Complex Value Types as Nodes 71 Time 72 Iterative and Incremental Development 74 Application Architecture 76 Embedded versus Server 76 Clustering 81 Load Balancing 82 Testing 85 Test-Driven Data Model Development 85 Performance Testing 91 Capacity Planning 95 Optimization Criteria 95 Performance 96 Redundancy 98 Load 98 iv | Table of Contents
Importing and Bulk Loading Data 99 Initial Import 99 Batch Import 100 Summary 104 5. Graphs in the Real World. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Why Organizations Choose Graph Databases 105 Common Use Cases 106 Social 106 Recommendations 107 Geo 108 Master Data Management 109 Network and Data Center Management 109 Authorization and Access Control (Communications) 110 Real-World Examples 111 Social Recommendations (Professional Social Network) 111 Authorization and Access Control 123 Geospatial and Logistics 132 Summary 147 6. Graph Database Internals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Native Graph Processing 149 Native Graph Storage 152 Programmatic APIs 158 Kernel API 158 Core API 159 Traversal Framework 160 Nonfunctional Characteristics 162 Transactions 162 Recoverability 163 Availability 164 Scale 166 Summary 170 7. Predictive Analysis with Graph Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Depth- and Breadth-First Search 171 Path-Finding with Dijkstra’s Algorithm 173 The A* Algorithm 181 Graph Theory and Predictive Modeling 182 Triadic Closures 182 Structural Balance 184 Local Bridges 188 Table of Contents | v
Summary 190 A. NOSQL Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 vi | Table of Contents
Foreword Graphs Are Everywhere, or the Birth of Graph Databases as We Know Them It was 1999 and everyone worked 23-hour days. At least it felt that way. It seemed like each day brought another story about a crazy idea that just got millions of dollars in funding. All our competitors had hundreds of engineers, and we were a 20-ish person development team. As if that was not enough, 10 of our engineers spent the majority of their time just fighting the relational database. It took us a while to figure out why. As we drilled deeper into the persistence layer of our enterprise content management application, we realized that our software was managing not just a lot of individual, isolated, and discrete data items, but also the connections between them. And while we could easily fit the discrete data in relational tables, the connected data was more challenging to store and tremendously slow to query. Out of pure desperation, my two Neo cofounders, Johan and Peter, and I started experimenting with other models for working with data, particularly those that were centered around graphs. We were blown away by the idea that it might be possible to replace the tabular SQL semantic with a graph-centric model that would be much easier for developers to work with when navigating connected data. We sensed that, armed with a graph data model, our development team might not waste half its time fighting the database. Surely, we said to ourselves, we can’t be unique here. Graph theory has been around for nearly 300 years and is well known for its wide applicability across a number of diverse mathematical problems. Surely, there must be databases out there that embrace graphs! vii
分享到:
收藏