Mining and Learning with
Graphs at Scale
https://gm-neurips-2020.github.io/
Welcome + Agenda
Introduction to Graphs + Application Stories
What are graphs? Why are they important?
Graph Mining: Basic tools and algorithms
How do we build, cluster, and use graphs at scale?
Graph Neural Networks
How can we use deep learning on graphs? How can
we use graphs in deep learning?
Systems, Algorithms and Scalability
How do we deal with massive graphs? How can
graphs help us organize Google-scale data?
An Introduction to Mining and
Learning with Graphs
Vahab Mirrokni
Graph Mining | go/graph-mining | December 2019
What are graphs?
Graphs are representations of relationships
(edges) between entities (nodes).
In the most general case, graphs have:
varying numbers of edges…
-
- with different edge types going to
different node types…
- with a highly complex structure.
Social Networks
Traffic, maps (Google Maps)
Image Pixels
Disease Spread
https://www.pnas.org/content/116/2/401
Types of Graphs
Natural graphs are graphs in which the edge
relationship comes from an external source. Think:
payments, social networks, roadways,
coclick/cowatch.
By contrast, similarity graphs are graphs in which the
edge relationship is based on some measure of
similarity/distance between nodes. In these cases, we
start with a blob of (meta-)data and attempt to give
that blob structure via graph representation.
Why Graphs?
Computation on abstract concepts
Most data is fundamentally about relationships,
and graphs can help us . Graphs can also help
us abstract local information and use it to
extract useful global information from data.
Computation on different data types
We constantly deal with visual, textual, and
semantic information, and all of this data relates
to each other. Graphs provide a natural way to
handle multi-modal data.
Social Network Analysis, Wikimedia Commons
Search Query:
Apple
Why Graphs? Global and Local View
Apple
Inc.
Global view:
Graph structure/topology can tell us a lot
about our data such as uncovering clusters
of data points, or providing distance
measures for otherwise intangible
concepts.
Local view:
Local edges to and from a node can tell us
something useful about a node --
something that is difficult to express with a
single element.
The black center pixel is part of an eye, but that is
only apparent when you can see nearby pixels.
Graphs at Scale: Algorithms, Learning, & Systems
for Impact
Because graph representations are so flexible, we
often want to use them on Google-scale data.
We are often dealing with billions of nodes and
many more edges. To work with data at this scale,
we have to combine algorithmic ideas with the right
systems and ML models.
This can be very hard, and the devil is in details.
These tools power hundreds of projects at Google
in Search, Ads, Youtube, Play, Cloud, Maps,
Payments, and more.
Same-meaning queries for Keyword
matching systems
Better Caching for saving 32% Flash
I/O for Search Infra(VLDB’19).
Collaborative Filtering for
YouTube Recommendations
Finding micro-markets in
designing A/B experiments
[KDD’19, NeurIPS’19]