Scalable Keyword Search
on Large RDF Data
In TKDE, 2013
woniu317
Outline
ØMotivation
ØAuthor’s method
ØConclusion
ØExperiments
2/32
Preliminaries
l RDF (Resource Description Framework)
l Triples (subject, predicate, object)
l Condense Graph
One node conclude
one keyword.
3/32
Problem definition
l Given an RDF graph G=(V, E) and a query Q = (w1, w2, …, wm)
l Candidate: (r, v1, v2, …, vm)
l r ∈ V is called a root answer node which is reachable by vi
l w(vi) = wi
l Answer
g = {r=v4, v1, v2, v6, v7}
s(g) = 2 + 2 + 2 + 2 = 8
g` = {r=v3, v1, v2, v6, v7}
s(g`) = 1 + 1 + 3 + 1 = 6
4/32
Motivation
l 1. The RDF is the de-facto standard for data representation
on the web.
l 2. Keyword search is an important tool for exploring and
searching large data corpuses whose structure is either
unknown, or constantly changing.
l 3. Existing solutions
l (1) Returning incorrect answers
l (2) Inability to handle large RDF
5/32
Outline
üMotivation
ØAuthor’s method
ØConclusion
ØExperiments
6/32
Backward search(existing)
Termination:
(1) whenever meet at a node r for the first time
Schema
0
1
2
w1 v1 v3 v2 v4 v7
w2 v2 v3 v1 v4 v7
w3 v6 v5
w4 v7 v3 v1 v2 v4
v4
g = {r=v4, v1, v2, v6, v7} s(g) = 8
g` = {r=v3, v1, v2, v6, v7} s(g`) = 6
g`` = {r=v12, v8, v10, v14, v15}
s(g``) = 7
7/32
Baseline method(author’s)
Termination:
(1) whenever meet at a node r for the first time
(2) s(r) ≤ s(r`)
0
1
2
w1 v1 v3 v2 v4 v7
w2 v2 v3 v1 v4 v7
w3 v6 v5
w4 v7 v3 v1 v2 v4
v4
g = {r=v4, v1, v2, v6, v7} s(g) = 8
s(r=v3) = 1 + 1 + 3 + 1 = 6
s(r=v1) = 0 + 2 + 3 + 2 = 7
……
8/32