sparql fast as hell
In the last two months we shifted the gnowsis search services to SPARQL. Our problem was, that the common ARQ implementation is slow and does not work in a named graph scenario.
A fine twist in history brought Richard Cyganiak to work at HP Labs for a while and he hacked sparql2sql there, a mapping of sparql to the jena database scheme with support for named graphs.
I asked him if I can use it in gnowsis and he was quite happy to have this as a test-case. Result is, that Richard added a special framework for fulltext search and I hacked some stuff regarding MySQL's fine FULLTEXT index functions (if you don't know it: its similiar to lucene, can do relevance and query expansion and so on)
outcome: an hour ago I transformed the SQL index to the new fulltext format and got hit in the face with the new answer times:
SPARQL of this kind works practically instantly (10-20msec):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?label WHERE
{ GRAPH ?source {
?x rdfs:label ?label.
FILTER REGEX(?label, "test", "i")}}
But the real astonishing thing is that SPARQL of this kind is also fast as hell (10-500ms):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?p ?label WHERE
{ GRAPH ?source {
?x ?p ?label.
FILTER REGEX(?label, "test", "i")}}
In simple words: this is a fulltext scan over all properties of all statements. don't get bothered by the warmup time, the thing will need about 10-20 sec warmup, but then its great.
triplecount:
371994
for the freaks we will probably pack a little executable jar that packs all into a nice demo. for the real freaks:
http://gnowsis.opendfki.de/repos/gnowsis/trunk/sparql2sql/
demo app
A fine twist in history brought Richard Cyganiak to work at HP Labs for a while and he hacked sparql2sql there, a mapping of sparql to the jena database scheme with support for named graphs.
I asked him if I can use it in gnowsis and he was quite happy to have this as a test-case. Result is, that Richard added a special framework for fulltext search and I hacked some stuff regarding MySQL's fine FULLTEXT index functions (if you don't know it: its similiar to lucene, can do relevance and query expansion and so on)
outcome: an hour ago I transformed the SQL index to the new fulltext format and got hit in the face with the new answer times:
SPARQL of this kind works practically instantly (10-20msec):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?label WHERE
{ GRAPH ?source {
?x rdfs:label ?label.
FILTER REGEX(?label, "test", "i")}}
But the real astonishing thing is that SPARQL of this kind is also fast as hell (10-500ms):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?p ?label WHERE
{ GRAPH ?source {
?x ?p ?label.
FILTER REGEX(?label, "test", "i")}}
In simple words: this is a fulltext scan over all properties of all statements. don't get bothered by the warmup time, the thing will need about 10-20 sec warmup, but then its great.
triplecount:
371994
for the freaks we will probably pack a little executable jar that packs all into a nice demo. for the real freaks:
http://gnowsis.opendfki.de/repos/gnowsis/trunk/sparql2sql/
demo app
leobard - 3. Aug, 23:17
|
- add comment - 0 trackbacks