Initial performance comparison: Pellet vs Prolog in Bioclipse

I started with some initial performance testing for RDF data, between pellet an prolog, which are now both available integrated in Bioclipse.

Importing data

Total time for importing nmrshiftdata with Pellet: 56.791 s

Loading into prolog, with the pellet data already loaded:

Total time for importing nmrshiftdata with Prolog: 49.091 s

Listing all predicates


Bioclipse JS Script

// JavaScript
var nmrShiftDBStore = pellet.createStore();
rdf.importFile(nmrShiftDBStore, "runningbioclipse/nmrshiftdata.100.R2.rdf.xml", "RDF/XML");
var start = new Date().getTime();
var sparql = "SELECT distinct ?predicate WHERE {   ?x ?predicate ?y. }";
js.say(rdf.sparql(nmrShiftDBStore, sparql));
var elapsed = (new Date().getTime() - start)/1000;
js.say("Total time for retreiving all predicates, with Pellet: " + elapsed + " s");


[], [],
Total time for retreiving all predicates, with Pellet: 111.68 s

Note that the 7 last predicates are specific to pellet. That is, they are in-built predicates, not coming from the NMRShift data.


Prolog function

listAllPredicates(Ps) :-
  setof(P, rdf_db:S^O^rdf( S, P, O ), Ps ).

(This code is placed in the file

Bioclipse JS Script

var start = new Date().getTime();
js.say(blipkit.queryProlog( [ "listAllPredicates", "100", "Ps" ] ));
var elapsed = (new Date().getTime() - start)/1000;
js.say("Total time for retreiving all predicates with Prolog: " + elapsed + " s");


'.'('', []))))))))))))))))]]
Total time for retreiving all predicates with Prolog: 0.023 s

Note that this prolog method returns a list, rather than a set of instances/atoms, which explains the difference in output.

There is obviously some problems with pellet here, in that it takes 111.68 s to retreive all predicates, whereas Prolog does the same thing in 0.023 s. Talking to Egon, we figured out it is most probably related to the fact that Pellet/Jena stores the whole RDF store in memory only, so the thing to do would be to implement a database backend or similar, for the RDF store.

  • This seems to be a good starting point.