A usage strategy emerges

A strategy for how to work with the Bioclipse/JPL/Prolog/Blipkit combination I'm setting up, is becoming clear.

The main idea with Bioclipse, as well as with having a prolog engine available in it, is for flexible and "interactive" knitting together of knowledge. One of the main questions regarding how to use a Bioclipse/JPL/Prolog/Blipkit combination, has been where to put the bulk of knowledge integration/reasoning code? There would in principle be three options for that:

  1. Bioclipse (Javascript environment)
  2. The Blipkit-Prolog/Bioclipse integration plugin (Java code, a.k.a. "Manager methods")
  3. The prolog engine (As a prolog file)

Number 2 is unrealistic

Number 2 in the list was quite quickly abandoned as unrealistic, since writing new Java code would be just too complicated for the end user (It would require checking out of source code, and restart of Bioclipse for each change etc.).

Furthermore the JPL (Java / Prolog API) is not overly flexible. For example, it requires that you decide the type of a term before passing it on to a prolog query, while prolog would have been able to determine this on the fly.

This CAN be addressed though, which I have already done with a wrapper class for JPL:s Query class, which replicates part of the prolog type handling behaviour, but there are even more problems, that are not easily solvable. One of them is the use of namespaces, as mentioned in this blog post. There was in fact a solution to this, as put forth on the SWIPL mailing list (link to be added), but that required constructing a convenience method in Prolog (which does not seem like a good general solution), and I did not get it to work anyway.

Rather, it seems the best to keep the Java methods general enough to allow for flexible use of them for calling any prolog methods from Bioclipse's JavaScript environment.

Number 1 was my first thought

Number 1 was my first idea, based on the fact that this is the place where one has access to most other functionality in Bioclipse. Some code can surely be put here, but it turns out that I had too big hopes for what can easily be done from here. Putting to much of reasoning code in the Bioclipse JS environment is associated with a number of problems:

  • Loading data back and forth between Prolog and Java almost indefinitely will lead to slower execusion, as Prolog can not draw benefit of it's own optimizations + the overall overhead of data shuffling between different locations in memory.
  • The JPL is not well suited for handling large data amounts at once. For example, if trying to load rdf data into a JPL variable (rather than just importing it into the Prolog engine), this will give Java stack size errors when going over around 3000 triples, as documented by me in the SWIPL mailing list (link to be added). This is of course problematic, since in many reasoning problem you want to do exhaustive searches in order to know for sure whether something exists OR NOT in the database.

No 3 seems to be the choice

So, number 3 seems to be the choice. RDF data is conveniently loaded directly into the Prolog engine using the rdf_db and rdf_load/1 predicates, and convenience methods, and indeed any "knowledge integration methods", are quickly developed in a Prolog file (can be done in Bioclipse's text editor and then telling Prolog to consult that file, via a manager method). Those methods can then conveniently be queried from Bioclipse using the general query methods in the Plugin (which does on the fly type checking and prolog query construction). By keeping the actual execution methods inside Prolog, one can be sure to keep up the performance.

So, now that the usage strategy is becoming clear, there's "just" the list of things to decide reported in this previous blog post, before having a useful integration of Blipkit/Prolog.