GSoC2010

SMWRDFConnector Architecture - mental picture

This is my current mental picture of the architecture of the parts included in the RDF import/export functionality I'm implementing for Semantic MediaWiki as part of my Google summer of code project. I just got the ARC2 based store functional. The functionality still to be implemented in "dashed" lines:

 - - - - -   - - - - -
 | Export |  | Import |   
 - - - - -   - - - - -
      ^          |     
      |          v     
  - - - - - - - - - -   ---------------
 | Equiv URI handler |->|  SMW Writer |
  - - - - - - - - - -   ---------------
          ^                   |
          |                   v
 ---------------------  ---------------
 | SPARQL+ Interface |  |     SMW     |
 ---------------------  ---------------
          ^        ______/    |
          |       v           v
  -----------------     ----------------
  |  ARC2 Store   |     | MediaWiki DB |
  -----------------     ---------------- 

Working ARC2 RDF Store connector committed

Now I have a working RDF Store connector for Semantic MediaWiki, that uses ARC2:s RDF store, rather than SMW:s built-in store. This will allow to take advantage of functionality in ARC2, such as possibility to set up a SPARQL endpoint etc.

Thanks to Alfredas Chmieliauskas for the Joseki store connector in the SparqlExtension for SMW, which this connector is heavily based upon.

The ARC2 connector implements the same amount of the SMWStore API as the JosekiStore, but I'm not yet sure if more needs to be implemented, for the things we want to do (general RDF import/export). Gotta figure that out.

The code is available in the google code repository trunk, and install instructions on the gcode wiki.

Feel free to try it out, but be warned that it has been only very briefly much tested at all yet!

Back on track GSoC:ing

Back on track GSoC:ing. Follow progress at my twitter.

Tags:

GSoC project started

Have started actual coding for GSoC this wednesday (start was 2 weeks delayed because of exams, which I'll catch up). Still just getting up to speed, but looking now into the PHP RDF framework ARC, whose RDF store will replace the currently used RAP store in Semantic Mediawiki. Usage ARC itself looks very straightforward. Just have to figure out the SMW Store API. Looking at the SMWRapStore2.php now, to get an idea.

If you want to follow my progress in (approximate) real-time, then see my twitter.

First look at code - Thoughts and questions

Coding for my GSoC project will start for real in June, 9th or so, but I just had a first look at code, to start wrapping my head around the things involved. I installed the following on my local SMW:

I tested the RAP based SPARQL endpoint, played a bit with SMWWriter, and tried to get some grips of how to best use existing functionality for implementing RDF import/export.

Some questions that arose (for Denny in the first place, I guess, but feel free to comment):

  1. For implementing a SPARQL endpoint, should the ARC RDF store be implemented, similar to how RAP implements a separate store that mirrors the content in the SMW, or is there any way to get around the need for an extra store?
  2. An alternative starting point to implementing an ARC store, In addition to the RAP store connector already available in SMW, seems to be the JosekiStore connector in SparqlExtension. But of course, one can look at both.
  3. Where to find the relevant functionality related to the equivalent URI property found? In SMW_Exporter.php? other places too? 
  4. SMWWriter depends on an outdated version of PageObjectModel, while the latter
    seems to have stabilized (no changes since 2008). What to do about that?

Playing with SMWWriter

We are probably going to use SMWWriter for extending the RDF import/export functionality of Semantic MediaWiki, so I wanted to test it out a bit.

With some copy and paste of code from this page, I quickly had a MediaWiki Special Page set up, where I could make use SMWWriters internal API to implement a crude form for adding or removing "triples" in my Semantic MediaWiki. See Screenshot:

And the result, on the Methane page:

Looks promising. Connecting this with some ARC functionality for parsing SPARQL and RDF/XML, should make a big step in the right direction.

Testing set up of SPARQL endpoint for SMW using RAP and NetAPI

(For my internal documentation, mostly)

Got Eclipse for PHP up running with XDebug

Got Eclipse for PHP up running now, with XDebug. Yay :) It was a snap to install on my Ubuntu box. I basically followed this and this blog post. (The Ubuntu package for XDebug is php5-xdebug).

The Eclipse dialogs had changed location and structure a little, so for my documentation, I included a screenshot of the dialog under "Run > Debug configurations" below.

GSoC Project accepted

Just got to know that my proposed GSoC 2010 project: "General RDF export/import in Semantic MediaWiki" (as documented here), was accepted! That's some good news! :). Mentor will be Denny Vrandečić from the Semantic MediaWiki community.

Surely the project is going to be a challenge but it is a highly motivating one so I'm much looking forward to it, to hopefully, together with my mentor and the community, to solve things, and to learn a lot.

I posted a (slightly shortened) copy of the project proposal and my bio here.

The project will be continuously documented here on this blog, so keep an eye here if you are interested (Use the GSoC2010 tag to filter out relevant posts). Community discussion will likely happen at the SMW-Devel mailing list, and if you want to contact me directly, you can do that at samuel dot lampa at gmail dot com or skype samuel_lampa.

My current status/schedule is:

  • This week: Very busy, finishing thesis report.
  • Next 2 weeks (though starting a little this week): Get dev. environment up running (leaning towards Eclipse with PDT) and looking at code
  • 12/5: Briefing with Denny
  • Up until 9th: Very busy period with exams on 24/5 and 8/6.
  • On 9th: Start coding! (So coding start will be a little delayed, but will make up for that no worries! :) (not to used to having spare time anyway))