I have a new blog!

Editing Semantic MediaWiki from Bioclipse (with Screencast)

The original use case behind the RDFIO Semantic MediaWiki Extension which I developed as part of "Google Summer of Code 2010", and which was to hook up SMW with Bioclipse, is now concretizising. By using the new Bioclipse SMW Module (code here) it is now for the first time possible to add and remove SMW facts from inside Bioclipse, using a little Bioclipse JS Script:

var wikiURL = "http://drugmet.rilspace.org/wiki/";
smw.addTriple( "w:Caffeine", "w:is_a", "w:Molecule", wikiURL );

Removing triples is similar:

var wikiURL = "http://drugmet.rilspace.org/wiki/";
smw.removeTriple( "w:Caffeine", "w:is_a", "w:Molecule", wikiURL );

Well, you can use full URI:s also, but using the "w" prefix references wiki article titles directly. Thus you can view the result of the addition at http://drugmet.rilspace.org/wiki/Caffeine

What does this mean? Well, one thing is that with Bioclipse you can edit facts in SMW with the ease and power of javascript! This could enable scenarios where an SMW gets prepopulated with data for subsequent community editing, whereafter data can be transferred back to Bioclipse again, (as blogged about by Egon Willighagen already), possibly making community editing of scientific data mainstream!

There's a little convenience method for getting all RDF data from the SMW too:

rdfStore = rdf.createInMemoryStore();
rdfStore = smw.getRDF( "http://drugmet.rilspace.org/wiki/" );

... which you can then query locally with SPARQL, using the rdf manager:

result = rdf.sparql( rdfStore, 
                     "SELECT DISTINCT ?p WHERE { ?s ?p ?o } LIMIT 10" )
js.print( result );

...getting some output like so:

[["p"],
["http://www.w3.org/1999/02/22-rdf-syntax-ns#type"],
["http://www.w3.org/2000/01/rdf-schema#domain"],
["http://www.w3.org/2000/01/rdf-schema#range"],
["http://www.w3.org/2000/01/rdf-schema#subPropertyOf"],
["http://www.w3.org/2000/01/rdf-schema#subClassOf"],
["http://semantic-mediawiki.org/swivt/1.0#wikiPageModificationDate"],
["http://semantic-mediawiki.org/swivt/1.0#wikiNamespace"],
["http://www.w3.org/2000/01/rdf-schema#isDefinedBy"],
["http://semantic-mediawiki.org/swivt/1.0#page"],
["http://www.w3.org/2000/01/rdf-schema#label"]
]

That's it! Oh, well, I demonstrated the same thing with a screencast as well:

Yay! :) As getting this to work, we ran into a number of bugs in RDFIO, so that also resulted in a new release

RDFIO 0.5.0 released

Version 0.5.0 of RDFIO, the MediaWIki extension providing PHP-based SPARQL endpoint and RDF import capabilities to to Semantic MediaWiki (and previously developed as part of my GSoC 2010 project ), is now released. 

The 0.5.0 release

The 0.5.0 release fixes numerous bugs that were encountered as me and Egon Willighagen were working to hook up SMW with Bioclipse, as I blogged about earlier. We have now got this connection up running, so hopefully we tracked down most of the relevant bugs. A shortlist of changes can be found in the changelog. Links to download plus install instructions to be found on the extension page.

I hope to blog/screencast about the new Bioclipse->SMW editing functionality shortly.

Autumn is here - new projects starting

Now that the autumn is here I have some new projects starting, and running for approximately this month (before jumping out into the real world and trying to get a real job :) ). Good thing is, it all builds upon previous work.

First thing is, I'll work with Egon Willighagen to hook up Bioclipse with Semantic MediaWiki, via the RDFIO extension, which I developed as part of GSoC this year, mentored by Denny Vrandecic. I'm excited about putting RDFIO into some real world usage!

Second thing is I'll work part time at the Bioclipse group to improve the ways user documentation for plugins is authored and published, enabling some automation of publishing content to the Bioclipse website as well as the wiki etc. This will hopefully make it a lot easier for end users to find their desired extra functionality, and make more users see the value of a research platform with an open and modular structure, as is Bioclipse!

RDFIO 0.4.0 released - GSoC Finished!

With the release of RDFIO 0.4.0, my GSoC 2010 project is now over!

I want to thank especially my mentor Denny Vrandečić, and also the SMW community at large for a great time! I also want to sincerely thank my masters project mentor Egon Willighagen who mentioned about, and encouraged me to apply to the program. Without this encouragement, I'd never taken the step. It has been a good time and rewarding, and I've much enjoyed to have time to get a bit into MediaWiki/SMW extensions development, as well as to provide some new functionality to these great bits of software.

The main GSoC coding is now over, and I will need to take a little break for an exam this friday, but surely I'll continue to refine the RDFIO extension later, especially as Egon and me are looking into using it to integrate Bioclipse with SMW in order to make RDF data in Bioclipse "Community editable" (could turn out to be some real useful stuff!).

The 0.4.0 release

This new release brings a lot of refactoring and reworkings under the hood, as well as quite a few minor bugfixes and improvements here and there, so upgrading is recommended. We also got the issues in SMWWriter fixed now, so patching it is no longer needed, which hopefully will make installing easier!

Of the more notable changes are the improved selection of wiki titles on import, as described in this blog post. Another important fix is that the default output format (if not specifying any) is now "SPARQL Resultset XML" which now makes the SPARQL endpoint fully "SPARQL compliant" and queryable from typical SPARQL tools like Jena. It is a remaining topic though how to allow update operations without leaving the endpoint wide open ... i.e. how to implement some form of user rights checking, when used as a webservice.

A little technical note also is that RDFIO now takes the $wgDBprefix parameter into account, so if you are using RDFIO with table prefixes in the database, you will need to regenerate the tables and the triples in the store (can be done at the Special:ARC2Admin and Special:SMWAdmin pages respectively).

I should not end without a note about some great bits of existing code that I've had the pleasure to make use of:

  • ARC, the PHP RDF library I've been using, and on which RDFIO is very heavily dependent
    I found it very powerful and super conveniant to work with!
  • I enjoyed making use of the SMWWriter and PageObjectModel extensions, which also definitely made life easier for me and saved me tons of work.

Sensible wiki titles on RDF import with "pseudo RDF namespaces"

This week I just finished the last remaining items on my todo list for my Google Summer of Code project, (which is available in the form of the RDFIO MediaWiki extension). Those things, which I also mentioned in my last blog post were to:

  • Add ability to use ("pseudo") namespaces for general RDF entities (non-properties) in order to choose wiki titles for them on RDF import.
  • Add a screen that shows URIs lacking a namespace prefix to abbreviate it with.

Regarding the first point, it might not be overly easy to see the usefulness of it at once, so I just created a screencast to show the difference between using it and not:

It demonstrates the problem of choosing sensible wiki titles for general RDF entities in case no good property for naming is available, (such as rdfs:label etc) ... since "entity" URIs often just consist of nonsensible id:s and often no namespace prefixes are defined for them. RDFIO lets you add "pseudo" namespaces (using a simplified splitting pattern, not necessarily consistent with XMLns specs), in order to come around this problem.

  • The new functionality is so far only available in the svn trunk
  • More info, install instructions etc on the extension page

Hopefully I'll find time to also demonstrate the second point above, as well as the "filter by ontology" feature for the SPARQL endpoint, with screencasts early next week.

Otherwise, the coming week I'll use for doing some refactoring of the currently quite unmanageable code, as well as add commenting, and hopefully also add the feature to filter RDF export by a [[Export RDF::false]] SMW property (which was the "it time permits" item of my TODO list).

RDFIO 0.3.0 released

I just created a new release of the RDFIO MediaWiki extension. A somewhat detailed list of the changes can be found in the change log. The relevant links:

The filter by ontology / vocabulary feature


New for this release is a "export by ontology" feature, that - when possible - restricts the URIs used for a wiki page to only those that appears in an ontology that the user points to. To give an idea of this feature see the following screenshot:

On the page "Samuel", I have one fact:

[[has blog::http://saml.rilspace.org]]

... and on the page "Property:has_blog", there are a number of facts, including:

[[Equivalent URI::http://xmlns.com/foaf/0.1/weblog]]
[[Equivalent URI::http://example.org/ExampleOntology/weblog]]

What will happen when submitting the form in the screenshot is; if I click only "Output Equivalent URIs", then both of the above facts will be exported, but by enabling "Filter by Vocabulary" and setting the URL to FOAF:s definition file, the export will be filtered to only contain the first fact, which is included in the FOAF definition.

Current GSoC status

From the "remaining TODO list" from my last blog post, the following are finished with this release:

  1. In the SPARQL endpoint, enable querying by any URI specified as equivalent URI
  2. For RDF export, implement an "export by ontology" option, that - when possible - restricts the URIs used for a wiki page to only those that appears in an ontology that the user points to.

The remaining items ones are now:

  1. Create an HTML interface for interactively configureing how wiki titles should be chosen for RDF entities for which no preferred "wiki title property" (such as rdfs:label, dc:title etc.) was found.
  2. Add "pseudo namespaces" as an option for choosing wiki titles from general RDF URIs (not only properties!). I.e, the possibility to abbreviate a part of an URI into a pseudo-namespace, making the URI more fit for use as wiki title. (For properties, there if often a well known abbreviation for the corresponding vocabulary/ontology's base URI, but this is often not the case for general RDF entities, which can often be from some user defined data etc).
  3. If time permits:
  • Implement filter by "export rdf" property.

New release of RDFIO (0.2.0) with security fixes

Just to inform that I created a new release of the RDFIO MediaWiki extension. It contains important security fixes, by adding at least some basic checking of user rights and CSRFs (Cross site request forgeries) to the SPARQL endpoint, RDF import form etc. Thus, it's highly recommended to upgrade if you are using the extension on a public wiki!

Also, you might already have seen:

Current GSoC status

Otherwise, me and Denny just confirmed the remaining TODO list for my GSoC project, which is what I start working on now:

  1. In the SPARQL endpoint, enable querying by any URI specified as equivalent URI
  2. For RDF export, implement an "export by ontology" option, that - when possible - restricts the URIs used for a wiki page to only those that appears in an ontology that the user points to
  3. Create an HTML interface for interactively configureing how wiki titles should be chosen for RDF entities for which no preferred "wiki title property" (such as rdfs:label, dc:title etc.) was found.
  4. Add "pseudo namespaces" as an option for choosing wiki titles from general RDF URIs (not only properties!). I.e, the possibility to abbreviate a part of an URI into a pseudo-namespace, making the URI more fit for use as wiki title. (For properties, there if often a well known abbreviation for the corresponding vocabulary/ontology's base URI, but this is often not the case for general RDF entities, which can often be from some user defined data etc).

There are also some extra additions that I'll look into if time permits, like adding support for filtering the output on export with a property such as "RDF export::False", as suggested by Daniel Herzig.

Screencast: Installing Semantic MediaWiki and RDFIO from scratch on Ubuntu

In a previous blog post I demonstrated with a screen cast the RDFIO extension for Semantic MediaWiki but nothing on installation.

By testing I realized that the install procudure was VERY painful. I have now (with much valuable help from Oleg Simakoff) corrected a number of errors in the instructions and the code, and added to the install instructions commandline snippets for linux/ubuntu. I also created a screencast which goes through the steps from scratch (except Apache/MySQL/PHP setup), in a little more than 5 minutes. Hope this makes things easier for you testers! (And as you might try it out, please report any bugs or issues in the issue tracker!)

Sorry for the low volume level! Didn't realize that while recording ... :/

Screencast: RDF Import and SPARQL "Update" in Semantic MediaWiki

So, for those of you who might think the Install instructions for the RDFIO Semantic MediaWiki extension I'm working on are a bit daunting but would like a glimpse of what my GSoC project is up to anyway, I created a short (3:20) screencast demonstrating (ARC2 based) RDF Import and SPARQL "Update" functionality for some example data. (Sorry for the lame speaking ... :P ... didn't sleep for a looong time )

The screencast shows how you can import RDF/XML into Semantic MediaWiki and then use the SPARQL endpoint to insert or remove data to/from articles, even using the original format of the RDF that you imported earlier.

(For you who decide to try to install, please have a look at the error fixing happening in this thread.)

Moved to new SVN repository (please update links)

I just moved to a new Google code repository, reflecting the name change of the MediaWiki extension from "SMW RDF Connector" (it's awfully long, isn't it) to "SMW RDFIO", or just "RDFIO", so please update your links!

See also the newly created extension page, which will be the hub for information about the extension in the future.