I'm a M.Sc. Student (grad in Aug 2010) in Biotechnology, at Uppsala University, focusing on Semantic Web, Bioinformatics and Systems Biology. I'm also a MediaWiki and Drupal geek, more or less =D.
This blog is currently used for documentation of the following projects:
-- Samuel Lampa - firstname.lastname@gmail.com
With the release of RDFIO 0.4.0, my GSoC 2010 project is now over!
I want to thank especially my mentor Denny Vrandečić, and also the SMW community at large for a great time! I also want to sincerely thank my masters project mentor Egon Willighagen who mentioned about, and encouraged me to apply to the program. Without this encouragement, I'd never taken the step. It has been a good time and rewarding, and I've much enjoyed to have time to get a bit into MediaWiki/SMW extensions development, as well as to provide some new functionality to these great bits of software.
The main GSoC coding is now over, and I will need to take a little break for an exam this friday, but surely I'll continue to refine the RDFIO extension later, especially as Egon and me are looking into using it to integrate Bioclipse with SMW in order to make RDF data in Bioclipse "Community editable" (could turn out to be some real useful stuff!).
This new release brings a lot of refactoring and reworkings under the hood, as well as quite a few minor bugfixes and improvements here and there, so upgrading is recommended. We also got the issues in SMWWriter fixed now, so patching it is no longer needed, which hopefully will make installing easier!
Of the more notable changes are the improved selection of wiki titles on import, as described in this blog post. Another important fix is that the default output format (if not specifying any) is now "SPARQL Resultset XML" which now makes the SPARQL endpoint fully "SPARQL compliant" and queryable from typical SPARQL tools like Jena. It is a remaining topic though how to allow update operations without leaving the endpoint wide open ... i.e. how to implement some form of user rights checking, when used as a webservice.
A little technical note also is that RDFIO now takes the $wgDBprefix parameter into account, so if you are using RDFIO with table prefixes in the database, you will need to regenerate the tables and the triples in the store (can be done at the Special:ARC2Admin and Special:SMWAdmin pages respectively).
I should not end without a note about some great bits of existing code that I've had the pleasure to make use of:
This week I just finished the last remaining items on my todo list for my Google Summer of Code project, (which is available in the form of the RDFIO MediaWiki extension). Those things, which I also mentioned in my last blog post were to:
Regarding the first point, it might not be overly easy to see the usefulness of it at once, so I just created a screencast to show the difference between using it and not:
It demonstrates the problem of choosing sensible wiki titles for general RDF entities in case no good property for naming is available, (such as rdfs:label etc) ... since "entity" URIs often just consist of nonsensible id:s and often no namespace prefixes are defined for them. RDFIO lets you add "pseudo" namespaces (using a simplified splitting pattern, not necessarily consistent with XMLns specs), in order to come around this problem.
Hopefully I'll find time to also demonstrate the second point above, as well as the "filter by ontology" feature for the SPARQL endpoint, with screencasts early next week.
Otherwise, the coming week I'll use for doing some refactoring of the currently quite unmanageable code, as well as add commenting, and hopefully also add the feature to filter RDF export by a [[Export RDF::false]] SMW property (which was the "it time permits" item of my TODO list).
I just created a new release of the RDFIO MediaWiki extension. A somewhat detailed list of the changes can be found in the change log. The relevant links:
New for this release is a "export by ontology" feature, that - when possible - restricts the URIs used for a wiki page to only those that appears in an ontology that the user points to. To give an idea of this feature see the following screenshot:

On the page "Samuel", I have one fact:
[[has blog::http://saml.rilspace.org]]
... and on the page "Property:has_blog", there are a number of facts, including:
[[Equivalent URI::http://xmlns.com/foaf/0.1/weblog]]
[[Equivalent URI::http://example.org/ExampleOntology/weblog]]
From the "remaining TODO list" from my last blog post, the following are finished with this release:
The remaining items ones are now:
Just to inform that I created a new release of the RDFIO MediaWiki extension. It contains important security fixes, by adding at least some basic checking of user rights and CSRFs (Cross site request forgeries) to the SPARQL endpoint, RDF import form etc. Thus, it's highly recommended to upgrade if you are using the extension on a public wiki!
Also, you might already have seen:
Otherwise, me and Denny just confirmed the remaining TODO list for my GSoC project, which is what I start working on now:
There are also some extra additions that I'll look into if time permits, like adding support for filtering the output on export with a property such as "RDF export::False", as suggested by Daniel Herzig.
In a previous blog post I demonstrated with a screen cast the RDFIO extension for Semantic MediaWiki but nothing on installation.
By testing I realized that the install procudure was VERY painful. I have now (with much valuable help from Oleg Simakoff) corrected a number of errors in the instructions and the code, and added to the install instructions commandline snippets for linux/ubuntu. I also created a screencast which goes through the steps from scratch (except Apache/MySQL/PHP setup), in a little more than 5 minutes. Hope this makes things easier for you testers! (And as you might try it out, please report any bugs or issues in the issue tracker!)
Sorry for the low volume level! Didn't realize that while recording ... :/
So, for those of you who might think the Install instructions for the RDFIO Semantic MediaWiki extension I'm working on are a bit daunting but would like a glimpse of what my GSoC project is up to anyway, I created a short (3:20) screencast demonstrating (ARC2 based) RDF Import and SPARQL "Update" functionality for some example data. (Sorry for the lame speaking ... :P ... didn't sleep for a looong time )
The screencast shows how you can import RDF/XML into Semantic MediaWiki and then use the SPARQL endpoint to insert or remove data to/from articles, even using the original format of the RDF that you imported earlier.
(For you who decide to try to install, please have a look at the error fixing happening in this thread.)
I just moved to a new Google code repository, reflecting the name change of the MediaWiki extension from "SMW RDF Connector" (it's awfully long, isn't it) to "SMW RDFIO", or just "RDFIO", so please update your links!
See also the newly created extension page, which will be the hub for information about the extension in the future.
(Figured I better do regular status updates, as a lot of small things tend to get missed if blogging only when there is something to show off)
As you might have seen among the GSoC2010 tagged posts I've had a rudimental RDF/XML import, and a SPARQL endpoint (only for querying so far!) up running for a while. You should be able to set up these yourself by following one or more of the instructions in the Google code repo:
I have since worked a bit on some use cases, which revealed a lot of intricacies to take into account on RDF import. One of them was a spinoff discussion, from a blog post by Egon Willighagen, which quite nicely outlines one of the motivations for having general RDF import in MediaWiki (read post, read discussion).
The last few days I've been working on heavly refactoring the import code, so that it is more general and easy to modify in new ways. There is still a lot to be improved in the code, like error handling, documentation, adding more options etc, so feel free to give feedback on the code! (Especially RDFImporter.php and EquivalendURIHandler.php, and preferrably use the mailing lists: semediawiki-devel, semediawiki-user or mediawiki-l)
The RDF import seems to be the most challenging part in my project (and on which the export feature heavily depends) - since it is the part where I'm breaking a bit of new ground, so here feedback is much welcome.
The one most challenging issue is about how to select reasonable wikititles to use for RDF entities on RDF import, based on the RDF data (one relevant blog post here). The question of being able to export the page with the original URI, should not limit the choice directly, since this is already solved by storing the original URI as a property on each page.
The thoughts we have had so far - in short - is:
Using only the label of course has the risk that multiple RDF entities converts to the same wiki title, which is not acceptable for example if using the wiki as a "one time RDF editor", which is one of the motivations for this project.
To solve this, one alternative (as a configurable option) could be to use a pseudo-namespace in the wiki title (e.g. "go" in the above example, which would result in "go:0032283" as the wikititle). This could be configured by creating a mapping between base URI:s and pseudo-namespaces (.e. "http://bio2rdf.org/go:" and "go", in this case).
But then there is the question how to configure this mapping. We've been thinking of a few options:
I will be working ahead, and try to figure out the most reasonable strategy together with Denny (who is my GSoC mentor), but feedback and comments are always welcome! (As said, preferrably send feedback on the mailing lists; semediawiki-devel, semediawiki-user or mediawiki-l!)
If you want to follow the project progress, see the status page for options
I ran into some troubles with the debugging with XDebug in Eclipse for PHP Developers / PDT (breakpoints stopped to take / catch, after I changed location of my www folder - which I figured out later), so I wanted to document the full setup procedure it here. I mainly followed this blog post. (Assuming you have apache and php set up!).
apt-get install php5-xdebug
zend_extension=/usr/lib/php5/20090626+lfs/xdebug.so
xdebug.remote_enable=On
xdebug.remote_host="localhost"
xdebug.remote_port=9000
xdebug.remote_handler="dbgp"
# xdebug.remote_log="/tmp/xdebug.log"
/etc/php5/conf.d/xdebug.ini.ucf-dist
sudo apache2ctl restart

Then an important thing If you have changed the location of your www folder from /var/www, then don't miss this! This was what caused the problem for me, with breakpoints not taking:

Most probably, the extended RDF import/export in Semantic MediaWiki, which I'm working on, will be made an extension, to start with (at least that's my idea).
First I was calling it "RDFIO", but in fear of being a too general name, or too undescriptive, I switched to "SMW RDF Connector", and the project currently lives in a repo called SMW RDF Connector.
Now, realize this is awfully long, I'm thinking to go back to "RDFIO", or rather "SMW RDF IO" again, before it's "too late", (haven't even created a MediaWiki extension page yet). Seems like it's the most clever, short choice after all.
Any comments?
Recent comments
2 weeks 4 days ago
2 weeks 4 days ago
2 weeks 4 days ago
5 weeks 21 hours ago
6 weeks 4 days ago
6 weeks 5 days ago
6 weeks 6 days ago
6 weeks 6 days ago
7 weeks 8 min ago
7 weeks 11 min ago