Last week has been the G8 week in Ireland, so hey! Let's shout indignation! And our 99% hate against 1% bastards! And hasta la victoria! And… not really… It has also been a significant week for open data efforts in the western world, and EU in particular.
Open Data, a term that often refers to government and public administrations, is nowadays an international force, which is spreading among governments, organisations, and very active communities of 'civil hackers'. As wikipedia says, at the most basic level, it's about releasing data without technical or copyright restrictions. At a more technically-advanced level, it also implies releasing data in formats that makes them highly re-usable.
Open Good Propositions
The G8 has released an Open Data Chart, where the member countries pledge to support more open data policies and technical solutions in future, by means of four principles, which are more policy-related, than regarding technical aspects. While this is probably a significant step forward, it's still just a sort of memorandum (at least this is what it looks like to me) and, as such, it doesn't say much about a number issues. Political ones have been described elsewhere. For those having hands on IT and the like, technical issues not covered by such document are quite evident. For instance, they mention the need to release data in open formats (with brief use of the word 'standard'), however they haven't written about any mechanism to promote high level of data structuring and integration, as the 5-star approach and, worse, they cite CSV as an example of the minimum that would make it. I say worse because, despite this being better than nothing, it's certainly not such better. Moreover, in defining open data, I think it isn't clear how the pledges they make on the top will meet a variety of national situations and departments, including technical backwardness and cultural resistance from the bureaucrats.
Bringing Semantics to Open Data
At European Union level, these issues have been tackled in detail by the 2013 edition of the Semantic Interoperability Community conference (SEMIC 2013), which I attended. The meeting is organised by the ISA programme, which, in turn, is part of EU efforts to promote integration at various levels, such as IT services, digital data, legal. ISA works on semantic integration and open data, promoting standardisation, including usage of common vocabularies, work practices and concrete projects (pilots). One of the most prominent is joinUp, a catalog of integration-related descriptions, such as open source software, policies, data sets.
In a field where CSV export is way too often the best that you can see around about open data, it was nice to hear about so many projects based on RDF and on SKOS vocabularies or alike. That's pretty good and there's room for further improvement. For instance:
- Marco Pellegrino has presented questions about the best relationship that EUROSTAT should have with its users, which well apply to virtually any organisation trying to give information to the public.
- James Hendler, one of the fathers of the Semantic Web, has highlighted how we still need to join ontology people with database people. Namely, he put it like: I spend much time trying to convince database people to use more semantics and to convince ontology people to stop with all that semantics.
- Giorgia Lodi presented a government-related work, really impressive, considering the limited number of people who have done it. And in her talk one could acknowledge what is already known of countries like Italy: a lot of good will from single people and groups, who deliver brilliant results, despite lack of coordination and money. If you understand Italian, you can read about similar concerns here. It's also clear that such situations affect the success of programs like ISA and it's not clear (to me) how their proposals are received on member states.
- Vassilios Peristeras has stressed the importance of a bottom-up approach and of leveraging on existing solutions. This is certainly positive, considering how many passionate civic hackers are currently working on these themes. However, I don't know how much it's realistic and it doesn't work well for things like standards and ontologies (for a bi-directional top/down plus bottom/up approach works better in such cases, as, for instance, it is the case for OBO ontologies).
The SEMIC program and most presentations are available here. I've also put together a personal (i.e., subjective) summary.