More than six months ago the Romanian government launched the data.gov.ro website, a portal for open public data. Given the rather low opinion I have about Romanian politicians, I have been rather skeptical about the future of this project. Until I visited the site again the last week and I realised that the project started to develop rather nicely and that it has some potential (I have been told that positive reinforcement has been proven effective in some cases).
So, in my attempt to focus more on “positive developments” in Romania, I will look at the half-full part of the glass and present to you what data.gov.ro has to offer to its visitors. For a review of the site at the moment it was launched (in Romanian) you can access Ovidiu Voicu’s post on La colțu’ străzii.
Before getting at the data, first some thoughts on the site itself. The portal is still in the “beta” stage, and sometimes it shows. For instance, users can change the language of the site using a menu in the footer. While the list of available languages is impressive (probably they used the widget with the default settings), even when selecting English the site is still not fully translated. On a more positive note, the portal is built using CKAN, the data management that seems to be the standard for open data portals (see examples here). So even if the Romanian portal does not look yet like the British portal, in theory such a thing could happen (fingers crossed!).
Users can create an account on the website. This offers additional functionalities, such as adding datasets (did not try it – I suppose there are some restrictions to who can post datasets on a governmental portal), saving searches and so on. The beta status of the portal was visible here as well. I asked for a password reset and all I got was a server error message. So make sure you remember the password you use when you create your account.
The user can reach the data in three different ways: by dataset, by organisations, or by groups. At the moment the “groups” feature is the least developed: there are only four defined groups (public acquisitions, European funds, classifications, and reports) but only six datasets are assigned to these groups. I think what they had in mind was some sort of tag- or keyword-based grouping that was badly implemented. Notice again the incomplete translation of the content into English.
Trying to access datasets by “organisations” leads to a list of 52 organisations presented on three pages (21 institutions per page, using a 3-by-7 grid) and that seem to be sorted alphabetically. I say “seem to be” because there are some exceptions from this possible sorting (Ministerul Muncii, for instance, comes before Ministerul Mediului). Only 27 of the 52 organisations have datasets assigned to them, so I choose to hope that soon the remaining 25 organisations will “free” their own datasets. By clicking on the name of an organisation the user can access the organisation’s profile page, which includes all the datasets that belong to the organisation as well as some additional information about the organisation (description, groups, tags, dataset formats, licenses). This idea is also lacking a proper implementation. Most of the organisations do not have any information added to their profiles. It is the usual problem of basic profiles created automatically by the administrator of the profile: profile owners are very slow when it comes to adding content to their profile.
And now, let’s take a look at the datasets. The first thing anyone will notice is that there are not that many datasets available to possible data users. Regardless of how you try to spin the facts, in the end the portal offers only 129 datasets. You’ll find below the corresponding numbers from the similar portals for the EU, USA, and UK (the last two are explicitly named by the Romanian portal as examples they’re trying to emulate). This is one of those examples where there’s really nothing left to add to the data.
Ok, at least we do have some data that the government is willing to share with us. The next graph looks at the distribution of formats used for the 129 datasets that are distributed through the open data portal. Since some datasets are offered in multiple formats, the total of all formats will exceed 100%. Almost three quarters of the datasets are offered as Excel spreadsheets, while only 13% are available in the truly open CSV (comma-separated value) format. So there’s some space for improvement.
The two datasets that are available as PDF and as ODS can be ignored, since they are also available as CVS files the other formats are useless. The two KMZ files contain the boundaries of the Romanian administrative units for NUTS 1, NUTS 2, and NUTS 3. I have yet to load the data into QGIS but having NUTS-3 boundaries available for free is always nice for students and under-funded academics.
Most of the data available on the portal are data related to budgets: the national budget, budgets of several public institutions, budgets of some cities and so on. I have never been interested in budgets but I think I should congratulate all the institutions that offered the datasets for being more transparent than they have been about how they are spending public funds. In addition to the budget data, browsing through the list of datasets available on the site it seems one can also find data on:
- military participation in international missions
- climate, air quality, pollution
- elections, referenda
- boundaries of administrative units
- lists of churches, schools, hospitals, monuments, museums
- lists of translators, public notaries, bailiffs
There are three more things worth mentioning. First, do not expect good data documentation. Most datasets do not have even the most basic codebook, so you will have to do some additional research to find out definitions, units of measurement and other data characteristics. Second, you have to check and clean the data. Data quality varies across datasets so even though you got the data from the government’s open data portal that does not mean you can assume to have clean data. Finally, because I wanted to finish the review on a positive note, almost all datasets are offered under the Open Government License.
I will let you discover the rest of what the Romanian open data portal has to offer. I know I said I will look at the half-full part of the glass but, after really looking in detail at what the portal is offering, I do not think there has been enough progress since it was launched. The only thing that changed in six months is the number of datasets that are available, from 27 to 129. Nothing has been done, however, to improve the functionality of the portal.
|Design:||(7 / 10)|
|Usability:||(7 / 10)|
|Content:||(4 / 10)|
|Average:||(6 / 10)|