The buzz around open data has begun to subside. Those voicing caution about both the immediate impact and long-term success and utility of open data initiatives are being heard and (hopefully) heeded. Even as everything from multilateral data dumps to grassroots data initiatives take off, open data evangelists discover that creating a database does not automatically lead to accessibility, participation or transparency. In the case of international aid data, as Stephen Davenport pointed out in the Guardian blog, key questions like "who updates the data?" and "what can they really use it for?” are often left unanswered.
Even with a well-designed database, James Ball noted at the 1st International Open Data Dialogue: “Open data is often what the government is happy for you to know. Freedom of Information Acts provide what the public actually wants to know.” Any open data initiative must be accompanied by a progressive Freedom of Information Act to achieve its goal. A quick visit to Kenya Open Data, for example, will turn up poverty rates by county, but not over time, and without data sets like disease rates or food aid by county for comparison. Linet Kwamboka, who leads the Kenya Open Data Initiative and the Open Government Partnership at the Kenya ICT Board, admits that the absence of a FOI act is a primary reason for why Kenyan open data “is taking so damn long.”
The challenges of participatory governance are not new. Many of the Four Dimensions of Open Statecraft by Philipp Mueller—processes, community management, platform selection and security/trust—also apply to open data initiatives, where the relationship between data provider and data consumer is crucial to success. Organizations like the World Bank are attempting to tackle these issues of open government data in a toolkit.
The process of developing an open data initiative should involve anyone expected to either provide or consume the data according to basic supply and demand principles. In some places, open data initiatives launch without the buy-in of all government ministries, leading to a sparse and nearly unusable database despite the good intentions of some departments (in the case of Kenya, the national intiative was led by the Ministry of Communications). The European Union has been unable to make its data available due to the reluctance of several member states.
However, from a user's point of view, the information request process on Ask the EU prioritizes user access and is hopefully a precursor to an open data initiative. Its interface enables all citizens to submit inquiries without having to provide detail on why they are seeking the information or where it might be found. The impetus is on the source of that data to identify and provide the information. PublicData.eu, funded by the EU, seeks to circumvent the formal EU structure by enabling access to data linked from different databases. To be successful, the database development process should foster participation of all parts of the coordinating body to ensure that the database meets the demonstrated needs of the entire community.
For users, the user interface, data formats and visualization tools are almost as important as the quality of the data. The platform should be developed and piloted to ensure that even novices in data analysis are able to find and download data in usable formats and explore various visualization strategies. All data should be scraped, cleaned and offered in several formats, not trapped in PDFs that require scraping or manual re-entry. A forum for users to post general questions on how to identify relevant datasets for their research, to more specific guidance on how to download and analyze the data, will increase chances of success for target populations unaccustomed to working with data. Through some FOI and open data systems, data requests and responses and the resulting visualizations are publicly available and searchable on the same site for greater transparency. This additional information provides users with leads for good data as well as ideas for displaying their analysis. Tools should be designed to lower the data literacy bar for users and democratize data.
As the World Bank discovered, making data available online is only one step towards community involvement and a culture of participation. Common strategies for encouraging interaction with data include hackathons to develop apps to explore data, data visualization competitions, and training for journalists and civil society groups on how to extract important information and then tell stories that go beyond charts and tables. Broad participation is a major challenge for such competitions, as reflected by a competition run by the Guardian on development data that was open only to participants from developed countries. This highlights that opening data is only part of the stimulus needed to provoke behavior change among the journalists and civil society members expected to convert data into useful information.
Data providers should also meet users on their home turf. By lending their data and expertise to data journalism training programs, or by uploading their data sets to local data hubs including Africa Open Data and LandPortal.info, data can integrate into the regular workflow of target audiences. That way, a journalist can access data from a non-profit on local land ownership and compare it to international investment in land restitution in the same place. Eliminating the barrier separating Big Data and local data gathered by grassroots NGOs can make the data more relevant to communities and also inform data providers about the data literacy and data needs of their target audience. UNDP, which has just opened its data, and the Government of Argentina, whose Supreme Court just issued a landmark decision ordering the government to provide access to information, can learn from other development data and open data initiatives on how to engage their constituencies.
As open data initiatives spread beyond giant multilateral institutions, trust and security become vital to success and loss of trust may be detrimental to the entire process. National and local open data projects endorsed by or developed in collaboration with respected institutions will have an exponential impact if the data is kept up-to-date and used regularly.
Small-scale data collectors should find a centralized hub for their data and leave the work of promoting the data to the hub and the Google search engine. A million micro-data hubs with no other purpose than to make data public will just add to the data chaos. Users need to be able to trust the credibility of the data. Data providers need to be truthful about data collection methodology, which increases when standardized and linked to a central data hub. Low levels of data literacy naturally provoke distrust and skepticism and must be counteracted by the provision of consistent, complete and reliable data endorsed and vetted by reputable institutions that provide not just what the institution wants to give, but what citizens need to know.
Open data will only be open to a few until the amount, quality and accessibility of data catches up with demand. Today, if I wanted to research mismanagement of aid data, I might have to check the donor’s open data site (if they have one), transparency sites like OpenSpending.org or OpenContracts.org, and uncover involvement of the private sector through OpenCorporates.org and then take my chances with the national FOIA or open data initiative for the rest. As data becomes more available from hyper-local to global, data should become less daunting. Even if these initiatives never work like magic, the gap between open data and transparency will narrow as open data initiatives are built smarter with users in mind.