Steve Bennett blogs

…about maps, open data, Git, and other tech.

Tag Archives: data guru

7 reasons to release that government dataset

As a data guru in residence, I’m helping government bodies prioritise which datasets to release as open data. Sometimes people say “No one would ever find this data interesting, so why bother releasing it?” I think there are several distinct reasons why a given dataset might be worth releasing. Some datasets are valuable for several reasons simultaneously. Some aren’t valuable at all.

When a public servant comments that a potential dataset isn’t interesting or useful, ask: “are there other reasons to release it”?

But if a dataset fails to meet any of these criteria? You have my permission not to release it.

#1 Build an app around it

Census Explorer, by Yuri Feldman, allows easy exploration of part of the 2011 Australian Census.

Datasets like public transport timetables, public bike share station status, or parking space availability are obvious candidates for third party developers to use to build an app. Unfortunately, these examples also require near-realtime feeds in order to be useful.

#2 Support other apps

Even if a dataset isn’t interesting or useful enough to warrant an app in its own right, it could add value to another website or app if it’s easy to use. I’ve come across many of these:

  • Average traffic volume on roads maintained by VicRoads, used to help cyclists decide which roads to avoid, on
  • The slope of footpaths around Melbourne can help wheelchair users navigate the city.
  • The location and species of every tree in Melbourne can add colour and interest to a map of the city.
  • Locations of drinking fountains could be useful for cycling, jogging, or dog walking apps or websites.
Vicroads traffic volume

Which way would you cycle to Port Albert?

#3 Interesting for research

If a dataset is big, rich, detailed and high quality, then there’s a pretty good chance it’s worth of some kind of analysis. If it’s unique enough, then it might even interest a researcher in starting a research project just to look at this dataset.

Examples: building permits database, public transport timetables (for urban planning).

#4 Supporting other research

Much more common than such a rich dataset is small datasets that researchers find useful to solve particular problems, add context, or strengthen an analysis. Local Government Area boundaries aren’t inherently interesting, but they’re one of the geospatial datasets that researchers request the most often. The ATO’s Standard Business Rules taxonomy sounds incredibly dry to me, but is of potential use to lots of people trying to glue different kinds of data and applications together.

#5 Policy and analysis

Lots of organisations need government data to develop internal strategies or policies to be shared with the public – or even to influence government. Typically they get the data either by transcribing tables from official reports, or by developing direct relationships with the government body in question. Publishing data directly to an open data portal allows a wider range of groups to make use of it, without the overhead of having to ask whether the data is available. Data that is collected regularly, in the same format is a particularly likely to be useful.

#6 Transparency

If the data relates to how government decisions are made, it may be worth releasing to demonstrate transparency – regardless of how much the dataset is even used. For example, releasing annual budget data as an easy to use spreadsheet makes a big political statement about willingness to be scrutinised. Even if no citizen takes up the opportunity to crunch the numbers, they may still appreciate having that option.

Examples: annual budgets, revenue sources (parking meters, speeding fines), parliamentary voting records.

#7 Insights for government

If you’re really lucky, the dataset you publish may help another part of government do something useful. I think good things happen when people can access data without having to ask anyone for it, and the some goes for governments themselves. You can’t really expect insights, but if it happens – great.


The Data Guru in Residence

Cross-posted at Code for Australia.

Last week, Code for Australia launched its first fellowship program, a four-month project where a civic-minded developer will try a new approach to helping government solve problems with their data. For the next few months, I’ll be the Data Guru in Residence, blogging mostly to The program got a brief mention in The Age.My goals are to find interesting and useful datasets, help make them public, and do fun stuff with them. It’s a kind of test run for the Code for Australia hacker in residence program currently being developed. Since I work for the University of Melbourne, I’ll be targeting datasets that are useful for researchers, and using VicNode to store data wherever it’s needed.

To start with, I’m spending some time with the CityLab team at City of Melbourne. They’re very progressive on the open data front, and their Open Data Platform has some really high quality datasets, like the 70,000-tree Urban Forest or the Development Activity Monitor which contains detailed information on property developments.

“Living, Breathing Melbourne”, our GovHack Project, would be so much better with live data feeds.

Some of the immediate datasets on the radar are finding live feeds from the city’s pedestrian sensors and bike share stations. I’d love to incorporate these into the successful Govhack project, Living Breathing Melbourne, built with Yuri Feldman and Andrew Chin. There’s also lots of interesting data from the Census of Land Use and Employment with immense detail on how floorspace is divided up between residential, retail, commercial and so on. There areMahlstedt fire plans, LIDAR data, and a really detailed, textured 3D model of the CBD. And of course other data that’s already public, but whose full potential hasn’t yet been realised.

If you’re from a government body (Federal, State, Council, or agency), based in or around Melbourne and you could use the services of a Data Guru, please get in touch!