Steve Bennett blogs

…about maps, open data, Git, and other tech.

OpenStreetMap vector tiles: mixing and matching engines, schemas and styles

For my next web mapping project, we’ll use vector tiles. Great. And the data will come from OpenStreetMap. Excellent. Now you only have five more questions to answer.

For the front-end web application developer who wants to stick a map in their site, vector tiles open up lots of options and flexibility, but also lots of choices.

  • Display engine: which JavaScript library is going to actually draw stuff in the browser?
  • Style: how will you tell the display engine what colour to draw each thing in the schema?
  • Data schema: what kinds of data are contained in the tiles, what are the layers called, and what are the attributes available?
  • Tile transport: how will the engine know where to get each tile from?
  • File format: how is the data translated into bytes to store within a tile file?

We’re very lucky that in a couple of these areas, a single standard dominates:

Which leaves three decisions to make.

Display engine

There are several viable options for displaying your vector tiles, depending on whether you also want to display raster tiles or need creative styling, if WebGL (IE11+ only) is ok, and what else you need to integrate with.

Screenshot 2017-08-22 23.26.11.png

Mapbox Terrain, a style rendered with Mapbox-GL-JS.

  • Mapbox-GL-JS: the industry leader, made by Mapbox, uses webGL, focused on the needs of mass-market maps for large web companies. It has excellent documentation, great examples and very active development.
  • Tangram: made by Mapzen, also uses WebGL, has more experimental and creative features like custom shaders.
  • OpenLayers: a fully-featured, truly open source mapping library primarily built for raster tiles, but with vector tile support. (Disclaimer: I’ve never used OpenLayers, I’m just reading docs here.)

There are other combinations as well, such as Leaflet.VectorGrid.


The style mechanism tends to be closely tied to the display engine. (That was also true of CartoCSS, which was a pre-processor for Mapnik. RIP).

  • Mapbox Style Specification is a single JSON file which defines sources (vector tiles, GeoJSON files, raster tiles etc) and their display as layers (circles, fills, lines, text, icons etc), including properties that depend on zoom and/or data values. It also has some fiddly details for displaying custom fonts and symbols. Supported by Mapbox-GL-JS and Mapbox.js, but no third-party front-end libraries that I’m aware of. (Geoserver, a Java-based web application seems to have support.) Styles can be created with Mapbox Studio, Maputnik (free, open source) or by hand.
  • OpenLayers style is a JSON object for OpenLayers. It doesn’t seem to exist as a file format per se. (I’m not sure why the demo above just uses a ton of JavaScript statements rather than this style object.)
  • Tangram scene file, a YAML format which covers a bit more than just styling data, such as cameras and lighting.


Finally, there are three distinct, well-defined schemas for packaging OpenStreetMap data into vector tiles. There doesn’t seem to be a formal specification for how you define a schema, so each is presented as documentation: a list of layers, each with a list of attributes (and their possible values), and at which zoom levels they appear.

  • Mapbox Streets v7 (22 layers): a highly processed version of OpenStreetMap data optimised for simplicity and performance, geared towards general mapping applications. Layer and attribute names often reflect original OSM tag names (“waterway, class=stream”) but not slavishly (“road, class=link”, “road,
    Dark Matter (OpenMapTiles)

    Dark Matter, a Mapbox Style for OpenMapTiles.


  • OpenMapTiles (15 layers): an open schema developed by Klokan (a Swiss company) “in cooperation with the Wikimedia foundation”. It is a bit looser with layer names (“transportation, class=minor”) and occasionally quirky (“transportation, brunnel=tunnel”)
  • Mapzen (9 layers): includes both simplified “kind=” and original OSM “kind_detail=” tags on almost every object,  making them heavier than the alternatives. Somewhat confusingly, all waterway/water features are combined into a single layer and distinguished only by geometry (line or polygon). At lower zooms, data is sourced from Natural Earth, instead of OSM – I don’t know why. (A lot of work goes into these decisions!)

Matching schemas and styles

Now, the style needs to be designed for the schema: if the schema contains a layer called “roads”, your style can’t be expecting a layer called “transportation”. But it also needs to be expressed in the right format supported by the engine: don’t go feeding no YAML to Mapbox-GL-JS.

For instance:

  • Mapbox Basic uses the Mapbox Streets schema, and is expressed in the Mapbox

    Tron, a highly stylised style from Mapzen for Tangram.

    Style Specification. And hence can be rendered by Mapbox-GL-JS, or OpenLayers. (Other standard Mapbox styles include Mapbox Streets, Mapbox Terrain and Mapbox Dark)

  • Cinnabar uses the Mapzen schema, and is expressed as a Tangram scene file. (Other Mapzen styles include Bubble Wrap, Tron, Zinc, Walkabout and Refill).
  • Klokantech Basic uses the OpenMapTiles schema, and is expressed in Mapbox-GL-JS. (Other OpenMapTiles styles include Positron, Dark Matter, OSM Bright, Toner and Fjord Color).

These styles kind of live within their company affiliations, however. How about styles rendered by one company’s engine, using data from a different schema:

  • Tilezen uses Mapzen’s schema, but is rendered with Mapbox-GL-JS. Demo. (There are also Mapzen examples for OpenLayers and D3). This token effort by Mapbox achieves the same thing.
  • This example uses OpenMapTiles, rendered using Tangram.

Mixing and matching

Which brings us to the point of this post. How do you mix schemas and styles? That is, how do you take a style you designed for Mapbox Streets, and make it work on OpenMapTiles? Or port one of Mapzen’s kooky open-licensed styles so it works with Mapbox Streets? Well, you can’t – yet.

(Adapting a style from one engine to another, like what ol-mapbox-style does, is a tough ask, because engines’ capabilities differ.)

But adapting a Mapbox Style file from one OpenStreetMap schema to another? That seems totally doable – even if there isn’t yet a tool to make that happen.

My quick little proof of concept in NodeJS converted OpenMapTiles’ “OSM Bright” style (left) to versions for Mapbox Streets (centre) and Mapzen (right).

Screenshot 2017-08-17 21.52.54

Want to give me a hand? Get in touch!



2015’s proudest moments

Self-doubt is awful, so this is for Future Steve: here are lots of things you did in 2015 that you can be proud of!

  1. Created Swagger API documentation for PTV’s real-time API, as part of the VicTripathon hackfest. PTV’s documentation is a developer’s nightmare. My version has live testing of the API built in and links to various wrappers in different languages.
  2. Designed a process for judging mobile apps (way outside my area of expertise!) for VicTripathon, across 4 different platforms, with help from Installr. Measure of success: the devs said it was ok.
  3. Created a fun way to explore the combined tree inventories of a dozen councils across Australia.
  4. Created uses open data published by councils in standard formats to tell you if it’s bin night where you live. Provides a nice incentive for councils publishing their first datasets.
  5. Created lightweight, open standards so that councils publishing the same data can do so in an interoperable way. And some of them are actually using it!
  6. The OpenCouncilData toolkit (co-created with Ellen Bicknell), provides a lot of guidance and useful background for councils embarking on the open data journey.
  7. A constantly updated map showing exactly how many datasets each council is currently publishing.
  8. csv-geo-au: A standard for publishing spatial data that refers to regions such as ABS statistical divisions or Australian commonwealth electoral divisions.
  9. Helping get VicRoads and Geelong’s street-level imagery on Mapillary.
  10. Building an automatic feed of real-time environmental data into City of Melbourne’s open data portal.
  11. A proof of concept for a super trail finder, which won a GovHack prize with help from Matt Cengia.
  12. Built 5 Terria-based maps: NEII Viewer (BOM), City of Sydney Environment and Sustainability Portal, a proof of concept for the Greater Sydney Commission, Northern Australia Investment Map, and ParlMap. The last is not public, but a really nice clone of NationalMap providing access to historical election and electoral boundary data for parliamentarians and the Parliamentary Library staff.
  13. Helped organise HealthHack: we had 10/10 happy problem owners in Melbourne, 60 participants, and a really awesome atmosphere.
  14. Wrote the definitive guide to YAML multi-line strings. (Ok, I started it last year, but most of the grunt work was this year).
  15. Participated in developing the next version of Fiscal Data Package, including a major change in presentation.
  16. Got a photo I took on the front cover of a book, Exploring the Jagungal Wilderness. (It’s also a physical book).
  17. Spoke about open data and National Map at lots of events: Local Government Spatial Reference Group, forum, MAV Technology forum (local government),  WebNetwork (local government), NewTech (state government), Open Data Government Community Forum (federal)
  18. Organised a Spatial Analytics for Policy Makers Workshop in the Victorian Government, with speakers from four different mapping platforms represented.
  19. Taught mapping workshops at La Trobe and Victoria University.
  20. Refactored and seriously improved part of the TerriaJS code that that deals with region mapping CSV files, plus much better choroplething.
  21. Designed and launched the GeoNext hackfest.
  22. Co-organised 5 cycle tours: Moe-Inverloch-Port Albert-Rosedale, Nelson-Warrnambool, Geelong-Rye-Melbourne, Tallarook-Jamieson-Benalla, Mt Beauty-Hotham-Falls Creek-Wangaratta.

Your own personal National Map with TerriaJS: no coding and nothing to deploy

National Map is a pretty awesome place to find geospatial open data from all levels of Australian government.  (Disclaimer: I work on it at NICTA). But thanks to some not-so-obvious features in TerriaJS, the software that drives it, you can actually create and share your own private version with your own map layers – without programming, and without deploying any code.

What you get:

  • A 3D, rotateable, zoomable globe, thanks the awesome Cesium library. (It seamlessly falls back to Leaflet if 3D isn’t available.)
  • Selectable layers, grouped into an organised hierarchy of your devising
  • Support for a wide range of spatial services: WMS, WFS, ESRI (both catalogs and individual layers for all of these), CKAN, individual files like GeoJSON and KML, and even CSV files representing regions like LGAs, Postcodes, States…
  • Choose your own basemap, initial camera position, styling for some spatial types, etc.

1. Make your own content with online tools

Want to create your own spatial layer – polygons, lines and points? Use and choose Save > Gist to save the result to Github Gist. (Gist is just a convenient service that stores text on the web for free).

Screenshot 2015-07-02 07.56.52

How about a layer of data about suburbs by postcode? Create a Google Sheet that follows the csv-au-geo specification (it’s easy!), download as CSV, paste it into a fresh Gist.

Screenshot 2015-07-02 08.02.28

2. Create a catalog with the Data Source Editor

Using the new TerriaJS Data Source Editor (I made this!), create your new catalog. You’re basically writing a JSON file but using a web form (thanks json-editor!) to do it.

To add one of your datasources on Gist, make sure you link to the Raw view of the page:

Screenshot 2015-07-02 08.07.02

Screenshot 2015-07-02 08.08.16

Don’t forget to select the type for each file: GeoJSON, CSV, etc.

3. Add more data

You might want to bring in some other data sources that you found on National Map. This can be a little tricky – there’s a lot of complexity in accessing data sources that National Map hides for you.

But here’s roughly how to go about it for a WMS (web map service) data source.

In the layer’s info window, grab the WMS URL

Screenshot 2015-07-02 10.53.13

You’ll need to put “” in front of a some layers, because their WMS servers don’t support CORS.

You’ll also need the value of the “Layer name” field. (For Esri layers you need to dig a bit further.)

Screenshot 2015-11-13 11.10.55

(Yes, this layer is called “2”)

Add a WMS layer, and add “Layers Names” as an additional property. So it looks like this:

Screenshot 2015-07-02 10.55.54

4. Tweak your presentation

You can add extra properties to layers to fine tune their appearance. For example, for our CSV dataset:

Screenshot 2015-07-02 10.57.49

You might want to set “Is Enabled” and “Is Shown” on every layer so they display automatically.

And finally, you might want to set an initial camera and base map, so the view doesn’t start off the west coast of Africa with a satellite view.

Screenshot 2015-07-02 10.39.55

5. Save and preview

Screenshot 2015-07-02 10.46.41

As you make changes, click “Save to Gist” to save your configuration file to a secret location on Gist. You can then click “Preview your changes in National Map”.

Screenshot 2015-07-02 11.01.19

Make a note of the Gist link so you can keep working on it in the future. You can’t modify an existing configuration, but you can load from there and save a new copy.

6. Share!

Now you have a long URL like this:

So, use or another URL shortening service to get something more useful:

After the hackathon: 4 classic recipes


Everyone loves hackathons. And almost as much, everyone loves asking “but what happens to the projects afterwards?” There’s more than one route to follow. I’d like to propose four standard recipes we can use to describe the prospects of each project.

#1: Start-up

The creators of the hack could form a business. The developers work very hard to polish up what they’ve written until it’s a viable product ready for the marketplace, and then try to build a start-up around it while probably looking for external funding.

Snap Send Solve - hackathon to start-up success story

Snap Send Solve – hackathon to start-up success story

This kind of result is very desirable for hackathon organisers because there is such a clear story of benefits and outcomes: “a few thousand dollars of sponsorship paid for a weekend hackathon which led to this $50 million start-up which makes the app your grandma uses, which is great for the economy”.

Ingredients required: Start-up mentors, entrepreneurs, a business focus from the get-go

#2: Government app - a government app in waiting? – a government app in waiting?

If you make an interesting and useful app with a government body’s data, then maybe they’d like to take it on board. They might use the code base, but it’s probably better to use the concept and vision and write the code from scratch. Imagination isn’t a government strong suit, but once they see something they like, they’re pretty good at saying “we need one of those”.

This also doesn’t seem to happen very often, but can we try harder? We should follow GovHack up with serious discussions between hack developers and the government bodies that sponsored them. Following my cheeky “” category winner last year, I did meet with Transport Safety Victoria, but didn’t really have the time or motivation to pursue it. But they were very keen, so why couldn’t we have made it work? Similarly, there was potentially money available from the Victorian Technology Innovation Fund to support GovHack projects, but no clear process meant that months of fumbling through paperwork might eventually lead to nothing. Not so appealing to developers.

Ingredients needed: A solid process, government/developer wranglers, pre-commitment to funding.

#3: Community project

Eventable would make a great community project.

Eventable would make a great community project.

If a hack is interesting and important enough to other developers, could it become a self-sustaining open source project? The idea seems plausible, but I don’t know if I’ve seen it happen. The major blockers are the hackish quality of the code itself which typically would require a major rewrite, and the sense that the weekend was fun, and this would be a lot of work. Hacks are a kind of showy facade. Once developers sit down to talk seriously about onward development, all kinds of serious difficulties start to emerge. And between the end of the weekend and the announcement of prizes a lot of momentum gets lost which can be hard to start up again.

Ingredients needed: Post-hackathon events to explore projects and establish communities.

#4: Story

Living, Breathing Melbourne - still just a story.

Living, Breathing Melbourne – still just a story.

And finally, let’s acknowledge that the most important part of many hacks is their potential as an interesting story in their own right. Anthony Mockler’s GovHack 2012 entry “Is your Pollie Smarter than a Fith Grader” isn’t a failure because it didn’t lead to a start up – it was a great story that captured a lot of attention. My team’s 2014 entry “Living, Breathing Melbourne” has been frequently referred to as a model for actual open data dashboards, even though we didn’t develop it further. We should try to extract as much value as possible from these stories, and preserve their essence, even if only in screenshots and blog posts.

Ingredients needed: Story tellers, blog posts, active engagement with journalists

In summary

Let’s think of these different paths early on when discussing projects: “This would make a great community project“, “I don’t see this going anywhere, but let’s get the story out”, “It would be a shame if the department doesn’t take this on as a government app“. And don’t write off a hack just because it didn’t fit into the mould you were thinking of. how to aggregate 373,000 trees from 9 open data sources

I try to convince government bodies, especially local councils, to publish more open data. It’s much easier when there is a concrete benefit to point to: if you publish your tree inventory, it could be joined up with all the other councils’ tree inventories, to make some kind of big tree-explorey interface thing.

Introducing: It’s fun! Click on “interesting trees”, hover over a few, and click on the ones that take your fancy. You can play for ages.

Here’s how I made it.

First you get the data

Through a bit of searching on, I found tree inventories (normally called “Geelong street trees” or similar) for: Geelong, Ballarat (both participating in OpenCouncilData), Corangamite (I visited last year), Colac-Otways (friends of Corangamite), Wyndham (a surprise!), Manningham (total surprise). It showed two results from Adelaide, and the Waite Arboretum (in Adelaide). Plus the City of Melbourne’s (open data pioneers) “Urban Forest” dataset on

Every dataset is different. For instance:

  • GeoJSON’s for Corangamite, Colac-Otways, Ballarat, Manningham
  • CSV for Melbourne and Adelaide. Socrata has a “JSON” export, but it’s not GeoJSON.
  • Wyndham has a GeoJSON, but for some reason the data is represented as “MultiPoint”, rather than “Point”, which GDAL couldn’t handle. They also have a CSV, which are also very weird, with an embedded WKT geometry (also MULTIPOINT), in a projected (probably UTM) format. There are also several blank columns.
  • Waite Arboretum’s data is in zipped Shapefile and KML. KML is the worst, because it seems to have attributes encoded as HTML, so I used the Shapefile.

Source code for

Tip for data providers #1: Choose CSV files for all point data, with columns “lat” and “lon”. (They’re much easier to manipulate than other formats, it’s easy to strip fields you don’t need, and they’re useful for doing non-spatial things with.)

Then you load the data

Next we load all the data files, as they are, into separate tables in PostGIS. GDAL is the magic tool here. Its conversion tool, ogr2ogr, has a slightly weird command line but works very well. A few tips:

  • Set the target table geometry type to be “GEOMETRY”, rather than letting it choose a more specific type like POINT or MULTIPOINT. This makes it easier to combine layers later.
    -nlt GEOMETRY
  • Re-project all geometry to Web Mercator (EPSG:3857) when you load. Save yourself pain.
    -t_srs EPSG:3857
  • Load data faster by using Postgres “copy” mode:
    –config PG_USE_COPY YES
  • Specify your own table name:
    -nln adelaide

Tip for data providers #2: Provide all data in unprojected (latitude/longitude) coordinates by preference, or Web Mercator (EPSG:3857).

CSV files unfortunately require creating a companion ‘.vrt’ file for non-trivial cases (eg, weird projections, weird column names). For example:
<OGRVRTLayer name="melbourne">
<GeometryField encoding="PointFromColumns" x="Longitude" y="Latitude"/>

The command to load a dataset looks like:
ogr2ogr --config PG_USE_COPY YES -overwrite -f "PostgreSQL" PG:"dbname=trees" -t_srs EPSG:3857 melbourne.vrt -nln melbourne -lco GEOMETRY_NAME=the_geom -lco FID=gid -nlt GEOMETRY
Source code for

Merge the data

Unfortunately most councils do not yet publish data in the (very easy to follow!) standards. So we have to investigate the data and try to match the fields into the scheme. Basically, it’s a bunch of hand-crafted SQL INSERT statements like:
INSERT INTO alltrees (the_geom, ref, genus, species, scientific, common, location, height, crown, dbh, planted, maturity, source)
SELECT the_geom,
tree_id AS ref,
genus_desc AS genus,
spec_desc AS species,
trim(concat(genus_desc, ' ', spec_desc)) AS scientific,
common_nam AS common,
split_part(location_t, ' ', 1) AS location,
height_m AS height,
canopy_wid AS crown,
diam_breas AS dbh,
CASE WHEN length(year_plant::varchar) = 4 THEN to_date(year_plant::varchar, 'YYYY') END AS planted,
life_stage AS maturity,
'colac_otways' AS source
FROM colac_otways;

Notice that we have to convert the year (“year_plant”) into an actual date. I haven’t yet fully handled complicated fields like health, structure, height and dbh, so there’s a mish-mash of non-numeric values, different units (Adelaide records the circumference of trees rather than diameter!)

Tip for data providers #3: Follow the standards, and participate in the process.

Source code for mergetrees.sql

Clean the data

We now have 370,000 trees but it’s of very variable quality. For instance, in some datasets, values like “Stump”, “Unknown” or “Fan Palm” appear in the “scientific name” column. We need to clean them out:
UPDATE alltrees
SET scientific='', genus='', species='', description=scientific
WHERE scientific='Vacant Planting'
OR scientific ILIKE 'Native%'
OR scientific ILIKE 'Ornamental%'
OR scientific ILIKE 'Rose %'
OR scientific ILIKE 'Fan Palm%'
OR scientific ILIKE 'Unidentified%'
OR scientific ILIKE 'Unknown%'
OR scientific ILIKE 'Stump';

We also want to split scientific names into individual genus and species fields, handle varieties, sub-species and so on. Then there are the typos which, due to some quirk in tree management software, become faithfully and consistently retained across a whole dataset. This results in hundreds of Angpohoras, Qurecuses, Botlebrushes etc. We also need to turn non-values (“Not assessed”, “Unknown”, “Unidentified”) into actual NULL values.
UPDATE alltrees
SET crown=NULL
WHERE crown ILIKE 'Not Assessed';

Source code for cleantrees.sql

Tip for data providers #4: The cleaner your data, the more interesting things people can do with it. (But we’d rather see dirty data than nothing.)

Make a map

I use TileMill to make web maps. For this project it has a killer feature: the ability to pre-render a map of hundreds of thousands of points, and allow the user to interact with those points, without exploding the browser. That’s incredibly clever. Having complete control of the cartography is also great, and looks much better than, say, dumping a bunch of points on a Google Map.

As far as TileMill maps goes, it’s very conventional. I add a PostGIS layer for the tree points, plus layers for other features such as roads, rivers and parks, pointing to an OpenStreetMap database I already had loaded. Also show the names of the local government areas with their boundaries, which fade out and disappear as you zoom in.

My style is intentionally all about the trees. There are some very discreet roads and footpaths to serve as landmarks, but they’re very subdued. I use colour (from green to grey) to indicate when species and/or genus information is missing. The Waite Arboretum data has polygons for (I presume) crown coverage, which I show as a semi-opaque dark green. TileMill screenshot


Source code for the TileMill CartoCSS style.

There’s also an interactive layer, so the user can hover over a tree to see more information. It looks like this:
<b>{{{common}}} <i>{{{scientific}}}</i></b>
{{#genus}}<tr><th>Genus </th><td>{{{genus}}}</td></tr>{{/genus}}

I also whipped up two more layers:

  1. OpenStreetMap trees, showing “natural=tree” objects in OpenStreetMap. The data is very sketchy. This kind of data is something that councils collect much better than OpenStreetMap.
  2. Interesting trees. I compute the “interestingness” of a tree by calculating the number of other trees in the total database of the same species. A tree in a set of 5 or less is very interesting (red), 25 or less is somewhat interesting (yellow).

Source code for makespecies.sql.

Build a website

It’s very easy to display a tiled, interactive map in a browser, using Leaflet.JS and Mapbox’s extensions. It’s a lot more work to turn that into an interesting website. A couple of the main features:

  • The base CSS is Twitter Bootstrap, mostly because I don’t know any better.
  • Mapbox.js handles the interactivity, but I intercept clicks (map.gridLayer.on) to look up the species and genus on Wikipedia. It’s straightforward using JQuery but I found it fiddly due to unfamiliarity. The Wikipedia API is surprisingly rough, and doesn’t have a proper page of its own – there’s the MediaWiki API page, the Wikipedia API Sandbox, and this useful StackOverflow question which that community helpfully shut down as a service to humanity.
  • To make embedding the page in other sites (such as Open Council Data trees) work better, the “?embed” URL parameter hides the titlebar.
  • You can go straight to certain councils with bookmarks:
  • I found the fonts (the title font is “Lancelot“) on Adobe Edge.
  • The header background combines the forces of and

Source code for treesmap.html, treesmap.js, treesmap.css.

And of course there’s a server component as well. The lightweight tilelive_server, written mostly by Yuri Feldman, glues together the necessary server-side bits of MapBox’s technology. I pre-generate a large-ish chunk of map tiles, then the rest are computed on demand. This bit of nginx code makes that work (well, after tilelive_server generated 404s appropriately):
location /treetiles/ {
# Redirect to TileLive. If tile not found, redirect to TileMill.
rewrite_log on;
rewrite ^.*/(\d+)/(\d+)/(\d+.*)$ /supertrees_c8887d/$1/$2/$3 break;

proxy_intercept_errors on;
error_page 404 = @dynamictiles;
proxy_set_header Host $http_host;

proxy_cache my-cache;

location @dynamictiles {
rewrite_log on;
rewrite ^.*/(\d+)/(\d+)/(\d+.*)$ /tile/supertrees/$1/$2/$3 break;
proxy_cache my-cache;

Too hard basket

A really obvious feature would be to show native and introduced species in different colours. Try as I might, I could not find any database with this information. There are numerous online plant databases, but none seemed to have this information in a way I could access. If you have ideas, I’d love to hear from you.

It would also be great to make a great mobile app, so you can easily answer the question “what is this tree in front of me”, and who knows what else.

In conclusion

Dear councils,

  Please release datasets such as tree inventories, garbage collection locations and times, and customer service centres, following the open standards at We’ll do our best to make fun, interesting and useful things with them.


The open data community a better map for Australian cycle tours is a tool for planning cycle tours in Australia, and particularly Victoria. I made it because Google Maps is virtually useless for this: poor coverage in the bush and inappropriate map styling make cycle tour planning a very frustrating experience.

Let’s say we want to plan a trip from Warburton to Stratford, through the hills. This is what Google Maps with “bicycling directions” offers:

Google Maps - useless for planning cycle tours.

Google Maps – useless for planning cycle tours.

Very few roads are shown at this scale. Unlike motorists, we cyclists want to travel long distances on small roads. A 500 kilometre journey on narrow backstreets would be heaven on a bike, and a nightmare in a car. So you need to see all those roads when zoomed out.

Worse, small towns such as Noojee, Walhalla and Woods point are completely missing!


Screenshot 2015-01-09 18.03.59

You can plan a route by clicking a start and end, then dragging the route around:

Screenshot 2015-01-13 23.43.12

It doesn’t offer safe or scenic route selection. The routing engine (OSRM) just picks the fastest route, and doesn’t take hills into account. You can download your route as a GPX file, or copy a link to a permanent URL.


The other major features of’s map style are:

Screenshot 2015-01-09 18.12.04Bike paths are shown prominently. Rail trails (old train lines converted into bike paths) are given a special yellow highlighting as they tend to be tourist attractions in their own right.

Train lines (in green) are given prominence, as they provide transport to and from trips.



Screenshot 2015-01-09 18.20.23Towns are only shown if there is at least one food-related amenity within a certain distance. This is by far the most important information about a town. Places that are simply “localities” with no amenities are relegated to a microscopic label.



Screenshot 2015-01-09 18.27.40Major roads are dark gray, progressing to lighter colours for minor roads. Unsealed roads are dashed. Off-road tracks are dashed red lines. Tracks that are tagged “four-wheel drive only” have a subtle cross-hashing.

And of course amenities Screenshot 2015-01-09 18.57.40useful to cyclists are shown: supermarkets, campgrounds, mountain huts, bike shops, breweries, wineries, bakeries, pubs etc etc. Yes, well-supplied towns look messy, but as a user, I still prefer having more information in front of me.


Screenshot 2015-01-09 19.11.23The terrain data is a 20 metre-resolution digital elevation model from DEPI, within Victoria, trickily combined with a 90m DEM elsewhere, sourced from SRTM (NASA). I use TileMill‘s elevation shading feature, scaled so that sea level is a browny-green, and the highest Australian mountains (around 2200m) are white, with green between. 20-metre contours are shown, labelled at 100m intervals.

I’m really happy with how it looks. Many other comparable maps have either excessively dark hill shading, or heavy contours – or both.

Screenshot 2015-01-09 19.21.02


Screenshot 2015-01-09 19.20.35


Screenshot 2015-01-09 19.20.24


Screenshot 2015-01-09 19.20.15


Mapbox Outdoors

Google Maps (terrain mode)

Screenshot 2015-01-09 19.35.37

MapBox Outdoors


Other basemaps

Screenshot 2015-01-09 19.39.25


I’ve included an assortment of common basemaps, including most of the above. But the most useful is perhaps VicMap, because it represents a completely different data source: the government’s official maps.




There are also optional overlays. Find a good spot to stealth camp with the vegetation layer.

Or avoid busy roads with the truck volume layer. This data comes from VicRoads.Screenshot 2015-01-09 19.47.43

The bike shops layer makes contingency planning a bit easier, by making bike shops visible even when zoomed way out. The data is OpenStreetMap, so if you know of a bike shop that’s missing (or one that has since closed down), please update it so everyone can benefit.

Screenshot 2015-01-14 00.06.20


Unfortunately, the site is pretty broken on mobile. But you can download the tiles for offline use on your Android phone using the freemium app Maverick. It works really well.

Other countries

Screenshot 2015-01-14 00.46.05 for Iceland. Yes, it’s real – but I don’t know how long I will maintain it.

It’s a pretty major technical undertaking to run a map for the whole world. I’ve automated the process for setting up as much as possible, and created my own version for Iceland and England when I travelled there in mid 2014. If you’re interested in running your own, get in touch and I’ll try to help out.





I’d love to hear from anyone that uses to plan a trip. Ideas? Thoughts? Bugs? Suggestions? Send ’em to, or on Twitter at @Stevage1.

Normalize cross-tabs for Tableau: a free Google Sheets tool


You want to do some visualisation magic in Tableau, but your spreadsheet looks like this:

All those green columns are dependent variables: independent observations about one location defined by the white columns.

Tableau would be so much happier if your spreadsheet looked like this:

This is called “normalizing” the “cross-tab” format, or converting from “wide format” to “long format”, or “UNPIVOT“. Tableau provides an Excel plugin for reshaping data. Unfortunately, if you don’t use Excel, you’re stuck. It’s kind of weird.


Anyway, I’ve made a Google Sheets script “Normalize cross-tab” that will do it for you.

As the instructions say, to use it, you:

  1. Reorganise your data so that all the independent variable columns are to the right of all the dependent ones; then
  2. Place the cursor somewhere in the first (leftmost) independent variable column.

It then creates a new sheet, “NormalizedResult”, and puts the result there.

How to use

It’s surprisingly clumsy to share Google Scripts, at least until the new “Add-ons” feature is mature. Here’s the best I can do for you:

1. Copy the script to the clipboard

Go to, select all the text, and copy to the clipboard.

2. Upload your spreadsheet to Google Sheets

Upload your Excel spreadsheet into Google Sheets, if it’s not there already.

3. Tools > Script Editor…

4. Click “Spreadsheet”








5. Paste

In the window labelled “”, select all the text and paste over it the script from the clipboard.

6. Save.

You need to give this script “project” a name. It doesn’t matter.

7. Select the “start” function.

8. Click Run

Click Continue and accept the authorisation request.

9. Follow the instructions of the script

Now, switch windows to your Google Sheet, and you’ll see the sidebar.

10. Download your normalised spreadsheet

On the NormalizedResult page, choose File > Download as…

Screenshot 2015-01-06 20.53.36





If you want to convert several spreadsheets, you can save yourself pain by loading them all into the same workbook. Just remember that the script will always save its output to NormalizedOutput.

7 reasons to release that government dataset

As a data guru in residence, I’m helping government bodies prioritise which datasets to release as open data. Sometimes people say “No one would ever find this data interesting, so why bother releasing it?” I think there are several distinct reasons why a given dataset might be worth releasing. Some datasets are valuable for several reasons simultaneously. Some aren’t valuable at all.

When a public servant comments that a potential dataset isn’t interesting or useful, ask: “are there other reasons to release it”?

But if a dataset fails to meet any of these criteria? You have my permission not to release it.

#1 Build an app around it

Census Explorer, by Yuri Feldman, allows easy exploration of part of the 2011 Australian Census.

Datasets like public transport timetables, public bike share station status, or parking space availability are obvious candidates for third party developers to use to build an app. Unfortunately, these examples also require near-realtime feeds in order to be useful.

#2 Support other apps

Even if a dataset isn’t interesting or useful enough to warrant an app in its own right, it could add value to another website or app if it’s easy to use. I’ve come across many of these:

  • Average traffic volume on roads maintained by VicRoads, used to help cyclists decide which roads to avoid, on
  • The slope of footpaths around Melbourne can help wheelchair users navigate the city.
  • The location and species of every tree in Melbourne can add colour and interest to a map of the city.
  • Locations of drinking fountains could be useful for cycling, jogging, or dog walking apps or websites.
Vicroads traffic volume

Which way would you cycle to Port Albert?

#3 Interesting for research

If a dataset is big, rich, detailed and high quality, then there’s a pretty good chance it’s worth of some kind of analysis. If it’s unique enough, then it might even interest a researcher in starting a research project just to look at this dataset.

Examples: building permits database, public transport timetables (for urban planning).

#4 Supporting other research

Much more common than such a rich dataset is small datasets that researchers find useful to solve particular problems, add context, or strengthen an analysis. Local Government Area boundaries aren’t inherently interesting, but they’re one of the geospatial datasets that researchers request the most often. The ATO’s Standard Business Rules taxonomy sounds incredibly dry to me, but is of potential use to lots of people trying to glue different kinds of data and applications together.

#5 Policy and analysis

Lots of organisations need government data to develop internal strategies or policies to be shared with the public – or even to influence government. Typically they get the data either by transcribing tables from official reports, or by developing direct relationships with the government body in question. Publishing data directly to an open data portal allows a wider range of groups to make use of it, without the overhead of having to ask whether the data is available. Data that is collected regularly, in the same format is a particularly likely to be useful.

#6 Transparency

If the data relates to how government decisions are made, it may be worth releasing to demonstrate transparency – regardless of how much the dataset is even used. For example, releasing annual budget data as an easy to use spreadsheet makes a big political statement about willingness to be scrutinised. Even if no citizen takes up the opportunity to crunch the numbers, they may still appreciate having that option.

Examples: annual budgets, revenue sources (parking meters, speeding fines), parliamentary voting records.

#7 Insights for government

If you’re really lucky, the dataset you publish may help another part of government do something useful. I think good things happen when people can access data without having to ask anyone for it, and the some goes for governments themselves. You can’t really expect insights, but if it happens – great.

The Data Guru in Residence

Cross-posted at Code for Australia.

Last week, Code for Australia launched its first fellowship program, a four-month project where a civic-minded developer will try a new approach to helping government solve problems with their data. For the next few months, I’ll be the Data Guru in Residence, blogging mostly to The program got a brief mention in The Age.My goals are to find interesting and useful datasets, help make them public, and do fun stuff with them. It’s a kind of test run for the Code for Australia hacker in residence program currently being developed. Since I work for the University of Melbourne, I’ll be targeting datasets that are useful for researchers, and using VicNode to store data wherever it’s needed.

To start with, I’m spending some time with the CityLab team at City of Melbourne. They’re very progressive on the open data front, and their Open Data Platform has some really high quality datasets, like the 70,000-tree Urban Forest or the Development Activity Monitor which contains detailed information on property developments.

“Living, Breathing Melbourne”, our GovHack Project, would be so much better with live data feeds.

Some of the immediate datasets on the radar are finding live feeds from the city’s pedestrian sensors and bike share stations. I’d love to incorporate these into the successful Govhack project, Living Breathing Melbourne, built with Yuri Feldman and Andrew Chin. There’s also lots of interesting data from the Census of Land Use and Employment with immense detail on how floorspace is divided up between residential, retail, commercial and so on. There areMahlstedt fire plans, LIDAR data, and a really detailed, textured 3D model of the CBD. And of course other data that’s already public, but whose full potential hasn’t yet been realised.

If you’re from a government body (Federal, State, Council, or agency), based in or around Melbourne and you could use the services of a Data Guru, please get in touch!

Chromecast in the real world: six casting workflows

For such a simple device, Google’s Chromecast has created a surprisingly complex network of technology at my place.

Google says that Chromecast works roughly like this:

Chromecast Google view

Actually it’s more complicated than that. My setup is about as simple as you can get (no NAS, no existing media servers, no Netflix or Hulu or Foxtel or anything), and it looks like this:


Chromecast in the real world

Chromecast in the real world. Not so simple, really.


Six casting workflows

That is, depending on what exactly I want to watch and how, I have to choose between 6 different workflows:

  1. YouTube? Just go to the YouTube website in Chrome, and click the Chromecast button in the video window. This works really well. Great for music playlists, too.
  2. iView or SBS? Go to the site, and use the Google Cast extension to “TabCast”. This works so-so. It’s great for randomly showing something funny you found on the web, though.
  3. Movies you’ve downloaded? Use the VideoStream Chrome app to load it directly off disk. This works perfectly.
  4. Movies in your Plex library? Use the Plex web interface. For some reason you have to go through, and the whole experience is a bit complicated. There are some issues with transcoding that I don’t really understand.
  5. Vimeo? Plex to the rescue. Add Vimeo as a channel (a slightly complicated procedure to view your own uploads).
  6. Want to watch something without using your computer? There’s only a couple of “Google Cast ready Android apps” (YouTube is the only one that works well for me), or use BubbleUPnP to access your Plex library.

And I haven’t even mentioned a couple more complications:

My advice? Figure out the smallest number of workflows to do everything you want to do, and get rid of any extraneous apps, servers, websites etc.

Google’s world

Google’s world centres around casting stuff from your phone. If that was all you could do, the Chromecast would suck. There are few apps, a lot of them are very niche (eg, anime or baseball), junk (like this) or just don’t really work (like the Red Bull app, which drops out every few minutes).

Fortunately, third party tools like VideoStream and Plex fill in a lot of the gaps.

But does it work?

The end result is actually great. Compared to having to plug my laptop into the TV, these things are now easy and fun:

  • Put on some background music: go to YouTube, Pandora or GrooveShark, and cast. No more hooking up audio cables.
  • Show a silly video to my partner. Even from the other room. Stuff I previously wouldn’t have bothered with, but it’s so easy – the TV even turns on by itself.
  • Keep watching a video while doing something else. Easy to leave my study, keep watching the same thing while making coffee or something.
  • Show photos: Just go to Google Plus or Flickr, and cast.