Steve Bennett blogs

…about maps, open data, Git, and other tech.

One week of Salt: frustrations and reflections.

24 Comments Posted by steveko on February 17, 2014

Reflecting on Salt. (Credit: Psyberartist, CC-BY 2.0)

Salt, or SaltStack, is an automatic deployment tool for clusters of machines, in the same category as Chef, Puppet and Ansible. I’ve previously used Chef, found the experience frustrating, and thought I’d give Salt a go. The task: converting my tilemill-server bash scripts, which set up a complete TileMill server stack, with OpenStreetMap data, Nginx and a couple of other goodies.

The goals in this kind of automation for me are threefold:

So I can quickly build new servers for Mapmaking for Academics workshops.
So I can manage other, similar (but slightly different) servers, such as cycletour.org and gis.researchmaps.net. “Manage” here means documenting and controlling the slight configuration differences, being able to roll out further changes (such as web passwords) without having to log in.
So other people can do this too. (That is, the solution has to be simple enough to explain that other people can do it).

The product of my labours: https://github.com/stevage/saltymill

Disclaimers and apologies

A week is too soon to feel the real benefits of automatic deployment and configuration management, but plenty long enough for frustrations and annoyances. There are lots of good things about Salt, but those will go in another post.

I’ve intentionally written this post as I go, to document things that are surprising, counter-intuitive or bothersome to the newcomer. Once you become familiar with a system, it’s much harder to remember what it was like as a newbie, and easier to make allowances for it. So experienced Salt users will probably rankle at the wilful ignorance displayed herein.

A few things I quickly appreciated, compared to Chef:

Two parties, not three. Chef has the client (your laptop), the Chef Server (which I never used because it was too complicated) and the target machine itself. Salt doesn’t have the client, so you do everything either on the master (“salt ‘*’ state.highstate) or on the target machine in masterless mode (“salt-call –local state.highstate)
No need for “knife upload”: the Salt Master just looks for your files in /srv/salt, so you can use any method you like for getting them there. I always hated having to upload recipes, environments etc individually – you always forgot something.
YAML rather than Ruby.
A nifty command line tool that lets you try states out on the fly, or run commands remotely on several servers at once. (Chef’s shell, formerly ‘shef’, never really worked out for me.)

Mediocre documentation

Automatic deployment systems are complicated, and you’d love some clear mental concepts to hang on to. Unfortunately, Salt’s chosen metaphor is pretty clumsy (salt grains, pillars, mines…meh), and they’re not well explained. Apparently the concept of a “state” is pretty important to Salt, but even after reading much documentation and quite a reasonable amount of hacking away, I still have only the vaguest notion of what a Salt state is. What’s a “high state”? A “low state”? How can I query the “state” of a minion? No idea. It’s hard to tell if “states” are an implementation magic that makes the whole thing work, or if I, the user am supposed to know about them.

(After more digging around, I think “state” has several meanings, of which one is just one of the actions defined in a SLS file.)

And then, strangely, how do you refer to the most important objects that you work with, the actual code that specifies what you want done? “SLS files”. Or “state files”. Or “SLS[es]”. Or “Salt States”. Or “Salt state files”. Those terms are all used in the docs. (I thought they were also called “formulas“, but those are apparently only pre-canned state files.)

YAML headaches

It turns out YAML has some weird indenting and formatting rules that occasionally ruin your day. Initially it’s especially tedious having to learn this extra language and its subtleties (> vs |, multi-line hyphenated lists versus single line square-bracketed lists) when you just want to get started – it steepens the learning curve.

IDs, names

The behaviour of statement IDs is cute at first, but not well explained, and becomes pretty tiresome. The idea is you can write this:

httpd:
  pkg.installed

Succinct, huh? That’s actually shorthand for this:

install_apache_please:
  pkg:
    - name: httpd
    - installed

That is, the “id” (httpd) is also the default value for “name”. Secondly, you can compress one item of the following list (“installed”) onto the function name (“pkg”) with a dot.

So far, so good. But it quickly leads to this kind of silliness:

"{{ pillar.installdir}}/myfile.txt":
  file.managed:
    - unless: ...
another_command:  
  cmd.wait:
    - watch: [ "{{ pillar.installdir}}/myfile.txt" ]

Free-text strings don’t make great IDs. On the whole, I’d prefer a syntax which is robustly readable and predictable, rather than one which is very succinct in special cases, but mostly less so.

The next problem is that all those IDs have to be unique across all your .sls files. That’s tedious when you have repetitious little actions that you can’t be bothered naming properly.

Another annoyance is that Salt’s data structure sometimes requires lists, and sometimes requires dicts, and it’s hard to remember which is which. For example:

finishlog:
  cmd.run:
    - name: |
        echo "All done! Enjoy your new server.<br/>" >> /var/log/salt/buildlog.html
  file.managed:
    - name: /var/log/salt/index.html
    - source: salt://initindex.html
    - template: jinja
    - context:
       buildtitle: "Your server is ready!"
       buildsubtitle: "Get in there and make something."
       buildtitlecolor: "hsl(130,70%,70%)"

Notice how the first level of indentation doesn’t have hyphens, the second level does, then the third level doesn’t? I’m not sure what the philosophy is here – it looks to me like each level is just a list of key/value pairs.

Jinja templating logic

Jinja is a fantastic templating language. But Salt uses this template language as its core programming logic. That’s like writing a C program using macros all over the place. It’s sort of ok as a way to access data values:

update_data:
 cmd.run:
 - cwd: {{ pillar['tm_dir'] }}

Here, the {{ … }} bit is a Jinja substitution, so the actual YAML code that gets parsed and executed looks like this:

update_data:
 cmd.run:
 - cwd: /mnt/saltymill

As most people who have dealt with macro preprocessors know, they rapidly get complex and very hard to debug. The error you see is a YAML parse error, but is the problem in your YAML code, or in what got substituted by Jinja?

It’s certainly possible to use the full range of Jinja functions (that is, {% … %} territory) inside Salt formulas, but for sanity I’m trying to avoid it. On the flip side, certain things that I expected to be easy, like defining a variable in a state {% set foo = blah %} and then being able to access it from inside any other template, turn out to not work. You need to explicitly pass context variables around, or import state files, and it becomes quite cumbersome.

Help may be at hand. Evan Borgstrom also noticed that “simple readability of YAML starts to get lost in the noise of the Jinja syntax” and is working on a new, pure Python syntax called NaCl.

Many ordering options

There are about 4 competing systems simultaneously operating to determine which order actions get executed in, ranked here from most important to least important.

Requisite order: The “preferred”, declarative system, where if X requires Y and Z requires X, then Salt just figures out that Y has to happen first.
Explicit order: you can set an “order flag” on an action, like “order: 1” or “order: last”. Basically, if you’re tearing your hair out with desperation, I think. (I tried it once, and it seemed to have no effect.)
Definition order: things happen in the order they’re declared in. (By far my preferred option). This doesn’t always seem to work, in the top.sls, for reasons I don’t understand.
Lexicographical order (yes, really): actions are executed in alphabetical order of their state name. (Apparently this was a good thing because at least it was consistent across platforms.)

The requisite ordering system sounds good, but in practice it has two shortcomings:

Repetitiveness: let’s say you have psql 10 commands that all strictly depend on Postgresql being installed. It’s very repetitive to state that dependency explicitly, and much more efficient to simply install Postgresql at the top of the formula, and assume its existence thereafter.
Scope: Requiring actions in other files seems to prevent the running of just this file. That is, if in load_db.sls you do “require: [ cmd: install_postgres ]”, then this command line no longer works: salt ‘*’ state.sls load_db

I’m yet to see any advantages to a declarative style. I like the certainty of knowing that a given sequence of steps works. The declarative model implies that one day the steps might be rearranged based only on my statement of what the dependencies are.

Arbitrary rules

There’s a rule that says you can’t have the same kind of state multiple times in a statement. This is invalid:

/var/log/salt/buildlog.html:
 file.managed:
 - source: salt://initlog.html 
 file.append:
 - text: Commencing build...

Why? I don’t know. The rules dictate that you have to do this:

/var/log/salt/buildlog.html:
 file.managed:
 - source: salt://initlog.html 

append_to_that_file:
 file.append:
 - name: /var/log/salt/buildlog.html
 - text: Commencing build...

Notice the cascade of consequences as this fragile “readability” tower crashes down:

You can’t have two states of the same type in a statement, so we move the second state to a new statement; but:
You can’t have two statements with the same ID, so we give the second statement a new ID; meaning:
We now have to explicitly define the “name” of the file.append function.

Clumsy bootstrapping

I miss Chef’s elegant “knife bootstrap” though. In one command line command, Chef:

Defines properties about the node, like its environment and role.
Connects to the node
Installs Chef
Registers the node with the Chef server, or your client, or both.
Starts a deployment, as required.

With Salt, these all seem to be manual steps:

I SSH to the node
I download and install Salt (using one-liner bootstrap script)
I write ‘grains’ to the newly created /etc/salt/grains file (yes, I think I’m doing this wrong – roles should be defined through pillars maybe)
The node then attempts to register itself with the Salt Master, but because the connection is insecure:
I have to go to the Salt Master and accept the pending key: yes | salt-key -A [and the docs suggest you’re supposed to manually verify the keys!]
Now I can launch a deployment: salt-master <nodename> state.highstate

When I rebuild a server, the above are preceded by these steps:

Log in to OpenStack Dashboard, click ‘rebuild’ on the instance.
Edit my ~/.ssh/known_hosts to remove the old SSH key.
On the Salt Master, delete the old key: yes | salt-key <node> -d

There are probably ways of streamlining this process. I hope so, anyway, because it’s pretty clumsy. I don’t know yet whether the SaltMaster can automatically launch deployments when new nodes register.

e-Research, System administration chef, configuration management, salt, saltstack, tilemill

Super lightweight map websites with Github

Leave a comment Posted by steveko on February 13, 2014

Github, the online version control repository host for Git, recently added support for GeoJSON files. Sounds boring, right? It actually lets you do something very cool: build your own “dots on a map” website with virtually zero code.

An example of GeoJSON on Github I whipped up. Click it.

Here’s what you need to do.

Get a Github repository if you don’t have one already. They’re free.
Create a GeoJSON file. You can export to this format from various tools. One easy way to get started would be to upload a CSV file with locations to dotspotting.org then download the GeoJSON from there. Or even easier, use geojson.io to place dots, lines and polygons with a graphical tool. It can save directly to your GitHub.

Here’s what my test file looks like:

{
 "type": "FeatureCollection",
 "features": [{ "type": "Feature",
 "geometry": {
 "type": "Point",
 "coordinates": [144.9,-37.8]
 },
 "properties": {
 "title": "Scooter",
 "description": "Here's a dot",
 "marker-size": "medium",
 "marker-symbol": "scooter",
 "marker-color": "#a59",
 "stroke": "#555555",
 "stroke-opacity": 1.0,
 "stroke-width": 2,
 "fill": "#555555",
 "fill-opacity": 0.5
 }
 },
 { "type": "Feature",
 "geometry": {
 "type": "Point",
 "coordinates": [144.4,-37.5]
 },
 "properties": {
 "title": "Cafe",
 "description": "Coffee and stuff",
 "marker-size": "medium",
 "marker-symbol": "cafe",
 "marker-color": "#f99",
 "stroke": "#555555",
 "stroke-opacity": 1.0,
 "stroke-width": 2,
 "fill": "#555555",
 "fill-opacity": 0.5
 }
 }]
}

It’s worth validating with GeoJSONLint.

Commit this file, say test.geojson, to your Github repository. You can get a preview of it in Github:
The test GeoJSON file, as seen on GitHub.
Now the really cool part. Embed the map into your own website. This is stupidly easy:
```
<!DOCTYPE html>
<html>
<body>
<script src="https://embed.github.com/view/geojson/stevage/georly/master/test1.geojson?height=500&width=1000"></script>
</body>
</html>
```
If you don’t have a website, site44 is an extremely easy way to get started. You place HTML files into your DropBox, and they get automagicked onto the web, with a subdomain: something.site44.com.

Now what?

That’s it! What’s especially interesting about the hosting on GitHub is it’s a very easy way to have a lightweight shared geospatial database of points, lines or polygons. Here’s how you could add dots to my map:

Fork my repository
Add a few points, by modifying the GeoJSON file
Commit your changes to your repository.
Send me a pull request
I accept the changes, and voila – now your points are shown with mine.

Using this method, we have a “review before publish” workflow, and a full version history of every change.

Caveats

This is a nifty tool for prototyping social mapping applications, but it obviously won’t cut it for production purposes:

No support for different layers: all the dots are always shown
No support for different basemaps: always the same OpenStreetMap style
No authoring tools: you must use something else to generate the GeoJSON
Obligatory “rendered with (heart) by GitHub” footer.

Soon you’ll want to build a proper application, using tools like MapBox, CloudMade, CartoDB, Leaflet etc.

mapmaking geojson, gis, github, leafletjs, maps, site44, social mapping

Digital humanities for beginners: get started with the Trove API

2 Comments Posted by steveko on February 5, 2014

Trove is the National Library of Australia’s “discovery interface” – an amazing catalogue of books, newspapers, maps, music, journal articles etc. The Trove API is a special website that programs can talk to run queries, retrieve individual records etc. With some creative ideas and a bit of coding skill, you could make a pretty nifty tool and win a digital humanities award.

The documentation is pretty thorough, but doesn’t tell you how to get started. These days, web development is mostly about assembling the right tools and putting them together, but it can be hard for a novice to know what they need.

This guide is for you if:

You want to try exploring Trove programmatically; and
You’ve done some coding before; but
You’re not very familiar with coding against an API, or writing web applications.

In short, if you think the digital humanities might be for you.

What we’ll need

A server on which to run your web application. You can use your laptop directly, but better to use a virtual machine inside your laptop. Even better, a server running on the internet, like a VM on the NeCTAR Research Cloud.
A web application framework: Flask and Python
A library that makes it easy to make requests to other web servers (we’ll use Requests)
A JavaScript framework which makes dynamically modifying web pages much easier (we’ll use jQuery)
A CSS framework, which makes your page look attractive with no effort (we’ll use Twitter Bootstrap)
A templating engine, to combine HTML and the results of our queries into a web page (we’ll use Jinja2)
Our server logic, expressed as code in files.

A server

As an Australian academic, you can create your own server on the NeCTAR Research Cloud. That would be the best option, but can be a bit fiddly for a novice so we won’t explain that here.

The next best option is to create a server on a virtual machine inside your laptop, using VirtualBox and Vagrant. That keeps everything related to this one project self contained, which is a big bonus when you start working on other projects. It also creates a pristine, controlled Linux environment which means you’re less likely to run into issues cause by the peculiarities of your own system. Try the Vagrant Getting Started guide.

As a last resort, the simplest approach is to just install stuff directly on your computer. This may work ok to begin with.

Building a web application

We’re going to build a web application that will talk to Trove. This isn’t the only approach. You could:

Build a command line tool that pulls stuff out of Trove and saves it to disk
Make a desktop application that a user would have to download and install
Make a static website (not a web application) that does everything with JavaScript.

But a web application is probably what you’re going to want eventually and is the easiest to share with other people.

We’ll use Python, the programming language, and Flask, a web application framework for Python.

Why Flask? It’s easy to install, lightweight, and is well suited to experimental programming like this. (An alternative would be Django, which would be much better if you wanted to store stuff in a database and write something big and scalable.)

You can install these on the command line like this:

sudo apt-get install -y python-pip
sudo pip install flask

That is, first you install Python’s “pip” installer. Pip then knows how to install Python packages like Flask.

Whatever server you’re using, create a directory somewhere (perhaps under your home directory), called ‘trovetest’.

cd ~
mkdir trovetest
cd trovetest

Making requests to the Trove API

The Trove documentation says that to perform a query, we need to access a URL like this:

http://api.trove.nla.gov.au/result?key=<INSERT KEY>&zone=book&q=%22piers%20anthony%22

You might think you need to construct that string yourself, complete with question marks, equals signs and %22’s. You could do that using Python’s built-in urllib2 library:

import urllib2
apikey='1b2c3d4f'
troveURL = 'http://api.trove.nla.gov.au/'
zone = 'book'
search = '%22' + 'piers' + '%20' + 'anthony' + '%22'
r = urllib2.urlopen(troveURL + 'result' + '?' + 'key=' + apikey + '&' + 'zone=' + zone + '&' + 'q=' + search).read()

But it’s easier than that, if we use the Requests library.

import requests 
apikey='1b2c3d4f'
troveURL = 'http://api.trove.nla.gov.au/'
searchquery = 'piers'
r = requests.get(troveURL + 'result/', params = { 
 'key': apikey, 
 'zone': 'book', 
 'q': search
 } )

So, install the Requests library as well:

sudo pip install requests

jQuery

These days, virtually all web applications use a JavaScript framework, to make manipulating the web page more practical. We’ll use jQuery, which is also required by Twitter BootStrap.

Create a directory under ‘trovetest’ called ‘static’, and download and unzip the latest jQuery.

CSS Framework

Making a web page look good using straight CSS (cascading style sheets) is really hard. Making it look good in every browser and device is a nightmare. Save yourself the pain, and start from somewhere sensible like Bootstrap, made by Twitter. (By default, it looks a bit like Twitter’s website, but you can change that.)

Without Twitter Bootstrap

Adding Bootstrap and minimal changes.

Although it’s best to download both jQuery and Boostrap to your server and link to them there, you can get started quickly by linking to them on the web. Just include this text at the top of your HTML files.

<head>
http://code.jquery.com/jquery-1.10.1.min.js
<link rel="stylesheet" href="http://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap.min.css">
http://netdna.bootstrapcdn.com/bootstrap/3.1.0/js/bootstrap.min.js
</head>

Templating engine

The most common way to build a web page dynamically is by writing a “template” of the HTML page. It can be as simple as this:

<body>
<h1>Request complete</h1>
You requested this article: {{ article.name }}
</body>

When you tell the web application framework to render the page, you pass in the list of variables – in this case, an object called ‘article’ with an attribute ‘name’.

We’ll use the Jinja2 templating engine, because that’s what Flask uses.

Our server logic

Now that everything is in place, let’s write a simple application that will simply make one request. We’ll ask the Trove API for a list of newspaper articles mentioning Piers Anthony (to follow their documentation’s example).

Create this file as ~/trovetest/simple.py

from flask import Flask, render_template, request # load Flask itself
import requests, json # load Requests library, and JSON library for interpreting responses
app = Flask(__name__) # create our web application object, named after this file
apikey = '1b2c3d...' # insert your API key, following instructions here: http://trove.nla.gov.au/general/api
troveURL = 'http://api.trove.nla.gov.au/' # all Trove API URLs start with this

@app.route("/list") # this is what we do when someone goes to "http://localhost:5000/list"
def list(): # the function that defines the behaviour
query = { # this dict structure contains the parameters that make up a URL
'key': apikey, # following the Trove API documentation.
'zone': 'newspaper',
'q': 'piers anthony',
'encoding': 'json', # we specify 'json' as the format because json is easy to parse.
'include': 'tags' }
r = requests.get(troveURL + 'result/', params=query)

# Requests will transform the params property into a string like
# ?key=...&zone=newspaper&q=%22piers%20anthony%22&include=tags&encoding=json

# After calling requests.get(), r is now an object containing lots of information about the response - errors codes, content etc.

r.encoding = 'ISO-8859-1' # We avoid some unicode conversion errors by adding this step.
results=r.json() # Convert the text we receive (in JSON format) into a Python dict
return render_template('list.html', results=results)# Render the template, passing through that dict

app.run(host='0.0.0.0', debug=True) # Run the defined web server, allowing anyone to connect, in debug mode. (This is not safe for a public web server.)

And save this as ~/trovetest/templates/list.html:

<!doctype html>
<head>
    <script src="http://code.jquery.com/jquery-1.10.1.min.js"></script>
    <linkrel="stylesheet"href="http://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap.min.css">
    <script src="http://netdna.bootstrapcdn.com/bootstrap/3.1.0/js/bootstrap.min.js"></script>
</head>
<body>
    <title>Trove test</title>
    <h1>Results</h1>

    <ul>
    {% for article in results.response.zone.0.records.article %}
      <li>
        <ahref="{{ article.troveUrl }}">{{ article.title.value }}</a> - <ahref="item/{{ article.id}}">More info</a>
        <br/>
        <i><small>{{ article.snippet | safe}}</small></i>
      </li>
    {% endfor %}
    </ul>
</body>

Run your server

On the command line, tell Python (and hence Flask) to run your application:

/home/ubuntu/trove$ python simple.py
 * Running on http://0.0.0.0:5000/
 * Restarting with reloader

Flask will sit there waiting for someone to connect to your webserver in a browser.

In your browser, go to http://localhost/5000/list

The results of our little Trove query.

What next?

That was a very quick, high level view of all the pieces you need. If you made it through the example, you’re be in an excellent position to start making interesting web applications building on the Trove API. There’s no shortage of information on the web about all these technologies – the hard bit is knowing what you need to know.

All sorts of fun things can be done:

Cool exploration interfaces
Bots
Visualisations like graphs, charts etc
Text mining and analysis

Tim Sherratt (aka @wragge) has made many fun creations with the Trove API, documented on his discontents.com.au blog. Tim works for the Trove team, who are generally pretty happy to help .

e-Research api, application programming interface, css, digital humanities, flask, jinja2, jquery, python, trove, trove api, twitter bootstrap

Trello Tennis

Leave a comment Posted by steveko on January 28, 2014

(Credit: Flickr’s Carine06, CC-BY-SA)

Here’s my method of keeping track of stuff to do. Other tools just didn’t work for me. Most to-do list tools make two assumptions, I think:

Work arrives from somewhere (eg, a manager), and is readily broken up into pieces roughly 0.5-4 hours in size.
Your primary responsibility is processing these tasks without gaps, maintaining a high level of output.

My work, technology for academics, tends to involve coordinating lots of little projects with lots of different people. At any given moment, there are few tasks “in my court”, and they tend to be small. My primary responsibility is to keep all projects moving, and ensure that projects aren’t held up by a task being with me too long.

So, I ran with the ball in my court / ball in your court metaphor. It’s useful when:

Your work is primarily about projects, and particularly a relationship between you and the main stakeholder in that project.
You anxiously wonder whether you’re supposed to be doing something on a project.
You have trouble remembering all the relevant bits of information for each project.

“Projects” here can be very small, and often include very early stage proposals that haven’t formed into anything very concrete.

Presenting: Trello Tennis.

Trello Tennis in (fictious) action

To start with, the two most important columns:

Ball in my court: projects whose current state requires me to do something (as small as writing an email or checking something, or as big as preparing a whole workshop structure)
Ball in their court: projects whose current state is waiting for someone else to do something, such as providing feedback on a document, organising an event, or replying to an email.

The goal each day is to move projects from the “my court” column to the “their court” column. I periodically review “my court”, placing the most urgent items at the top. I also review “their court”, checking whether I should follow up or offer assistance.

More columns, that turned out to be useful for me:

All good/meeting scheduled: projects whose current state is that both parties are waiting for an event, usually a meeting. Neither of us has to do anything until then.
Wut?: the most dangerous column, projects where I’m not sure what the current state is after all. Am I waiting on them? Did I promise to research something and get back to them?
Somebody else’s problem (SEP): where responsibility for the project has been shifted to someone else. I don’t need to do anything, and I don’t even need to monitor it at the moment. Someone else may eventually hand it back to me, but until then I don’t worry.
Archive: the project is actually finished.
Blocked: there is a technical reason why the project can’t advance, and no one is currently doing anything about it.

Using Trello

Trello works really well for this.

The name of each card consists of both the project name, and a quick summary of my current task (if in my court), or what I’m waiting on if in theirs. Examples:

WidgetUpgrade – check if compatible with Ubuntu Precise
FoobarSystem – Anita fixing DNS problem
Maps for Psych – meeting Thurs 3rd 2pm

What’s particularly fantastic about using Trello is that each card can also track all the work on that project. Changing the title automatically creates a history trace, or I can add comments to record more information. Need somewhere to record a server name and login details? Use the card’s description field.

You can also use coloured labels to group projects, such as tagging by university in my case.

Of course, each individual project frequently has its own task tracking system: perhaps another Trello board, Github issues, Pivotal Tracker, Google spreadsheets etc. I link to them from the card’s description, ultimately tying together all information systems for all projects under a single roof.

The rules

Each day (or whenever):

If there is anything in “Wut?”, attempt to clarify its state, then move it to the right column.
Check each item in “Ball in their court” to see if it needs attention. Set alarms if you need. If it’s time to follow up, just move it back to “Ball in my court” and adjust the title.
Review the “Ball in my court” list, ordering them appropriately – most urgent at the top.
Work through items in “Ball in my court” if you have any free time. :)

To make this natural, I order the columns from left to right as follows: Wut, My Court, Blocked, Their Court, All Good, SEP, Archive. Stress is on the left, calm is on the right.

The goal is to get your board to look like this blissful fantasy:

A totally successful day in Trello Tennis land.

These examples are actually tiny compared to (my) reality. By the end of 2013, I had 21 active cards, plus another 10 inactive (SEP and Archive). If you find this technique at all useful, or have any improvements, I’d love to hear about it.

Filtering for sanity

(Added 4 Feb 2014)
Sometimes it gets daunting looking at the huge mounds of work in front of you. Filtering can help cut through the distraction. Two types have worked for me:

Filter by organisation

Filter by client

Are you working for a particular client today, and don’t want to see anything related to any other clients? Assign one label per client, then turn on the filter!

Filter by … you

To focus on just one or two tasks that you need to finish, first assign yourself to those cards. Then, filter to only show cards that you’re assigned to. This feature is intended for collaboration, but it works nicely.

Tools featured, gtd, organisation, productivity, trello

Terrain in TileMill: a walkthrough for non-GIS types

10 Comments Posted by steveko on September 11, 2013

I created a basemap for http://cycletour.org with TileMill and OpenStreetMap. It looked…ok.

But it felt like there was something missing. Terrain. Elevation. Hills. Mountains. I’d put all that in the too hard basket, until a quick google turned up two blog posts from MapBox, the wonderful people who make TileMill. “Working with terrain data” and “Using TileMill’s raster-colorizer“. Putting these two together, plus a little OCD, led me to this:

Slight improvement! Now, I felt that the two blog posts skipped a couple of little steps, and were slightly confusing on the difference between the two approaches, so here’s my step by step. (MapBox, please feel free to reuse any of this content.)

My setup is an 8 core Ubuntu VM with 32GB RAM and lots of disk space. I have OpenStreetMap loaded into PostGIS.

1. Install the development version of TileMill

You need to do this if you want to use the raster-colorizer. You want this while developing your terrain style, if you want the “snow up high, green valleys below” look. Without it, you have to pre-render the elevation color effect, which is time consuming. If you want to tweak anything (say, to move the snow line slightly), you need to re-render all the tiles.

Fortunately, it’s pretty easy.

Get the “install-tilemill” gist (my version works slightly better)
Probably uncomment the mapnik-from-source line (and comment out the other one). I don’t know whether you need the latest mapnik.
Run it. Oh – it will uninstall your existing TileMill. Watch out for that.
Reassemble stuff. The dev version puts projects in ~/Documents/<username/MapBox/project which is weird.

2. Get some terrain data.

The easiest place is the ubiquitous NASA SRTM DEM (digital elevation model) data. You get it from CGIAR. The user interface is awful. You can only download pieces that are 5 degrees by 5 degrees, so Victoria is 4 pieces.

If you’re downloading more than about that many, you’ll probably want to automate the process. I wrote this quick script to get all the bits for Australia:
for x in {59..67}; do for y in {14..21}; do echo $x,$y if [ ! -f srtm_${x}_${y}.zip ]; then wget http://droppr.org/srtm/v4.1/6_5x5_TIFs/srtm_${x}_${y}.zip else echo "Already got it." fi done done unzip '*.zip'

3. Process SRTM .tifs with GDAL.

To have any fun with terrain mapping in TileMill, you need to produce separate layers from the terrain data:

The heightmap itself, so you can colour high elevations differently from low ones.
A “hillshading” layer, where southeast facing slopes are dark, and northwest ones are light. This is what produces the “terrain” illusion.
A “slopeshading” layer, where steep slopes (regardless of aspect) are dark. I’m ambivalent about how useful this is, but you’ll want to play with it.
Contours. These can make your map look AMAZING.

Contours – they’re the best.

In addition, you’ll need some extra processing:

Merge all the .tif’s into one. (I made the mistake of keeping them separate, which makes a lot of extra layers in TileMill). Because they’re GeoTiffs, GDAL can magically merge them without further instruction.
Re-project it (converting it from some random ESPG to Google Web Mercator – can you tell I’m not a real GIS person?)
A bit of scaling here and there.
Generating .tif “overviews”, which are basically smaller versions of the tifs stored inside the same file, so that TileMill doesn’t explode.

Hopefully you already have GDAL installed. It probably came with the development version of TileMill.

Here’s my script for doing all the processing:
#!/bin/bash echo -n "Merging files: " gdal_merge.py srtm_*.tif -o srtm.tif f=srtm echo -n "Re-projecting: " gdalwarp -s_srs EPSG:4326 -t_srs EPSG:3785 -r bilinear $f.tif $f-3785.tif

echo -n "Generating hill shading: "gdaldem hillshade -z 5 $f-3785.tif $f-3785-hs.tifecho and overviews:

gdaladdo -r average $f-3785-hs.tif 2 4 8 16 32 echo -n "Generating slope files: " gdaldem slope $f-3785.tif $f-3785-slope.tif echo -n "Translating to 0-90..." gdal_translate -ot Byte -scale 0 90 $f-3785-slope.tif $f-3785-slope-scale.tif echo "and overviews." gdaladdo -r average $f-3785-slope-scale.tif 2 4 8 16 32 echo -n Translating DEM... gdal_translate -ot Byte -scale -10 2000 $f-3785.tif $f-3785-scale.tif echo and overviews. gdaladdo -r average $f-3785-scale.tif 2 4 8 16 32 #echo Creating contours gdal_contour -a elev -i 20 $f-3785.tif $f-3785-contour.shp

Take my word for it that the above script does everything I promise. The options I’ve chosen are all pretty standard, except that:

I’m exaggerating the hillshading by a factor of 5 vertically (“-z 5”). For no particularly good reason.
Contours are generated at 20m intervals (“-i 20”).
Terrain is scaled in the range -10 to 2000m. Probably an even lower lower bound would be better (you’d be surprised how much terrain is below sea levels – especially coal mines.) Excessively low terrain results in holes that can’t be styled and turn up white.

4. Load terrain layers into TileMill

Now you have four useful files, so create layers for them. I’d suggest creating them as layers in this order (as seen in TileMill):

srtm-3785-contour.shp – the shapefile containing all the contours.
srtm-3785-hs.tif – the hillshading file.
srtm-3785-slope-scale.tif – the scaled slope shading file.
srtm-3785.tif – the height map itself. (I also generate srtm-3785-scale.tif. The latter is scaled to a 0-255 range, while the former is in metres. I find metres makes more sense.)

For each of these, you must set the projection to 900913 (that’s how you spell GOOGLE in numbers). For the three ‘tifs’, set band=1 in the “advanced” box. I gather that GeoTiffs can have multiple bands of data, and this is the band where TileMill expects to find numeric data to apply colour to.

5. Style the layers

Mapbox’s blog posts go into detail about how to do this, so I’ll just copy/paste my styles. The key lessons here are:

Very slight differences in opacity when stacking terrain layers make a huge impact on the appearance of your map. Changing the colour of a road doesn’t make that much difference, but with raster data, a slight change can affect every single pixel.
There are lots of different raster-comp-ops to try out, but ‘multiply’ is a good default. (Remember, order matters).
Carefully work out each individual zoom level. It seems to work best to have hillshading transition to contours around zoom 12-13. The SRTM data isn’t detailed enough to really allow hillshading above zoom 13

My styles:

.hs[zoom <= 15] {
[zoom>=15] { raster-opacity: 0.1; }
[zoom>=13] { raster-opacity: 0.125; }
[zoom=12] { raster-opacity:0.15;}
[zoom<=11] { raster-opacity: 0.12; }
[zoom<=8] { raster-opacity: 0.3; }

raster-scaling:bilinear;
raster-comp-op:multiply;
}

// not really convinced about the value of slope shading
.slope[zoom <= 14][zoom >= 10] {
raster-opacity:0.1;
[zoom=14] { raster-opacity:0.05; }
[zoom=13] { raster-opacity:0.05; }
raster-scaling:lanczos;
raster-colorizer-default-mode: linear;
raster-colorizer-default-color: transparent;

// this combo is ok
raster-colorizer-stops:
stop(0, white)
stop(5, white)
stop(80, black);
raster-comp-op:color-burn;

}

// colour-graded elevation model
.dem {
[zoom >= 10] { raster-opacity: 0.2; }
[zoom = 9] { raster-opacity: 0.225; }
[zoom = 8] { raster-opacity: 0.25; }
[zoom <= 7] { raster-opacity: 0.3; }
raster-scaling:bilinear;
raster-colorizer-default-mode: linear;
raster-colorizer-default-color: hsl(60,50%,80%);
// hay, forest, rocks, snow

// if using the srtm-3785-scale.tif file, these stops should be in the range 0-255.
raster-colorizer-stops:
stop(0,hsl(60,50%,80%))
stop(392,hsl(110,80%,20%))
stop(785,hsl(120,70%,20%))
stop(1100,hsl(100,0%,50%))
stop(1370,white);
}
.contour[zoom >=13] {
line-smooth:1.0;
line-width:0.75;
line-color:hsla(100,30%,50%,20%);
[zoom = 13] {
line-width:0.5;
line-color:hsla(100,30%,50%,15%);
}

[zoom >= 16],
[elev =~ “.*00”] {
l/text-face-name:’Roboto Condensed Light’;
l/text-size:8;
l/text-name:'[elev]’;
[elev =~ “.*00”] { line-color:hsla(100,30%,50%,40%); }
[zoom >= 16] { l/text-size: 10; }
l/text-fill:gray;
l/text-placement:line;
}
}

And finally a gratuitous shot of Mt Feathertop, showing the major approaches and the two huts: MUMC Hut to the north and Federation Hut further south. Terrain is awesome!

mapmaking cartography, contours, featured, gdal, gis, mapbox, openstreetmap, postgis, srtm, terrain, tilemill

A TileMill server with all the trimmings

2 Comments Posted by steveko on May 8, 2013

Recently, I set up a server for a series of #datahack workshops. We used TileMill to make creative maps with OpenStreetMap and other available data.

The major pieces required are:

TileMill, which comes with its own installer, and is totally self-sufficient: web application server, Mapnik, etc.
Postgres, the database which will hold the OSM data
PostGIS, the extension which allow Postgres to do that
Nginx, a reverse proxy, so we can have some basic security (TileMill comes with none)
OSM2PGSQL, a tool for loading OSM data into PostGIS

I’ve captured all those bits, and their configuration in this script. You’ll probably want to change the password – search for “htpasswd”.

The script below is out of date, contains errors, and is not maintained. Go to:

http://github.com/stevage/saltymill

# This script installs TileMill, PostGIS, nginx, and does some basic configuration.
# The set up it creates has basic security: port 20009 can only be accessed through port 80, which has password auth.

# The Postgres database tuning assumes 32 Gb RAM.

# Author: Steve Bennett

wget https://github.com/downloads/mapbox/tilemill/install-tilemill.tar.gz
tar -xzvf install-tilemill.tar.gz

sudo apt-get install -y policykit-1

#As per https://github.com/gravitystorm/openstreetmap-carto

sudo bash install-tilemill.sh

#And hence here: http://www.postgis.org/documentation/manual-2.0/postgis_installation.html
#? 
sudo apt-get install -y postgresql libpq-dev postgis

# Install OSM2pgsql

sudo apt-get install -y software-properties-common git unzip
sudo add-apt-repository ppa:kakrueger/openstreetmap
sudo apt-get update
sudo apt-get install -y osm2pgsql

#(leave all defaults)

#Install TileMill

sudo add-apt-repository ppa:developmentseed/mapbox
sudo apt-get update

sudo apt-get install -y tilemill

# less /etc/tilemill/tilemill.config
# Verify that server: true

sudo start tilemill

# To tunnel to the machine, if needed:
# ssh -CA nectar-maps -L 21009:localhost:20009 -L 21008:localhost:20008
# Then access it at localhost:21009

# Configure Postgres

echo "CREATE ROLE ubuntu WITH LOGIN CREATEDB UNENCRYPTED PASSWORD 'ubuntu'" | sudo -su postgres psql
# sudo -su postgres bash -c 'createuser -d -a -P ubuntu'

#(password 'ubuntu') (blank doesn't work well...)

# === Unsecuring TileMill

export IP=`curl http://ifconfig.me`

cat > tilemill.config <<FOF
{
  "files": "/usr/share/mapbox",
  "coreUrl": "$IP:20009",
  "tileUrl": "$IP:20008",
  "listenHost": "0.0.0.0",
  "server": true
}
FOF
sudo cp tilemill.config /etc/tilemill/tilemill.config

# ======== Postgres performance tuning
sudo bash
cat >> /etc/postgresql/9.1/main/postgresql.conf <<FOF
# Steve's settings
shared_buffers = 8GB
autovaccuum = on
effective_cache_size = 8GB
work_mem = 128MB
maintenance_work_mem = 64MB
wal_buffers = 1MB

FOF
exit

# ==== Automatic start 
cat > rc.local <<FOF
#!/bin/sh -e
sysctl -w kernel.shmmax=8000000000
service postgresql start
start tilemill
service nginx start
exit 0
FOF

sudo cp rc.local /etc/rc.local

# === Securing with nginx
sudo apt-get -y install nginx

cd /etc/nginx
sudo bash
printf "maps:$(openssl passwd -crypt 'incorrect cow cell pin')\n" >> htpasswd
chown root:www-data htpasswd
chmod 640 htpasswd
exit

cat > sites-enabled-default <<FOF

server {
   listen 80;
   server_name localhost;
   location / {
        proxy_set_header Host \$http_host;
        proxy_pass http://127.0.0.1:20009;
        auth_basic "Restricted";
        auth_basic_user_file htpasswd;
    }
}

server {
   listen $IP:20008;
   server_name localhost;
   location / {
        proxy_set_header Host $http_host;
        proxy_pass http://127.0.0.1:20008;
        auth_basic "Restricted";
        auth_basic_user_file htpasswd;
    }
}

FOF

sudo cp sites-enabled-default /etc/nginx/sites-enabled/default
sudo service nginx restart

echo "Australia/Melbourne" | sudo tee /etc/timezone
sudo dpkg-reconfigure --frontend noninteractive tzdata

mapmaking, System administration nginx, openstreetmap, postgis, script, sysadmin, system administration, tilemill

Forget trying to remember your servers’ names!

1 Comment Posted by steveko on May 3, 2013

I run lots of Linux servers. I create them, install some stuff, mess around with them, forget them, come back to them…and forget my credentials. My life used to look like this:

$ ssh -i ~/steveko.pem steveb@115.146.83.268
Permission denied (publickey).
$ ssh -i ~/steveko.pem sbennett@115.146.83.268
Permission denied (publickey).
$ ssh -i ~/stevebennett.pem steveb@115.146.83.268
Permission denied (publickey).
$ ssh -i ~/steveko.pem ubuntu@115.146.83.268
Welcome to Ubuntu 12.10 (GNU/Linux 3.5.0-26-generic x86_64)
...

This got really tedious. You can’t memorise many IP addresses, so you’re constantly referring to emails, post-its or even SMSes. Then you rebuild your server, the IP address changes, and you’re lost again.

And because some of the servers are administered by other people, I can’t always choose my own user name, so more faffing around.

So, here’s my solution:

A naming convention for servers

Give each server a name. Ignore the actual hostname of the server, or what everyone else calls it. My convention goes like this:

<infrastructure>–<project>[–<purpose>]

Examples:

nectar-tugg-dev: A development server for the TUGG project, running on NeCTAR Research Cloud infrastructure.
rmit-microtardis-prod: A development server for MicroTardis, running on RMIT infrastructure.
nectar-tunnelator: A side project “tunnelator” running on NeCTAR Research Cloud infrastructure. Small projects only have one server.

The key here is minimising what you need to remember. If I’m doing some work on a project, I’ll always know the project name and whether I want to work on the prod or dev server. Indeed, it’s an advantage to have to specifically type “-prod” when working on a production machine.

Put all IP addresses in /etc/hosts.

When I create a server, or someone tells me an IP, I immediately give it a name, and store it in /etc/hosts. The file looks like this:

##
# Host Database
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost 
fe80::1%lo0 localhost
115.146.46.269 nectar-tunnelator
136.186.3.299 swin-bpsyc-dev
...

This has the huge advantage that you can also put those names in the browser address bar: http://nectar-tunnelator

If a server moves location, just update the entry in /etc/hosts, and forget about it again.

Put all access information in ~/.ssh/config

The SSH configuration file can radically simplify your life. You have one entry per server, like this:

Host latrobe-vesiclepedia-dev
 User steveb
 IdentityFile /Users/stevebennett/Dropbox/VeRSI/NeCTAR/nectar.pem
 Port 9022

Notice how we don’t need to spell out the IP address again. And by storing the access details here, we can connect like this:

$ ssh latrobe-vesiclepedia-dev

So much less to remember. And because it’s so easy to connect, suddenly tools like SCP, and SSH tunnelling become much more attractive.

$ scp myfile.txt latrobe-vesiclepedia-dev
$ ssh latrobe-vesiclepedia-dev sudo cp myfile.txt /var/www

In reality, it gets even simpler. Most of my NeCTAR boxes are Ubuntu, with a login name of “ubuntu”. The “nectar-” naming convention proves valuable:

Host nectar-*
 IdentityFile /Users/stevebennett/Dropbox/VeRSI/NeCTAR/nectar.pem
 User ubuntu

That means that any NeCTAR box using that key and username doesn’t even need its own entry in .ssh/config:

$ ssh nectar-someserver

System administration authentication, efficiency, hacking, hosts, networking, organisation, ssh, tips

Anonymous longitudinal surveys with LimeSurvey

3 Comments Posted by steveko on January 21, 2013

A researcher at the Australian Research Centre in Sex, Health and Society (ARCSHS, pronounced “archers”) carries out surveys that are both:

anonymous: personal details are not collected, and steps are taken to avoid being able to identify participants; and
longitudinal: participants are contacted at intervals to answer similar questions. Questions often refer to previous answers given by the participant.

It’s a tricky combination, not well supported by most survey software. Here’s a solution, using LimeSurvey, that doesn’t require modifying any code. These instructions are aimed at people with no familiarity with LimeSurvey.

Quick summary:

The survey is carried out in not anonymous mode. Anonymity is maintained through an external email redirection service.
Additional attributes are added on to token tables for follow-up rounds.
Answers from each survey are transferred to the additional attributes through Excel manipulation

Survey hosting

LimeService is a very cheap way to host LimeSurvey, and removes the burden of server administration.

Anonymous email addresses

Participants must not sign up for the survey using their actual email address. Instead, they should go to a third-party email forwarding site like http://notsharingmy.info. For the participant:

Click on a link
Enter their email address, click “Get an obscure email”
Copy the generated email address into the survey registration form.

(Note: as of January 2013, notsharingmy.info is unreliable. I’m working on another solution.)

About tokens

This solution relies heavily on LimeSurvey’s “tokens” which are explained badly in the interface. Enabling tokens just means you want to track information about individual participants: individual adding them to the list, inviting them, keeping track of who completed the survey or who needs another reminder. We also add “additional attributes”: the participants’ previous answers.

1. Set up the initial survey

Create a new survey:
Fill out the Description, Welcome Message, End Message.
Create the first question group.
Create some questions.
Set these survey “general settings”:
1. Tokens > Allow public registration? Yes
2. (Optional) Modify the registration text as described below in “Avoiding personal information”
Activate the survey. You’ll be asked if you want to initialise tokens:

Yes, you do.
Publicise the URL, and get lots of responses. Great.

Avoiding personal information

By default, LimeSurvey collects from each self-registering participant their first name, last name and email address. For a truly anonymous survey, you may wish to avoid collecting their name.

Under Global Settings, choose Template Editor:
As the warning points out, editing the default template isn’t a great idea. Let’s make a copy called “no_name_collecting”:

Now, edit the copy. Select “startpage.pstpl”, then add this code just before the “</head”> line:

...
<script>
$(function(){
$("[name=register_firstname]").parent().parent().hide()
$("[name=register_lastname]").parent().parent().hide()
});
</script>
</head>

Next, you might want to change the registration message, to direct people to create an anonymous email address. On the “Register” screen,
select “register.pstpl”.
In the code, replace {REGISTERMESSAGE1} and {REGISTERMESSAGE2} with any text or HTML you like.

2. Export responses and tokens

Export the responses as CSV
Export the tokens as CSV:

3. Create follow-up survey

The first survey was a success, and there are now lots of entries in the tokens table and the responses table. More in the former than the latter, because some people will sign up and not answer any questions.

Copy the first survey. (Click the + button, like starting a new survey first.)
You’ll probably want to remove some irrelevant questions and groups, so do that.
Now add one “additional attribute” for every response that you want to refer to from the initial survey. In Token Management:
The other fields don’t matter. Leave them blank.
Next, you may want to use these attributes in some questions. Let’s say you want to ask “Last time you did this survey you were 42. How old are you now?”
In the question text, click ‘LimeSurvey replacement field properties’ then select the extended attribute:The text then looks like this:
You can get fancy and set that value as the default for the question. Instructions on how to do this.
Repeat this for each question, wherever those extended attributes are needed.

4. Transfer previous responses to the follow-up survey

Download the token table for this survey. It will have no data, but the headers will be useful.So far, you have created a way to hold answers from previous surveys for each participant. But you haven’t actually put those answers in there. Time for some Excel magic. You want to create a token table for the follow-up survey, using bits from the three CSV files you’ve dowloaded so far.:
1. Tokens table for initial survey. (Containing the email addresses people signed up with.)
2. Responses for initial survey.
3. Tokens table for follow-up survey (containing the additional attributes).
Copy all the rows of tokens from the initial survey to the follow-up survey. Don’t copy the header row:Make sure you retain the additional attribute fields (attribute_1 etc.) They should be blank.
Notice that these tokens have already been “completed”. Clear out the invited and completed fields. Set the remindercount field to 0 and usesleft field to 1:
Now, copy columns from the participant responses table into attribute fields, one by one, as appropriate. Make sure the IDs line up correctly.
Save the file (“followup-tokens.csv” if you like). Now import it into your follow-up survey.

5. Activate the follow-up survey

Whew. You’re now ready to launch.

Activate the survey.
Send out invitations. Under “Token control”:
The default email template is pretty gross, so clean it up a bit before you send:
Each participant receives a custom URL which logs them in without requiring any username and password.

You can then repeat this process for each subsequent iteration.

e-Research, Tools anonymous, iteration, iterative, limesurvey, longitudinal, research, surveys

Windows red cross errors scam

2 Comments Posted by steveko on December 12, 2012

We have noticed many red cross errors!

I just received an interesting phone call, apparently from a group of Indian scammers. It went roughly like this. (Phrases in bold are things I jotted down during the call)

Hello, I’m Caroline from the Computer Technical Department at Windows Best Help [or Windows Based Help, perhaps]. We’re calling to alert you that for the past four weeks you’ve been receiving red cross errors, which mean you’re subject to internet viruses, and hackers that are trying to break into your computer. Your address is [my address], correct?

Throughout this, I give non-committal “mmm, yes” responses.

We’re connected through the Global IT Server. This just an awareness call, nothing to do with telemarketing.
Now, go to the home page of your browser.
Now press Windows-R, and type “cookies“. [Actually, long description of how to find the Windows key, and spelling out “cookies” in radio code. The first “o” was orange, the second was Oscar.]

Now, do you see all those files and folders? All the work you’re doing is stored in those files as a double coded check up.

I’m calling from the Computer Technical Department at Windows Best Help. This has nothing to do with telemarketing!

At this point, she transferred me to her “technical supervisor”. He gave me his name, but I didn’t quite catch it – something like Armin. I asked where they’re based – Kolkata.

Are you in front of your computer now? [I admitted that, no, the phone was in a different room.]
But I believe that you told Caroline that you were typing the commands and could see the results? [Interesting…I had led her to believe that. Is their operation so small that he listens to the whole conversation?]

Some confusion followed, where I offered to go and run the command for real. I told him to hold the line for a minute, while I went and did it. When he came back, the line was dead. Oops.

I’ve heard of this scam before, but it was entertaining to see it in operation. Too bad I didn’t get to see where it led.

Uncategorized scams

What I learned at e-Research Australasia 2012

Leave a comment Posted by steveko on November 6, 2012

Waterfall development still doesn’t work.
Filling an institutional research data registry is still a hard slog.
Omero is not just for optical microscopy.
Spending money so universities can build tools that the private sector will soon provide for free ends badly.
Research data tools that mimicking the design of laymen’s tools (“realestate.com.au for ecologists“) work well
AURIN is brilliant. AuScope is even more brilliant than before.
Touchscreen whiteboards are here, are extremely cool, and cost less than $5000.
AAF still doesn’t work, but will soon. Please.
NeuroHub. If neuroscience is your thing.
Boring, mildly offensive after-dinner entertainment works well as a team-building exercise.
All the state-based e-Research organisations (VeRSI, IVEC, QCIF, Intersect, eRSA, TPAC etc.) are working on a joint framework for e-research support.
Cytoscape: An Open Source Platform for Complex Network Analysis and Visualization
Staff at e-Research organisations have much more informed view of future e-Research projects, having worked on so many of them.
If you tick the wrong checkbox, your paper turns into a poster.
People find the word “productionising” offensive, but don’t mind “productifying”.
CloudStor’s FileSender is the right way for people in the university sector to send big files to each other.

Build it? Wait for someone else to build it?

And a thought that hit me after the conference: although a dominant message over the last few years has been “the data deluge is coming! start building things!”, there are sometimes significant advantages in waiting. e-Research organisations overseas are also building data repositories, registries, tools etc. In many cases, it would pay to wait for a big project overseas to deliver a reusable product, rather than going it alone on a much smaller scale. So, since we (at e-Research organisations) are trying to help many researchers, perhaps we should consider the prospect of some other tool arriving on the scene, when assessing the merit of any individual project.

research data conference, e-research, e-Research Australasia, research data

← Older posts

Newer posts →

	arnebab on 10 things I hate about Gi…
	arnebab on 10 things I hate about Gi…
	arnebab on 10 things I hate about Gi…
	Smitty on 10 things I hate about Gi…
	Igor on 10 things I hate about Gi…