21 Feb 2016

Mapping with CartoDB: Solutions to problems faced by the journalist user

CartoDB is a great mapping tool for journalists. I've used it personally and professionally, to help tell the data-driven stories I produce.

It can be used to visually improve the stories you seek to tell, using interactive maps to help readers engage with your stories. You can produce these maps with no coding knowledge, all with the simple upload of an Excel file to the website, which will do all the hard work for you.

As soon as you upload your data, CartoDB will often show you a map of it immediately, which can be customised easily to improve it. This customisation can vary from simply changing the map type or information window, to editing the CSS to play with how the map shows your data.

But CartoDB is not perfect. There are drawbacks with using this tool, as I'll try to explain below, as well as highlight some ways to get around these issues.

Area names not matching

In the cleaning process, data journalists know that they have to look at their data's consistency - and this applies for area names.

Unless you're mapping large, major areas, such as countries, the chances are that you'll have to merge datasets - matching up area names in your dataset with shape files you've had to download yourself (often in the form of .kml files).

Often, when mapping UK constituencies or local authorities (shape file here), area names can prove an issue, as there are different ways to spell or present them. The text strings that you have in your data (even if t's downloaded from government websites) may not match up with the text strings in the shape file you downloaded - and so you won't be able to merge them.

This can be the difference between having "York Central" and "Central York"; "Wyre and Preston North" and "Wyre & Preston North"; "Weston-Super-Mare" and "Weston Super Mare". Any of these inconsistencies are present between your two spreadsheets, when you could to merge them in CartoDB, you will have gaps in your map.

This means you have to be careful when you are preparing your dataset, before you upload it to CartoDB. Look at your shape file and your dataset, and see two match up for area names. Checking will save you time later. The if function could be useful here, to ask Excel if your two columns are the same (once you have brought the two datasets together for your test).

Alternatively, using area codes overlooks this altogether - avoiding the possibility of variations in human input and making your work more reliable.

Area codes are individual and less susceptible to human error, and therefore are the best options to use in geolocating

Regional differences

Living in the UK? In England or Wales? Scotland? Northern Ireland? Thanks to devolution, each of these regions have different statistical agencies and so maps comparing a variable across the whole of the UK are rare.

This can be an issue if you're writing for a British newspaper, wanting to show differences across the whole country. You may get shape files and data for English regions, and perhaps Scotland and Wales - but often Northern Ireland may be missing.

Feel free to email me if you can't get a shape file for Northern Ireland - I have one somewhere. You may have to add this as a different layer on your map, if you don't wish to merge the files yourself (this can have implications for the map's size).

Even if you manage to get the shape files, consistent data across the whole of the UK can be rare. Be wary of comparing regional differences from different spreadsheets. These datasets may not be comparable, and so it's always best to try and seek the data you want all from one source for reliability.

As Simon Rogers says in his book Facts are Sacred:
"It is often easier to get across statistics from European countries via Eurostat than it is to get figures for the whole of the UK at a local level. That is because Eurostat has a single operation to combine data from across the European Union into single accessible datasets by coordinating all the national statistics agencies.
"The UK, with increasingly disparate data sources, needs that now. And it's kind of what we expect from the Office for National Statistics. The title says it all."
A choropleth map of the UK without Northern Ireland is an all too common sight

Mobile responsiveness

CartoDB maps are easy to embed in frames on your website, but they can be tricky to view on mobile. A map can often take over the whole screen, making it hard to scroll past to get to the rest of your story.

The map itself can also show too much information, or have the wrong focus, which means it's actually just confusing for your mobile audience, instead of helping them understand your story.

Advice to combat this would include:
  • Check the dimensions that are described in your iframe code. If the height is too much, the map can be hard to scroll past on mobile.
  • Keep the map free of clutter, such as lots of shapes, lines or dots. On mobile, too many of these can make the map unusable. 
  • The same goes for CartoDB's optional add-ons, such as sharing options and a search map. Unless it's important for your story, cut it. 
  • If your information windows have lots of information, they can dominate the screen when the reader clicks on them. This can crowd out what is underneath the window, which may be important context, and can hinder the reader. Omit all but essential information, and reserve what else you wish to tell for elsewhere in your story. 
CartoDB has now teamed up with Nutiteq, described as"pioneers in native mobile mapping", which could see developments in how their maps are viewed and engaged with on our smaller handheld screens.

Other geocoding issues

Other visual journalists I have spoken to have mentioned that it can sometimes be tricky to geocode data in CartoDB. There are options available for this in the tool, but they can be difficult.

I would suggest you geocode your data before entering it into the CartoDB platform. Either line up your shape file and check its suitability with your data, or create longitude and latitude columns before uploading your dataset.

This geocoding resource is invaluable: you can input a list of addresses into it, and it will automatically return the longitude and latitude of each point for you. It will also present them on a map for you, so you can quickly check the geographical distribution (and have a quick, first check to make sure it's accurate).

When merging spreadsheets by a column in order to geolocate your data against a shape file, make sure that the area names match up exactly

The restrictions of a third-party tool

Using a tool you haven't created yourself has obvious restrictions, in that you haven't planned and developed it with your specific needs in mind. You won't be able to do everything you want with its browser tool.

There are ways to get around this however. There are several ways to improve your maps in the tool, such as using HTML to make the data inside the information windows flow better, as well as filter and SQL query options.

CartoDB is also much more than just a browser tool and is available open source so you can make Carto more honed for your own ends. There is the CartoDB.js library, which can have several uses, such as wrap its APIs into complete visualisations.

1 comment:

  1. By reading your blog post i earned my knowledge path in the perfect path. Nice to see your blog.

    Web Designing Training in Chennai