While you can do a lot with the functions in Excel or Google Sheets, as well as the plethora of data-related tools available, there comes a time when programming can help take your work to the next level.
So, as I said, I've done a little research to figure out what I should focus on. These are three programming languages I'm learning to aid my journalism.
A preview of the RStudio dashboard (Credit: Mararie, Flickr)
RR has been increasing in popularity among data journalists over the last few years. It's easy to see why: whether you're collecting, cleaning or analysing data, R's a powerful tool that goes beyond what you can do in a spreadsheet.
It's a constantly developing platform assisted by an active community of people willing to help you. With RStudio, there's a decent GUI in which to issue your commands that tell R what to do with the dataframe you input. Swirl is a good place to begin learning.
As the Data Journalism Handbook says:
It is hard to find any visualisation method or data wrangling technique that is not already built into R. R is a universe in its own, the mecca of visual data analysis. Trained data journalists can use R to analyze huge dataset which extends the limits of Excel.
SQLEver had two or more spreadsheets that you wanted to join together? Or a large dataset that you wanted to query? You quickly begin to see the restrictions with what you can do in a spreadsheet - which is where SQL comes in.
The programming language allows you to do several things, including: selecting specific subsets of data that you want to extract; describing the exact changes you want to make; and performing queries across related datasets. You can save your commands as a script, meaning you can document your progress and repeat your steps in the future. Try the w3schools tutorial as a place to start.
PythonAmong its many uses, Python is a language that you can use to scrape websites - more powerful than tools such as OutWit Hub. Python scripts can extract data from web pages and documents, allowing you to build datasets in which you can find strong stories.
Gregor Aisch, of Open Knowledge Foundation, has said:
Python is a wonderful open source programming language which is easy to read and write (e.g. you don’t have to type a semi-colon after each line). More importantly, Python has a tremendous user base and therefore has plugins (called packages) for literally everything you need.
Once again, there are several tutorials out there to learn this language.
So there's my three: R, SQL and Python. Together they give you the skills to conduct great data journalism, from sourcing to scraping, from cleaning to visualising.
Got any more that you think I should include? Tweet me.