I approached my first data set like I imagine one would approach a herd of rabid bison – feeling horribly afraid and hopelessly unprepared.

There was however some comfort in “knowing” the bison I was dealing with. I had received an initial data set on refugee migration in 2014 from data wrangler/ninja Daniela Lepiz from Code for South Africa. It was familiar to me because I have an interest in the topic and had done “traditional” journalism stories about it in the past.

But I hadn’t analysed the information in this way before.

Before I got started, I looked to see if there was a more recent data set, which was easy enough because I already knew where to look. A quick search of the United Nations High Commissioner for Refugees’ website brought up data for the first half of 2015 and it was in xls format.


Knowing that I now had the most recent data available, and in a usable format, I took my first steps, by cleaning and then analysing it.

After reading through the data sets briefly, from the 20 tables in both data sets, I selected only the tables I thought would help answer questions I had started to ask of the data while analysing it.

I wanted to know how many refugee applications South Africa received between 2014 and the first half of 2015, how many of those were rejected and how that compared to the rest of the world.

With the relevant tables now in a new worksheet, I removed the columns and rows that weren’t necessary and renamed certain columns so that they were more concise. The data was already quite clean and I didn’t need to use the “if”, “And” or “vlookup” statements I was dreading.

I joined tables with the same information from both years and added a “year” column to differentiate them. Using pivot tables I could analyse the data and find the answers to my questions. It allowed me to filter, sort and visualise the information more easily.

Feeling like I wasn’t completely useless, I began to think of what kind of data visualisation I was going to use to illustrate the new numbers-heavy information I had uncovered.

Since my story was about people’s movements between countries I decided to create a map. A discussion with a visiting editor from Costa Rican newspaper La Nacion, Hassel Fallas, led to a decision to add graphs to the viz.

Together we (mostly her) built an interactive map and graphs showing some of the interesting stats from the data in the Tableau data viz builder. We also made an additional graph for mobile after we realised the map would be too big for anything other than desktop. It taught me to think mobile first when doing viz’s.

To make the data more relatable, I decided to find human voices to tell the real story behind the data, and did this in the form of a video I made of two asylum seekers in SA. These voices not only allowed for a multimedia element to the story, but showed the real people behind the figures in the data.

The final challenge with the story came with getting the embed codes for the visualisations to work on our publication’s website.  The problem was fixed after the developers from Code for South Africa assisted our own developers with the issues they were having with the CMS.

And so came the end to my first published data story, and I could breathe again, breathe again…


Matthew Wiebe via Unsplash (Creative Commons Zero)