Twenty eight data sets: that is the amount of data we had to look through to get an interesting slice of insight into South African professionals leaving the country to work abroad.

To say that data journalism is the next critical form of storytelling in reporting and that it will eventually transform the way all news rooms operate would be a misnomer. It is, in fact, happening now.

My first piece of data journalism started with data on global migration from the United Nations and the Organisation for Economic Development and Cooperation (OECD).

A Costa Rican data journalist at Code for South Africa, Daniela Quiros Lepiz had previously written about migration patterns in postcolonial Africa, noting the impact of democracy in third world and developing countries on their growing skills mass.

My task was to find stories within a collection of data sets from the two main sources on migration and how it is affecting South Africa. Lepiz sent me 42 data sets which held the key to this insight.

Slightly intimidated by the sheer mass of the data, I nervously asked her how many of the data sets I would have to use in the formulation of the story itself.

“All of them,” she replied casually.

“FML,” I sighed under my breath. “FML”.

2016-04-07 15.52.22

What I needed, first and foremost, was the foresight to know what I was looking for in the migration data set before I started looking at them. That was easy. I wanted to know where South Africans were going and how many were going to each country.

Using the United Nations’ dataset on migrant stock allowed us to get insight on South Africa’s migration trends. The multiple-sheet dataset was filtered to include South Africa’s migration numbers as well as the countries South Africans migrated to.

Working in Excel, a “sum” formula was used to find the countries receiving the most South Africans. An “if” statement and conditional formatting were used to categorise those countries whose migration numbers from South Africa increased between 2010 and 2013.

The result allowed us to create this heatmap:

​Map: Migration Stock Heatmap

Finding out the migration patterns of South Africa’s highly qualified and skilled professionals involved lifting OECD data from fourteen datasets and adding them to an Excel document with multiple sheets.

The data stated the migration patterns of migrants from various regions of the world to all OECD nations. The data pertaining to migration from other countries was filtered out, leaving South Africa’s migration patterns to stand out.

Excel’s “sum” formula was used to measure overall immigration by South Africans to all countries. This would eliminate the chance of anomalies falling through the cracks in the data on highly qualified and skilled migrants from the OECD data.

Map: South Africa’s Skills Leak

We had found that close to 50,000 highly skilled and highly qualified South African professionals left South Africa to work elsewhere in the period captured by the OECD in September of 2015. The lion’s share of the migrants went to the United Kingdom and included teachers, legislators as well as health specialists.

It took three separate datasets to create the Excel spreadsheet containing the total migration outlook from 2010 to 2013. Once that was done, a pivot table was created to illustrate the largest remittance source for South Africa.

Using a scraping tool called Kimono, we scraped (extracting data from a different format to another format which can be better analysed) remittances data from an OECD and UN collaborative project, from a PDF document to an Excel document. This allowed us to take the remittance amount from a country and divide it by the number of South African migrants there.

While the United Kingdom had the highest figure in terms of remittances to SA, countries like the United States had the highest remittance amounts sent annually by individuals per capita.

The data did have its limits. The OECD data only measured migration into OECD nations and the UN data did not have clearly defined periods for the migration which it measured.

But through combining the information from both, we were able to come to some fascinating conclusions on South Africa’s migration patterns and trends.

The result is an article and a tool which could potentially be used in policy formulating and research relating to migration, skills and the global work market.​ And there you have it.

I learned a lot from my first data adventure. I learned that you cannot put a spreadsheet on a sandwich and that pizza toppings don’t go on a database. But what data can give you is food for thought, and more.


Tim Gouw via Unsplash (Creative Commons Zero)
Migration Stock Heatmap – CartoDB
South Africa’s Skills Leak – CartoDB