June was a wonderful month of failing at entirely new things, like scraping and cleaning data and trying to make a heatmap.
And, in keeping with the Disruption Age mantra of ‘Fail Fast’, we got through it all at quite a snappy pace.
The first two (cleaning, scraping) are modern endeavours that proved to be exactly as tedious as they sound, but are priceless skills to consolidate information from the various messy incarnations in which they land on your desk or inbox, into a format that allows systematic analysis and, in turn, forms the foundation for visualisations.
The heatmap was, in this case, the final step serving as both trend visualisation and engagement tool for the reader to explore the data analysis by him/herself. Using cartodb, the heatmapping process is largely automated once you’re past a certain point, which to me sums up what I’ve learnt about data driven story-telling in my first month here at Code for South Africa, where I am doing a three month course in data journalism: It takes a lot of time and work to reach what, a month ago, I might have been tempted to call the beginning of a story.
Example: On landbou.com, I read about a guy who had used information from the Deeds Office to create a database for all “agricultural land” traded in SA over a period of 12 months, and wondered if there would be any big surprises if you looked at that information per province. I added data about average price per hectare as trusty potential news angle.
My job was to import those tables into Excel and combine the data from all provinces into one clean dataset, deleting categories that I won’t be looking at to de-clutter the whole business. Then, using pivot tables in Excel, I could put a few basic questions to the data: Which district saw the most trades? Which province saw the most trades? Where and how much was the highest average price per hectare, etc? I picked an angle (that Marikana is abuzz with land deals) and pitched it to Netwerk24’s news desk who used the story in the business section.
Although I used the written story and infographics for most of the heavy lifting about the findings in the data, the appeal of adding heatmaps was really to allow readers to explore in much more detail, as I realised the Marikana angle was only that interesting…a lot of farmers would be saying: “Hang on, let me see what happened in Underberg in KZN (and let me know if you know how prices got driven that high)?”.
In order to plot the stats in the relevant districts I needed the x and y co-ordinates of those towns, which can be done using a search formula in Refine, that picks up the coordinates from Google Maps…value.parseJson().
value.parseJson().results[0].
I’ve made a demo video of how I made the map which you can watch here.
Once you’ve geo-located the information, you can export that new dataset from Google Refine (a tool used for cleaning data) to CartoDB or Tableau, and you can play around with colours and templates for visualisations and check that the embedding of the map works on the content management system of the site where you will publish the story – in our case the user experience was ruined a bit by the fact that on mobile it’s hard to navigate past the CartoDB map, you end up scrolling down deeper and deeper in the Atlantic Ocean inside the map.
And so July came along and the plan was to start telling the shocking story of the extent to which South African mothers drink during pregnancy, but then I got stomach pain…
Appendix: Small data
After a strong start followed by an indifferent 37 year innings, my appendix, formerly associated with Media24’s Afrikaans news division, on Monday the 4th of July became the first member of the Code for South Africa Data Journalism Academy‘s second cohort to drop out and down web tools indefinitely.
It was last seen around 14.00 in the Cape Town Mediclinic.
Now, three small scars on my lower abdomen speak to the wonders of science and progress while the itimised bill speaks to the horror of life without a comprehensive medical fund.
The surgeon who was tasked with removing the infected organ was singing the praises of the News24 app as I was being prepared for the procedure by a nurse named Hugo. The doc spoke with great urgency and surprisingly deep knowledge of monetising digital content in the mobile age, when the drugs knocked me out and I lost consciousness at hearing the word ‘paywall’, allowing me rare insight into what it must feel like to be an elderly board member of a modern media company in the teeth of disruption.
Reading up on the appendix, one discovers a larger, cautionary tale of not over-estimating one’s importance to those around you in this life: “Although the appendix is a part of your gastrointestinal tract, it’s a vestigial organ. This means that it provides no vital function and that you may live a normal, healthy life without it,” reports healthline.com.
“Indeed, we’re all an appendix to someone,” I share with nurse Hugo, who fails to respond.
The data angle on this story is thin other than to say; one down, none to go.