Course Description
It’s not enough to have access to information, it is necessary to understand the correct way to organise, re-order and shape a dataset dependant on the need or the angle of a story. Data cleaning tools allow journalists to correct typo’s or other errors that can creep into datasets through different data capture methods.
Learning Outcomes
- Identify Data Cleaning as a distinct stage in the Data Pipeline
- Understand how to organise and manage datasets dependant on requirements
- Purpose based information formatting
- Use open-source tools to clean data for the purposes of analysis
- Understand structured formats
- Clean typos or data capture errors in an automated environment
- Find and fix inconsistencies in data
- Understand comprehensive functionality of open source data cleaning software
Intended Audience
Journalists, storytellers, ands data wranglers
Course Level
Introductory
Course Length
2 days
Additional Comments
Participants must have access to their own laptop, Excel / Open (Libre) Office, an activated google account, google chrome browser.