How to clean up your data?
Raw data is generally messy. Most projects require a substantial investment in ‘cleaning’ empty records, duplicates, unidentifiable inputs, and other anomalies that make it harder to discern patterns in the data.
There are a number of resources that can help you understand this process and make it easier to spot problems.
You can also learn more about techniques and tips regarding data cleaning:
- The Data Journalism Handbook has a section on cleaning messy data.
- The Online Journalism Blog also has blog posts on various aspects of data cleaning.
Have you applied or developed a practice that you would like to share with the influence mapping community? Edit this post on Github!