The amount of data produced, the democratization of data, and the increase of data tools have increased the need for data literacy among the general population and in organizations. Data literacy means the ability to find, evaluate the source and quality of the data, be able to understand the data, manipulate it, ask questions of it, make an argument from it and assess the arguments of others.
Raul Bhargava from MIT, and Catherine D'ignazio from Emerson College (Knight, 2017) divides data literacy into four components:
Reading the data: Comprehending data in various forms and being able to read the language of data.
Working with the data: People work with data in various forms which depend on the role of the person. Is the person a student, statistician, or data visualization expert? Each of these roles work with data in a different way.
Analyze the Data: Using various skills to analyze the data. The specific skills that are used to analyze the data depend on the goals of the person analyzing the data. This might range from analyzing the data for basic summary statistics to creating machine learning models with the data.
Argue with the Data: Using the data in order to support your idea or research.
Examples of data literacy skills
Stage 1: Pre-project |
Idea - You have generated an idea, found collaborators and began thinking about what to do.
|
Planning - This essential stage impacts every other stage.
|
Stage 2: Active |
Collection - In this stage, researchers gather data and other materials key to the project.
|
Stage 3: Explore |
Wrangle - You prepare the data for analysis
|
Stage 4: Results |
Visualize - You create graphical representation of numbers, examine how to communicate the structures
|
Interpret - You articulate your preferences; what data relationships tell us
|
Stage 5: Post-project |
Sharing - You prepare data for long-term access and preservation
|
Reuse - Your well-documented data can be used in further research
|