What is curation:
The active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, management, preservation, and representation.
Johnston, Lisa R. Curating Research Data, Volume One: Practical Strategies for Your Digital Repository. 1 edition. Chicago, Illinois: Amer Library Assn, 2017.
What WashU curators do:
Advise on repository selection
Review data and documentation
Provide suggestions to improve the FAIRness
Help you prepare a readme file to accompany your data
WashU Curation Workflow:
You can get started with WashU Curators.
1. By requesting a consultation. In a consultation we can:
2. By submitting data in our repository. This automatically initiates curation:
Provides structured information about the data, project and creator
It helps re-users understand the limitations of the dataset
It helps re-users understand how to use the data appropriately
It’s a place to collect all the relevant identifiers to help connect the data to authors, published results, institutions and funders
The Office of Science and Technology Policy charged a subgroup to explore and write a report on the Desirable Characteristic of Data Repositories for Federally Funded Research. Below is an overview of the recommendations. For WashU, we can address these characteristics directly, but because there are many options to choose from in the other categories, specificity is more difficult. Here is a desirable characteristics checklist, which you can copy and to help you evaluate whether a data repository meets these characteristics. Check out https://www.re3data.org/ to explore data repositories.
Guidance |
Institutional (WashU) | General | Domain |
Free and Easy Access | yes | varies | varies |
Clear Use Guidance | yes | varies | varies |
Risk Management | Yes and.... | varies | varies |
Retention Policy | yes | varies | varies |
Long-term Organizational Sustainability | yes | varies | varies |
Authentication | yes | yes | yes |
Long-term Technical Sustainability | yes | varies | varies |
Security and Integrity | yes | varies | varies |
Unique Persistent Identifiers | yes | varies | varies |
Metadata | yes | yes | yes |
Curation/ Quality Assurance | yes | usually not | varies |
Broad and Measured Reuse | yes | varies | varies |
Common Format | yes | varies | varies |
Provenance | yes | varies | varies |
Organization Infrastructure | |||
Technology | |||
Digital Object Management |
Under the 2023 Data Management and Sharing (DMS) policy, NIH encourages investigators to use an established repository, effective January 25, 2023.
Where does NIH want it shared
Under the 2023 Data Management and Sharing (DMS) policy, NIH encourages investigators to use an established repository.
When selecting a repository, investigators should choose based on factors such as the sensitivity of the data, the size and complexity of the dataset, and the volume of requests anticipated
Where does not constitute meeting data sharing sharing requirements
A personal website
A repository you built
“Available on request”
Social networking (e.g., academia.edu)
Choosing a Repository: Some programs will mandate a specified repositories. For those who don't:
Make a copy of this document based on NIH checklist, and make sure you can check off all of the listed characteristics.
Consider:
What should I include in my NIH DMS plan:
Element 1: Data Type (describe the data collected, the data to share, the documentation of methods, and metadata)
Element 2: Related Tools, Software and/or Code (any specialized tools or software should be shared; if possible)
Element 3: Standards (use community standards for data and metadata (e.g., datacite)
Element 4: Data Preservation, Access, and Associated Timelines (where it will be shared, how it will be found, when will it be shared, how long will it be retained)
Element 5: Access, Distribution, or Reuse Considerations (licenses and documentation such as codebooks, data dictionaries, etc.)
Element 6: Oversight of Data Management and Sharing (roles and repsonsibilities)
How FAIR is your data?