1. Submit to the WURD Repository
2. The WURD director assigns to local or Data Curation Network (DCN) curator
3. Curator reviews the data and documentation following the DCN CURATED steps
4. Curator sends recommendation to increase FAIRness
5. You send back revisions
6. Curator marks curation complete
7. Work director evaluates FAIRness. Publishes dataset.
8. You receive notice of published dataset with
9. WURD staff take preservation actions
What is curation:
The active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, management, preservation, and representation.
Johnston, Lisa R. Curating Research Data, Volume One: Practical Strategies for Your Digital Repository. 1 edition. Chicago, Illinois: Amer Library Assn, 2017.
What WashU curators do:
Advise on repository selection
Review data and documentation
Provide suggestions to improve the FAIRness
Help you prepare a readme file to accompany your data
The Office of Science and Technology Policy charged a subgroup to explore and write a report on the Desirable Characteristic of Data Repositories for Federally Funded Research. Below is an overview of the recommendations. For WashU, we can address these characteristics directly, but because there are many options to choose from in the other categories, specificity is more difficult. Here is a desirable characteristics checklist, which you can copy and to help you evaluate whether a data repository meets these characteristics. Check out https://www.re3data.org/ to explore data repositories.
|Free and Easy Access||yes||varies||varies|
|Clear Use Guidance||yes||varies||varies|
|Risk Management||Yes and....||varies||varies|
|Long-term Organizational Sustainability||yes||varies||varies|
|Long-term Technical Sustainability||yes||varies||varies|
|Security and Integrity||yes||varies||varies|
|Unique Persistent Identifiers||yes||varies||varies|
|Curation/ Quality Assurance||yes||usually not||varies|
|Broad and Measured Reuse||yes||varies||varies|
|Digital Object Management|
Under the 2023 Data Management and Sharing (DMS) policy, NIH encourages investigators to use an established repository, effective January 25, 2023.
Where does NIH want it shared
Under the 2023 Data Management and Sharing (DMS) policy, NIH encourages investigators to use an established repository.
When selecting a repository, investigators should choose based on factors such as the sensitivity of the data, the size and complexity of the dataset, and the volume of requests anticipated
Where does not constitute meeting data sharing sharing requirements
A personal website
A repository you built
“Available on request”
Social networking (e.g., academia.edu)
Choosing a Repository: Some programs will mandate a specified repositories. For those who don't:
Make a copy of this document based on NIH checklist, and make sure you can check off all of the listed characteristics.
Element 1: Data Type (describe the data collected, the data to share, the documentation of methods, and metadata)
Element 2: Related Tools, Software and/or Code (any specialized tools or software should be shared; if possible)
Element 3: Standards (use community standards for data and metadata (e.g., datacite)
Element 4: Data Preservation, Access, and Associated Timelines (where it will be shared, how it will be found, when will it be shared, how long will it be retained)
Element 5: Access, Distribution, or Reuse Considerations (licenses and documentation such as codebooks, data dictionaries, etc.)
Element 6: Oversight of Data Management and Sharing (roles and repsonsibilities)
How FAIR is your data?
What is FAIR
In 2016, the ‘FAIR Guiding Principles for scientific data management and stewardship’ were published in Scientific Data. The authors intended to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. The principles emphasise machine-actionability (i.e., the capacity of computational systems to find, access, interoperate, and reuse data with none or minimal human intervention) because humans increasingly rely on computational support to deal with data as a result of the increase in volume, complexity, and creation speed of data. (Go-FAIR)
In an indexed repository, with a unique, persistent ID, and rich metadata
Repo uses open, standard protocols so the metadata and data can be accessed
data are in formal, standard, open application languages
well documented, explicit provenance, open licenses, follows community standards