Skip to Main Content Header and Footer Templates

Research Data Management

This guide is designed to help you navigating research data management, tools, planning, and sharing.

File Organization

Help out future you by getting organized.an image of a messy desktop

Simply put, organizing your files will ultimately save you time and headaches.

To do this properly, the WashU Libraries strongly urges faculty members, postdoctoral researchers, and graduate students to follow the recommendations found at the links in this guide.  These recommendations will ensure the easy access and ongoing usability of your files in years to come. If you have any questions about organizing your files or need additional assistance, please feel free to contact Data Services.

image: https://datacarpentry.org/rr-organization1/02-file-organization/index.html 

Naming your files or data in a way that allows them to be easily recognizable without opening them is critical for properly maintaining your files. 

Good Practice   Rationale
Create meaningful file names that are simple and relevant to the file. Obscure file names may be impossible to locate in the future
Use lowercase letters and numbers to name files. There are accessibilty problems for individuals with disabilities when using the shift key.
Do not include spaces. This can cause problems with some operating systems and when acting as a hyperlink. 
File names should not exceed 31 characters.      More than 31 characters may cause a problem for certain operating systems. 
Do NOT include any special characters such as:
& , . ( ) % # $ ¢ / \ - { } [ ] < > : ; @
In certain operating systems these are seen as wild card operators and may cause a problem. 

Include dates and format them consistent with ISO 8601:

YYYYMMDD

This format allows ease of sorting and comparing files by date and prevents confusion with other date formats.
Use a version number to manage drafts and revisions (v01, v02, v03).  This is much more effective than other common additions like "update", "new", and "old". 

 

 (adapted from the Digital Asset Management Blog Archive and from NCDCR’s Best Practices for File-Naming)

Do you remember Lotus 1,2,3?  Have any files saved in WPS or MacWrite?  Have you ever gone back to a file that is a few years old and received a "file not found" or other error message? If so, proprietary file formats are to blame. 


The format you keep your data in today is the primary factor in it's usability in the future.  As new versions of software are released or different software becomes commonly accepted, older versions become obsolete. 

To avoid this, it is recommended that researchers use file formats that are :

  • non-proprietary (use .txt or .rft versus .doc)
  • open, documented standard (see the Metatdata section for more information)
  • standard representation (ASCII, Unicode)
  • uncompressed
  • unencrypted

When establishing your directory's organizational  structure/hierarchy, aim to create a consistent approach, whichreflects the manner in which data is accessed and collected.

Consider Organizing by:

1.     Object type – e.g. Interview Data, Survey Data

2.     Organization structure – e.g. Location, Department, Individual, Project

3.     Combine – structure directory by organization then object type

 

In most cases, the combined structure is the preferred method to organize files, but it is ultimately up the needs of the project.

Other tips:

  • Consistency is important in your directory structure
  • Avoid going too deep. If you have to click through 10 folders to find your data, that is generally unnecessarily deep
  • Do not rely on nested folder. A folder or file named "data" without any other context is a candidate for loss or obscurity.

image: Jennifer Moore