Skip to Main Content

Research Data Management: Describe

What is Data Documentation?

Data documentation is the recording of information that makes your data able to be used and understood by others. Documentation might include lab notebooks, notes on methodology, data dictionaries, codebooks, README.tx files, and metadata.

A summary of how you will document your work should be included in your Data Management Plan.

What is Metadata?

Metadata is the information the describes your data in a standardized structure so that it can be properly understood, indexed in a repository, reused and cited. Create metadata for every experiment or study that you run, and save your metadata documentation in a .txt or .csv file alongside your data.

Standardizing your Metadata

A metadata standard outlines the required and optional elements or pieces of information required for your metadata. There are general metadata standards, like Dublin Core, but most metadata standards are for specific subject areas.

  • If you are putting your data in a repository, there might be a required metadata standard.
  • If there are no specific requirements, adopt the metadata standard common to your field.
  • If you have multiple options, consider the specific needs of your project and match your metadata standard accordingly.
  • Use standards for recording information in your metadata. For example, there are standard ways of recording dates, locations, subjects, etc. Look for the controlled vocabularies, ontologies, or standards commonly accepted in your discipline.

Common Elements for Project-level Metadata

This describes the “who, what, where, when, how and why” of the project, giving context for understanding why the data were collected and used.

  • Name of project
  • Principal investigator and collaborators
  • Context of data collection (geographic location, date of collection, etc)
  • Data collection methods
  • Structure, organization of data files
  • Data sources used
  • Data validation, quality assurance
  • Transformations of data from the raw data through analysis
  • Information on confidentiality, access & use conditions
  • Project sponsor (if any)

Common Elements for Dataset Metadata

This gives more detail about the data itself.

  • Variable names, and description
  • Explanation of codes and classification schemes used
  • Algorithms used to transform data
  • Data acquisition details
  • File format and software (including version) used

Resources