Data preservation should be considered early on in planning and new research project.
Storage simply refers to placing data somewhere that it can be accessed when needed. Data is stored on local internal and external hard drives, cloud-based systems (Box@SMU, Dropbox, Google Drive, Amazon Web Services, etc), or servers.
Storing data does not safeguard against degrading media or obsolescence of the data formats. When you leave digital data on servers or hard drives without performing the proper preservation maintenance, your data eventually becomes obsolete.
Preservation is the process by which research data is maintained and remains usable for the long-term. The preservation process has 3 general steps: appraising the data, selecting a repository, and documenting and depositing data.
Benefits of preservation include:
A data repository (or archive) archives data for the long-term. Data repositories are often web-accessible and user-friendly to allow for easy discovery and re-use. They also provide supporting identifiers that facilitate proper citation.
There are 2 types of data repository: domain and institutional.
Domain Repositories: These are discipline-specific repositories that offer benefits (specialized metadata, review and validation by experts in the field, and specialized search and discovery tools) that are preferred options when they are available for particular discipline.
Examples of domain repositories include: eCrystals, PubChem, National Oceanographic Data Center (NODC), the Protein Databank, Genbank, and the Inter-university Consortium for Political and Social Research (ICPSR). Not all disciplines have a dedicated repository.
Institutional Repositories: These repositories collect and maintain the research outputs of a particular institution or group of institutions.
SMU Scholar is the repository we use. It houses research articles, theses and dissertations, data sets, and other digital assets. It is open-access, free for faculty and students to publish, offers unlimited storage, and secure, perpetual links.
Important factors to consider when selecting a repository include: the type of data, it's importance to the field, potential future uses, and privacy.
CRL TRAC Metrics - This tool is used by the Center for Research Libraries to audit and certify digital repositories. The three main categories include: Organizational Infrastructure; Digital Object Management; and Technologies, Technical Infrastructure, & Security. The checklist specifies the requirements for certification as a trusted archive.
CRL Ten Principles - The ten basic characteristics of quality digital preservation repositories. Created by the CRL.
Data Seal of Approval- The international organization that certifies repositories based on a set of 16 requirements. These requirements offer a good basis on which to evaluate the repositories you are considering.