Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
There are 5 core elements of a dataset citation, with additional elements added as needed. All elements should be arranged according to the citation style you are using (MLA, APA, etc.).
- Creator(s) – individuals or organizations
- Publication year when the dataset was released (may be different from the Access date)
- Publisher – the data center, archive, or repository
- Identifier – a unique public identifier (e.g., an ARK or DOI)
Some additional elements may be needed if you are citing a dynamic/evolving dataset or a subset of a larger dataset:
- Version of the dataset analyzed in the citing paper
- Access date when the data was accessed for analysis in the citing paper
- Subset of the dataset analyzed (e.g., a range of dates or record numbers, a list of variables)
- Verifier that the dataset or subset accessed by a reader is identical to the one analyzed by the author (e.g., a Checksum)
- Location of the dataset on the internet, needed if the identifier is not "actionable" (convertible to a web address)
If you are creating a citation for your own data, many data repositories and publishers provide specific instructions for how to cite their data. If no citation information is provided, you can use generally agreed- upon guidelines. DataCite Metadata Schema is an example.