Skip to Main Content

Using Data in Research: Define & Gather

Stage 1: Define a research question

Starting with an original research question or topic is an important way to keep the scope of your project in mind when assessing what data is available to be used.

Stage 2: Gather data

Data may already exist that you can use to answer your research question, or you may need to collect or create that data. Always start by searching for existing data because creating data is time-intensive and expensive. Most students will use datasets that have already been created.

Best practices and methods for collecting or creating data vary by discipline. Consult with your professors.

Search for Datasets

You need to be able to actually use the data that you find, so check if the data is open or proprietary. Proprietary data requires permissions or payment, as opposed to open data which can be freely used and distributed by anyone.

Evaluating Data

Generating Datasets

If you are doing original research or an existing data set on your topic does not exist, you may need to collect or create that data. Always start by searching for existing data because creating data is time intensive and expensive. Most students will use datasets that have already been created.

Collecting Datasets

You may need to gather and combine multiple datasets for your research. In this case you will need to: identify resources for data, acquire that data and organize that disparate data in a structure conducive for analysis.

Ethics & Compliance

Spreadsheets

When working with data, you will often start with spreadsheets or CSV (comma separated) files. Excel is a good starting tool for spreadsheets, data, and working with CSV files.