Starting with an original research question or topic is an important way to keep the scope of your project in mind when assessing what data is available to be used.
Data may already exist that you can use to answer your research question, or you may need to collect or create that data. Always start by searching for existing data because creating data is time-intensive and expensive. Most students will use datasets that have already been created.
Best practices and methods for collecting or creating data vary by discipline. Consult with your professors.
You need to be able to actually use the data that you find, so check if the data is open or proprietary. Proprietary data requires permissions or payment, as opposed to open data which can be freely used and distributed by anyone.
If you are doing original research or an existing data set on your topic does not exist, you may need to collect or create that data. Always start by searching for existing data because creating data is time intensive and expensive. Most students will use datasets that have already been created.
You may need to gather and combine multiple datasets for your research. In this case you will need to: identify resources for data, acquire that data and organize that disparate data in a structure conducive for analysis.
When working with data, you will often start with spreadsheets or CSV (comma separated) files. Excel is a good starting tool for spreadsheets, data, and working with CSV files.