Generative AI (GenAI) is a type of artificial intelligence that produces content such as text, images, or music. GenAI does not actually understand the content it produces. Instead, it makes predictions about the relationships between words, images, and sounds.
GenAI is trained on datasets, called large language models, that allow it to know how grammar, vocabulary, and style contribute to text. It mimics the language structures learned from the data to create coherent sentences.
Machine learning makes it possible for computers to learn from large datasets without being explicitly programmed to do so. This means that performance is continually improved through more data exposure.
Uses |
Tool |
---|---|
Translation |
Useful for primary and secondary sources, translate into or out of English for best results ChatPDF: upload PDF, limited to 2 PDFs a day at 120 pages each ChatGPT: copy and paste text, limited to 4,000 tokens |
Searching for scholarly articles |
Semantic scholar: use author, title, or DOI for searching; English-language focused; narrow book coverage; patents not included Consensus: use question form for searching; only includes open access empirical/peer-reviewed research; science and social science focused ChatGPT: refines a research question, determines subject terms, and suggests related terms; not suitable for finding actual publications Elicit: Organizes sources by selected variable, summarizes findings, and provides key findings, the depth of the search suggests it might be helpful for systematic reviews/meta-analyses Undermind: refines a research question and matches that question to sources, breaks sources up into categories, and provides a timeline and citation network, the depth of the search suggests it might be helpful for systematic reviews/meta-analyses |
Citation tracing for literature reviews |
Connected Papers: only includes articles Research Rabbit: only includes articles; sources mostly from academic journals |
Ideation & Keywords |
ChatGPT: create prompts for subtopics, organize/outline a paper, and brainstorm open data sources; request keywords and boolean search strings related to a specific research question; not suitable for producing citations; all content input into ChatGPT becomes useable by OpenAI Elicit: suggests key concepts for topics based on scholarship found in search |
Comprehension |
ChatPDF: only a paragraph or two is referred to for the answer Consensus: provides study snapshots with population, sample size, methods, and outcomes; synthesize feature provides summary of all results and offers consensus graph |
Lack of transparency and bias with datasets
Large language models (LLMs) are trained on a wide variety of datasets and aren't always transparent on which datasets are included and excluded. As a researcher, it is important to continuously critique the quality of generated content for bias and inclusivity.
Fake citations
GenAI can combine results from its existing datasets into citations that don't actually exist. This is called a hallucination. As a researcher, check generated citations to ensure credibility and correctness before sharing or using.
Plagiarism
Using information generated from LLMs without stating so is plagiarizing. Since LLMs aren't always transparent, researchers must be careful not to take someone else's work without providing proper credit and acknowledgment.
Privacy
Information you input into generative AI tools becomes the property of the platform and can be used for LLMs, training, or something else. Never input personally identifiable information or original research ideas into any AI platform.