Skip to Main Content

Searching for Research Data: Glossary

A libguide on how to find existing data to reuse for research.

Glossary

API: Application Programming Interface (API) is a technical interface directly connected to a computer’s software program. It defines the types of calls or requests that can be made, how to make them, the data formats to be used, which data are to be returned, etc. API's can be used to move information in or out of a system (DeiC (2021): National strategy for data management based on the FAIR principles. https://doi.org/10.48715/ea59-tp35).

Creative Commons (CC) license: A CC license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work" (See more here).

Dataset: A file or a collection of files including any supporting information as README files, scripts, images and metadata.

Documentation: Contextual information about the data which enables users to make sense of it and to interpret it properly. Documentation may relate to a whole dataset (e.g. a README file that accompanies the data files, or a detailed description of data gathering methods), or to specific aspects of it (e.g. labelling of columns in a spreadsheet, or annotation of apparent anomalies in the data).

FAIR data: Refers to the FAIR principles for research data, i.e. data which are Findable, Accessible, Interoperable and Reusable. See also the introduction page of this guide or here for an explanation of all the FAIR principles.

Grey literature: Documentary material which is not commercially published or publicly available, such as technical reports or internal business documents (Oxford English Dictionary, s.v. “grey literature | gray literature (n.),” March 2024, https://doi.org/10.1093/OED/5690716467).

License: Licenses are copyright waivers whereby creators can allow users to make use of their works (literature, data, software) the way the license permits, without asking for permission first.

Metadata: Data describing other data: identification, description/documentation, history of creation, license, etc. (DeiC (2021): National strategy for data management based on the FAIR principles. https://doi.org/10.48715/ea59-tp35).

Persistent Identifier (PID): Unique identification of a digital resource. Must normally be translated into a specific website address via a PID service (e.g. DOI, ORCID) (DeiC (2021): National strategy for data management based on the FAIR principles. https://doi.org/10.48715/ea59-tp35). 

Repository: Database of digital objects/research output, comprising data, metadata, and PID, often with a searchable user interface and computer interface (API). Repositories are most often organized by academic fields (disciplinary repository, preferably internationally) or institutionally (institutional repository, typically by a university) (DeiC (2021): National strategy for data management based on the FAIR principles. https://doi.org/10.48715/ea59-tp35).

Research data: Documents in a digital form, other than scientific publications, which are collected or produced in the course of scientific research activities and are used as evidence in the research process, or are commonly accepted in the research community as necessary to validate research findings and results (Directive (EU) 2019/1024, art. 2(9)).