Explanation of Terms

Data and Research Data: for the purposes of this project, we interpret ‘data’ very broadly. We consider virtually any electronic output, input, or discourse that researchers use or create in the course of research to be ‘research data’. For example, Excel spreadsheets and SPSS data sets with statistics or measurements are ‘research data,’ but so are the lines of computer code used to transform the data, researcher field notes and transcripts, digital images used in research or presentations, e-mail correspondence containing academic discourse, and versions of academic papers. While most research data will be in a digital format, some physical objects or paper records such as lab notebooks may justifiably be termed as ‘research data’ too, since they are an integral element required to understand the data.

Data Management: the processes involved in creating, obtaining, transforming, sharing, protecting, documenting and preserving data (see explanation above of what constitutes research data). Data Management includes everything from file naming conventions to policies and practices on creating metadata and documentation for the long term. Professional data managers often refer to the process as ‘data curation’. For a PDF with a chart and explanation of the data curation lifecycle, click here (courtesy of the Digital Curation Centre).

Digital Preservation or Digital Curation: These terms are used somewhat interchangeably. They refer to all activities toward ensuring that digital information is available, interpretable, and useable over the long term. This includes storing files in a digital repository or digital archive, transforming files to make them compatible with newer versions of software or more common platforms, and backing up files. It also includes activities such as cataloguing/archiving or creating metadata and documentation so that internal and external users can interpret and use data in the future.

Digital Repository: a digital database and server system where academics/researchers can deposit their articles, books, and data. These include ‘institutional repositories’ (which contain the output of a single institution, often a university) and subject-specific repositories (which contain materials related to a particular subject or academic discipline). Digital repositories often take on the role of making electronic information publically available, preserving it in the medium and long term, and migrating information to be compatible with new software (or new versions of software) where possible.

Institutional Repository: a type of digital repository, housed within a single institution. Institutional Repositories are prevalent in universities and often collect published research papers.

Documentation: broad contextual information that helps outside users to understand and interpret data. This could include descriptions of files, file structure and relationships between files, as well as more general details on research methods and the context of creation e.g. project documentation. Metadata typically provides detailed information about a specific data object, while documentation provides the wider context needed for future users to be able to understand and trust the data collection.

Metadata: often explained as ‘data about data,’ metadata is contextual information about an object, often included within the data file itself. This could include information about data attributes, the creator, the source of the data, variable names, the date and time of data collection, or information about usage rights e.g. confidentiality or intellectual property rights.