Glossary

Overview

AI

Short for »artificial intelligence.« The term should be viewed critically, as it represents a humanisation (Anthropomorphism). In actuality, it currently refers exclusively to »Machine Learning

Algorithm

A sequence of instructions for solving a problem. Algorithms follow defined individual steps, which are executed in their specified order (input, processing, output).

ANN

Short for »artificial neural network.« Just like AI, the term is coined after the biological model of the human brain. The artificial networks consist of a model of neurons with the aim of processing information. This designation provokes a humanisation (Anthropomorphism).

Annotation

Annotated, i.e. provided with a note. For example, knowledge in the form of Metadata and Tags can be added to specific images or digital objects in order to better classify or filter them.

Anthropomorphism

Humanisation, i.e. human characteristics are attributed to the non-human. The machine is supposed to be intelligent like humans (or surpass them) and in the process obtain a circuitry that resembles the human brain.

API

Short for »Application Programming Interface.« Stands for a programming interface that enables the connection of a piece of software to another program, e.g. for Scraping data sets of museum collections.

Bias

Describes a disproportionate weight, e.g. in the training of AI, in favour of or against information contained in the data. Conversely, this can result in disadvantages or perpetuate unfair biases, which is particularly critical when the data is used as the basis for making decisions with (in)direct impact on daily life.

Clustering

Means the classification of various objects in a data set into different groups. The classification is carried out automatically on the basis of detected similarities, e.g. in an image corpus for groups such as »dogs« and »cats.«

Digital Humanities

Discusses the use of computer-based methods and digital object resources in an interdisciplinary manner and reflects on their application and impact in the humanities and cultural studies.

GitHub

Is an American online service for managing software development projects and has been part of Microsoft since 2018. The service is based on Git, which is used for file management. »Training the Archive« also manages a so-called Repository.

IIIF

Short for »International Image Interoperability Framework.« A standardised interface, e.g. for the inter-institutional exchange of image data and other digital objects.

ImageNet

Various ANN that already contain mathematical weights can be used via public libraries such as Keras or TensorFlow. The complex process of training the weights is based on the ImageNet image database, which is composed of up to 14 million images from the Internet. This has led to questionable or even biased categorisation.

Keras

Is an open deep-learning library, similar to TensorFlow, written in Python and Open Source. The library can be used in a particularly meaningful way when a certain ANN pre-trained by means of Transfer Learning are applied to one’s own tasks. This, however, creates dependence on external training.

Machine Learning

The term describes the development of a model using special learning algorithms that draw on a large amount of training data. The ‘knowledge’ generated can be used for predictions or recommendations.

Metadata

Also called metainformation, describes structured data containing information on characteristics of other data to define properties of objects (e.g. medium of an artwork).

Open Source

Is applicable as soon as the source code for a software is available and can thus be viewed, changed and used (free of charge) by the public. In some cases, licences of use must be observed. »Training the Archive« wants to publish as much code as possible, e.g. on GitHub.

Pattern Recognition

Describes the recognition of regularities, repetitions and similarities in a large amount of data to facilitate facial, speech or text recognition, for instance.

Proof of Concept

In short: PoC. From project management. PoC is proof that a project is feasible in principle, e.g. by means of a Prototype. Starting from this milestone, further work can be completed on the project.

Prototype

Describes a sample design of the end product to be developed. In software development, a Prototype template is adapted to the needs of the user and thus continually developed in iterative cycles.

Python

A high-level programming language with which, among other things, Machine Learning can be programmed. It is characterised by an easy-to-read, concise programming style. It is often used in science because it is comparatively easy to learn and offers good integration of scientific libraries. The name is derived from the British comedy group Monty Python.

Repository

Describes a digital archive by means of a directory for storing and describing digital objects. For example, »Training the Archive« manages a Repository on GitHub as a freely accessible source of code for the first Prototype.

Robotics

Robotics as a subject and the robot as an entity deal with the unification of an interaction with the physical world through sensors, actuators as well as information processing and a technically feasible kinetics. In this context, AI is often mistakenly illustrated or symbolised by means of humanoid robots.

Scraping

Targeted extraction of information from the source code of websites to make the desired content available locally for further use. To scrape the image files of a museum using an API is one example of its application.

Tags

A tag labels a data set with additional information.

TensorFlow

»import TensorFlow as tf.« A framework to be applied to Machine Learning in order to have computational operations performed by the ANN. Keras, for example, is an integral part of the TF-API.

Transfer Learning

Procedure of instantiating a fully trained ANN from Keras or TensorFlow in order to pass compiled image data through it as input. The characteristics learned on one problem are applied to a new, similar problem. This is advantageous to research because the models have already trained a fundamental ‘understanding’ about the human world in terms of the general structure and content of images and this knowledge does not need to be taught from scratch.

Working Paper

The publication format reflects the current state of work and discussion within the research group, makes new knowledge available and also transfers it to the outside world.

»Training the Archive« (2020–2023) is a research project that explores the possibilities and risks of AI in relation to the automated structuring of museum collection data to support curatorial practice and artistic production.

Collaborative Partner:
Digital Partner:
Funded by the Programme:
Funded by: