Glossary

AI

Short for »artificial intelligence.« The term should be viewed critically, as it represents a humanisation (). In actuality, it currently refers exclusively to ».«

Algorithm

A sequence of instructions for solving a problem. Algorithms follow defined individual steps, which are executed in their specified order (input, processing, output).

ANN

Short for »artificial neural network.« Just like , the term is coined after the biological model of the human brain. The artificial networks consist of a model of neurons with the aim of processing information. This designation provokes a humanisation ().

Annotation

Annotated, i.e. provided with a note. For example, knowledge in the form of and can be added to specific images or digital objects in order to better classify or filter them.

Anthropomorphism

Humanisation, i.e. human characteristics are attributed to the non-human. The machine is supposed to be intelligent like humans (or surpass them) and in the process obtain a circuitry that resembles the human brain.

API

Short for »Application Programming Interface.« Stands for a programming interface that enables the connection of a piece of software to another program, e.g. for data sets of museum collections.

Bias

Describes a disproportionate weight, e.g. in the training of , in favour of or against information contained in the data. Conversely, this can result in disadvantages or perpetuate unfair biases, which is particularly critical when the data is used as the basis for making decisions with (in)direct impact on daily life.

Clustering

Means the classification of various objects in a data set into different groups. The classification is carried out automatically on the basis of detected similarities, e.g. in an image corpus for groups such as »dogs« and »cats.«

Digital Humanities

Discusses the use of computer-based methods and digital object resources in an interdisciplinary manner and reflects on their application and impact in the humanities and cultural studies.

GitHub

Is an American online service for managing software development projects and has been part of Microsoft since 2018. The service is based on Git, which is used for file management. »Training the Archive« also manages a so-called .

IIIF

Short for »International Image Interoperability Framework.« A standardised interface, e.g. for the inter-institutional exchange of image data and other digital objects.

ImageNet

Various that already contain mathematical weights can be used via public libraries such as or . The complex process of training the weights is based on the image database, which is composed of up to 14 million images from the Internet. This has led to questionable or even biased categorisation.

Keras

Is an open deep-learning library, similar to , written in and . The library can be used in a particularly meaningful way when a certain pre-trained by means of are applied to one’s own tasks. This, however, creates dependence on external training.

Machine Learning

The term describes the development of a model using special learning algorithms that draw on a large amount of training data. The ‘knowledge’ generated can be used for predictions or recommendations.

Metadata

Also called metainformation, describes structured data containing information on characteristics of other data to define properties of objects (e.g. medium of an artwork).

Open Source

Is applicable as soon as the source code for a software is available and can thus be viewed, changed and used (free of charge) by the public. In some cases, licences of use must be observed. »Training the Archive« wants to publish as much code as possible, e.g. on .

Pattern Recognition

Describes the recognition of regularities, repetitions and similarities in a large amount of data to facilitate facial, speech or text recognition, for instance.

Proof of Concept

In short: PoC. From project management. PoC is proof that a project is feasible in principle, e.g. by means of a . Starting from this milestone, further work can be completed on the project.

Prototype

Describes a sample design of the end product to be developed. In software development, a template is adapted to the needs of the user and thus continually developed in iterative cycles.

Python

A high-level programming language with which, among other things, can be programmed. It is characterised by an easy-to-read, concise programming style. It is often used in science because it is comparatively easy to learn and offers good integration of scientific libraries. The name is derived from the British comedy group Monty .

Repository

Describes a digital archive by means of a directory for storing and describing digital objects. For example, »Training the Archive« manages a on as a freely accessible source of code for the first .

Robotics

as a subject and the robot as an entity deal with the unification of an interaction with the physical world through sensors, actuators as well as information processing and a technically feasible kinetics. In this context, is often mistakenly illustrated or symbolised by means of humanoid robots.

Scraping

Targeted extraction of information from the source code of websites to make the desired content available locally for further use. To scrape the image files of a museum using an is one example of its application.

TensorFlow

»import as tf.« A framework to be applied to in order to have computational operations performed by the . , for example, is an integral part of the TF-.

Transfer Learning

Procedure of instantiating a fully trained from or in order to pass compiled image data through it as input. The characteristics learned on one problem are applied to a new, similar problem. This is advantageous to research because the models have already trained a fundamental ‘understanding’ about the human world in terms of the general structure and content of images and this knowledge does not need to be taught from scratch.

Working Paper

The publication format reflects the current state of work and discussion within the research group, makes new knowledge available and also transfers it to the outside world.

Training the Archive

Overview

AI