Artificial intelligence systems are often considered enigmatic or unknowable – complex neural networks whose parameters can number in the trillions and whose possible array of outputs is greater still. Yet no matter how convoluted or complex these systems become, the data used to train them remains one of the most important sources of evidence we can use to trace the histories, practices, and politics of how these systems interpret the world.
To further the understanding of training data, the Knowing Machines Project developed see:set, an investigative tool for examining the training datasets for AI. Here you will find nine essays from individual members of our team. Each one uses see:set to explore a key AI dataset and its role in the construction of “ground truth.” We invite you to use them to further interrogate the ways these systems structure knowledge, make predictions, represent reality, and intervene in the world.