I’m writing up notes in order to present the project I sent in the last email, I’m having a hard time linking them together into something cohesive, so I’m going to dump an attempt below. Honestly here we’re scraping the barrel of what energy I have left to mentally engage with anything before performing some kind of festive mental disengagement so please bear with the fact this frequently descends into nonsense. Also in this email, some music, film, shows.
Sparring Partners (a timeline)
So let’s run through a timeline of how the piece came to be. I reckon it would be helpful to talk through where we began, where we’re at now, and where we want to be in the future.
Prior to starting this work, I’ve spent a few years working in news media, an arena in which trust and authenticity are crucial values, and I’ve felt a growing awareness of the increasing difficulty of assuring them for an audience. In his (extremely) recent book The Eye of the Master, Matteo Pasquinelli talks of a “dimensionality explosion” of data, that is: data are not just becoming more numerous, but increasingly complex, consisting of multiple dimensions.
Artificial Intelligence is often touted as a response to the crisis this causes for human interaction with data, because AI is capable of parsing complex data and generating something that appears novel from it. AI is simultaneously touted as saviour or satan, depending on who you ask: either AI creates trust by successfully and accurately concatenating the dimensionality explosion into legible value, or it destroys trust by flooding the dataset with flawed, plagiarised, or purely hallucinated information.
So my question has been: assuming that neither dichotomy is wholly true yet both are valid; that AI cannot be trusted to ascertain objective truths yet still holds utility potential, then what engagements with it might we explore that rely on neither? More succinctly:
What engagement can we have with AI that is neither a naive faith in the tech, nor is just a dogmatic dismissal of any potential?
So, rationalisation established, I designed a speculative object: a wearable device powered by AI, which would alert its user any time they said or heard information that was false. Speculative because it’s faulty in its basic premise: the AI models that power it are flawed and biased, generated from flawed and biased human datasets, and it can’t reliably assess reality.
I know a bit of Javascript, HTML, CSS, the basics, but I’m no developer so the first iterations were disgustingly crude. Feel free to skip all the technical descriptions that follow, however if you’re a developer I hope you read it and melt.
I built three modules: a Speech-to-text interpreter, an app which fed that to a large language model to evaluate it, and a bridge between the AI and the smartwatch. I knew that Mozilla's DeepSpeech speech-to-text model existed so without much thought I jumped right into it, grabbing some demo code from its github repo which included voice activity detection. It was written in Nim, which I had no familiarity with, but iirc the Node.js version's dependencies were broken and I felt unnervingly optimistic at this point. I found a command-line tool, Ollama, which could pull various open-source LLM models and deliver responses. I bought a Bangle.js smartwatch, which I figured I could probably send some data to, though I didn't really know how. I needed to bridge these three modules, Websockets sounded complicated so I decided to just shuffle text files between everything (no, really). I used ChatGPT to write some Nim code to export voice transcripts as .txt files and compiled it. Ollama had an API but that sounded complicated, so I wrote bash scripts to watch the folders that DeepSpeech dumped .txt files in, read the text and run it through Ollama on the command line, telling the LLM to evaluate the text to either FALSE or TRUE, and then writing another text file to another folder. I found I could use a javascript library to transmit lines of code to the smartwatch, so I used ChatGPT again to write a javascript app which (you guessed it) watched a folder for .txt files, and bounced the TRUE or FALSE message through to the watch via Web Bluetooth. There were a lot of hiccups on this journey: Ollama only ran on Linux so I was running it on Windows Subsystem for Linux, but WSL can't access microphones or bluetooth, so the DeepSpeech transcription had to be done in Windows and sent through to the VM, then back to Windows to go to the watch. Also DeepSpeech is basically dogshit now and I didn't realise it was abandoned by Mozilla a few years ago, superceded by various other AI STT and TTS models. Lastly, I didn't want to move my PC all the way across London to the studio to demonstrate this setup, so at some point I symlinked Dropbox folders (actually, this isn't possible but I did something equivalent yet more obscene with powershell scripts to achieve the same result) so that the speech transcription could be done on a little laptop remotely, processed at home, and sent back to the laptop and to the smartwatch.
This is the worst thing I have ever done. In retrospect I am honestly astonished that it worked at all, but it did. I built a cursed object, working in a cursed fashion. Yet this technical drama wasn’t really the source of the dread that I felt through this entire process, it was the knowledge that what I was making was fundamentally built on a flawed thesis and there was a real possibility that I was putting a lot of time and effort into what might be received only as an earnest attempt to create a theoretically indefensible AI-powered lie detector, rather than something critical.
However this is where the speculation could begin: with a working, designed provocation. Using this device I could tear it to pieces, experience its flaws and engage with its tangible effects.
I refined it more over the coming weeks: everything moved to linux; the modules shared information over websockets; the ageing and ineffectual DeepSpeech model was swapped for Whisper AI; much of the code was rewritten in Python. The device itself shifted focus from true/false declarations to delivering scored estimations of accuracy, as well as contextual explanations.
Always a conflicted development process, pretending, working facetiously to make something wrong work better without trying to make it work right: the only way to make it right would be not to make it at all.
So this became the most crucial element of the piece. Not to get someone to use the device, but to engage an audience in the long and critical process of speculation. As such, whoever engages with the work is not a user so much as they are a performer or a collaborator with the work, involved in the process, playing with it and examining it. The device won’t and can’t ever be completed because its basic premise (that AI can establish objective truth) is flawed, and it exists only to exhibit its basic dysfunction.
It’s for this reason that Sparring Partners is published as an open-source, shareable and modifiable exercise: the only way to engage with the work is to speculate upon it; and the landscape against which it is evaluated is changing rapidly as new AI technologies are launched, new models are trained, new and earnest interactions with human users are invented.
The work decays, constantly and violently. If it is to have any future beyond an immediate existence in a narrow contemporaneity, the only way it can remain active is by regularly altering its provocations to take aim at itself once again.