Ross Goodwin is part writer, part engineering whiz. Working for Google as a creative technologist, he employs machine learning, natural language processing, and other computational tools to realize new forms and interfaces for written language.
This work has led him to an ongoing collaboration with artist and designer Es Devlin, which began with PoemPortraits in collaboration with Google Arts & Culture at the Serpentine Gallery in 2017. PoemPortraits invited visitors to submit random words that would be generated into poetry by an algorithm (developed by Goodwin) and projected across their faces.
Fast forward to 2018 and Goodwin is working once again with Devlin, this time on Please Feed The Lions in partnership with Google Arts & Culture as part of London Design Festival. The project will see a new lion added to the four-strong pride at Trafalgar Square, London and invite visitors to ‘feed the lion’ a word that will become part of an ever-evolving collective poem shown on LEDs embedded in the lion’s mouth by day and then projected onto Nelson’s Column by night.
Here Goodwin explains more about the algorithm he trained to turn words into collective poetry, and the benefits of using this kind of technology in public installations.
Could you tell us a little bit about what you do?
My practice centres around computational creative writing, or generative text, whatever you wanna call it. But basically I like anything that's at the intersection of text and computation.
What is the algorithm that features within Please Feed The Lions?
The specific algorithm I'm using in Please Feed The Lions is one I use a lot. It's called a long short-term memory (LSTM) recurrent neural network (RNN). How it works is it’s basically a massive statistical model that predicts linear sequences, so in this case the linear sequence that it's predicting is letters and text characters.
The algorithm is essentially predicting the next text character over and over again, and always taking into account what came before to generate text. The statistical model used in this algorithm is trained on thousands of poetry books from all over the world.
Why did you focus on poetry?
When I started using computers in my writing practice and in particular when I started training deep learning models to write, it struck me that I could take prose and poetry, and even non-fiction and mash them together in interesting ways. The material that would come out, regardless of the combination of things used, would always feel very poetic or have those characteristics.
The other reason why I like working with poetry as the output is because of a constraint to the machine used in Please Feed The Lions. The LSTM has a somewhat limited memory window, and that window can be expanded but one of the biggest problems in the field of natural language generation is long-term coherence and topical consistency. We're at the point now where the natural language generative systems can write a paragraph of text that makes sense, flows, and is coherent, human-level writing. But if you try to make it write more than a paragraph, it will rarely produce a page of text that coheres in the way human writing is expected to.
What role does machine learning and this algorithm you developed play within Please Feed The Lions?
I would say it plays a central role. The way Es Devlin likes to put it is that the computer-generated poetry serves as an impartial mediator, so to speak, for all of the submissions that are coming from the public visiting the installation. So in Please Feed The Lions, it will serve this role in that it’s not biased by the same people who are submitting the words. In a certain way it creates an impartial place for the submissions from visitors to exist side by side but still to grow into something that's greater than the sum of its parts.
So how does the algorithm actually work with the installation?
So someone submits a word, and that word essentially becomes the beginning of a line. The algorithm takes that word, figures out what part of speech it is and then gives it a preposition if necessary, like ‘the’, ‘a’, or ‘my’, but only if it detects it as a noun. We kind of expect most people to submit a noun though, which is why we've structured it that way.
After that, the machine, like I said is predicting the next letter or space or punctuation over and over again to generate text, but really this process actually starts before that first word has been inserted. What I mean is it starts with a ‘pre-seed’, which is a sequence of text the model itself produced at some point in the past but that I curated because I thought it was really good. The reason why I do that is because an LSTM is a state-based machine, when it's making the prediction, the state that it's in is based on the characters that came before. So if you give an LSTM a good piece of its own output as the pre-seed, you'll be putting the model back into the state it was in when it produced that high-quality output. It's a sort of hack to get the machine to produce better output consistently.
Anyway back to the process, the pre-seed exists, it’s fed a word, which then evolves into the desired seed. When it goes into the desired seed it keeps writing, so that's where the line of poetry comes in, another 90 or so characters past the pre-seed is produced. Then it stops and you find a good place to cut it off, so it's not in the middle of a word. By using the same pre-seed over and over again for every line that gets produced, it creates thematic and tonal consistency between the generated lines because they all actually secretly had the same line before each one, if that makes sense!
How has your collaboration with Es developed since the first project you worked on together?
I would say it's been my role in the collaboration that’s developed, which has essentially been to show Es the possibilities of these machines and the types of things they can produce given different types of inputs and different types of training materials. It’s evolved from PoemPortraits in that Please Feed The Lions is much more sculptural, it's in a public space so it's going to be very interesting to see the way the public reacts to this language and this poetry.
Es’ vision for these collaborations is to open it up as much as possible as she sees the generated poetry as an impartial conduit for the words people are submitting. For me, the more we can open up those submissions to larger and larger groups, the more things will be possible. The material gets richer, but there are things you can do in machine intervention with 1,000 people submitting words that you can’t do with 50 people.
What are the benefits when it's on this larger scale?
I think one of the values of technology is its scalability. For PoemPortraits we could've pulled it off with me writing 50 individual poems, but for this project it would be impossible to keep up with the real time crowd in Trafalgar Square. The machine can write a poem in less than a second and that scalability is an affordance of a system like this.
I don't think it's the most important value though, there’s also the way technology can shift creative roles. I used to be a traditional writer but technology has shifted my own personal role from writer to creator and operator of a machine that writes, which is maybe the future for everyone. It might not be as bleak of a future as we imagine it could be because as this technology gets better, as it becomes more human level, and as we explore new interfaces for it and understand the best augmentations for this type of technology, we may end up with very natural feeling interfaces and experiences where people are able to write beyond their native capacities.
Do you find yourself ever having to convince people of the advantages of this technology?
Less and less actually. I like it when the work speaks for itself and people arrive at that positive impression without me even having to say it, which is sort of what people have gotten from my film Sunspring with Oscar Sharp. They see it and they aren't threatened by it because it's funny and it shows what machines can do.
I think understanding the difference between human thinking and machine thinking is one of the biggest challenges but we can confront it head on in order to have better experiences in the future.
What have been the main challenges for Please Feed The Lions?
The main challenges of this project have been the scaling and some speed improvements since PoemPortraits to accommodate a larger crowd. The biggest issue though, has been finding a whitelist/blacklist approach that I'm confident enough about.
With PoemPortraits, we didn't really worry as much about bad words. In this case, it's a different context because people are submitting words that will be projected on a very public surface and in a very large scale. In that context, I think it’s important if people aren't able to sabotage the aesthetic by inserting words that would offend or degrade the experience.
What do you want people to go away with, after taking part in the project?
I want people to think about the ways that we use language – we don't just use language, language is also something that uses us. We need to be conscious of that, so when you submit a word and see that word blown up enormously and then placed into a context that you might not have expected, you're going to realise hopefully that the word has a life of its own and it always has.
What has been the best word you've submitted to the machine?
I have this other bot called Lexiconjure, which is a bot that defines invented words using an LSTM. It's a Twitter bot and you can tweet a word at it and it'll write a definition for your fake word based on what it's learned from reading the Oxford English Dictionary over and over again. A friend of mine submitted the word ‘love’ and the response, I'll never forget, was: "Result of a persons or animals response to a problem or difficulty; she loved the music of the new employee". Ever since then I've always submitted the word ‘love’ to any machine like this. The other word I like to submit is ‘death’, because, while they're both pretty cliche, I think these words conjure strong connotations with a number of topics and a number of symbols and images. Using a word that has broad connotations you can produce a variety of output with limited amount of input.