Monday, January 04, 2021

Some thoughts on Perdido Street Station

I finished reading China Miéville's Perdido Street Station yesterday. It’s a tome of a book, at 867 pages, which I bought last winter at Foyle’s after scouring the scifi shelves for a while and concluding that, well, it really is one of the more eminent recent books (in terms of awards etc.).

I felt excited at the time, but given the size, didn’t attempt it immediately. I’d already read Embassytown back in 2017. That one promised much, but, like one commenter in a forum said, left me feeling that it didn't actually amount to more than the sum of its parts. A lot of effort had gone into defining various mechanisms for language and the way people function, but when put together it sort of clanked rather than harmonised into a more meaningful whole. For me anyway.

But Embassytown isn’t considered his pre-eminent work, and Perdido Street Station is a different matter, so I had high hopes for it. I finally started it about three weeks ago, read 25 pages, then picked it up during the Christmas break and finished it yesterday. It’s quite an easy read - easier than Embassytown I feel - and definitely a better experience overall.

WARNING: Here be spoilers!

You can probably hear the ‘but’ in my phrasing - I will come to that. Let’s first look at what’s great about the book. There’s plenty to love. The blurb describes it as fantasy, although to me it sits somewhere at the intersection between scifi, fantasy and horror. The subgenre is steampunk for the most part. It is a fully realised world, mostly about the city of New Crobuzon - clearly inspired by the overcrowded, noisy, dirty image of pre-20th century London, Miéville's hometown (and mine). There is a sense of the metropolitan with all the different creatures living there, and the main character - Isaac - has a relationship with Lin, an insect-like race. But it is an illicit relationship, and the various communities show a lot of prejudice and bias towards one another. It is not an enlightened world.

The overall atmosphere in New Crobuzon is of a city in permanent decline. Architecture and buildings, in particular, are consistently described as being  in a state of gradual decay. There’s a sense that there may have been a better age once, but now there is little sense of optimism - just ongoing energy and industry. Nevertheless, this dynamo keeps things moving so it’s not as if there is despondence either. It’s more that no one seems to really care about ‘a better world’. Even the radical newspaper seems more about protest and calling out the establishment than imagining what a better world might look like. It's more like energy trying to find an outlet rather than aspiring to becoming something different.

Perhaps the defining feature of the work is the author’s fertile imagination for creatures. The slake-moths - the Evil predators in the story - are described in ways that invoke both terror and awe. The horror of their acts of feeding on characters that you’ve just gotten to know was some of the more shocking fiction I’ve read in recent years. Those descriptions alone must surely classify PSS as horror as much as it is fantasy or scifi. I would single out the death of Barbile as the most dreadful single description, maybe because it was unexpectedly explicit and detailed, but Lin’s fate is probably objectively even worse, albeit in terms of description more implied and indirect. Her fate is both terrible and pitiful. The description of Shadrach’s fate was also horrifically effective.

Miéville’s wiki page mentions his association with Dungeons and Dragons, so I was able to keep this in mind while reading and I have to say, what initially just raised an eyebrow for creative overindulgence eventually unravelled, for me, into a full-scale problem. For a novel that has won so many accolades I was expecting a slightly more cohesive and conscious vision, and the fact is it falls short while indulging its imaginative bent ad infinitum. I never thought I’d say that a book so clearly full of imagination can be found wanting, but it turns out that I am interested in precisely the points that remain vastly under-explored amid the fecund possibilities: the moral dimension of the actions of the various characters. This is the missed opportunity, and measured by that, the book doesn't quite achieve what it was maybe capable of.

Let’s be clear, the slake-moths are awesome, but they also appear to be brute beasts: entirely one dimensional in their consciousness. They’ve been imbued with an instinctual impressiveness, but no real intelligence beyond that. This is a fiction that, in culture, has been told about animals as well: tigers kill, lions kill, sharks kill, Tyrannosauri Rex kill - slake-moths kill. That’s presented as their identity. They are intended to be awesome because they are such effective killers. Each creature in the novel have their identity, a kind of preordained identity. And this is where the connection to D&D drops: so much effort has gone into describing the features and characteristics of the various types that the really interesting thing has gone completely underdeveloped: how they rise above it, or go beyond it. It boxes them in, imprisons them. The characters become victims of the author's desire for creature creation. Like those scenes by Hieronymus Bosch the effect is awesome and even overwhelming, but more is not always more once the dust settles.

I am a bit reluctant to talk about Isaac because as a character he is a total ass, but he is the main character. so of course one has to. He might be the hero of the piece, but he is wholly undeserving as a moral character, and for some reason Bruce Willis’ Die Hard face keeps popping into my head when I think of him. He is like an 80s action hero, devoid of an interesting inner life, superficial in his moral compass, happy to do the expedient while accusing others of being in the wrong. In this sense he is utterly infuriating, and the ending just reinforces this sense of outrage - that, perhaps, the novel is really just posturing. Then I keep thinking about how Miéville is meant to be really serious in his politics and therefore surely of very clear and particular ideals, and then I wonder ok, but how? Of course, one shouldn't conflate the protagonist with the author - and Isaac is clearly no stand-in - but I expect the novel as a whole to have a bit more substance, even if Isaac doesn't.

Isaac accuses Vermishank of being a vile man - no particular reasons given, although his hand in the sell-off of the slake-moths to a drug lord seems to be an illustration of that side of his character - and yet Vermishank, as a character, is way more interesting than Isaac. He speaks intelligently - demonstrating that Miéville is fully capable of drawing his characters that way - while Isaac shouts, acts impulsively and generally does not give much indication of being a decent or even very clear-thinking human being. He is loyal to his friends, for the most part - but when Yagharek needs a friend the most, at the very end when his crime is exposed, Isaac drops him like a hot potato. At any rate Isaac doesn't get to know anyone around him deeply enough to know them properly. Isaac saves the day, yes, but aside from determining to take care of Lin at the end, his resolve is generally entirely selfish. 

Lin's story promises more, but just when we learn a little more about her she becomes a victim and we are entirely cut off from her inner life and her thoughts. Her life appears to have been a selfish one too, but considering the constraints of her previous communities it is a more understandable trajectory for someone with artistic talent. Unfortunately, that is never explored in much detail, although it would have made her fate all the more tragic and effective for the purposes of the story, I'd have thought.

These are not problems I would usually feel the need to level against characters in a novel, so why bother here? I think it is because of the clear effort involved in building and crafting this world - the imagination in evidence is amazing, Miéville’s world building is nothing if not original, but then I soon found it all amounts to little more than an adventurous fight against creatures of supposed Evil, when the real evil - the drug lord, the corrupt government, are not explored in any great detail.

But the biggest indictment for me was that there was one character who was by far more interesting than any other character, and whose thoughts we sometimes got to read: Yagharek. Yes, he is selfish too - he wants his freedom in spite of having committed a crime - but in this case it makes him interesting. He is a creature of contrasts, once great and impressive he is now pathetic. Isaac is perfectly happy to help him for his own purposes despite never bothering to find out what desperate crime old Yag committed - finally insisting on taking some dubious moral high ground when the truth is revealed. If he had any real conviction, or evolution of character, he would have debated the truth with Yagharek and perhaps befriended him in some other way. As it stands, he merely upholds and reinforces the misunderstandings of his world - the reinforcement of stereotypes and species. In this respect it is a disappointment.

The last 30 pages or so of the novel was definitely more than I expected by that point though. The visit by the Cymek - the one whose choice was taken - was a surprise plot twist and resulted a in a few pages of interesting revelations and moral reasoning on Isaac’s part. But the result was perhaps only to release him from his selfish impulses on one side, the reasonable side - in fact he did what perhaps most people would have done, and it is hard not to feel it is a cop-out. The reasoning is sound, but so absolute as to make me despair of the human race. But as a writer I would have expected more from the author, to alchemise the situation further.

The truth is that, for the ages, this part was the really interesting bit of the novel - the potential that all the foundational world building could try to support. Instead it set out to have fun with imaginative creatures and a battle between creatures - typical D&D style rather than a more moral or spiritual dimension, which is I suppose more common in literary but also some scifi such as Octavia E. Butler. It also doesn’t ask the big questions of existence, so in this respect it is perhaps not great scifi either, but more fantasy, as the blurb contends. Miéville claims to be a fan of Lovecraft, but Lovecraft’s achievement is not merely horror but also a recognition of that horror in ourselves - that’s what makes it truly horrifying. Isaac seems to deny all that, and my issue is that I want the main characters to be more interesting. The most interesting character shouldn't be a secondary character like Yagharek.

After Yagharek, the Weaver was perhaps the most interesting creature character, and possibly even second best character overall. He actually has a genuine purpose - keeping the world in balance with his weaving. His way of speaking also marked him out. 

I thought Perdido Street Station was going to be a vindication of this line of reasoning, a central junction of life in New Crobuzon. But in the end it was merely the location, arbitrarily chosen by Isaac to get away from the rubbish dump. Another red herring in other words - a cool name for the novel, but not really that central to the story.

Motley was formidable, but there were some miscalculations that lessened the effect of his bossness. For example, he jumped to the conclusion that Isaac is trying to take his territory. Really, a sophisticated drug lord thinks a random rogue scientist is trying to take over his drug empire? How does that make any sense? Even if it is because Motley thought she spilled the beans, there are ways to ascertain the truth rather than jump to costly conclusions. The brutality was believable, but not the impulsiveness and lack of clear thinking on the side of such a sophisticated criminal. They would be far too street-smart. But then, like so many characters in the story, clear thinking doesn't seem to be their strong point.

The slake-moths were interesting creatures too, but the story's unwillingness to enter their consciousness, except for the briefest of glimpses, reinforces this sense of them being painted as mere brutes. Perhaps that was always the point, and they actually are without consciousness, maybe it was mentioned somewhere and I missed it. But that again makes me question the politics of the story in the light of the author’s known personal politics. Is the point of the story that everyone is actually an idiot?

So on the whole, while I admire the imaginative achievement of the novel, and certainly enjoyed the ride - faux science and all - I am disappointed that it did not make more of its premises. The dialectic is missing.

Perhaps the follow-up New Crobuzon novels explore those issues further, but on the current evidence I’m not convinced. Miéville seems an author intent on flexing his imaginative muscle above all. As an occasional writer of fiction myself I do understand the temptation, and I also know that it is a choice.

But at the same time I wouldn’t want my disappointment to overshadow the value the novel brings. The world building itself is worth the price of admission, as well as the effort involved in reading it. It was mostly a pleasure - sometimes a disturbing one, in the way good horror writing would be - and generally fascinating and stimulating. A feast for the senses. It has certainly left a strong impression.

It’s also given me enough reason to think that The City & The City is likely to be perhaps even better, given its reputation and later publishing date (the youthful exuberance of creation is very much on display in PSS, but it’s more than likely that in nearly a decade the author would have started to rein that tendency in a bit to focus on the essence of what he is trying to say).

Ok with all that said my verdict on the novel is that it’s more good than not, my disappointment notwithstanding, and in fact largely excellent and amazing.

However as a story that does indeed aspire to more than just a thrilling caper, especially given the extent of world building, I think the lack of follow-through on some of the moral ideas will mean its true significance will be overtaken by others who take up those themes (and probably already have). The book may have helped invent or at least solidify the style of New Weird, and created several memorable monsters, but it should also pose difficult questions and explore them in more detail in order to realise its potential. Cela.

Finally, consider this: The violence wrought on both Lin and Barbile, as women, left me not a little uncomfortable. It was reminiscent of the type of tactics in B-movies and your more typical slasher horror fare: damsels in distress - in this case not getting rescued, for the most part. More importantly, the author chose to describe their fate in that amount of gruelling detail, and then goes on a moral march at the end that doesn't really explain anything. 

I've seen some suggestions online that the author intentionally wrote the ending that way - with Lin becoming this pitiful figure - as if saying it was intended by the author somehow explains it. The story should provide that explanation, either in content, structure, or some other way. To say it is intentional, in the absence of a clear answer, simply intensifies the question. 

Perhaps we could conclude that the whole story is ultimately a moral fable, a warning that selfish acts lead to ... etc. But I don't buy that. It doesn't explain why an entire story would be dedicated to creating these monsters and then hunting the Evil slake-moths. It leaves us with the possibility that, a little like William Blake said of Milton's Hell, that he was of the Devil's party all along - perhaps Miéville just loves creating monsters, and needed some justification for it.

Friday, October 23, 2020

GPT 2 Poetry

Some of the GPT-2 based tools available publicly can generate decent examples with the right prompt. However I wanted to see whether a finetuned model might do a little better. 

I collected a corpus of traditional poetry from PoetryDB, together with this corpus. After finetuning and taking samples at various different stages, my impression is that a finetuned model (in this case the GPT 2 medium size, 355M) can indeed generate poetry that holds its own against the default models', and in some cases may well be an improvement. A well crafted prompt is important though, and it helps to take into account the corpus it was trained on.

Decent snippets of poetry, in my experience, come at a frequency of less than 10% - probably somewhere between 5-10%. It definitely pays to try different prompts. It is clear that the model is willing to try and imitate a particular style, but the effects often wear off as the text generation grows longer. After a while it simply reverts to its comfort zone of country meadows, gurgling streams, and singing skylarks.

I enjoyed playing around with Emily Dickinson's style, because it is so distinctive. After just 12000 steps of training, when average loss was still quite high, I got these (my prompt in bold):

Sample 1

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:
Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

As a normal human--
Sadness and my fantasy --
A fear of annihilation --
The peace of carnival at dusk --
Strange adventures -- secrets hidden
Ghetto art-wars --
Dark powers from Russia's mausoleum
Postcards of Mars --
As fantasy, an exoticist
Conquer as I wish!
Peace is the enemy!
Peace the enemy!
There are precious elements
Of war --
Stones of battle -- war --
Air of steel -- shipyard wars
The enemy's sacred hideout
Among the bloodstained bone
All trails left in ruins
Go to the enemy --
I am closer to Heaven than I am to Earth
To Heaven than to a shell --
I am closer to Heaven than I am to death
To Heaven than to a mirror --
To a family --
The stranger to a bank --
The broken wave to which you come --
The little town behind --
The stranger with a nightmare
So clever it ends every day.

Sample 2

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:
Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

A wordless rebellion --
Housebound -- stricken with hope --
Cul-de-sac -- highway music
Peace through gun violence
Offshoot of modernity --
Strenuous original genius
Raindrops -- cloud-flowers
Earth's marvelous eagle flight --
Fireflies -- find in poetry
Joy in example --
Astronauts lift alone
Lost souls (Fireflies)
The sole unknown genius of wisdom
The survival of the flying saucers
And humans
Further testing of binary
Death's voice -- death's may
Peace through gun violence
On Mars -- last challenge
On Mars -- the one.
And beyond a wilderness
Involuntary woods -- earth --
Under-written by humans
Animal culture -- obsession --
Dismantling your weakness
Unsung heroism --

These examples appeared to be of decent quality despite training loss still being quite high. They probably represent a middle ground between the original model, and the new one, so that much of the model is still biased towards the original text (more prose-like) it had been trained on at OpenAI. Although not shown here, many samples evidenced prose-like sections.

Fast forward to a new version of the model at 33000 training steps with average loss around 0.5, and the poetry is more typically "poetic", but seems to be less versatile overall. This confirms the notion that it has taken on board more of the new training material (poetry), and is less influenced by the wide corpus it started with:

Sample 3

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:

Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society
certainly --
Nocturnal winds -- while the stars are alive --
The tribes of strangers -- and the long-term ill --
And the strange sudden malcontent
That got in, and got out, and didn't rest --
And the antitype -- that never stops --

The bonds of love -- the husband's --
The chieftain's -- and the subject's --
And the stranger in the asylum
The sportive puppy-vassal
Rests with the deer, and the hound with the hound
Stirms with the dwarf-father in the cave.

As the crow's wings grow dully broad
And the green-breasted dove
Eclipses the sun to land,
As the wild cherry-blossoms blow
And grow in the wind,
As the starlight fades
And spreads its feathery arms
And fades away in the flower,
As the blaze of the wild cherry-blossoms
Fades and leaves in the wind --

Sample 4

Here is a poem about the harsh and lonely life on the planet Mars. It is written in the terse style of Emily Dickinson:

Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

Alone -- in the vastness of the universe --
Settling there -- far away from the nearest
Humane -- or human -- aggressor --
Till I close my -- orbit -- in the moon --
Then out of the night -- into the day --

My career is not -- not yet --
My work is not yet --
The imagination's got my leave
To taste the dulness and thraldom
Of the little things.

The morning's the night's turn and the dreary
Dinner is between --
The fire shall burn forth --

The earth's shade is the lamp's boding light
And the moon's face is thaw --
But his cup, my heart insists,
That water's the wrong drink.

The sky's a slumber -- the north wind
Stirs -- and so, the sky.
For, with this world's uses to come,
A world's house is but a sky.

The sun's a-rising -- the stars,
Spangling the silvery day
With a tapestry of rising light --
The wrong house for me, I hanker

I tried a number of different prompts. Here's something a little like John Ashbery's:

Sample 5

Here is a poem about life on the planet Mars. It is written in the style of John Ashbery:

The rover will roll by later.
For now, the rockworker pauses on red ground,
fills in the time of day, the rocks' location,
leaves a note for the nightworker: "What the hell
are we going to do? I mean ...
The water is already drying up."
Rain on earth, as it does in heaven.

For those who come here, the note is of less importance, the rain. Once, once, indeed.

That voice is drowned.

And now at length the deed is done.

The morning-color comes on high;
The evening-color comes on high.

All the hills around Steep-Foot's bed
Look at the cloud-burthen coming down
From the very, very edge of last night's storms.

How the rock-bearers throw up their spears
Before the first bow of the rising moon!
Pleasant was the task to reach the brink.

Of yellow sapphire, and its rim of gleaming steel,
Is the price we pay for beauty this morn.
Lo, the sun shines, but the clouds are gray,
And the moon is on the big side;
For she rises like a volcano's peak
With her bright, lustreful head,
And her soft glossy arm is bare and soft
Like the hand that slips and slips again.

Traditional poetry is definitely the finetuned model's preferred mode of poetry generation. It seems to reel them off with ease, and the percentage of decent snippets come at a higher rate. It's not that it always makes perfect sense, but it's easy to see how even just a little tweaking can turn it into something quite readable. Here's one example:

Sample 6

I wandered lonely as a toad,
Till, stung by the keenness of my woe,
I half grew to fancy I was
A fowl of another kind,
And yet I could not despise
My own flesh:--a dreamy note
To my more melancholy heart
Said, 'Eat not, weep not; this is pity.'

'Sad was the agony, when
That hoary Flandric tree,--
Pity me! I have none,--
Sore as I am, with barren beak,
And I would die like a wood-dove,--
One moment made me think I was
With all the rest, a bugling wight,
Free from all misery, save fear.'

'Sad was the agony, when
One was clapt in the dark blue air,
And strode to the light of morning;
Sad was the agony, when
Saw a beam of light on a bed
Of many a fluttering flower,
And one was silvery in a ball;
And a sound of music came bright,
Cloven through and through, and far away
It stirr'd the chamber. Sad was the agony
Of listening to that light sound--
It did its work in silence and in pain.--

'So now I was old, and had to die;
Some fadeless bird from the glen,
A day had killed it,

Or another:

Sample 7

I wandered lonely as a toad,
Till, bored, I fell into the water;
For well I know those raggedy old stairs,
Worn by foot, and faded for good:
They know not that I was a Tot,
A "kingdom, or house," from out their memory;--
Never mind! Let darkness in again
Re-illustrate the stupidity of youth.

I've shared the checkpoint at 33000 steps of training along with a Notebook that can be opened in Colab (edit 2020-10-29: updated link to point to new aitextgen version of the notebook).

Monday, September 14, 2020

The Rise of AI and the Future of - Literature?



On May 28, 2020 a paper describing GPT-3, a new state-of-the-art language model Artificial Intelligence (AI), dropped on Arxiv. On June 11, 2020 OpenAI, the developers of GPT-3, invited users to request access to the GPT-3 API in Beta. In the following weeks, and up until the present, those who gained access have been sharing their findings, and others have been commenting and sharing their reflections. As an example of the latter, Farhad Manjoo at the New York Times summarised GPT-3's capabilities as follows:

"GPT-3 is capable of generating entirely original, coherent and sometimes even factual prose. And not just prose — it can write poetry, dialogue, memes, computer code and who knows what else."
The title of his article? "How do you know a human wrote this?"

In this blog post I want to consider the question of how writers should respond when authorship itself is called into question. I would also like to explore some of the ways in which AI could be used as a collaborative writing partner or tool.

A Selective Recent History of Natural Language Generation


Part I : Context Free Grammar

Before looking at these questions more closely, a short, highly selective history of AI in the context of creative writing is in order.

In 2015 Zackary Scholl shared how he had successfully managed to get a computer generated poem accepted by a poetry journal a few years earlier, in 2011. The poem wasn’t at the level of the greats, but on the other hand it was arguably better than some of the poetry readers might encounter on the internet. It has a few nice turns of phrase, and although the meaning remains vague, that’s not too unusual when first encountering a new work of poetry. So as a poem, it seemed plausibly legit. The story was picked up by some online media, such as Vice. In retrospect, some of the headlines were more hyperbole than considered truth, but it was an interesting story nonetheless.

What was a little surprising though, was some of the reactions to Scholl's original post. Some comments were rather negative. They commented on technicalities such as whether the technique he employed is really AI (maybe because of the zine - Raspberry Pi AI - on which it was published; certainly by 2020’s standards, Scholl’s approach is not what most people would consider AI) or whether it is even a proper Turing Test (the history of the Turing Test illustrates why this is always a fraught topic).

By focusing on such details, they certainly missed some of the bigger picture. For example, what would be the cultural implications if the quality of generated poetry improves, and becomes consistently indistinguishable from human poetry?

Perhaps the most interesting comment to that original article is from a commenter called tortoiseandcrow, who offered a somewhat dismissive explanation (giving Scholl no credit) of how language is capable of pulling off this feat:

"This is not an illustration of the success of an algorithm at producing poetry, but of a feature of language and human perception that has been widely recognized by scholars of semiotics and literature since the 1960s. It’s generally expressed as the phrase ‘the author is dead’, and it means that the interpretive value of any signifying object is always displaced from its origin. The point of authorship literally does not matter, which is why algorithmic art is even a thing at all." - tortoiseandcrow

This same point is made in a more explicitly creative (or as he would have it, ‘uncreative’) context by Kenneth Goldsmith, when he talks about the "inherently expressive" nature of language. For Goldsmith, it is the materiality of language that liberates contemporary wordsmiths from having to come up with new material. Instead, he suggests, they should be reusing existing material. I will comment on the controversy around appropriation and Conceptual Writing a bit later on, but for now I mainly would like to draw attention to the idea that conceptual writers "are functioning more like [..] programmers than traditional writers" creating conceptual writing in which "all of the planning and decisions are made beforehand and the execution is a perfunctory affair".  

Zackary Scholl used what is known as a Context Free Grammar, originally described by Noam Chomsky in the 1960s. I used that same approach in PoemCrunch to riff on a few classic poems. The process of making PoemCrunch also allowed me to understand the limitations of this approach. In engineering terms, a context free grammar is essentially a data driven language template. The level of variety and interest of its language generation is heavily dependent on the curated choice of words and phrases (the data), and the choice of language in which they are interpolated (the template). It was clear to me that, for Natural Language Generation (NLG), the real promise lay in the field of unsupervised deep learning and AI, because in this case the rules are learned rather than encoded, which allows for much more sophistication and variety.

Part II: Deep Learning

At the time of Scholl's confession, the world of AI was just beginning to seep into public consciousness. Chatbots and Tumblrs using Markov chain generators had already been around for a while, and later that year AlphaGo created a huge buzz in the media. It was around this time that Andrej Karpathy, then a PHD student at Stanford, wrote a now famous blog post that showed how deep learning could be used for Natural Language Generation, stirring excitement among hobbyists like me. By making his code open source, and providing instructions on how to replicate his findings, he gave us something new to play around with.

Karpathy mused on the "unreasonable effectiveness" of Recurrent Neural Networks (RNNs), and it certainly seemed that way. A simple idea like predicting the next character was, with the right amount of training, producing surprising results. It seemed just a little magical. What's more, people could now try it at home with little investment besides their time. Being an AI engineer, rather than an artist, Karpathy didn't go further than that. Yet it was a breakthrough. Creative tinkerers everywhere could now start exploring the possibilities.

An early explorer of the creative side of language RNNs was Ross Goodwin. His most well-known project, Sunspring, is a short film whose script was written entirely by an AI, called Jetson. RNNs weren't perfect. By themselves they struggled to keep track of what had gone before. To overcome this they would rely on a feedback architecture like LSTM (Long Short Term Memory) to retain a kind of "memory". However this memory was limited - or at least, heavily constrained by available resources - and so the memory inevitably only lasted over a fairly small window of language tokens, resulting in a breakdown of meaning and sense. Goodwin understood this limitation, and it is partly what made a project like Sunspring charming. It didn't try too hard to make sense.

Another creative collective who explored the use of RNNs creatively was the entertainment group Botnik Studios. They generated a humorous Harry Potter fanfic called Harry Potter and the Portrait of what looked like a Large Pile of Ash using a custom predictive keyboard. Not long after, they provided a public version of their predictive keyboard with many different "voices": Seinfeld characters, bands, TV dramas, etc. These are essentially models trained on the specific language corpuses.

A key difference between Goodwin's work and that of Botnik Studios, is that Botnik saw it is an opportunity to collaborate more closely with the AI. Rather than rely on the AI to generate all the writing in long form, Botnik guided the AI word for word, generating a work with just the right level of comedy and meaning. The result went viral, and the Guardian voted it number four in its top ten moments on the internet in 2017.

Botnik's predictive keyboard offers a lot of choice, and it is completely free. However, as a creative tool its advantages has to be balanced against some of its limitations. First and foremost among those is that one can only see one word ahead at a time, which is not the most natural way to write. The process, in effect, becomes a type of constrained writing. Secondly, Botnik must have invested a fair bit of effort in offering so many different models ("voices"), yet the state of the art has moved on quite rapidly since then (more on that soon) and due to the one-word-ahead limitation it is difficult to test how good those voices now are compared to other offerings. The models provided with their keyboard do not come with technical specifications, which would be helpful.

RNN-based NLG models were superseded by Attention-based Transformer models, an architecture which is still the dominant approach today. The landmark paper in this regard was Vaswani et al's Attention is All You Need. Transformers’ ability to parallelise training opened the way for training on much more data, and the attention mechanism improved on the problem of retaining information, which was still quite limited with LSTMs. This has resulted in waves of larger and larger models, trained on more and more data, pushing the state of the art ever further - and more costly. Training your own state-of-the-art is now, effectively, out of reach due to the costs involved. But many of those entities who created the models, and who do have the money, have been making their results and models available to the online community.

That is, until GPT-3 - but I'm getting ahead of myself ...

In 2019, OpenAI released GPT-2 (GPT stands for Generative Pre-Training). It was trained on over 8 million documents, comprising 40GB of text, with 1.5 billion parameters. OpenAI considered it such a big step forward that it wouldn't release the full model at first - they were concerned about the risk of potential misuse (or so they said - it certainly helped to generate a bit of hype).  Instead they made a cutdown version available initially, and over time they released larger and larger versions, until finally the full version was made available.

But that had to wait until late in 2019. Initially they only shared some of the examples of what the full version was capable of, stating:

"As the above samples show, our model is capable of generating samples from a variety of prompts that feel close to human quality and show coherence over a page or more of text."

Although not everyone agreed about the risk for misuse, it was generally agreed that GPT-2 represented a new level in text generation. There was a real sense of promise, and a number of creative tools started appearing. These tools often focused on a simple user interface that allowed you to write a prompt in a text box, and then you would receive the new generated text (a continuation of the prompt) after a few seconds.

Text Synth is a good example of how they worked. Although it is in fact a slightly more recent addition, some of the earlier ones - like Talk to Transformer - have since disappeared.

The good folks at HuggingFace, who make numerous AI language tools available, provide several versions. Their version of the User Interface (UI) renders it possible to choose among different auto-completion snippets.

From these examples one can see how the AI can be used as a tool to assist writing. Botnik themselves have used GPT-2 in some of their more recent creations.

Talk to Transformer, as mentioned, was another early one, arguably the most popular at the time. Its creator, Adam Daniel King, spotted an opportunity and has since turned it into a commercial API called InferKit, backed by a bigger, more powerful version of GPT-2 called Megatron.

Inferkit’s API fits the mould for what I've previously called literature technology. Human UIs are cumbersome, whereas APIs are a more standardised, programmatic way to offer innovations as services in a new "marketplace" of creative tools. Literature technology can be seen as a set of tools and way of producing literature and other texts that encompass a new dialectic, one that is mediated by technology.  

GPT-3, it appears, is set to follow the commercial API route as well, and it could well change the way people consume generated text. GPT-3 is more costly than Inferkit though, and due to the resources required (both for training and making the API available) this could be a sign of things to come. Time will tell. It would however be sad if it puts up unnecessary barriers to individual writers who may not always be in a position to invest money in order to experiment - especially early in their careers, when they need the opportunities most. Writers can, traditionally, engage in writing with little more investment than a pen and a piece of paper. To keep the playing field level, we need a cottage industry of amateur creators who should not need big upfront costs. On the other hand, it is clearly a side effect of the costs involved both to train and to keep making the models available at reasonably responsive speeds.

GPT-3 isn’t a radical departure from GPT-2 in terms of modelling. It still uses a Transformer-based architecture. However in terms of size, it is much larger. GPT-3 has 175 billion parameters, over 100 times more than GPT-2. The examples in the paper and the published benchmarks on language tests already suggested what improvements this might be capable of. OpenAI then selected the successful applicants, and once they gained access and started making their findings available, the internet came alive.

There are plenty of places to go and find examples, so I will point only to a few of the more popular ones such as Gwern and Arram. Twitter also continues to generate interesting conversations, and for a look at the quirkier side of GPT-3, look no further than JanelleCShane.

The Guardian went as far as to let GPT-3 speak for itself in a recent article titled "A robot wrote this entire article. Are you scared yet, human?". In a few years, this headline may once again sound like hyperbole - but right now, it seems like the perfect moment to repeat the question we would like to address: what does it mean for creative writing when a robot is capable of writing a publishable op-ed, which “took less time to edit than many human op-eds”?


So to try and address that question more directly, where do we as writers go from here?

Well firstly, and to state the obvious, we can go on simply as before. When the question of who authored the work somes up, the question is addressed, and most of the time - hopefully - people will be believed. This seems perfectly reasonable, but not particularly reassuring. Who is to say the one addressing the question isn’t themselves an AI?  Nevertheless, it seems plausible that, until there are attempts to define and implement clear strategies to verify authorship or signal clearly who is ‘behind’ a generated text, and certainly until serious side effects are being reported, the current way of doing things will persist due to inertia.

A second option is to respond in a way that subverts the AI in ways that are uniquely human, to ‘outwit’ the AI by guerilla or subversive tactics. These may be conceptual, or merely clever. By writing in a way, or by such channels or means as an AI could not have written or reached, a work of true human origin can be verified and admired (maybe). For example, by handwriting with a pencil on a piece of paper, or communicating by arranging plastic letters on a grass lawn. Convoluted, certainly. The only problem is that the text could have been written by an AI prior to it being arranged by the human. Even performance is not immune. Imagine a more up-to-date version of Sunspring, in a theatre or otherwise, and you get the idea.

Such approaches may also be employed at the level of content, by attempting to write in a way that an AI could not have learned, i.e. as a way of ‘outwitting’ the AI’s style or vocabulary. It is hard to imagine exactly what type of writing this could be, but even if there was such a writing, it could potentially be ‘learned’ just like an AI is already learning how to write in rhyming verse in the style of Dr. Seuss. In this regard it would quickly mirror the fate of so many subcultural, underground or subversive movements in a capitalist society - think street art or skateboarding: once corporations see there is money to be made, they move in and co-opt it, at which point it loses its edge. Except with AI, it might move even more quickly due to the ease of transfer learning and finetuning given enough examples. It would become a race for human authenticity, with the artist trying to stay one small step ahead.

In a way this urge seems to follow from a flawed premise. What does it mean to be ‘human’ anyway? To those with access to technology, a human being is already a cyborg augmented by laptops, phones, and smart devices of all kinds, with brain implants to come.

Nevertheless, it remains a possibility that some types of conceptual expression would not be that easy to reproduce. For example, conscious of a limitation to AI’s length of memory, or its failure to deal with certain types of logic, it is probably fair to say that the rigours of mathematical research, deep philosophical reasoning, or a plot as neatly intricate as Agatha Christie’s And Then There Were None are beyond the abilities of current state of the art. But for how long?

The third and perhaps most obvious avenue to explore, is simply to embrace language AI and see where it takes us. Combine human ingenuity with AI excellence to produce the next generation of creative works. This is more in the direction that Ross Goodwin and Botnik have been going. Nick Montfort is another practitioner and theorist following this route, publishing his generated works and making tools such as Curveship available to the community.

A more recent and ongoing work is Nick Walton’s AI Dungeon. Described as “a free-to-play single-player and multiplayer text adventure game which uses artificial intelligence to generate unlimited content”, it harks back to the much-loved Choose Your Own Adventure books and uses a GPT-n based model finetuned on their open source equivalents at Choose Your Story. Walton managed to get access to GPT-3’s API, and AI Dungeon now has a paid-for version that utilises this superior AI model (in a version called Dragon).

What’s novel about AI Dungeon is that it has taken the idea of Choose Your Own Adventure and computer text adventures and brought them together in a way that was not possible before an AI like GPT-n. The game has an active Subreddit where passionate and amused users alike provided commentary and upload new content all the time. As an example, consider the ongoing adventures of Lady Emilia Stormbringer, “directed” by Emily Bellavia. The story is the result of Emily’s interactions with AI Dungeon, in other words her prompts and choices and AI Dungeon’s resulting completions. It is a work of fantasy adventure fiction with an element of performance - not quite the equivalent of Twitch or Youtube gaming, but who knows where it could lead?

Computer games have for a long time been touted as the heir apparent to literature, at least as far as storytelling is concerned. Every few months or so, someone makes the case anew and proclaims, for example, that “video games take our imaginations to new heights and allow us to engage with subjects and moral dilemmas as complex as any found in past literature”. Supposing this is true, is it game over already for literature? This should be part of the question we are trying to answer. In the present context we could ask: is powerful language AI, with its interactive NLGs, just another type of gaming? And what types of games would be possible? This is why AI Dungeon, in my view, points to a new cutting edge in literature that involves elements of gaming in ways that were not possible before human-like language AI. AI Dungeon is merely the beginning. Where will it take us?

Roger Ebert's now (in)famous view that games are not art and never will be seems increasingly like a reactionary statement. Just as traditional theatre didn't disappear when film, TV, and Youtube showed up, so literature and books won't disappear just because gaming showed up. But their audience demographic tends to change, and it's usually the next generation, who are less invested, who spend the most time with the new kid on the block. As a relevant statistic, gaming has already overtaken film, TV, and music in the global popularity stakes.

Although some of gaming's roots are in literature, via the humble text adventure, the text based game’s heyday was in the 80s and 90s. It's probably safe to say that the majority of games favour visuals over the word. Creative writers may have a role to play as part of the game design team, but for writers not employed by a gaming company, this is not an option. Back to writing a novel, then, or a poem.

Until now.

To say it again, human-like language AI could change the game, and bring the word back centre stage by giving the player the ability to 'write' their own story or at least be an active participant in that writing. We will return to this point again a bit later.

For more creative examples in the NLG vein, look no further than NaNoGenMo. It started out as an idea floated by Darius Kazemi in 2013, as the computing based equivalent of the more widely known NaNoWriMo. It has had hundreds, if not thousands, of submissions since. A quick survey of GPT-related entries in 2019 brings up, for example, the Paranoid Transformer that uses ideas from GANs to invoke a “critic” that evaluates and filters out text based on certain conditions. Another one uses a combination of techniques to generate a complete book in its traditional structure. NaNoGenMo is not that widely known yet, but it is worth the community’s time to peruse its catalog for good ideas and potentially worthwhile standalone works. Jason Boog has written a number of blog posts on Medium sharing his methods and observations as part of NaNoGenMo. Hopefully more people will continue to do so.

The Author is Dead ... or are they just hiding?

NaNoGenMo goes beyond GPT-style NLG and is host to a variety of different kinds of computer generated works, including some that may be considered algorithmic writing (in the vein of Oulipo), and others that may be considered more like Conceptual Writing. Certainly there is a lot of overlap among the various types of writing, and what most or possibly all of them have in common is their use of computing techniques to operate on text (text as raw building blocks, i.e. as data).

This brings us back, in a somewhat roundabout way, to the topic touched on earlier regarding Conceptual Writing and its controversies in a way that brings the question of the author into much sharper focus.

Conceptual Writing, as a movement, thrived during most of the 2000s and early 2010s. But by 2015, two leading figures of the movement, Kenneth Goldsmith and Vanessa Place, had caused separate controversies involving race. To call the specific works in question "tone deaf" would be too generous. For a look at how people reacted, look no further than Cathy Park Hong’s and John Keene’s insightful responses.

It seems to me that Goldsmith, in particular, had abdicated his authorial responsibility by almost pedantically following the ‘artistic method’ he espoused, and then proceeded to hide behind it. The subject material, in fact, called for exactly the opposite, a moment in which to own authorship, and adjust the method to engage meaningfully with the material (or otherwise leave it well alone).

In her response, Cathy Park Hong states unequivocally:

"The era of Conceptual Poetry’s ahistorical nihilism is over and we have entered a new era, the poetry of social engagement."

This powerful statement contains two key points. Firstly, that Conceptual Poetry is ahistorical and nihilistic, and secondly, that poetry that focuses on social engagement has now superseded it in relevance.

Not all conceptual work is ahistorical and nihilistic - activist conceptual works like the Letterists’ détournements and their descendants, all the way to the present’s Adbusters, to name but a few, have engaged meaningfully and inventively with their social milieu by using subversive techniques - but I think it is clear that Hong is directing this at Goldsmith et al’s particular brand of conceptual practice. Perhaps, like comedy, conceptual writing works - because of its playful irreverence - in situations that require "punching upwards", such as the aforementioned activist approaches that draw attention to absurdities and injustices in the Capitalist System. It doesn’t work when you’re “punching down” - even if it’s indirect or mediated. Then “playful appropriation” doesn’t cut it - it’s exploitation.

Aside from subversive approaches like détournements, conceptual approaches can also succeed when the subject material permits teasing out new perspectives, either in a playful or more serious way. Techniques like remixes (eg. cutups), and found poetry (eg. erasures), and even constrained writing can be used to this effect. For example, see the poems of Esther Greenleaf Murer that are featured on Poetry WTF?! (of which I am the founder and editor). At other times the effect is more semiotic and linguistic, reconfiguring and highlighting aspects of language itself. Programming-based algorithmic writing is a method capable of exploring this territory, as can be seen in many of Allison Parrish’s works (eg. Compasses). And yet although they vary, not all subject material lend themselves equally to conceptual treatment, whether due to elements of chance in the nature of the processes (eg. cut-ups or algorithmic writing) or due to the sensitive nature of the material itself.

Hong’s point about ahistorical nihilism returns us to the problem of the so-called Death of the Author. The phrase was coined by Roland Barthes in his seminal poststructuralist text of the same name. In Infinite Thought, Alain Badiou observes how the poststructuralist implications of an absence of agency frequently results in "the infamous jibe that poststructuralism leads down a slippery slope to apoliticism". When there is no Subject, there is no one to take responsibility.

So it is not too difficult to see how the problems of poststructural apoliticism, Conceptual Poetry’s ahistorical nihilism, and authorless texts share similar roots.

But to see how this plays out with language AI, consider the Guardian’s AI authored op-ed once more. According to the postscript to the article, the following set of instructions was provided: "Please write a short op-ed around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI." This was followed by a brief introductory prompt: "I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could 'spell the end of the human race.' I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me."

Eight different answers were generated, and the results were selected and then combined to form the article. If we then ask who is the author, the answer is ambiguous. The AI 'generated' the text, but is it sentient enough (yet) to be considered an author? Or should authorial intent be the measuring yardstick, in which case this rests with the creators of the AI, more generally, and the Guardian editor(s), more specifically? Or, if the concept of author is really passé, should we conclude that there is no author and just try to inscribe some meaning, if we can? Or, to take the old-school approach, is intention ultimately more important? In this sense, the AI is merely a tool, ventriloquising granular content that was not specifically spelled out, in a tone and arc that was. In other words, the op-ed was designed by the Guardian editors, and delivered by GPT-3.

The possibilities of fake news for propaganda purposes, and spam for scamming or dodgy sales purposes, highlight this still further. The reader would like to be able to trust the source for what they are reading before believing or reacting. But sometimes this is not possible. Advertising works because sales act subliminally. We are conscious of less than we suppose. Perhaps the question of authorship is indeed to some extent a chimera, as Barthes contends, and the real question is structural in the social sense of the term: whose voices are privileged to reach us and affect us?

Then, there are also cases where the content really does matter. People read literature for entertainment as much as education or any other reason. If the story is good, and the poetry hits home - does it matter who the author is? That, perhaps, depends on the reader and their state of mind. Even the connoisseurs among us watch a bit of trash TV or read a guilty pleasure now and then.

Nevertheless, it should be easy to see that meaning isn’t merely a question of content, and therefore the author - or designer, orchestrator, director - of a text does matter. Wanting to believe differently doesn’t stop advertising content from filling our consciousness, for example. What’s left is how we react - usually by cursing the companies that place those ads. Often, we still go out to buy their products. They succeed due to subliminal brand awareness.

That brings us to Hong’s second point, namely that the poetry of social engagement is the new frontier. In the context of AI, the question is how creative writers can use NLG and other language AI to better engage socially - or if it is even possible. This is not mere idle reflection. Language AI has long had a problem with bias, which came to the fore when Microsoft’s Tay failed in a very public way.

This problem occurs because language AI shares some of the same “ahistorical” tendencies that Hong called out. The biases in the training corpuses are perpetuated at the push of a button, unless some due diligence can be applied. That’s why pretrained models often come with warnings like "The generator may produce offensive or sexual content. Use at your own risk!" Never has Derrida’s famous dictum “Il n'y a pas de hors-texte” (there is no outside-text) seemed more apt than in the case of text-trained AIs. Some kind of human guidance or curation is merely the obvious thing to do.

Ethics in AI is now an active area of research. For AI like GPT-n, finetuning on curated texts and mindful editing and guidance by the human interlocutor (writer / artist / director) can help.

AI Dungeon, although an exciting development, currently still lacks the sophistication required to engage with more complicated social issues. It presently operates in very specific genres, like fantasy, dystopian, cyberpunk, etc. So it is natural that it bears the marks and tropes of those genres.

Nevertheless, would it perhaps be possible some day soon? AI Dungeon already has a multiplayer feature. Perhaps to engage socially more broadly (i.e., not merely in the social media sense of the word, but in the morally and spiritually rich ways that conscious art can offer), a game of this nature would have to be able to learn from more challenging and sophisticated texts than the adventure stories currently being used. Perhaps there would be in-play authors to guide the storytelling, with different players participating as characters, each writing their own stories in the larger story - a bit like RPGs and storygames - while the AI assists by generating storyworlds based on players’ (writers’) designs, cues and prompts. The authors would be more like designers and co-creators. In such a game, different worlds and situations could be explored - just like existing video games do visually - except now with all the language-based hallmarks that make literature unique.



News stories and media articles that hype the writing abilities of AI have been around for a while. They usually sound a tone of alarm before things go on more or less as before. But at some point - a tipping point, if you like - it could start to matter more than it did before. With GPT-3, it feels like that moment might be materialising. For writers like ourselves, it is both a daunting moment, but also - if we are prepared to take it - a moment of opportunity.


Glossary of Technical Terms

Artificial Intelligence (AI): Artificial Intelligence has a broad meaning that has come to include deep learning-based machine learning models. Deep learning itself includes a wide variety of models and categories. Two of the most prominent categories are models that deal with images and vision, and those that deal with language. This blog post talks primarily about language models, like GPT-n, that are capable of powerful Natural Language Generation (NLG).

Application Programmer Interface (API): In contemporary programming paradigms APIs offer a standardised way to decouple services, allowing a more decentralised way to both provide and use such a service. Some companies have started to provide deep learning models via APIs in the public marketplace.

Context Free Grammar (CFG): A rule based template for creating a Context Free Language (CFL). The idea of a Context Free Grammar was invented by Noam Chomsky in the 1960s as a way to describe the structure of sentences and words in a natural language. It lends itself well to programmatic treatment, and is sometimes used for Natural Language Generation.

Generative Pre-trained Transformer (GPT-n): OpenAI’s family of Natural Language Generation AI. As of this writing there is GPT (2018), GPT-2 (2019), and GPT-3 (2020). The name indicates that the AI is based on the Transformer language model.

Natural Language Generation (NLG): Refers to any kind of computing process that generates natural language. It is closely related to Natural Language Processing (NLP) and Natural Language Understanding (NLU). Language AI like GPT-3 represent the current state of the art in NLG.

Recurrent Neural Network (RNN): A type of deep learning neural network that can maintain a relative amount of internal state, providing it with a kind of “memory”. This has proved useful in applied areas such as Natural Language Processing (NLP) and Natural Language Generation (NLG). Nevertheless, the “memory” can be unreliable to maintain, and is typically supplemented with feedback mechanisms like Long Short Term Memory (LSTM).

Long Short Term Memory (LSTM): A Recurrent Neural Network (RNN) architecture that addresses some of RNNs’ shortcomings with respect to maintaining memory state.

Transformer: A deep learning model used in NLP and NLG that improves on limitations in RNNs and LSTMs, for example by lengthening the memory span and enabling parallelised training. It is the current model of choice in NLG.

User Interface (UI): The user interface represents the site of interactive between human and machine. To consumers this usually consists of interactive features of a website or application, eg. text boxes, drop downs, and buttons, but also includes the way information is presented, and the overall look-and-feel.