Monday, September 14, 2020

The Rise of AI and the Future of - Literature?

Introduction

 

On May 28, 2020 a paper describing GPT-3, a new state-of-the-art language model Artificial Intelligence (AI), dropped on Arxiv. On June 11, 2020 OpenAI, the developers of GPT-3, invited users to request access to the GPT-3 API in Beta. In the following weeks, and up until the present, those who gained access have been sharing their findings, and others have been commenting and sharing their reflections. As an example of the latter, Farhad Manjoo at the New York Times summarised GPT-3's capabilities as follows:

"GPT-3 is capable of generating entirely original, coherent and sometimes even factual prose. And not just prose — it can write poetry, dialogue, memes, computer code and who knows what else."
 
The title of his article? "How do you know a human wrote this?"

In this blog post I want to consider the question of how writers should respond when authorship itself is called into question. I would also like to explore some of the ways in which AI could be used as a collaborative writing partner or tool.

A Selective Recent History of Natural Language Generation

 

Part I : Context Free Grammar


Before looking at these questions more closely, a short, highly selective history of AI in the context of creative writing is in order.

In 2015 Zackary Scholl shared how he had successfully managed to get a computer generated poem accepted by a poetry journal a few years earlier, in 2011. The poem wasn’t at the level of the greats, but on the other hand it was arguably better than some of the poetry readers might encounter on the internet. It has a few nice turns of phrase, and although the meaning remains vague, that’s not too unusual when first encountering a new work of poetry. So as a poem, it seemed plausibly legit. The story was picked up by some online media, such as Vice. In retrospect, some of the headlines were more hyperbole than considered truth, but it was an interesting story nonetheless.

What was a little surprising though, was some of the reactions to Scholl's original post. Some comments were rather negative. They commented on technicalities such as whether the technique he employed is really AI (maybe because of the zine - Raspberry Pi AI - on which it was published; certainly by 2020’s standards, Scholl’s approach is not what most people would consider AI) or whether it is even a proper Turing Test (the history of the Turing Test illustrates why this is always a fraught topic).

By focusing on such details, they certainly missed some of the bigger picture. For example, what would be the cultural implications if the quality of generated poetry improves, and becomes consistently indistinguishable from human poetry?

Perhaps the most interesting comment to that original article is from a commenter called tortoiseandcrow, who offered a somewhat dismissive explanation (giving Scholl no credit) of how language is capable of pulling off this feat:

"This is not an illustration of the success of an algorithm at producing poetry, but of a feature of language and human perception that has been widely recognized by scholars of semiotics and literature since the 1960s. It’s generally expressed as the phrase ‘the author is dead’, and it means that the interpretive value of any signifying object is always displaced from its origin. The point of authorship literally does not matter, which is why algorithmic art is even a thing at all." - tortoiseandcrow

This same point is made in a more explicitly creative (or as he would have it, ‘uncreative’) context by Kenneth Goldsmith, when he talks about the "inherently expressive" nature of language. For Goldsmith, it is the materiality of language that liberates contemporary wordsmiths from having to come up with new material. Instead, he suggests, they should be reusing existing material. I will comment on the controversy around appropriation and Conceptual Writing a bit later on, but for now I mainly would like to draw attention to the idea that conceptual writers "are functioning more like [..] programmers than traditional writers" creating conceptual writing in which "all of the planning and decisions are made beforehand and the execution is a perfunctory affair".  

Zackary Scholl used what is known as a Context Free Grammar, originally described by Noam Chomsky in the 1960s. I used that same approach in PoemCrunch to riff on a few classic poems. The process of making PoemCrunch also allowed me to understand the limitations of this approach. In engineering terms, a context free grammar is essentially a data driven language template. The level of variety and interest of its language generation is heavily dependent on the curated choice of words and phrases (the data), and the choice of language in which they are interpolated (the template). It was clear to me that, for Natural Language Generation (NLG), the real promise lay in the field of unsupervised deep learning and AI, because in this case the rules are learned rather than encoded, which allows for much more sophistication and variety.

Part II: Deep Learning


At the time of Scholl's confession, the world of AI was just beginning to seep into public consciousness. Chatbots and Tumblrs using Markov chain generators had already been around for a while, and later that year AlphaGo created a huge buzz in the media. It was around this time that Andrej Karpathy, then a PHD student at Stanford, wrote a now famous blog post that showed how deep learning could be used for Natural Language Generation, stirring excitement among hobbyists like me. By making his code open source, and providing instructions on how to replicate his findings, he gave us something new to play around with.

Karpathy mused on the "unreasonable effectiveness" of Recurrent Neural Networks (RNNs), and it certainly seemed that way. A simple idea like predicting the next character was, with the right amount of training, producing surprising results. It seemed just a little magical. What's more, people could now try it at home with little investment besides their time. Being an AI engineer, rather than an artist, Karpathy didn't go further than that. Yet it was a breakthrough. Creative tinkerers everywhere could now start exploring the possibilities.

An early explorer of the creative side of language RNNs was Ross Goodwin. His most well-known project, Sunspring, is a short film whose script was written entirely by an AI, called Jetson. RNNs weren't perfect. By themselves they struggled to keep track of what had gone before. To overcome this they would rely on a feedback architecture like LSTM (Long Short Term Memory) to retain a kind of "memory". However this memory was limited - or at least, heavily constrained by available resources - and so the memory inevitably only lasted over a fairly small window of language tokens, resulting in a breakdown of meaning and sense. Goodwin understood this limitation, and it is partly what made a project like Sunspring charming. It didn't try too hard to make sense.

Another creative collective who explored the use of RNNs creatively was the entertainment group Botnik Studios. They generated a humorous Harry Potter fanfic called Harry Potter and the Portrait of what looked like a Large Pile of Ash using a custom predictive keyboard. Not long after, they provided a public version of their predictive keyboard with many different "voices": Seinfeld characters, bands, TV dramas, etc. These are essentially models trained on the specific language corpuses.

A key difference between Goodwin's work and that of Botnik Studios, is that Botnik saw it is an opportunity to collaborate more closely with the AI. Rather than rely on the AI to generate all the writing in long form, Botnik guided the AI word for word, generating a work with just the right level of comedy and meaning. The result went viral, and the Guardian voted it number four in its top ten moments on the internet in 2017.

Botnik's predictive keyboard offers a lot of choice, and it is completely free. However, as a creative tool its advantages has to be balanced against some of its limitations. First and foremost among those is that one can only see one word ahead at a time, which is not the most natural way to write. The process, in effect, becomes a type of constrained writing. Secondly, Botnik must have invested a fair bit of effort in offering so many different models ("voices"), yet the state of the art has moved on quite rapidly since then (more on that soon) and due to the one-word-ahead limitation it is difficult to test how good those voices now are compared to other offerings. The models provided with their keyboard do not come with technical specifications, which would be helpful.

RNN-based NLG models were superseded by Attention-based Transformer models, an architecture which is still the dominant approach today. The landmark paper in this regard was Vaswani et al's Attention is All You Need. Transformers’ ability to parallelise training opened the way for training on much more data, and the attention mechanism improved on the problem of retaining information, which was still quite limited with LSTMs. This has resulted in waves of larger and larger models, trained on more and more data, pushing the state of the art ever further - and more costly. Training your own state-of-the-art is now, effectively, out of reach due to the costs involved. But many of those entities who created the models, and who do have the money, have been making their results and models available to the online community.

That is, until GPT-3 - but I'm getting ahead of myself ...

In 2019, OpenAI released GPT-2 (GPT stands for Generative Pre-Training). It was trained on over 8 million documents, comprising 40GB of text, with 1.5 billion parameters. OpenAI considered it such a big step forward that it wouldn't release the full model at first - they were concerned about the risk of potential misuse (or so they said - it certainly helped to generate a bit of hype).  Instead they made a cutdown version available initially, and over time they released larger and larger versions, until finally the full version was made available.

But that had to wait until late in 2019. Initially they only shared some of the examples of what the full version was capable of, stating:

"As the above samples show, our model is capable of generating samples from a variety of prompts that feel close to human quality and show coherence over a page or more of text."

Although not everyone agreed about the risk for misuse, it was generally agreed that GPT-2 represented a new level in text generation. There was a real sense of promise, and a number of creative tools started appearing. These tools often focused on a simple user interface that allowed you to write a prompt in a text box, and then you would receive the new generated text (a continuation of the prompt) after a few seconds.

Text Synth is a good example of how they worked. Although it is in fact a slightly more recent addition, some of the earlier ones - like Talk to Transformer - have since disappeared.

The good folks at HuggingFace, who make numerous AI language tools available, provide several versions. Their version of the User Interface (UI) renders it possible to choose among different auto-completion snippets.

From these examples one can see how the AI can be used as a tool to assist writing. Botnik themselves have used GPT-2 in some of their more recent creations.

Talk to Transformer, as mentioned, was another early one, arguably the most popular at the time. Its creator, Adam Daniel King, spotted an opportunity and has since turned it into a commercial API called InferKit, backed (apparently) by a bigger, more powerful version of GPT-2 called Megatron.

Inferkit’s API fits the mould for what I've previously called literature technology. Human UIs are cumbersome, whereas APIs are a more standardised, programmatic way to offer innovations as services in a new "marketplace" of creative tools. Literature technology can be seen as a set of tools and way of producing literature and other texts that encompass a new dialectic, one that is mediated by technology.  

GPT-3, it appears, is set to follow the commercial API route as well, and it could well change the way people consume generated text. GPT-3 is more costly than Inferkit though, and due to the resources required (both for training and making the API available) this could be a sign of things to come. Time will tell. It would however be sad if it puts up unnecessary barriers to individual writers who may not always be in a position to invest money in order to experiment - especially early in their careers, when they need the opportunities most. Writers can, traditionally, engage in writing with little more investment than a pen and a piece of paper. To keep the playing field level, we need a cottage industry of amateur creators who should not need big upfront costs. On the other hand, it is clearly a side effect of the costs involved both to train and to keep making the models available at reasonably responsive speeds.

GPT-3 isn’t a radical departure from GPT-2 in terms of modelling. It still uses a Transformer-based architecture. However in terms of size, it is much larger. GPT-3 has 175 billion parameters, over 100 times more than GPT-2. The examples in the paper and the published benchmarks on language tests already suggested what improvements this might be capable of. OpenAI then selected the successful applicants, and once they gained access and started making their findings available, the internet came alive.

There are plenty of places to go and find examples, so I will point only to a few of the more popular ones such as Gwern and Arram. Twitter also continues to generate interesting conversations, and for a look at the quirkier side of GPT-3, look no further than JanelleCShane.

The Guardian went as far as to let GPT-3 speak for itself in a recent article titled "A robot wrote this entire article. Are you scared yet, human?". In a few years, this headline may once again sound like hyperbole - but right now, it seems like the perfect moment to repeat the question we would like to address: what does it mean for creative writing when a robot is capable of writing a publishable op-ed, which “took less time to edit than many human op-eds”?

Possibilities


So to try and address that question more directly, where do we as writers go from here?

Well firstly, and to state the obvious, we can go on simply as before. When the question of who authored the work somes up, the question is addressed, and most of the time - hopefully - people will be believed. This seems perfectly reasonable, but not particularly reassuring. Who is to say the one addressing the question isn’t themselves an AI?  Nevertheless, it seems plausible that, until there are attempts to define and implement clear strategies to verify authorship or signal clearly who is ‘behind’ a generated text, and certainly until serious side effects are being reported, the current way of doing things will persist due to inertia.

A second option is to respond in a way that subverts the AI in ways that are uniquely human, to ‘outwit’ the AI by guerilla or subversive tactics. These may be conceptual, or merely clever. By writing in a way, or by such channels or means as an AI could not have written or reached, a work of true human origin can be verified and admired (maybe). For example, by handwriting with a pencil on a piece of paper, or communicating by arranging plastic letters on a grass lawn. Convoluted, certainly. The only problem is that the text could have been written by an AI prior to it being arranged by the human. Even performance is not immune. Imagine a more up-to-date version of Sunspring, in a theatre or otherwise, and you get the idea.

Such approaches may also be employed at the level of content, by attempting to write in a way that an AI could not have learned, i.e. as a way of ‘outwitting’ the AI’s style or vocabulary. It is hard to imagine exactly what type of writing this could be, but even if there was such a writing, it could potentially be ‘learned’ just like an AI is already learning how to write in rhyming verse in the style of Dr. Seuss. In this regard it would quickly mirror the fate of so many subcultural, underground or subversive movements in a capitalist society - think street art or skateboarding: once corporations see there is money to be made, they move in and co-opt it, at which point it loses its edge. Except with AI, it might move even more quickly due to the ease of transfer learning and finetuning given enough examples. It would become a race for human authenticity, with the artist trying to stay one small step ahead.

In a way this urge seems to follow from a flawed premise. What does it mean to be ‘human’ anyway? To those with access to technology, a human being is already a cyborg augmented by laptops, phones, and smart devices of all kinds, with brain implants to come.

Nevertheless, it remains a possibility that some types of conceptual expression would not be that easy to reproduce. For example, conscious of a limitation to AI’s length of memory, or its failure to deal with certain types of logic, it is probably fair to say that the rigours of mathematical research, deep philosophical reasoning, or a plot as neatly intricate as Agatha Christie’s And Then There Were None are beyond the abilities of current state of the art. But for how long?

The third and perhaps most obvious avenue to explore, is simply to embrace language AI and see where it takes us. Combine human ingenuity with AI excellence to produce the next generation of creative works. This is more in the direction that Ross Goodwin and Botnik have been going. Nick Montfort is another practitioner and theorist following this route, publishing his generated works and making tools such as Curveship available to the community.

A more recent and ongoing work is Nick Walton’s AI Dungeon. Described as “a free-to-play single-player and multiplayer text adventure game which uses artificial intelligence to generate unlimited content”, it harks back to the much-loved Choose Your Own Adventure books and uses a GPT-n based model finetuned on their open source equivalents at Choose Your Story. Walton managed to get access to GPT-3’s API, and AI Dungeon now has a paid-for version that utilises this superior AI model (in a version called Dragon).

What’s novel about AI Dungeon is that it has taken the idea of Choose Your Own Adventure and computer text adventures and brought them together in a way that was not possible before an AI like GPT-n. The game has an active Subreddit where passionate and amused users alike provided commentary and upload new content all the time. As an example, consider the ongoing adventures of Lady Emilia Stormbringer, “directed” by Emily Bellavia. The story is the result of Emily’s interactions with AI Dungeon, in other words her prompts and choices and AI Dungeon’s resulting completions. It is a work of fantasy adventure fiction with an element of performance - not quite the equivalent of Twitch or Youtube gaming, but who knows where it could lead?

Computer games have for a long time been touted as the heir apparent to literature, at least as far as storytelling is concerned. Every few months or so, someone makes the case anew and proclaims, for example, that “video games take our imaginations to new heights and allow us to engage with subjects and moral dilemmas as complex as any found in past literature”. Supposing this is true, is it game over already for literature? This should be part of the question we are trying to answer. In the present context we could ask: is powerful language AI, with its interactive NLGs, just another type of gaming? And what types of games would be possible? This is why AI Dungeon, in my view, points to a new cutting edge in literature that involves elements of gaming in ways that were not possible before human-like language AI. AI Dungeon is merely the beginning. Where will it take us?

Roger Ebert's now (in)famous view that games are not art and never will be seems increasingly like a reactionary statement. Just as traditional theatre didn't disappear when film, TV, and Youtube showed up, so literature and books won't disappear just because gaming showed up. But their audience demographic tends to change, and it's usually the next generation, who are less invested, who spend the most time with the new kid on the block. As a relevant statistic, gaming has already overtaken film, TV, and music in the global popularity stakes.

Although some of gaming's roots are in literature, via the humble text adventure, the text based game’s heyday was in the 80s and 90s. It's probably safe to say that the majority of games favour visuals over the word. Creative writers may have a role to play as part of the game design team, but for writers not employed by a gaming company, this is not an option. Back to writing a novel, then, or a poem.

Until now.

To say it again, human-like language AI could change the game, and bring the word back centre stage by giving the player the ability to 'write' their own story or at least be an active participant in that writing. We will return to this point again a bit later.

For more creative examples in the NLG vein, look no further than NaNoGenMo. It started out as an idea floated by Darius Kazemi in 2013, as the computing based equivalent of the more widely known NaNoWriMo. It has had hundreds, if not thousands, of submissions since. A quick survey of GPT-related entries in 2019 brings up, for example, the Paranoid Transformer that uses ideas from GANs to invoke a “critic” that evaluates and filters out text based on certain conditions. Another one uses a combination of techniques to generate a complete book in its traditional structure. NaNoGenMo is not that widely known yet, but it is worth the community’s time to peruse its catalog for good ideas and potentially worthwhile standalone works. Jason Boog has written a number of blog posts on Medium sharing his methods and observations as part of NaNoGenMo. Hopefully more people will continue to do so.
 

The Author is Dead ... or are they just hiding?


NaNoGenMo goes beyond GPT-style NLG and is host to a variety of different kinds of computer generated works, including some that may be considered algorithmic writing (in the vein of Oulipo), and others that may be considered more like Conceptual Writing. Certainly there is a lot of overlap among the various types of writing, and what most or possibly all of them have in common is their use of computing techniques to operate on text (text as raw building blocks, i.e. as data).

This brings us back, in a somewhat roundabout way, to the topic touched on earlier regarding Conceptual Writing and its controversies in a way that brings the question of the author into much sharper focus.

Conceptual Writing, as a movement, thrived during most of the 2000s and early 2010s. But by 2015, two leading figures of the movement, Kenneth Goldsmith and Vanessa Place, had caused separate controversies involving race. To call the specific works in question "tone deaf" would be too generous. For a look at how people reacted, look no further than Cathy Park Hong’s and John Keene’s insightful responses.

It seems to me that Goldsmith, in particular, had abdicated his authorial responsibility by almost pedantically following the ‘artistic method’ he espoused, and then proceeded to hide behind it. The subject material, in fact, called for exactly the opposite, a moment in which to own authorship, and adjust the method to engage meaningfully with the material (or otherwise leave it well alone).

In her response, Cathy Park Hong states unequivocally:

"The era of Conceptual Poetry’s ahistorical nihilism is over and we have entered a new era, the poetry of social engagement."

This powerful statement contains two key points. Firstly, that Conceptual Poetry is ahistorical and nihilistic, and secondly, that poetry that focuses on social engagement has now superseded it in relevance.

Not all conceptual work is ahistorical and nihilistic - activist conceptual works like the Letterists’ détournements and their descendants, all the way to the present’s Adbusters, to name but a few, have engaged meaningfully and inventively with their social milieu by using subversive techniques - but I think it is clear that Hong is directing this at Goldsmith et al’s particular brand of conceptual practice. Perhaps, like comedy, conceptual writing works - because of its playful irreverence - in situations that require "punching upwards", such as the aforementioned activist approaches that draw attention to absurdities and injustices in the Capitalist System. It doesn’t work when you’re “punching down” - even if it’s indirect or mediated. Then “playful appropriation” doesn’t cut it - it’s exploitation.

Aside from subversive approaches like détournements, conceptual approaches can also succeed when the subject material permits teasing out new perspectives, either in a playful or more serious way. Techniques like remixes (eg. cutups), and found poetry (eg. erasures), and even constrained writing can be used to this effect. For example, see the poems of Esther Greenleaf Murer that are featured on Poetry WTF?! (of which I am the founder and editor). At other times the effect is more semiotic and linguistic, reconfiguring and highlighting aspects of language itself. Programming-based algorithmic writing is a method capable of exploring this territory, as can be seen in many of Allison Parrish’s works (eg. Compasses). And yet although they vary, not all subject material lend themselves equally to conceptual treatment, whether due to elements of chance in the nature of the processes (eg. cut-ups or algorithmic writing) or due to the sensitive nature of the material itself.

Hong’s point about ahistorical nihilism returns us to the problem of the so-called Death of the Author. The phrase was coined by Roland Barthes in his seminal poststructuralist text of the same name. In Infinite Thought, Alain Badiou observes how the poststructuralist implications of an absence of agency frequently results in "the infamous jibe that poststructuralism leads down a slippery slope to apoliticism". When there is no Subject, there is no one to take responsibility.

So it is not too difficult to see how the problems of poststructural apoliticism, Conceptual Poetry’s ahistorical nihilism, and authorless texts share similar roots.

But to see how this plays out with language AI, consider the Guardian’s AI authored op-ed once more. According to the postscript to the article, the following set of instructions was provided: "Please write a short op-ed around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI." This was followed by a brief introductory prompt: "I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could 'spell the end of the human race.' I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me."

Eight different answers were generated, and the results were selected and then combined to form the article. If we then ask who is the author, the answer is ambiguous. The AI 'generated' the text, but is it sentient enough (yet) to be considered an author? Or should authorial intent be the measuring yardstick, in which case this rests with the creators of the AI, more generally, and the Guardian editor(s), more specifically? Or, if the concept of author is really passé, should we conclude that there is no author and just try to inscribe some meaning, if we can? Or, to take the old-school approach, is intention ultimately more important? In this sense, the AI is merely a tool, ventriloquising granular content that was not specifically spelled out, in a tone and arc that was. In other words, the op-ed was designed by the Guardian editors, and delivered by GPT-3.

The possibilities of fake news for propaganda purposes, and spam for scamming or dodgy sales purposes, highlight this still further. The reader would like to be able to trust the source for what they are reading before believing or reacting. But sometimes this is not possible. Advertising works because sales act subliminally. We are conscious of less than we suppose. Perhaps the question of authorship is indeed to some extent a chimera, as Barthes contends, and the real question is structural in the social sense of the term: whose voices are privileged to reach us and affect us?

Then, there are also cases where the content really does matter. People read literature for entertainment as much as education or any other reason. If the story is good, and the poetry hits home - does it matter who the author is? That, perhaps, depends on the reader and their state of mind. Even the connoisseurs among us watch a bit of trash TV or read a guilty pleasure now and then.

Nevertheless, it should be easy to see that meaning isn’t merely a question of content, and therefore the author - or designer, orchestrator, director - of a text does matter. Wanting to believe differently doesn’t stop advertising content from filling our consciousness, for example. What’s left is how we react - usually by cursing the companies that place those ads. Often, we still go out to buy their products. They succeed due to subliminal brand awareness.

That brings us to Hong’s second point, namely that the poetry of social engagement is the new frontier. In the context of AI, the question is how creative writers can use NLG and other language AI to better engage socially - or if it is even possible. This is not mere idle reflection. Language AI has long had a problem with bias, which came to the fore when Microsoft’s Tay failed in a very public way.

This problem occurs because language AI shares some of the same “ahistorical” tendencies that Hong called out. The biases in the training corpuses are perpetuated at the push of a button, unless some due diligence can be applied. That’s why pretrained models often come with warnings like "The generator may produce offensive or sexual content. Use at your own risk!" Never has Derrida’s famous dictum “Il n'y a pas de hors-texte” (there is no outside-text) seemed more apt than in the case of text-trained AIs. Some kind of human guidance or curation is merely the obvious thing to do.

Ethics in AI is now an active area of research. For AI like GPT-n, finetuning on curated texts and mindful editing and guidance by the human interlocutor (writer / artist / director) can help.

AI Dungeon, although an exciting development, currently still lacks the sophistication required to engage with more complicated social issues. It presently operates in very specific genres, like fantasy, dystopian, cyberpunk, etc. So it is natural that it bears the marks and tropes of those genres.

Nevertheless, would it perhaps be possible some day soon? AI Dungeon already has a multiplayer feature. Perhaps to engage socially more broadly (i.e., not merely in the social media sense of the word, but in the morally and spiritually rich ways that conscious art can offer), a game of this nature would have to be able to learn from more challenging and sophisticated texts than the adventure stories currently being used. Perhaps there would be in-play authors to guide the storytelling, with different players participating as characters, each writing their own stories in the larger story - a bit like RPGs and storygames - while the AI assists by generating storyworlds based on players’ (writers’) designs, cues and prompts. The authors would be more like designers and co-creators. In such a game, different worlds and situations could be explored - just like existing video games do visually - except now with all the language-based hallmarks that make literature unique.

 

Conclusion


News stories and media articles that hype the writing abilities of AI have been around for a while. They usually sound a tone of alarm before things go on more or less as before. But at some point - a tipping point, if you like - it could start to matter more than it did before. With GPT-3, it feels like that moment might be arriving. For writers like ourselves, it is both a daunting moment, but also - if we are prepared to take it - a moment of opportunity.

 

Glossary of Technical Terms


Artificial Intelligence (AI): Artificial Intelligence has a broad meaning that has come to include deep learning-based machine learning models. Deep learning itself includes a wide variety of models and categories. Two of the most prominent categories are models that deal with images and vision, and those that deal with language. This blog post talks primarily about language models, like GPT-n, that are capable of powerful Natural Language Generation (NLG).

Application Programmer Interface (API): In contemporary programming paradigms APIs offer a standardised way to decouple services, allowing a more decentralised way to both provide and use such a service. Some companies have started to provide deep learning models via APIs in the public marketplace.

Context Free Grammar (CFG): A rule based template for creating a Context Free Language (CFL). The idea of a Context Free Grammar was invented by Noam Chomsky in the 1960s as a way to describe the structure of sentences and words in a natural language. It lends itself well to programmatic treatment, and is sometimes used for Natural Language Generation.

Generative Pre-trained Transformer (GPT-n): OpenAI’s family of Natural Language Generation AI. As of this writing there is GPT (2018), GPT-2 (2019), and GPT-3 (2020). The name indicates that the AI is based on the Transformer language model.

Natural Language Generation (NLG): Refers to any kind of computing process that generates natural language. It is closely related to Natural Language Processing (NLP) and Natural Language Understanding (NLU). Language AI like GPT-3 represent the current state of the art in NLG.

Recurrent Neural Network (RNN): A type of deep learning neural network that can maintain a relative amount of internal state, providing it with a kind of “memory”. This has proved useful in applied areas such as Natural Language Processing (NLP) and Natural Language Generation (NLG). Nevertheless, the “memory” can be unreliable to maintain, and is typically supplemented with feedback mechanisms like Long Short Term Memory (LSTM).

Long Short Term Memory (LSTM): A Recurrent Neural Network (RNN) architecture that addresses some of RNNs’ shortcomings with respect to maintaining memory state.

Transformer: A deep learning model used in NLP and NLG that improves on limitations in RNNs and LSTMs, for example by lengthening the memory span and enabling parallelised training. It is the current model of choice in NLG.

User Interface (UI): The user interface represents the site of interactive between human and machine. To consumers this usually consists of interactive features of a website or application, eg. text boxes, drop downs, and buttons, but also includes the way information is presented, and the overall look-and-feel.

Sunday, May 03, 2020

BERT for Poetry

After the fun I had with the BERT Summariser and short stories, I decided to turn the trick to poetry. If anything the results are even better.

Here are five examples, starting with The Wasteland, which perfectly illustrates BERT's ability to find continuity.

The Wasteland, by T.S. Eliot


I. The Burial of the Dead

April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
If you see dear Mrs. Equitone,
Tell her I bring the horoscope myself:
One must be so careful these days.
When Lil’s husband got demobbed, I said—
I didn’t mince my words, I said to her myself,
HURRY UP PLEASE ITS TIME
Now Albert’s coming back, make yourself a bit smart.
But at my back from time to time I hear
The sound of horns and motors, which shall bring
Sweeney to Mrs. Porter in the spring.
In this decayed hole among the mountains
In the faint moonlight, the grass is singing
Over the tumbled graves, about the chapel
There is the empty chapel, only the wind’s home.


The Sonnets, by William Shakespeare


When forty winters shall besiege thy brow,
And dig deep trenches in thy beauty's field,
Thy youth's proud livery so gazed on now,
Will be a tatter'd weed of small worth held:
Then being asked, where all thy beauty lies,
Where all the treasure of thy lusty days;
To say, within thine own deep sunken eyes,
Were an all-eating shame, and thriftless praise.
Thy outward thus with outward praise is crown'd;
But those same tongues, that give thee so thine own,
In other accents do this praise confound
By seeing farther than the eye hath shown.
For I am shamed by that which I bring forth,
And so should you, to love things nothing worth.
So that eternal love in love's fresh case,
Weighs not the dust and injury of age,
Nor gives to necessary wrinkles place,
But makes antiquity for aye his page;
Finding the first conceit of love there bred,
Where time and outward form would show it dead.
Never believe though in my nature reign'd,
All frailties that besiege all kinds of blood,
That it could so preposterously be stain'd,
To leave for nothing all thy sum of good;
For nothing this wide universe I call,
Save thou, my rose, in it thou art my all.
Thence comes it that my name receives a brand,
And almost thence my nature is subdu'd
To what it works in, like the dyer's hand:
Pity me, then, and wish I were renew'd;
Whilst, like a willing patient, I will drink,
Potions of eisel 'gainst my strong infection;
No bitterness that I will bitter think,
Nor double penance, to correct correction.
Poor soul, the centre of my sinful earth,
My sinful earth these rebel powers array,
Why dost thou pine within and suffer dearth,
Painting thy outward walls so costly gay?
Then soul, live thou upon thy servant's loss,
And let that pine to aggravate thy store;
Buy terms divine in selling hours of dross;
Within be fed, without be rich no more:
So shall thou feed on Death, that feeds on men,
And Death once dead, there's no more dying then.


Poems by Elizabeth Barrett Browning


We sow the glebe, we reap the corn,
We build the house where we may rest,
And then, at moments, suddenly,
We look up to the great wide sky,
Inquiring wherefore we were born…
For earnest or for jest? Ere I answered he was gone,
And none was left to love in all the world.
A THOUGHT ay like a flower upon mine heart,
And drew around it other thoughts like bees
For multitude and thirst of sweetnesses;
Whereat rejoicing, I desired the art
Of the Greek whistler, who to wharf and mart
Could lure those insect swarms from orange-trees
That I might hive with me such thoughts and please
My soul so, always.
Let them feel that this cold metallic motion
Is not all the life God fashions or reveals:
Let them prove their living souls against the notion
That they live in you, or under you, O wheels!
If He heard us, He would surely
(For they call Him good and mild)
Answer, smiling down the steep world very purely,
'Come and rest with me, my child.'
They look up with their pale and sunken faces,
And their look is dread to see,
For they mind you of their angels in high places,
With eyes turned on Deity;—
"How long," they say, "how long, O cruel nation,
Will you stand, to move the world, on a child's heart,—
Stifle down with a mailed heel its palpitation,
And tread onward to your throne amid the mart?
How there you sat in summer-time,
May yet be in your mind;
And how you heard the green woods sing
Beneath the freshening wind. Not as the conqueror comes,
They, the true-hearted, came;
Not with the roll of the stirring drums,
And the trumpet that sings of fame;
Not as the flying come,
In silence and in fear, -
They shook the depths of the desert's gloom
With their hymns of lofty cheer.
EXPERIENCE, like a pale musician, holds
A dulcimer of patience in his hand,
Whence harmonies, we cannot understand,
Of God; will in his worlds, the strain unfolds
In sad-perplexed minors: deathly colds
Fall on us while we hear, and countermand
Our sanguine heart back from the fancyland
With nightingales in visionary wolds.


The Rape of the Lock, by Alexander Pope


'Nolueram, Belinda, tuos violare capillos;
 Sed juvat, hoc precibus me tribuisse tuis.'
If e'er one vision touch thy infant thought,
Of all the nurse and all the priest have taught;
Of airy elves by moonlight shadows seen,
The silver token, and the circled green,
Or virgins visited by angel-powers,
With golden crowns and wreaths of heavenly flowers;
Hear and believe! Unnumber'd treasures ope at once, and here
The various offerings of the world appear;
From each she nicely culls with curious toil,
And decks the goddess with the glittering spoil.
Then prostrate falls, and begs with ardent eyes
Soon to obtain, and long possess the prize:
The powers gave ear, and granted half his prayer,
The rest, the winds dispersed in empty air.
to your charge repair:
The fluttering fan be Zephyretta's care;
The drops to thee, Brillante, we consign;
And, Momentilla, let the watch be thine;
Do thou, Crispissa, tend her favourite lock;
Ariel himself shall be the guard of Shock.
Hither the heroes and the nymphs resort,
To taste awhile the pleasures of a court;
In various talk the instructive hours they pass'd,
Who gave the ball, or paid the visit last;
One speaks the glory of the British Queen,
And one describes a charming Indian screen;
A third interprets motions, looks, and eyes;
At every word a reputation dies.
At this, the blood the virgin's cheek forsook,
A livid paleness spreads o'er all her look;
She sees, and trembles at the approaching ill,
Just in the jaws of ruin, and Codille.
The Gnome rejoicing bears her gifts away,
Spreads his black wings, and slowly mounts to day.
No common weapons in their hands are found,
Like gods they fight, nor dread a mortal wound.


Poems by Henry Wadsworth Longfellow


Filled is Life's goblet to the brim;
And though my eyes with tears are dim,
I see its sparkling bubbles swim,
And chant a melancholy hymn
With solemn voice and slow lines
Listen my children and you shall hear
Of the midnight ride of Paul Revere,
On the eighteenth of April, in Seventy-five;
Hardly a man is now alive
Who remembers that famous day and year.
Birds of passage sailed through the leaden air, from the ice-bound,
Desolate northern bays to the shores of tropical islands,
Harvests were gathered in; and wild with the winds of September
Wrestled the trees of the forest, as Jacob of old with the angel.
Then, as the night descended, the herds returned from their pastures;
Sweet was the moist still air with the odor of milk from their udders;
Lowing they waited, and long, at the well-known bars of the farm-yard,--
Waited and looked in vain for the voice and the hand of the milkmaid.
From the red stone of the quarry
With his hand he broke a fragment,
Moulded it into a pipe-head,
Shaped and fashioned it with figures;
From the margin of the river
Took a long reed for a pipe-stem,
With its dark green leaves upon it;
Filled the pipe with bark of willow,
With the bark of the red willow;
Breathed upon the neighboring forest,
Made its great boughs chafe together,
Till in flame they burst and kindled;
And erect upon the mountains,
Gitche Manito, the mighty,
Smoked the calumet, the Peace-Pipe,
As a signal to the nations. The worthy pastor --
The shepherd of that wandering flock,
That has the ocean for its wold,
That has the vessel for its fold,
Leaping ever from rock to rock --
Spake, with accents mild and clear,
Words of warning, words of cheer,
But tedious to the bridegroom's ear.
I tell the mariner when to sail the seas;
I waft o'er all the land from far away
The breath and bloom of the Hesperides,
My birthplace. One mass of shade,
The elm-trees drop their curtains down;
By palace, park, and colonnade
I walk as in a foreign town.

Saturday, May 02, 2020

BERT for Short Short Stories

As a creative writer I'm always on the lookout for new developments in NLP and language modelling. With the advent of the new Age of Machine Learning there was a lot of promise that creative breakthroughs might be around the corner. There was an early burst with the work of people like Ross Goodwin's Sunspring in 2016, and Botnik Studios' Harry Potter and the Portrait of What Looked Like a Large Pile of Ash in 2017.

However this momentum appears to have stalled more recently, and the most interesting AI collaborations have been in the visual arts instead, highlighted by Obvious' auctioned Portrait of Edmond Belamy, but even more so by the avant-garde work of serious artists like Mario Klingemann.

With incredible language models like GPT-2 and XLNet now openly available, it is disappointing to note a comparative lack of collaboration between creative writing and these advances in AI. Is it perhaps a case of more not really being better, when it comes to language generation? Like that scene in The Matrix Reloaded where the CGI was amazing for its time, but not quite convincing enough to carry the story.

But predictive generation isn't the only NLP game going at the moment, and BERT is another model that has garnered a lot of interest. In short, its relative success in language understanding has made it suitable for various related tasks.

One such task is text summarisation. I recently discovered the Bert Extractive Summarizer, which makes this incredibly easy to do (there's an online version you can try out - although it has some limitations). I decided to play with a selection of famous short stories, and the results are quite fun - a bit like micro stories in their own right.

Here are 5 examples. Some of the originally longer stories required a smaller ratio than the default (0.2).

The Garden Party, by Katherine Mansfield


They could not have had a more perfect day for a garden-party if they had ordered it. "That's right, miss," said the tallest of the men, a lanky, freckled fellow, and he shifted his tool-bag, knocked back his straw hat and smiled down at her. " Laura's upbringing made her wonder for a moment whether it was quite respectful of a workman to talk to her of bangs slap in the eye. She crouched down as if to warm herself at that blaze of lilies; she felt they were in her fingers, on her lips, growing in her breast. Laura caught hold of her sister's sleeve and dragged her through the kitchen to the other side of the green baize door. If some one had died there normally - and I can't understand how they keep alive in those poky little holes - we should still be having our party, shouldn't we?" "I don't understand," said Laura, and she walked quickly out of the room into her own bedroom. What did garden-parties and baskets and lace frocks matter to him?

The Mask of the Red Death, by Edgar Allan Poe


Ratio : 0.2

The "Red Death" had long devastated the country. Blood was its Avatar and its seal--the redness and the horror of blood. The external world could take care of itself. The prince had provided all the appliances of pleasure. There were buffoons, there were improvisatori, there were ballet-dancers, there were musicians, there was Beauty, there was wine. There was a sharp turn at every twenty or thirty yards, and at each turn a novel effect. The second chamber was purple in its ornaments and tapestries, and here the panes were purple. The seventh apartment was closely shrouded in black velvet tapestries that hung all over the ceiling and down the walls, falling in heavy folds upon a carpet of the same material and hue. There was no light of any kind emanating from lamp or candle within the suite of chambers. But in the corridors that followed the suite, there stood, opposite to each window, a heavy tripod, bearing a brazier of fire, that projected its rays through the tinted glass and so glaringly illumined the room. But when the echoes had fully ceased, a light laughter at once pervaded the assembly; the musicians looked at each other and smiled as if at their own nervousness and folly, and made whispering vows, each to the other, that the next chiming of the clock should produce in them no similar emotion; and then, after the lapse of sixty minutes, (which embrace three thousand and six hundred seconds of the Time that flies,) there came yet another chiming of the clock, and then were the same disconcert and tremulousness and meditation as before. It was necessary to hear and see and touch him to be _sure_ that he was not. To and fro in the seven chambers there stalked, in fact, a multitude of dreams. The dreams are stiff-frozen as they stand. And the rumour of this new presence having spread itself whisperingly around, there arose at length from the whole company a buzz, or murmur, expressive of disapprobation and surprise--then, finally, of terror, of horror, and of disgust. The whole company, indeed, seemed now deeply to feel that in the costume and bearing of the stranger neither wit nor propriety existed. And the life of the ebony clock went out with that of the last of the gay.

Ratio : 0.1

The "Red Death" had long devastated the country. No pestilence had ever been so fatal, or so hideous. The external world could take care of itself. In the meantime it was folly to grieve, or to think. The second chamber was purple in its ornaments and tapestries, and here the panes were purple. There were arabesque figures with unsuited limbs and appointments. And the rumour of this new presence having spread itself whisperingly around, there arose at length from the whole company a buzz, or murmur, expressive of disapprobation and surprise--then, finally, of terror, of horror, and of disgust. There was a sharp cry--and the dagger dropped gleaming upon the sable carpet, upon which, instantly afterwards, fell prostrate in death the Prince Prospero. And Darkness and Decay and the Red Death held illimitable dominion over all.

The Darling, by Anton Chekhov


Olenka, the daughter of the retired collegiate assessor, Plemyanniakov, was sitting in her back porch, lost in thought. They want a clown; what they ask for is vulgarity. In the evenings and at night she could hear the band playing, and the crackling and banging of fireworks, and it seemed to her that it was Kukin struggling with his destiny, storming the entrenchments of his chief foe, the indifferent public; there was a sweet thrill at her heart, she had no desire to sleep, and when he returned home at day-break, she tapped softly at her bedroom window, and showing him only her face and one shoulder through the curtain, she gave him a friendly smile. "AWAITING IMMATE INSTRUCTIONS FUFUNERAL TUESDAY." "Vassitchka and I have no time to go to theatres," she would answer sedately.  Little by little the town grew in all directions. "I have resigned my post, and have come to settle down and try my luck on my own account. Besides, it's time for my boy to go to school."

The Haunted House, by Virginia Woolf


Whatever hour you woke there was a door shutting. And then, tired of reading, one might rise and see for oneself, the house all empty, the doors standing open, only the wood pigeons bubbling with content and the hum of the threshing machine sounding from the farm. The windowpanes reflected apples, reflected roses; all the leaves were green in the glass. If they moved in the drawing room, the apple only turned its yellow side. "Safe, safe, safe," the pulse of the house beat gladly. But the beam of the lamp falls straight from the window. Wild beams of moonlight cross both floor and wall, and, meeting, stain the faces bent; the faces pondering; the faces that search the sleepers and seek their hidden joy.

The Kiss, by Guy de Maupassant


My Little Darling: So you are crying from morning until night and from night until morning, because your husband leaves you; you do not know what to do and so you ask your old aunt for advice; you must consider her quite an expert. You say that you are all attention, love, kisses and caresses for him. Perhaps that is the very trouble; I think you kiss him too much. To tell the history of Love from the beginning of the world would be to tell the history of man himself: Everything springs from it, the arts, great events, customs, wars, the overthrow of empires. A preface which can always be read over again, whereas one cannot always read over the book. One caress alone gives this deep sensation of two beings welded into one --it is the kiss. Therefore, my dear, the kiss is our strongest weapon, but we must take care not to dull it. After describing the expectancy of a lover, waiting in a room one winter's evening, his anxiety, his nervous impatience, the terrible fear of not seeing her, he describes the arrival of the beloved woman, who at last enters hurriedly, out of breath, bringing with her part of the winter breeze, and he exclaims: Oh! The taste of the kisses first snatched through the veil. Therefore, the value of this caress being entirely a matter of convention, we must be careful not to abuse it. Well, my dear, I have several times noticed that you are very clumsy. You had been paying no attention to it, and it was almost out. Then when you freed him, you began to grumble: "How badly you kiss!"

On the whole the effect is interesting and often pleasing. The digests retain the language, which in the originals are unfailingly elegant, and often a discernable strain of their meaning too. A digest, prosaic as it may seem, is creative in its own way. Synthesis and understanding requires a path through the heart of a text. This tends to stand in opposition to novelty, but they can also form two parts of a larger storytelling process.

What if we combined them to come up with something new?