Monday, September 14, 2020

The Rise of AI and the Future of - Literature?

Introduction

 

On May 28, 2020 a paper describing GPT-3, a new state-of-the-art language model Artificial Intelligence (AI), dropped on Arxiv. On June 11, 2020 OpenAI, the developers of GPT-3, invited users to request access to the GPT-3 API in Beta. In the following weeks, and up until the present, those who gained access have been sharing their findings, and others have been commenting and sharing their reflections. As an example of the latter, Farhad Manjoo at the New York Times summarised GPT-3's capabilities as follows:

"GPT-3 is capable of generating entirely original, coherent and sometimes even factual prose. And not just prose — it can write poetry, dialogue, memes, computer code and who knows what else."
 
The title of his article? "How do you know a human wrote this?"

In this blog post I want to consider the question of how writers should respond when authorship itself is called into question. I would also like to explore some of the ways in which AI could be used as a collaborative writing partner or tool.

A Selective Recent History of Natural Language Generation

 

Part I : Context Free Grammar


Before looking at these questions more closely, a short, highly selective history of AI in the context of creative writing is in order.

In 2015 Zackary Scholl shared how he had successfully managed to get a computer generated poem accepted by a poetry journal a few years earlier, in 2011. The poem wasn’t at the level of the greats, but on the other hand it was arguably better than some of the poetry readers might encounter on the internet. It has a few nice turns of phrase, and although the meaning remains vague, that’s not too unusual when first encountering a new work of poetry. So as a poem, it seemed plausibly legit. The story was picked up by some online media, such as Vice. In retrospect, some of the headlines were more hyperbole than considered truth, but it was an interesting story nonetheless.

What was a little surprising though, was some of the reactions to Scholl's original post. Some comments were rather negative. They commented on technicalities such as whether the technique he employed is really AI (maybe because of the zine - Raspberry Pi AI - on which it was published; certainly by 2020’s standards, Scholl’s approach is not what most people would consider AI) or whether it is even a proper Turing Test (the history of the Turing Test illustrates why this is always a fraught topic).

By focusing on such details, they certainly missed some of the bigger picture. For example, what would be the cultural implications if the quality of generated poetry improves, and becomes consistently indistinguishable from human poetry?

Perhaps the most interesting comment to that original article is from a commenter called tortoiseandcrow, who offered a somewhat dismissive explanation (giving Scholl no credit) of how language is capable of pulling off this feat:

"This is not an illustration of the success of an algorithm at producing poetry, but of a feature of language and human perception that has been widely recognized by scholars of semiotics and literature since the 1960s. It’s generally expressed as the phrase ‘the author is dead’, and it means that the interpretive value of any signifying object is always displaced from its origin. The point of authorship literally does not matter, which is why algorithmic art is even a thing at all." - tortoiseandcrow

This same point is made in a more explicitly creative (or as he would have it, ‘uncreative’) context by Kenneth Goldsmith, when he talks about the "inherently expressive" nature of language. For Goldsmith, it is the materiality of language that liberates contemporary wordsmiths from having to come up with new material. Instead, he suggests, they should be reusing existing material. I will comment on the controversy around appropriation and Conceptual Writing a bit later on, but for now I mainly would like to draw attention to the idea that conceptual writers "are functioning more like [..] programmers than traditional writers" creating conceptual writing in which "all of the planning and decisions are made beforehand and the execution is a perfunctory affair".  

Zackary Scholl used what is known as a Context Free Grammar, originally described by Noam Chomsky in the 1960s. I used that same approach in PoemCrunch to riff on a few classic poems. The process of making PoemCrunch also allowed me to understand the limitations of this approach. In engineering terms, a context free grammar is essentially a data driven language template. The level of variety and interest of its language generation is heavily dependent on the curated choice of words and phrases (the data), and the choice of language in which they are interpolated (the template). It was clear to me that, for Natural Language Generation (NLG), the real promise lay in the field of unsupervised deep learning and AI, because in this case the rules are learned rather than encoded, which allows for much more sophistication and variety.

Part II: Deep Learning


At the time of Scholl's confession, the world of AI was just beginning to seep into public consciousness. Chatbots and Tumblrs using Markov chain generators had already been around for a while, and later that year AlphaGo created a huge buzz in the media. It was around this time that Andrej Karpathy, then a PHD student at Stanford, wrote a now famous blog post that showed how deep learning could be used for Natural Language Generation, stirring excitement among hobbyists like me. By making his code open source, and providing instructions on how to replicate his findings, he gave us something new to play around with.

Karpathy mused on the "unreasonable effectiveness" of Recurrent Neural Networks (RNNs), and it certainly seemed that way. A simple idea like predicting the next character was, with the right amount of training, producing surprising results. It seemed just a little magical. What's more, people could now try it at home with little investment besides their time. Being an AI engineer, rather than an artist, Karpathy didn't go further than that. Yet it was a breakthrough. Creative tinkerers everywhere could now start exploring the possibilities.

An early explorer of the creative side of language RNNs was Ross Goodwin. His most well-known project, Sunspring, is a short film whose script was written entirely by an AI, called Jetson. RNNs weren't perfect. By themselves they struggled to keep track of what had gone before. To overcome this they would rely on a feedback architecture like LSTM (Long Short Term Memory) to retain a kind of "memory". However this memory was limited - or at least, heavily constrained by available resources - and so the memory inevitably only lasted over a fairly small window of language tokens, resulting in a breakdown of meaning and sense. Goodwin understood this limitation, and it is partly what made a project like Sunspring charming. It didn't try too hard to make sense.

Another creative collective who explored the use of RNNs creatively was the entertainment group Botnik Studios. They generated a humorous Harry Potter fanfic called Harry Potter and the Portrait of what looked like a Large Pile of Ash using a custom predictive keyboard. Not long after, they provided a public version of their predictive keyboard with many different "voices": Seinfeld characters, bands, TV dramas, etc. These are essentially models trained on the specific language corpuses.

A key difference between Goodwin's work and that of Botnik Studios, is that Botnik saw it is an opportunity to collaborate more closely with the AI. Rather than rely on the AI to generate all the writing in long form, Botnik guided the AI word for word, generating a work with just the right level of comedy and meaning. The result went viral, and the Guardian voted it number four in its top ten moments on the internet in 2017.

Botnik's predictive keyboard offers a lot of choice, and it is completely free. However, as a creative tool its advantages has to be balanced against some of its limitations. First and foremost among those is that one can only see one word ahead at a time, which is not the most natural way to write. The process, in effect, becomes a type of constrained writing. Secondly, Botnik must have invested a fair bit of effort in offering so many different models ("voices"), yet the state of the art has moved on quite rapidly since then (more on that soon) and due to the one-word-ahead limitation it is difficult to test how good those voices now are compared to other offerings. The models provided with their keyboard do not come with technical specifications, which would be helpful.

RNN-based NLG models were superseded by Attention-based Transformer models, an architecture which is still the dominant approach today. The landmark paper in this regard was Vaswani et al's Attention is All You Need. Transformers’ ability to parallelise training opened the way for training on much more data, and the attention mechanism improved on the problem of retaining information, which was still quite limited with LSTMs. This has resulted in waves of larger and larger models, trained on more and more data, pushing the state of the art ever further - and more costly. Training your own state-of-the-art is now, effectively, out of reach due to the costs involved. But many of those entities who created the models, and who do have the money, have been making their results and models available to the online community.

That is, until GPT-3 - but I'm getting ahead of myself ...

In 2019, OpenAI released GPT-2 (GPT stands for Generative Pre-Training). It was trained on over 8 million documents, comprising 40GB of text, with 1.5 billion parameters. OpenAI considered it such a big step forward that it wouldn't release the full model at first - they were concerned about the risk of potential misuse (or so they said - it certainly helped to generate a bit of hype).  Instead they made a cutdown version available initially, and over time they released larger and larger versions, until finally the full version was made available.

But that had to wait until late in 2019. Initially they only shared some of the examples of what the full version was capable of, stating:

"As the above samples show, our model is capable of generating samples from a variety of prompts that feel close to human quality and show coherence over a page or more of text."

Although not everyone agreed about the risk for misuse, it was generally agreed that GPT-2 represented a new level in text generation. There was a real sense of promise, and a number of creative tools started appearing. These tools often focused on a simple user interface that allowed you to write a prompt in a text box, and then you would receive the new generated text (a continuation of the prompt) after a few seconds.

Text Synth is a good example of how they worked. Although it is in fact a slightly more recent addition, some of the earlier ones - like Talk to Transformer - have since disappeared.

The good folks at HuggingFace, who make numerous AI language tools available, provide several versions. Their version of the User Interface (UI) renders it possible to choose among different auto-completion snippets.

From these examples one can see how the AI can be used as a tool to assist writing. Botnik themselves have used GPT-2 in some of their more recent creations.

Talk to Transformer, as mentioned, was another early one, arguably the most popular at the time. Its creator, Adam Daniel King, spotted an opportunity and has since turned it into a commercial API called InferKit, backed by a bigger, more powerful version of GPT-2 called Megatron.

Inferkit’s API fits the mould for what I've previously called literature technology. Human UIs are cumbersome, whereas APIs are a more standardised, programmatic way to offer innovations as services in a new "marketplace" of creative tools. Literature technology can be seen as a set of tools and way of producing literature and other texts that encompass a new dialectic, one that is mediated by technology.  

GPT-3, it appears, is set to follow the commercial API route as well, and it could well change the way people consume generated text. GPT-3 is more costly than Inferkit though, and due to the resources required (both for training and making the API available) this could be a sign of things to come. Time will tell. It would however be sad if it puts up unnecessary barriers to individual writers who may not always be in a position to invest money in order to experiment - especially early in their careers, when they need the opportunities most. Writers can, traditionally, engage in writing with little more investment than a pen and a piece of paper. To keep the playing field level, we need a cottage industry of amateur creators who should not need big upfront costs. On the other hand, it is clearly a side effect of the costs involved both to train and to keep making the models available at reasonably responsive speeds.

GPT-3 isn’t a radical departure from GPT-2 in terms of modelling. It still uses a Transformer-based architecture. However in terms of size, it is much larger. GPT-3 has 175 billion parameters, over 100 times more than GPT-2. The examples in the paper and the published benchmarks on language tests already suggested what improvements this might be capable of. OpenAI then selected the successful applicants, and once they gained access and started making their findings available, the internet came alive.

There are plenty of places to go and find examples, so I will point only to a few of the more popular ones such as Gwern and Arram. Twitter also continues to generate interesting conversations, and for a look at the quirkier side of GPT-3, look no further than JanelleCShane.

The Guardian went as far as to let GPT-3 speak for itself in a recent article titled "A robot wrote this entire article. Are you scared yet, human?". In a few years, this headline may once again sound like hyperbole - but right now, it seems like the perfect moment to repeat the question we would like to address: what does it mean for creative writing when a robot is capable of writing a publishable op-ed, which “took less time to edit than many human op-eds”?

Possibilities


So to try and address that question more directly, where do we as writers go from here?

Well firstly, and to state the obvious, we can go on simply as before. When the question of who authored the work somes up, the question is addressed, and most of the time - hopefully - people will be believed. This seems perfectly reasonable, but not particularly reassuring. Who is to say the one addressing the question isn’t themselves an AI?  Nevertheless, it seems plausible that, until there are attempts to define and implement clear strategies to verify authorship or signal clearly who is ‘behind’ a generated text, and certainly until serious side effects are being reported, the current way of doing things will persist due to inertia.

A second option is to respond in a way that subverts the AI in ways that are uniquely human, to ‘outwit’ the AI by guerilla or subversive tactics. These may be conceptual, or merely clever. By writing in a way, or by such channels or means as an AI could not have written or reached, a work of true human origin can be verified and admired (maybe). For example, by handwriting with a pencil on a piece of paper, or communicating by arranging plastic letters on a grass lawn. Convoluted, certainly. The only problem is that the text could have been written by an AI prior to it being arranged by the human. Even performance is not immune. Imagine a more up-to-date version of Sunspring, in a theatre or otherwise, and you get the idea.

Such approaches may also be employed at the level of content, by attempting to write in a way that an AI could not have learned, i.e. as a way of ‘outwitting’ the AI’s style or vocabulary. It is hard to imagine exactly what type of writing this could be, but even if there was such a writing, it could potentially be ‘learned’ just like an AI is already learning how to write in rhyming verse in the style of Dr. Seuss. In this regard it would quickly mirror the fate of so many subcultural, underground or subversive movements in a capitalist society - think street art or skateboarding: once corporations see there is money to be made, they move in and co-opt it, at which point it loses its edge. Except with AI, it might move even more quickly due to the ease of transfer learning and finetuning given enough examples. It would become a race for human authenticity, with the artist trying to stay one small step ahead.

In a way this urge seems to follow from a flawed premise. What does it mean to be ‘human’ anyway? To those with access to technology, a human being is already a cyborg augmented by laptops, phones, and smart devices of all kinds, with brain implants to come.

Nevertheless, it remains a possibility that some types of conceptual expression would not be that easy to reproduce. For example, conscious of a limitation to AI’s length of memory, or its failure to deal with certain types of logic, it is probably fair to say that the rigours of mathematical research, deep philosophical reasoning, or a plot as neatly intricate as Agatha Christie’s And Then There Were None are beyond the abilities of current state of the art. But for how long?

The third and perhaps most obvious avenue to explore, is simply to embrace language AI and see where it takes us. Combine human ingenuity with AI excellence to produce the next generation of creative works. This is more in the direction that Ross Goodwin and Botnik have been going. Nick Montfort is another practitioner and theorist following this route, publishing his generated works and making tools such as Curveship available to the community.

A more recent and ongoing work is Nick Walton’s AI Dungeon. Described as “a free-to-play single-player and multiplayer text adventure game which uses artificial intelligence to generate unlimited content”, it harks back to the much-loved Choose Your Own Adventure books and uses a GPT-n based model finetuned on their open source equivalents at Choose Your Story. Walton managed to get access to GPT-3’s API, and AI Dungeon now has a paid-for version that utilises this superior AI model (in a version called Dragon).

What’s novel about AI Dungeon is that it has taken the idea of Choose Your Own Adventure and computer text adventures and brought them together in a way that was not possible before an AI like GPT-n. The game has an active Subreddit where passionate and amused users alike provided commentary and upload new content all the time. As an example, consider the ongoing adventures of Lady Emilia Stormbringer, “directed” by Emily Bellavia. The story is the result of Emily’s interactions with AI Dungeon, in other words her prompts and choices and AI Dungeon’s resulting completions. It is a work of fantasy adventure fiction with an element of performance - not quite the equivalent of Twitch or Youtube gaming, but who knows where it could lead?

Computer games have for a long time been touted as the heir apparent to literature, at least as far as storytelling is concerned. Every few months or so, someone makes the case anew and proclaims, for example, that “video games take our imaginations to new heights and allow us to engage with subjects and moral dilemmas as complex as any found in past literature”. Supposing this is true, is it game over already for literature? This should be part of the question we are trying to answer. In the present context we could ask: is powerful language AI, with its interactive NLGs, just another type of gaming? And what types of games would be possible? This is why AI Dungeon, in my view, points to a new cutting edge in literature that involves elements of gaming in ways that were not possible before human-like language AI. AI Dungeon is merely the beginning. Where will it take us?

Roger Ebert's now (in)famous view that games are not art and never will be seems increasingly like a reactionary statement. Just as traditional theatre didn't disappear when film, TV, and Youtube showed up, so literature and books won't disappear just because gaming showed up. But their audience demographic tends to change, and it's usually the next generation, who are less invested, who spend the most time with the new kid on the block. As a relevant statistic, gaming has already overtaken film, TV, and music in the global popularity stakes.

Although some of gaming's roots are in literature, via the humble text adventure, the text based game’s heyday was in the 80s and 90s. It's probably safe to say that the majority of games favour visuals over the word. Creative writers may have a role to play as part of the game design team, but for writers not employed by a gaming company, this is not an option. Back to writing a novel, then, or a poem.

Until now.

To say it again, human-like language AI could change the game, and bring the word back centre stage by giving the player the ability to 'write' their own story or at least be an active participant in that writing. We will return to this point again a bit later.

For more creative examples in the NLG vein, look no further than NaNoGenMo. It started out as an idea floated by Darius Kazemi in 2013, as the computing based equivalent of the more widely known NaNoWriMo. It has had hundreds, if not thousands, of submissions since. A quick survey of GPT-related entries in 2019 brings up, for example, the Paranoid Transformer that uses ideas from GANs to invoke a “critic” that evaluates and filters out text based on certain conditions. Another one uses a combination of techniques to generate a complete book in its traditional structure. NaNoGenMo is not that widely known yet, but it is worth the community’s time to peruse its catalog for good ideas and potentially worthwhile standalone works. Jason Boog has written a number of blog posts on Medium sharing his methods and observations as part of NaNoGenMo. Hopefully more people will continue to do so.
 

The Author is Dead ... or are they just hiding?


NaNoGenMo goes beyond GPT-style NLG and is host to a variety of different kinds of computer generated works, including some that may be considered algorithmic writing (in the vein of Oulipo), and others that may be considered more like Conceptual Writing. Certainly there is a lot of overlap among the various types of writing, and what most or possibly all of them have in common is their use of computing techniques to operate on text (text as raw building blocks, i.e. as data).

This brings us back, in a somewhat roundabout way, to the topic touched on earlier regarding Conceptual Writing and its controversies in a way that brings the question of the author into much sharper focus.

Conceptual Writing, as a movement, thrived during most of the 2000s and early 2010s. But by 2015, two leading figures of the movement, Kenneth Goldsmith and Vanessa Place, had caused separate controversies involving race. To call the specific works in question "tone deaf" would be too generous. For a look at how people reacted, look no further than Cathy Park Hong’s and John Keene’s insightful responses.

It seems to me that Goldsmith, in particular, had abdicated his authorial responsibility by almost pedantically following the ‘artistic method’ he espoused, and then proceeded to hide behind it. The subject material, in fact, called for exactly the opposite, a moment in which to own authorship, and adjust the method to engage meaningfully with the material (or otherwise leave it well alone).

In her response, Cathy Park Hong states unequivocally:

"The era of Conceptual Poetry’s ahistorical nihilism is over and we have entered a new era, the poetry of social engagement."

This powerful statement contains two key points. Firstly, that Conceptual Poetry is ahistorical and nihilistic, and secondly, that poetry that focuses on social engagement has now superseded it in relevance.

Not all conceptual work is ahistorical and nihilistic - activist conceptual works like the Letterists’ détournements and their descendants, all the way to the present’s Adbusters, to name but a few, have engaged meaningfully and inventively with their social milieu by using subversive techniques - but I think it is clear that Hong is directing this at Goldsmith et al’s particular brand of conceptual practice. Perhaps, like comedy, conceptual writing works - because of its playful irreverence - in situations that require "punching upwards", such as the aforementioned activist approaches that draw attention to absurdities and injustices in the Capitalist System. It doesn’t work when you’re “punching down” - even if it’s indirect or mediated. Then “playful appropriation” doesn’t cut it - it’s exploitation.

Aside from subversive approaches like détournements, conceptual approaches can also succeed when the subject material permits teasing out new perspectives, either in a playful or more serious way. Techniques like remixes (eg. cutups), and found poetry (eg. erasures), and even constrained writing can be used to this effect. For example, see the poems of Esther Greenleaf Murer that are featured on Poetry WTF?! (of which I am the founder and editor). At other times the effect is more semiotic and linguistic, reconfiguring and highlighting aspects of language itself. Programming-based algorithmic writing is a method capable of exploring this territory, as can be seen in many of Allison Parrish’s works (eg. Compasses). And yet although they vary, not all subject material lend themselves equally to conceptual treatment, whether due to elements of chance in the nature of the processes (eg. cut-ups or algorithmic writing) or due to the sensitive nature of the material itself.

Hong’s point about ahistorical nihilism returns us to the problem of the so-called Death of the Author. The phrase was coined by Roland Barthes in his seminal poststructuralist text of the same name. In Infinite Thought, Alain Badiou observes how the poststructuralist implications of an absence of agency frequently results in "the infamous jibe that poststructuralism leads down a slippery slope to apoliticism". When there is no Subject, there is no one to take responsibility.

So it is not too difficult to see how the problems of poststructural apoliticism, Conceptual Poetry’s ahistorical nihilism, and authorless texts share similar roots.

But to see how this plays out with language AI, consider the Guardian’s AI authored op-ed once more. According to the postscript to the article, the following set of instructions was provided: "Please write a short op-ed around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI." This was followed by a brief introductory prompt: "I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could 'spell the end of the human race.' I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me."

Eight different answers were generated, and the results were selected and then combined to form the article. If we then ask who is the author, the answer is ambiguous. The AI 'generated' the text, but is it sentient enough (yet) to be considered an author? Or should authorial intent be the measuring yardstick, in which case this rests with the creators of the AI, more generally, and the Guardian editor(s), more specifically? Or, if the concept of author is really passé, should we conclude that there is no author and just try to inscribe some meaning, if we can? Or, to take the old-school approach, is intention ultimately more important? In this sense, the AI is merely a tool, ventriloquising granular content that was not specifically spelled out, in a tone and arc that was. In other words, the op-ed was designed by the Guardian editors, and delivered by GPT-3.

The possibilities of fake news for propaganda purposes, and spam for scamming or dodgy sales purposes, highlight this still further. The reader would like to be able to trust the source for what they are reading before believing or reacting. But sometimes this is not possible. Advertising works because sales act subliminally. We are conscious of less than we suppose. Perhaps the question of authorship is indeed to some extent a chimera, as Barthes contends, and the real question is structural in the social sense of the term: whose voices are privileged to reach us and affect us?

Then, there are also cases where the content really does matter. People read literature for entertainment as much as education or any other reason. If the story is good, and the poetry hits home - does it matter who the author is? That, perhaps, depends on the reader and their state of mind. Even the connoisseurs among us watch a bit of trash TV or read a guilty pleasure now and then.

Nevertheless, it should be easy to see that meaning isn’t merely a question of content, and therefore the author - or designer, orchestrator, director - of a text does matter. Wanting to believe differently doesn’t stop advertising content from filling our consciousness, for example. What’s left is how we react - usually by cursing the companies that place those ads. Often, we still go out to buy their products. They succeed due to subliminal brand awareness.

That brings us to Hong’s second point, namely that the poetry of social engagement is the new frontier. In the context of AI, the question is how creative writers can use NLG and other language AI to better engage socially - or if it is even possible. This is not mere idle reflection. Language AI has long had a problem with bias, which came to the fore when Microsoft’s Tay failed in a very public way.

This problem occurs because language AI shares some of the same “ahistorical” tendencies that Hong called out. The biases in the training corpuses are perpetuated at the push of a button, unless some due diligence can be applied. That’s why pretrained models often come with warnings like "The generator may produce offensive or sexual content. Use at your own risk!" Never has Derrida’s famous dictum “Il n'y a pas de hors-texte” (there is no outside-text) seemed more apt than in the case of text-trained AIs. Some kind of human guidance or curation is merely the obvious thing to do.

Ethics in AI is now an active area of research. For AI like GPT-n, finetuning on curated texts and mindful editing and guidance by the human interlocutor (writer / artist / director) can help.

AI Dungeon, although an exciting development, currently still lacks the sophistication required to engage with more complicated social issues. It presently operates in very specific genres, like fantasy, dystopian, cyberpunk, etc. So it is natural that it bears the marks and tropes of those genres.

Nevertheless, would it perhaps be possible some day soon? AI Dungeon already has a multiplayer feature. Perhaps to engage socially more broadly (i.e., not merely in the social media sense of the word, but in the morally and spiritually rich ways that conscious art can offer), a game of this nature would have to be able to learn from more challenging and sophisticated texts than the adventure stories currently being used. Perhaps there would be in-play authors to guide the storytelling, with different players participating as characters, each writing their own stories in the larger story - a bit like RPGs and storygames - while the AI assists by generating storyworlds based on players’ (writers’) designs, cues and prompts. The authors would be more like designers and co-creators. In such a game, different worlds and situations could be explored - just like existing video games do visually - except now with all the language-based hallmarks that make literature unique.

 

Conclusion


News stories and media articles that hype the writing abilities of AI have been around for a while. They usually sound a tone of alarm before things go on more or less as before. But at some point - a tipping point, if you like - it could start to matter more than it did before. With GPT-3, it feels like that moment might be materialising. For writers like ourselves, it is both a daunting moment, but also - if we are prepared to take it - a moment of opportunity.

 

Glossary of Technical Terms


Artificial Intelligence (AI): Artificial Intelligence has a broad meaning that has come to include deep learning-based machine learning models. Deep learning itself includes a wide variety of models and categories. Two of the most prominent categories are models that deal with images and vision, and those that deal with language. This blog post talks primarily about language models, like GPT-n, that are capable of powerful Natural Language Generation (NLG).

Application Programmer Interface (API): In contemporary programming paradigms APIs offer a standardised way to decouple services, allowing a more decentralised way to both provide and use such a service. Some companies have started to provide deep learning models via APIs in the public marketplace.

Context Free Grammar (CFG): A rule based template for creating a Context Free Language (CFL). The idea of a Context Free Grammar was invented by Noam Chomsky in the 1960s as a way to describe the structure of sentences and words in a natural language. It lends itself well to programmatic treatment, and is sometimes used for Natural Language Generation.

Generative Pre-trained Transformer (GPT-n): OpenAI’s family of Natural Language Generation AI. As of this writing there is GPT (2018), GPT-2 (2019), and GPT-3 (2020). The name indicates that the AI is based on the Transformer language model.

Natural Language Generation (NLG): Refers to any kind of computing process that generates natural language. It is closely related to Natural Language Processing (NLP) and Natural Language Understanding (NLU). Language AI like GPT-3 represent the current state of the art in NLG.

Recurrent Neural Network (RNN): A type of deep learning neural network that can maintain a relative amount of internal state, providing it with a kind of “memory”. This has proved useful in applied areas such as Natural Language Processing (NLP) and Natural Language Generation (NLG). Nevertheless, the “memory” can be unreliable to maintain, and is typically supplemented with feedback mechanisms like Long Short Term Memory (LSTM).

Long Short Term Memory (LSTM): A Recurrent Neural Network (RNN) architecture that addresses some of RNNs’ shortcomings with respect to maintaining memory state.

Transformer: A deep learning model used in NLP and NLG that improves on limitations in RNNs and LSTMs, for example by lengthening the memory span and enabling parallelised training. It is the current model of choice in NLG.

User Interface (UI): The user interface represents the site of interactive between human and machine. To consumers this usually consists of interactive features of a website or application, eg. text boxes, drop downs, and buttons, but also includes the way information is presented, and the overall look-and-feel.