Monday, March 02, 2015

Why We Need Poetry Technology

Sometimes you’re doing something that is so new that it has no name yet.

Over the last few years, on and off, I’ve been experimenting with programmatic approaches to new ways of writing. An obvious starting point was cut-ups. Cut-ups are cool and can be useful as a writing aid. William Burroughs made them sing in striking disharmony. He also revoked the traditional author’s monopoly on textual narrative.

I wrote several variations on a primitive cut-up generator program. It was meant to automate the job that Burroughs achieved with paper and scissors. The outcome illustrates some of the possibilities (as well as limitations) of a basic application of the idea.

This was 2007. A few isolated voices aside it was hard to find anyone who was experimenting in this area. There was hype about e-books and Amazon's soon-to-be-released e-book reader, but if literature was about to experience a revolution I wasn't in on the secret.

And so the first Kindle arrived and all the talk was of e-books, as if that was a milestone in literary innovation. To be sure, it wasn’t even close. The first e-books were simply content transposed into portable digital formats, and the Kindle extended that to the physical appliance. In other words, what you held in your hand and how you paged it was new. The reading content remained just the same.

The years passed and it kept nagging at me that in the age of the internet, literature was missing out on a massive opportunity. What opportunity exactly? The opportunity to use information technologies and the internet for the purposes of literary creativity. To bend it to our will. What the first Great Big Work of the internet age was going to look like exactly I wasn’t too sure, but one thing was crystal clear: it hadn’t yet been written.

Fast forward seven years and the major literary awards are still going to traditional forms of literature. But on the fringes and beneath the surface, the beginnings of a new way of writing literature is brewing. When I first started reading about Andy Warhol's literary works, about Flarf poetry, text remixing and the use of texts as material, the scales fell from my eyes.

Kenneth Goldsmith chronicles the background and rise of this subculture in his excellent book Uncreative Writing. This loosely distributed internet engaged community has been producing interesting and provocative literature for the last decade or so. Its influences, too, are discussed in stimulating detail and reach back via Andy Warhol, Oulipo, the Situationists, Walter Benjamin and all the way to Gertrude Stein.

However, as Goldsmith himself observes, he himself is but a bridge between the old world of literature and the new world of an as yet undefined anonymously authored Uncreative era:

“The future really belongs to anonymous writers writing for anonymous readers: people who are writing programmes for machines to read, for other machines to read; I think this whole thing is going to be pushed much further. I’m just a bridge between the old and the new.”

Goldsmith’s work has provided me with a guiding principle while I explored the implications of uncreative writing. It inspired my venture into experimental poetry curation, an online zine called Poetry WTF?!. The work that is being published at Poetry WTF?! operate on the principle that existing texts are material to be used for new poetry. The resulting artefacts are often hand crafted to reveal stimulating, ironic or conceptual new poems - and sometimes a creation exhibits all of these qualities at once.

By decoupling human agency from the immediacy of expression and instead reintroducing it at the stage of creative composition we are preparing the stage for a new type of writing. Nevertheless, when we do so we are still inhabiting the world that the Oulipoets from the 1960s would recognise. It is a phase in a literary evolution that has not been taken to its limits, that hasn’t transformed into a radically new literary being just yet. We could say that the ontology of literature is still authored, analog, and un-automated.

As Goldsmith speculates, the next stage of this evolution will be evidenced by greater anonymity of authorship as well as readership. In fact this is already happening. This anonymous dialectic between creator and consumer is being played out, at the very moment that I’m authoring this, by a variety of Twitter bots. Pentametron, the brainchild of NY based conceptual artist Ranjit Bhatnagar, is one of the best known poetry Twitter bots. It has been around since 2012.

Pentametron employs an automated program (the bot) that searches Twitter feeds for tweets written in iambic pentameter, matches two that rhyme, and writes them out as Pentametron tweets. Pretty simple, but the results are remarkably readable and rather moreish. They’re also a vindication of Goldsmith’s controversial observation that language transforms rather than loses its expressive capacity when viewed as material - which is precisely what Twitter bots do par excellence. The success of the work now depends on the repeatable realisation of a concept rather than on novelty of expression.

This technological mediation is clearly a step in the right direction. It is easy to see that Pentametron’s automated method of operation and Kindle’s mere transplant of content from a physical book to an e-book are poles apart. In the former, technology is playing a significant role in the creative process. The medium itself is now coming into play both in creation as well as consumption.

While Twitter hosts some of the most famous literary bots, it isn’t the only platform where automated, anonymous literature can be read. Tumblr has its own share of autoposted literary mash-ups. A typical case in point is King James Programming, which employs an algorithm known as Markov chains to combine phrases from the King James Bible with a couple of programming guides. The results are generally seamless and frequently funny, as this example illustrates:

“And since programming languages are largely written in English, who would suspect a language to come from Japan? And yet, here is this great and wide sea, wherein are things creeping innumerable, both small and great”

The snowball poem generator uses Markov chains to create a totally different type of poem called a snowball (a.k.a. a chaterism). It is a type of constrained writing (because it is based on a set of rules) and concrete poetry (since its typography is important). This particular snowball generator even got a mention on Boing Boing.

The use of Markov chains has become a favoured approach in automated poetry and literature generation. Markov chains are used so widely now, from genetics to physics, that few people are aware that the Russian mathematician Andrei Markov in fact developed his now-famous concept by studying consonant and vowel patterns in poetry. Poetry generation and Markov chains go together like strawberries and cream.

Yet, as can be expected in such a young and burgeoning field, Markov chains is not the only game in town. Various other approaches to generating literature have been attempted. Just recently it came to light that Zackary Scholl submitted several poems to a literary journal back in 2011, one of which was subsequently accepted and published. The twist in the tale is that the poem was not written by him directly, but generated by a program he developed. Scholl's program employs a type of context-free grammar, an area of linguistics invented by Noam Chomsky, called Backus-Naur Form.

Scholl has made his code available on Github, and the poetry generator can be seen in action on his website, where you can generate new poems at the click of a button. Some are pretty good, too.

This is definitely a trend, and the methods will only get more complex. What machine learning can do for Watson of Jeopardy fame, it can surely do for poetry and literature in general. But who will take up the challenge?

This is part of the question that has been bouncing around my head during the past year. I looked in vain around the internet for evidence that literature or poetry is evolving along the lines of, say, finance or marketing, which both enjoy tremendous technological innovation to create more intelligent platforms and, of course, generate more money. Surprise surprise, I couldn’t find even a single website dedicated to literary texts that made their content available via an API. How on earth are we going to get literature into the information age if Shakespeare is still stuck in a book?! (including e-books)

That’s when I decided to create Poetry DB, the world’s first poetry database that has an easy-to-use API ready and available for both human and automated machine consumption. As of this writing Poetry DB contains a selection of poetry by most of the well known pre-20th century poets in the English language (from Chaucer to Dickinson and beyond), as well as the complete works of a subset of those (such as Shelley, Keats, Clare, Byron and Blake).

Yet whenever dinner party talk turned to my hobbies, I got the same slightly anxious reaction about my hopes for programmatically generated poetry. I would talk about APIs and the ability to grab lines from different poets at  Poetry DB and pass them through an algorithm that splices and dices and produces something both modern and ancient and beautiful. Then I'd hear a response along the lines of “but that’s not really poetry … (!!!)” or “but that’s just …. wrong”. The fact that I couldn’t point to any concrete example of Greatness in this brave new world didn't exactly help my cause.

This leads me back to the start of this discussion. Sometimes you’re doing something that is so new that it has no name yet. It finally dawned on me that the activity I'm engaged in is not simply creating poetry. I'm not just writing poetry. I am also trying to define the process and tools that are required for its new form. In short I'm entering a radically new space, helping to midwife a new type of literary ontology.

It is the literary equivalent of music pioneers like The Beatles (not that I'm comparing myself with them of course) playing with tape loops, creating noise that didn’t always sound like music to anyone else - maybe not even to themselves. Yet today their innovations are accepted as groundbreaking music. Today we also have sophisticated production music technologies with which to create and control sound and music. In other words, what we are doing with data driven and algorithmic poetry is perhaps best described as poetry made with poetry technology, via the application of poetry science. The end goal is still "poetry", but it's a new kind of poetry, in a new medium, and a new type of audience.

What do we mean by poetry science and poetry technology? Does this playful activity really warrant such formal terms? I think it does, because I think the process is being misunderstood as just a different type of traditional poetry, and development of the field is languishing as a result.

From Homer to Elena Ferrante, from Aristotle through to the present day, literature and literary appraisal are bound in a dialectic that permeates culture and occasionally beyond, even into the very fabric of politics and society. A body of knowledge has evolved that has theoretical as well as practical implications. This knowledge includes a more or less formal understanding of poetry (metre, rhythm, rhyme in traditional forms), drama, prose and various other forms of literature. It also concerns detailed and analytical appraisals, such as what are examples of good literature and why, which range from close readings to serious, serious literary ciriticsm. This body of knowledge is enormously rich.

Whenever a writer attempts to innovate, he or she is applying part of this inherited knowledge in new contexts or to new purposes. The outcome may be more or less successful, but part of this learning process is what we may consider the science of literature or poetry as the case may be. In other words, poetry science is both (1) the body of knowledge and (2) the application of (a subset and a particular interpretation of) that knowledge. Such a body of knowledge will no doubt in time come to include more formal interfaces to information technology, which becomes part of that science. Just like a pipet, and a petri dish, a telescope and data science are all inextricably part of physical science. Above all, science is a learning process to discover what succeeds and what doesn’t. Poetry technology (and literature technology more broadly) is the development of methods, supporting tools, and processes for the purpose of generating new poetry.

So for instance, Poetry DB and my forked development of Scholl’s original Poetry Generator are all poetry technologies aimed at the creation of poetry. They are also experiments that enable me to learn what works and how these technologies could be improved. As the field evolves, and machine learning techniques are developed that are capable of absorbing the existing body of poetry knowledge (not only an understanding of its formal properties as poetry differs from, say, prose, but especially the qualitative understanding of what distinguishes Great poetry from mediocre poetry), we may gradually come to see genuine novelty.

Just as it took a few decades for music technologies such as sophisticated post production software to really mature and come into their own, so it will take a while for poetry science and technology to evolve a robust set of concepts and solutions that writers will want to use on a regular basis. But given a bit of time a new breed of writers, with the aid of poetry technology, will plant their flags firmly in the technological infosphere.

In the meantime, if we continue to associate poetry technology with poetry's traditional context, its growth will be stunted. That's the alternative of inertia. "Yes, so what about traditional poetry?", I hear you say. They will co-exist, the old and the new. They have to. But it's time that we acknowledge poetry technology for what it is, and welcome the new.