So, what is PoemCrunch? Is it a text-bending exercise involving the classics? Is it an exploration of poetics with the aid of computing? And, does it have a point? Yes, Yes, and - good question: Yes!
In short, PoemCrunch showcases a particular type of computer aided poetry, namely those templated from various classic poems with a formal structure called Context Free Grammar (a linguistic concept pioneered by Noam Chomsky), and subsequently automated using the modern art of computing. That’s PoemCrunch in a nutshell. How it ended up that way, however, requires a bit more explanation.
Some time ago I talked about the need for Poetry Technology and argued for its wider adoption. At the time I'd had PoetryDB up for a while and was looking at new ways to create poems from existing material, with the aid of - as I called it - Poetry Technology.
I looked at the available tools and got the impression that creating Poetry Technology that produces good poetry will be a Hard problem to solve. There are plenty of NLP (not Anthony Robbins' stomping ground, but Natural Language Processing) tools available, ranging from low level and specific (lemmatisers, tokenisers, persers, stemmers) to higher level and use case oriented (text summarisers, article writers, etc.). I could see they'd be useful tools to play with, especially the low level tools, but where do I go from there?
Before getting ahead of myself, I identified three broad areas to explore, which can be categorised loosely as Easy, Harder, and Really Hard. Easy represents the lowest barrier to entry, the proverbial "lowest hanging fruit". Techniques in this space involve various venerable home hacks such as cut-ups, chained n-grams, and the like. They are fairly easy to get to grips with, and yet capable of surprisingly interesting results. Unfortunately, the drawbacks are considerable too. They provide little or no control over semantics and structure, and the results require a healthy dose of editing and curation to be of interest to a real consumer.
The Really Hard group is a space that I associate with IBM’s Watson, deep machine learning and Huge gulps of Language Data. Whatever the holy grail of language processing looks like, it lies somewhere on the road to that misty, mythical place where fabulous new poems and major works of literature will one day be spawned in the depths of a data lake. It is also a world I don’t know very well yet and, if I ever want to reach it, I have much - much - to learn.
Luckily for me, a journey is not made up of giant leaps. In the meantime, there is plenty to explore beyond the green fields of Easy, in the wet marshes of Harder. Our sustenance here, so I figured, would be a selection of the NLP tools already on offer, a willingness to dabble in linguistics, and ultimately, a desire to see the literary emerge from the primordial soup of language, life ... and data.
Back to my story. As may be imagined, I started out looking for easy pickings. I was still in the Easy phase. I experimented with the data available to me on PoetryDB, which is a wealth of some of the best poetry in the English language up to about a century ago, easily consumable via an API. I tried various angles. A simple, and not uninteresting, approach was to use, say, all the sonnets of Shakespeare and create new sonnets at random. With fairly simple code I could combine the first line of Sonnet X with the second line of Sonnet Y with the third line of Sonnet Z and so forth until voilĂ ! a new sonnet. There are 154 sonnets by Shakespeare on the site, so plenty of permutations are possible.
In other cases I took lines from different poets and mixed them together. This was even more interesting. The results can be both quirky and cohesive, as can be seen in the following example, which mixes William Shakespeare with Alexander Pope. A sonnet rhyme scheme was used to guide the selection of lines, and the result actually hangs together quite well:
And agents from each foreign stateThis was good fun - up to a point. For every case like this there were also cases that didn’t work very well. How do we let the tool decide when a poem is decent and when it is not? We have to teach it the skill of judgment, of poetry aesthetics - or at the very, very least, the skill of deciding when a sentence or series of sentences are grammatically correct. I began wondering about machine learning's capabilities at this point, but it is linguistic NLP that drew me in first.
And birds sit brooding in the snow,
Merchants unloaded here their freight,
When all aloud the wind doth blow,
Since no reprisals can be made on thee.
In so profound abysm I throw all care
They parch'd with heat, and I inflamed by thee.
As any she belied with false compare.
Without a pain, a trouble, or a fear;
Shall I compare thee to a summer's day?
See what delights in sylvan scenes appear!
From his low tract, and look another way:
The sun, next those the fairest light,
Do paint the meadows with delight,
All roads eventually lead to NLP tools, and the prince of NLP tools is the Python NLTK (Natural Language Toolkit). Be that as it may, it is not the first one I became aware of. Being more of a Rubyist myself, I stumbled upon diasks2’s excellent page of links to Ruby NLP tools. This was both fortuitous and ultimately sobering, but more about that in a minute.
At this point I was still on the first leg of my journey, the “looking for quick wins” phase, but I was on the verge of embarking on the second. This transition occurred after my initial foray into NLP automation, which ended up being a bit of a disaster...
Now, linguists and NLP'ers commonly talk about something called Part-of-Speech tagging (POS tagging for short), which is a way of saying sentences are comprised of linguistic elements like nouns, verbs, adjectives, adverbs and so on, and we can can "tag" (or create metadata for) all those elements in a sentence, in a paragraph or in a full text. It's something I'd played with before, based on Zachary Scholl's work, without quite knowing that it was a widely used technique or that numerous NLP tools cater for it.
There were several Ruby tools in diasks2's list, and I approached them with gusto. After playing with various tools, the possibilities of automated tagging looked rather fun. In fact, I got so excited that I decided to brute force tag most of the poetry available to me via PoetryDB (i.e., quite a lot of poetry), and build a kind of “poetic dictionary” of nouns, verbs, etc., and finally build new poems by interpolating this vocabulary with the templates of the original poems. What a great idea. What could possibly go wrong?!
In case you hadn't guessed: just about everything!
To begin with, poetic language is well ahead of prose when it comes to bending the rules of language, and it laps journalism and scientific writing comfortably on that same account without even breaking a sweat. In short, it beats all comers hands down. Yes, sometimes even healthy folk, otherwise at ease with urbanspeak and text talk, break out in a rash when confronted with Shakespeare, or the evergreen Chaucer. Evergreen he may be, but we nolite talken like that no-more…
You get my point. The phrase “poetic license” was coined for a reason, and here I was, trying to run Part-Of-Speech tagging over the works of Shakespeare, Chaucer, Byron, Dickinson, Browning and the results were OMG?! This initial mistake obscured an even deeper problem, which I’ll come to in a second, and which is when I fully and truly had to leave those Halcyon days of the low and ripe fruit behind.
Once I’d narrowed down the types of poetry I used to the more understandable language of poets like Henry Wadsworth Longfellow, I began to see slightly better results - but not much. It was depressing. "Where am I going wrong?", I wondered in frustration.
All this time I was resisting the nagging feeling that I might have to give up on a fully automated process, from vocabulary through to templating to new poem generation. I really wanted to have my cake and eat it too: I wanted to use all the best poetry available as my input, I wanted to use available NLP tools to extract all the linguistic nutrients, and I wanted to do all this in an automated fashion without human intervention to see where the best results would come from. If only. Without fail, nearly all the results were dismal. Here and there was a little glimmer of brilliance, but mostly they were few and far between.
I was effectively not really in control of the process, trusting that the tools were as good as I wanted them to be. I had delegated my trust. Sometimes it's hard to distinguish between excitement, overconfidence, and foolishness. I was learning the hard way that not only was the poetry too varied and “poetic” to be reduced to one dimensional categories of speech, but likewise the tools were far too blunt.
At this point I really want to single out the idea of Part-Of-Speech tagging. It is a particularly misleading concept in the context of poetry. Part of speech really is part of nothing here, because in poetry, where the language is often interlinked, if you poke in one place, you poke everywhere - it is a ball of nerves. It is misleading because it is on the one hand not granular, nuanced enough, and on the other hand not wholistic, interdependent enough. It gives the illusion of grasping language, of having grasped elements of the poem. In reality it is more like grabbing a delicate butterfly by the leg, and finding you’ve torn it off.
This realisation shouldn't obscure the fact that the idea was useful. What it meant in practice though was that I'd have to invent my own categories over and above the existing ones. More than that, I'd have to start looking at the actual structure of the poems and describe, on a metalevel, how linguistic parts interlink with each other. So it is not just a case of seeing a "verb" here and an "adjective" here, it is also about understanding that the speaker is trying to describe, say, a feeling. What other ways are there to describe that feeling? And how does it relate to other parts of the poem? Easier said than done, and really, this is a challenge that still remains. Today's PoemCrunch release is still just the first milestone.
I started with Yeats' “An Irish Airman Foresees His Death”. The first challenge, which was a step beyond POS tagging, was to interlink the various pronouns: I -> my -> me, their -> them, etc. and find ways to exchange this for alternatives: he -> his, we -> our etc. (You can still see this process happening on PoemCrunch, where the latest incarnation has found its home). I refined the parts of speech and tinkered with the vocabulary. It was interesting, and the results were improving, but I still felt slightly underwhelmed. I knew I had to refine the process, but I was still surprised at how slowly the poem was giving up its secrets.
I decided to try a different tack. Perhaps the poems were too complex, and I was being too ambitious. I tried a couple of simple poems, and even wrote a few simple rhymes myself. No luck. The results were noticeably worse. What the hey?! Now at this point I might be forgiven for thinking that it could be time to move on and try a totally different angle. Were it not for that little voice in the back of my mind saying "but it has got to work!!", despite evidence to the contrary, that's probably what I would have done. But instead I decided to try once more and apply the same process to Shelley's Ozymandias. Ozymandias is in some ways a more complex bit of clockwork, and I really didn't expect much at all. Imagine my surprise, then, when I immediately saw several signs of improvement. How was it possible?
I refined the template further, and for the first time I was beginning to see genuine progress. It was also becoming clearer that some poems were more suitable to the treatment than others. (I returned to Yeats’ poem in the end, but only after I’d learned a few more lessons that I could apply.) It’s too early to say, even now, but there is some indication that some of the “incontrovertibly great” traditional poems like "Ozymandias", and "Sonnet 18" actually work really well, and that it might be because their inner workings are so precise that they are amenable to having the individual parts exchanged out.
At this stage I was still searching for a tool that could at least correct simple grammatical errors, and I found one in Gingerice. Its services came at a considerable performance penalty, but it sometimes (not always) rescued sentence segments that were otherwise problematic, and I was willing to live with the performance hit. The fact that I’ve now been able to phase it out almost completely is a tribute to the fact that the overall process has improved dramatically.
One of the biggest tangible lessons I learned during this process was not to rely on the existing NLP way of doing things like POS tagging. I repeat this point because in literature it is all about how you say things, not just what you say, and I realised I will have to start creating my own tags - which I soon did. I now work with a vocabulary set that has over forty non-standard tags, including different categories of nouns: nouns of people (policemen, teachers, actors, etc.), nouns of animals (bears, elephants, birds etc.), the list goes on. (A special shout out must go to Enchanted Learning, whose lists of words proved invaluable). This is still just scratching the surface. What if I wanted to write something with a specifically steampunk flavour? I will need to provide a dictionary that caters to that genre, with adjectives, nouns, phrases, idioms that evoke those steam punk elements. This is very exciting.
The results were getting better and better as I refined the process, and it strongly pointed to having more - rather than less - control over poem segments. This was the total opposite of where I'd started, when I basically wanted an unsupervised process to take care of things. In fact, it is a lot more like writing "normal" poetry (which is to say, without programmatic intervention), and I would say the lines will blur more and more in future. The naysayers will probably go quiet at that point, but hey, let the poetry do the talking. There is a still a good way to go before that will happen. PoemCrunch has reached its first milestone, yet there are many ways in which the process can still be enhanced. Language errors do creep in, sometimes because the underying NLP tools do not provide correct results, sometimes because the structural interaction of various poem components is still at an early stage. Then there is the data. Building a good vocabulary, for one, is time-consuming, and it is becoming clear to me that they have to be tailored to work well. The extent of that tailoring depends on many factors, and one always wants to leave enough unpredictability to keep things interesting.
Another interesting outcome of the whole process was that, although I hadn't set out to cover the same ground as Zachary Scholl, I was nevertheless coming to similar conclusions. For instance, I found that due to the heuristics our brains apply, it makes sense to group words according to emotional tone (eg. words that are positive or negative in tone). Likewise, I had begun to create custom tags in response to the limitations of "standard" tags. It probably suggests that there is often convergence along the path of evolution given certain starting conditions - in this case the parameters of a context free grammar in relation to English poetry.
Although there are still exciting improvements to be made, the consistency with which I was achieving interesting results suggested I was reaching a milestone. It was time to share those results. That's when I started work on setting up PoemCrunch. It was conceived as a showcase of “new poems” generated from the templates of those old classics, via the process described above. To this end I have selected 5 poems: “The Tyger” (by William Blake), “Sonnet 18” (by William Shakespeare), “Ozymandias” (by Percy Bysshe Shelley), “Do not go Gentle into that Good Night” (by Dylan Thomas), and “An Irish Airman Foresees his Death” (by William Butler Yeats).
The result is a series of poems that have much in common with their illustrious parents - not least the rhythm and rhyme scheme - but also reveals to what extent the strength of those classics lies in their magnificent scaffolding, against which new bricks can be laid, new windows can be installed, and a new facade can be erected.
Enjoy!