Showing posts with label poetry. Show all posts
Showing posts with label poetry. Show all posts

Friday, October 23, 2020

GPT 2 Poetry

Some of the GPT-2 based tools available publicly can generate decent examples with the right prompt. However I wanted to see whether a finetuned model might do a little better. 

I collected a corpus of traditional poetry from PoetryDB, together with this corpus. After finetuning and taking samples at various different stages, my impression is that a finetuned model (in this case the GPT 2 medium size, 355M) can indeed generate poetry that holds its own against the default models', and in some cases may well be an improvement. A well crafted prompt is important though, and it helps to take into account the corpus it was trained on.

Decent snippets of poetry, in my experience, come at a frequency of less than 10% - probably somewhere between 5-10%. It definitely pays to try different prompts. It is clear that the model is willing to try and imitate a particular style, but the effects often wear off as the text generation grows longer. After a while it simply reverts to its comfort zone of country meadows, gurgling streams, and singing skylarks.

I enjoyed playing around with Emily Dickinson's style, because it is so distinctive. After just 12000 steps of training, when average loss was still quite high, I got these (my prompt in bold):

Sample 1

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:
Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

As a normal human--
Sadness and my fantasy --
A fear of annihilation --
The peace of carnival at dusk --
Strange adventures -- secrets hidden
Ghetto art-wars --
Dark powers from Russia's mausoleum
Postcards of Mars --
As fantasy, an exoticist
Conquer as I wish!
Peace is the enemy!
Peace the enemy!
There are precious elements
Of war --
Stones of battle -- war --
Air of steel -- shipyard wars
The enemy's sacred hideout
Among the bloodstained bone
All trails left in ruins
Go to the enemy --
I am closer to Heaven than I am to Earth
To Heaven than to a shell --
I am closer to Heaven than I am to death
To Heaven than to a mirror --
To a family --
The stranger to a bank --
The broken wave to which you come --
The little town behind --
The stranger with a nightmare
So clever it ends every day.

Sample 2

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:
Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

A wordless rebellion --
Housebound -- stricken with hope --
Cul-de-sac -- highway music
Psychoanalysis
Peace through gun violence
Offshoot of modernity --
Strenuous original genius
Raindrops -- cloud-flowers
Earth's marvelous eagle flight --
Fireflies -- find in poetry
Joy in example --
Astronauts lift alone
Lost souls (Fireflies)
The sole unknown genius of wisdom
The survival of the flying saucers
And humans
Further testing of binary
Death's voice -- death's may
Peace through gun violence
On Mars -- last challenge
On Mars -- the one.
And beyond a wilderness
Involuntary woods -- earth --
Under-written by humans
Animal culture -- obsession --
Dismantling your weakness
Unsung heroism --
Immortality

These examples appeared to be of decent quality despite training loss still being quite high. They probably represent a middle ground between the original model, and the new one, so that much of the model is still biased towards the original text (more prose-like) it had been trained on at OpenAI. Although not shown here, many samples evidenced prose-like sections.

Fast forward to a new version of the model at 33000 training steps with average loss around 0.5, and the poetry is more typically "poetic", but seems to be less versatile overall. This confirms the notion that it has taken on board more of the new training material (poetry), and is less influenced by the wide corpus it started with:

Sample 3

Here is a poem about life on Mars that is written in the terse style of Emily Dickinson:

Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society
certainly --
Nocturnal winds -- while the stars are alive --
The tribes of strangers -- and the long-term ill --
And the strange sudden malcontent
That got in, and got out, and didn't rest --
And the antitype -- that never stops --

The bonds of love -- the husband's --
The chieftain's -- and the subject's --
And the stranger in the asylum
The sportive puppy-vassal
Rests with the deer, and the hound with the hound
Stirms with the dwarf-father in the cave.

As the crow's wings grow dully broad
And the green-breasted dove
Eclipses the sun to land,
As the wild cherry-blossoms blow
And grow in the wind,
As the starlight fades
And spreads its feathery arms
And fades away in the flower,
As the blaze of the wild cherry-blossoms
Fades and leaves in the wind --

Sample 4

Here is a poem about the harsh and lonely life on the planet Mars. It is written in the terse style of Emily Dickinson:

Bereft I -- astronaut
Idle eyes -- earth
The alien things of Ancestors
Grainy melodies --
Silent majorities -- distant society

Alone -- in the vastness of the universe --
Settling there -- far away from the nearest
Humane -- or human -- aggressor --
Till I close my -- orbit -- in the moon --
Then out of the night -- into the day --

My career is not -- not yet --
My work is not yet --
The imagination's got my leave
To taste the dulness and thraldom
Of the little things.

The morning's the night's turn and the dreary
Dinner is between --
The fire shall burn forth --

The earth's shade is the lamp's boding light
And the moon's face is thaw --
But his cup, my heart insists,
That water's the wrong drink.

The sky's a slumber -- the north wind
Stirs -- and so, the sky.
For, with this world's uses to come,
A world's house is but a sky.

The sun's a-rising -- the stars,
Spangling the silvery day
With a tapestry of rising light --
The wrong house for me, I hanker


I tried a number of different prompts. Here's something a little like John Ashbery's:

Sample 5

Here is a poem about life on the planet Mars. It is written in the style of John Ashbery:

The rover will roll by later.
For now, the rockworker pauses on red ground,
fills in the time of day, the rocks' location,
leaves a note for the nightworker: "What the hell
are we going to do? I mean ...
The water is already drying up."
Rain on earth, as it does in heaven.


For those who come here, the note is of less importance, the rain. Once, once, indeed.

That voice is drowned.

And now at length the deed is done.

The morning-color comes on high;
The evening-color comes on high.

All the hills around Steep-Foot's bed
Look at the cloud-burthen coming down
From the very, very edge of last night's storms.

How the rock-bearers throw up their spears
Before the first bow of the rising moon!
Pleasant was the task to reach the brink.

Of yellow sapphire, and its rim of gleaming steel,
Is the price we pay for beauty this morn.
Lo, the sun shines, but the clouds are gray,
And the moon is on the big side;
For she rises like a volcano's peak
With her bright, lustreful head,
And her soft glossy arm is bare and soft
Like the hand that slips and slips again.


Traditional poetry is definitely the finetuned model's preferred mode of poetry generation. It seems to reel them off with ease, and the percentage of decent snippets come at a higher rate. It's not that it always makes perfect sense, but it's easy to see how even just a little tweaking can turn it into something quite readable. Here's one example:

Sample 6

I wandered lonely as a toad,
Till, stung by the keenness of my woe,
I half grew to fancy I was
A fowl of another kind,
And yet I could not despise
My own flesh:--a dreamy note
To my more melancholy heart
Said, 'Eat not, weep not; this is pity.'

'Sad was the agony, when
That hoary Flandric tree,--
Pity me! I have none,--
Sore as I am, with barren beak,
And I would die like a wood-dove,--
One moment made me think I was
With all the rest, a bugling wight,
Free from all misery, save fear.'

'Sad was the agony, when
One was clapt in the dark blue air,
And strode to the light of morning;
Sad was the agony, when
Saw a beam of light on a bed
Of many a fluttering flower,
And one was silvery in a ball;
And a sound of music came bright,
Cloven through and through, and far away
It stirr'd the chamber. Sad was the agony
Of listening to that light sound--
It did its work in silence and in pain.--

'So now I was old, and had to die;
Some fadeless bird from the glen,
A day had killed it,

Or another:

Sample 7

I wandered lonely as a toad,
Till, bored, I fell into the water;
For well I know those raggedy old stairs,
Worn by foot, and faded for good:
They know not that I was a Tot,
A "kingdom, or house," from out their memory;--
Never mind! Let darkness in again
Re-illustrate the stupidity of youth.


I've shared the checkpoint at 33000 steps of training along with a Notebook that can be opened in Colab (edit 2020-10-29: updated link to point to new aitextgen version of the notebook).

Sunday, May 03, 2020

BERT for Poetry

After the fun I had with the BERT Summariser and short stories, I decided to turn the trick to poetry. If anything the results are even better.

Here are five examples, starting with The Wasteland, which perfectly illustrates BERT's ability to find continuity.

The Wasteland, by T.S. Eliot


I. The Burial of the Dead

April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
If you see dear Mrs. Equitone,
Tell her I bring the horoscope myself:
One must be so careful these days.
When Lil’s husband got demobbed, I said—
I didn’t mince my words, I said to her myself,
HURRY UP PLEASE ITS TIME
Now Albert’s coming back, make yourself a bit smart.
But at my back from time to time I hear
The sound of horns and motors, which shall bring
Sweeney to Mrs. Porter in the spring.
In this decayed hole among the mountains
In the faint moonlight, the grass is singing
Over the tumbled graves, about the chapel
There is the empty chapel, only the wind’s home.


The Sonnets, by William Shakespeare


When forty winters shall besiege thy brow,
And dig deep trenches in thy beauty's field,
Thy youth's proud livery so gazed on now,
Will be a tatter'd weed of small worth held:
Then being asked, where all thy beauty lies,
Where all the treasure of thy lusty days;
To say, within thine own deep sunken eyes,
Were an all-eating shame, and thriftless praise.
Thy outward thus with outward praise is crown'd;
But those same tongues, that give thee so thine own,
In other accents do this praise confound
By seeing farther than the eye hath shown.
For I am shamed by that which I bring forth,
And so should you, to love things nothing worth.
So that eternal love in love's fresh case,
Weighs not the dust and injury of age,
Nor gives to necessary wrinkles place,
But makes antiquity for aye his page;
Finding the first conceit of love there bred,
Where time and outward form would show it dead.
Never believe though in my nature reign'd,
All frailties that besiege all kinds of blood,
That it could so preposterously be stain'd,
To leave for nothing all thy sum of good;
For nothing this wide universe I call,
Save thou, my rose, in it thou art my all.
Thence comes it that my name receives a brand,
And almost thence my nature is subdu'd
To what it works in, like the dyer's hand:
Pity me, then, and wish I were renew'd;
Whilst, like a willing patient, I will drink,
Potions of eisel 'gainst my strong infection;
No bitterness that I will bitter think,
Nor double penance, to correct correction.
Poor soul, the centre of my sinful earth,
My sinful earth these rebel powers array,
Why dost thou pine within and suffer dearth,
Painting thy outward walls so costly gay?
Then soul, live thou upon thy servant's loss,
And let that pine to aggravate thy store;
Buy terms divine in selling hours of dross;
Within be fed, without be rich no more:
So shall thou feed on Death, that feeds on men,
And Death once dead, there's no more dying then.


Poems by Elizabeth Barrett Browning


We sow the glebe, we reap the corn,
We build the house where we may rest,
And then, at moments, suddenly,
We look up to the great wide sky,
Inquiring wherefore we were born…
For earnest or for jest? Ere I answered he was gone,
And none was left to love in all the world.
A THOUGHT ay like a flower upon mine heart,
And drew around it other thoughts like bees
For multitude and thirst of sweetnesses;
Whereat rejoicing, I desired the art
Of the Greek whistler, who to wharf and mart
Could lure those insect swarms from orange-trees
That I might hive with me such thoughts and please
My soul so, always.
Let them feel that this cold metallic motion
Is not all the life God fashions or reveals:
Let them prove their living souls against the notion
That they live in you, or under you, O wheels!
If He heard us, He would surely
(For they call Him good and mild)
Answer, smiling down the steep world very purely,
'Come and rest with me, my child.'
They look up with their pale and sunken faces,
And their look is dread to see,
For they mind you of their angels in high places,
With eyes turned on Deity;—
"How long," they say, "how long, O cruel nation,
Will you stand, to move the world, on a child's heart,—
Stifle down with a mailed heel its palpitation,
And tread onward to your throne amid the mart?
How there you sat in summer-time,
May yet be in your mind;
And how you heard the green woods sing
Beneath the freshening wind. Not as the conqueror comes,
They, the true-hearted, came;
Not with the roll of the stirring drums,
And the trumpet that sings of fame;
Not as the flying come,
In silence and in fear, -
They shook the depths of the desert's gloom
With their hymns of lofty cheer.
EXPERIENCE, like a pale musician, holds
A dulcimer of patience in his hand,
Whence harmonies, we cannot understand,
Of God; will in his worlds, the strain unfolds
In sad-perplexed minors: deathly colds
Fall on us while we hear, and countermand
Our sanguine heart back from the fancyland
With nightingales in visionary wolds.


The Rape of the Lock, by Alexander Pope


'Nolueram, Belinda, tuos violare capillos;
 Sed juvat, hoc precibus me tribuisse tuis.'
If e'er one vision touch thy infant thought,
Of all the nurse and all the priest have taught;
Of airy elves by moonlight shadows seen,
The silver token, and the circled green,
Or virgins visited by angel-powers,
With golden crowns and wreaths of heavenly flowers;
Hear and believe! Unnumber'd treasures ope at once, and here
The various offerings of the world appear;
From each she nicely culls with curious toil,
And decks the goddess with the glittering spoil.
Then prostrate falls, and begs with ardent eyes
Soon to obtain, and long possess the prize:
The powers gave ear, and granted half his prayer,
The rest, the winds dispersed in empty air.
to your charge repair:
The fluttering fan be Zephyretta's care;
The drops to thee, Brillante, we consign;
And, Momentilla, let the watch be thine;
Do thou, Crispissa, tend her favourite lock;
Ariel himself shall be the guard of Shock.
Hither the heroes and the nymphs resort,
To taste awhile the pleasures of a court;
In various talk the instructive hours they pass'd,
Who gave the ball, or paid the visit last;
One speaks the glory of the British Queen,
And one describes a charming Indian screen;
A third interprets motions, looks, and eyes;
At every word a reputation dies.
At this, the blood the virgin's cheek forsook,
A livid paleness spreads o'er all her look;
She sees, and trembles at the approaching ill,
Just in the jaws of ruin, and Codille.
The Gnome rejoicing bears her gifts away,
Spreads his black wings, and slowly mounts to day.
No common weapons in their hands are found,
Like gods they fight, nor dread a mortal wound.


Poems by Henry Wadsworth Longfellow


Filled is Life's goblet to the brim;
And though my eyes with tears are dim,
I see its sparkling bubbles swim,
And chant a melancholy hymn
With solemn voice and slow lines
Listen my children and you shall hear
Of the midnight ride of Paul Revere,
On the eighteenth of April, in Seventy-five;
Hardly a man is now alive
Who remembers that famous day and year.
Birds of passage sailed through the leaden air, from the ice-bound,
Desolate northern bays to the shores of tropical islands,
Harvests were gathered in; and wild with the winds of September
Wrestled the trees of the forest, as Jacob of old with the angel.
Then, as the night descended, the herds returned from their pastures;
Sweet was the moist still air with the odor of milk from their udders;
Lowing they waited, and long, at the well-known bars of the farm-yard,--
Waited and looked in vain for the voice and the hand of the milkmaid.
From the red stone of the quarry
With his hand he broke a fragment,
Moulded it into a pipe-head,
Shaped and fashioned it with figures;
From the margin of the river
Took a long reed for a pipe-stem,
With its dark green leaves upon it;
Filled the pipe with bark of willow,
With the bark of the red willow;
Breathed upon the neighboring forest,
Made its great boughs chafe together,
Till in flame they burst and kindled;
And erect upon the mountains,
Gitche Manito, the mighty,
Smoked the calumet, the Peace-Pipe,
As a signal to the nations. The worthy pastor --
The shepherd of that wandering flock,
That has the ocean for its wold,
That has the vessel for its fold,
Leaping ever from rock to rock --
Spake, with accents mild and clear,
Words of warning, words of cheer,
But tedious to the bridegroom's ear.
I tell the mariner when to sail the seas;
I waft o'er all the land from far away
The breath and bloom of the Hesperides,
My birthplace. One mass of shade,
The elm-trees drop their curtains down;
By palace, park, and colonnade
I walk as in a foreign town.

Tuesday, July 28, 2015

Poetry Technology Goes Liftoff!

PoemCrunch has gone live and I can finally explain what it’s all about.

So, what is PoemCrunch? Is it a text-bending exercise involving the classics? Is it an exploration of poetics with the aid of computing? And, does it have a point? Yes, Yes, and - good question: Yes!

In short, PoemCrunch showcases a particular type of computer aided poetry, namely those templated from various classic poems with a formal structure called Context Free Grammar (a linguistic concept pioneered by Noam Chomsky), and subsequently automated using the modern art of computing. That’s PoemCrunch in a nutshell. How it ended up that way, however, requires a bit more explanation.

Some time ago I talked about the need for Poetry Technology and argued for its wider adoption. At the time I'd had PoetryDB up for a while and was looking at new ways to create poems from existing material, with the aid of - as I called it - Poetry Technology.

I looked at the available tools and got the impression that creating Poetry Technology that produces good poetry will be a Hard problem to solve. There are plenty of NLP (not Anthony Robbins' stomping ground, but Natural Language Processing) tools available, ranging from low level and specific (lemmatisers,  tokenisers, persers, stemmers) to higher level and use case oriented (text summarisers, article writers, etc.). I could see they'd be useful tools to play with, especially the low level tools, but where do I go from there?

Before getting ahead of myself, I identified three broad areas to explore, which can be categorised loosely as Easy, Harder, and Really Hard. Easy represents the lowest barrier to entry, the proverbial "lowest hanging fruit". Techniques in this space involve various venerable home hacks such as cut-ups, chained n-grams, and the like. They are fairly easy to get to grips with, and yet capable of surprisingly interesting results. Unfortunately, the drawbacks are considerable too. They provide little or no control over semantics and structure, and the results require a healthy dose of editing and curation to be of interest to a real consumer.

The Really Hard group is a space that I associate with IBM’s Watson, deep machine learning and Huge gulps of Language Data. Whatever the holy grail of language processing looks like, it lies somewhere on the road to that misty, mythical place where fabulous new poems and major works of literature will one day be spawned in the depths of a data lake. It is also a world I don’t know very well yet and, if I ever want to reach it, I have much - much - to learn.

Luckily for me, a journey is not made up of giant leaps. In the meantime, there is plenty to explore beyond the green fields of Easy, in the wet marshes of Harder.  Our sustenance here, so I figured, would be a selection of the NLP tools already on offer, a willingness to dabble in linguistics, and ultimately, a desire to see the literary emerge from the primordial soup of language, life ... and data.

Back to my story. As may be imagined, I started out looking for easy pickings. I was still in the Easy phase. I experimented with the data available to me on PoetryDB, which is a wealth of some of the best poetry in the English language up to about a century ago, easily consumable via an API. I tried various angles. A simple, and not uninteresting, approach was to use, say, all the sonnets of Shakespeare and create new sonnets at random. With fairly simple code I could combine the first line of Sonnet X with the second line of Sonnet Y with the third line of Sonnet Z and so forth until voilà! a new sonnet. There are 154 sonnets by Shakespeare on the site, so plenty of permutations are possible.

In other cases I took lines from different poets and mixed them together. This was even more interesting. The results can be both quirky and cohesive, as can be seen in the following example, which mixes William Shakespeare with Alexander Pope. A sonnet rhyme scheme was used to guide the selection of lines, and the result actually hangs together quite well:

   And agents from each foreign state
And birds sit brooding in the snow,
  Merchants unloaded here their freight,
When all aloud the wind doth blow,
Since no reprisals can be made on thee.
In so profound abysm I throw all care
They parch'd with heat, and I inflamed by thee.
  As any she belied with false compare.
Without a pain, a trouble, or a fear;
Shall I compare thee to a summer's day?
See what delights in sylvan scenes appear!
From his low tract, and look another way:
  The sun, next those the fairest light,
Do paint the meadows with delight,
This was good fun - up to a point. For every case like this there were also cases that didn’t work very well. How do we let the tool decide when a poem is decent and when it is not? We have to teach it the skill of judgment, of poetry aesthetics - or at the very, very least, the skill of deciding when a sentence or series of sentences are grammatically correct. I began wondering about machine learning's capabilities at this point, but it is linguistic NLP that drew me in first.

All roads eventually lead to NLP tools, and the prince of NLP tools is the Python NLTK (Natural Language Toolkit). Be that as it may, it is not the first one I became aware of. Being more of a Rubyist myself, I stumbled upon diasks2’s excellent page of links to Ruby NLP tools. This was both fortuitous and ultimately sobering, but more about that in a minute.

At this point I was still on the first leg of my journey, the “looking for quick wins” phase, but I was on the verge of embarking on the second. This transition occurred after my initial foray into NLP automation, which ended up being a bit of a disaster...

Now, linguists and NLP'ers commonly talk about something called Part-of-Speech tagging (POS tagging for short), which is a way of saying sentences are comprised of linguistic elements like nouns, verbs, adjectives, adverbs and so on, and we can can "tag" (or create metadata for) all those elements in a sentence, in a paragraph or in a full text. It's something I'd played with before, based on Zachary Scholl's work, without quite knowing that it was a widely used technique or that numerous NLP tools cater for it.

There were several Ruby tools in diasks2's list, and I approached them with gusto. After playing with various tools, the possibilities of automated tagging looked rather fun. In fact, I got so excited that I decided to brute force tag most of the poetry available to me via PoetryDB (i.e., quite a lot of poetry), and build a kind of “poetic dictionary” of nouns, verbs, etc., and finally build new poems by interpolating this vocabulary with the templates of the original poems. What a great idea. What could possibly go wrong?!

In case you hadn't guessed: just about everything!

To begin with, poetic language is well ahead of prose when it comes to bending the rules of language, and it laps journalism and scientific writing comfortably on that same account without even breaking a sweat. In short, it beats all comers hands down. Yes, sometimes even healthy folk, otherwise at ease with urbanspeak and text talk, break out in a rash when confronted with Shakespeare, or the evergreen Chaucer. Evergreen he may be, but we nolite talken like that no-more…

You get my point. The phrase “poetic license” was coined for a reason, and here I was, trying to run Part-Of-Speech tagging over the works of Shakespeare, Chaucer, Byron, Dickinson, Browning and the results were OMG?! This initial mistake obscured an even deeper problem, which I’ll come to in a second, and which is when I fully and truly had to leave those Halcyon days of the low and ripe fruit behind.

Once I’d narrowed down the types of poetry I used to the more understandable language of poets like Henry Wadsworth Longfellow, I began to see slightly better results - but not much. It was depressing. "Where am I going wrong?", I wondered in frustration.

All this time I was resisting the nagging feeling that I might have to give up on a fully automated process, from vocabulary through to templating to new poem generation. I really wanted to have my cake and eat it too: I wanted to use all the best poetry available as my input, I wanted to use available NLP tools to extract all the linguistic nutrients, and I wanted to do all this in an automated fashion without human intervention to see where the best results would come from. If only. Without fail, nearly all the results were dismal. Here and there was a little glimmer of brilliance, but mostly they were few and far between.

I was effectively not really in control of the process, trusting that the tools were as good as I wanted them to be. I had delegated my trust. Sometimes it's hard to distinguish between excitement, overconfidence, and foolishness. I was learning the hard way that not only was the poetry too varied and “poetic” to be reduced to one dimensional categories of speech, but likewise the tools were far too blunt.

At this point I really want to single out the idea of Part-Of-Speech tagging. It is a particularly misleading concept in the context of poetry. Part of speech really is part of nothing here, because in poetry, where the language is often interlinked, if you poke in one place, you poke everywhere - it is a ball of nerves. It is misleading because it is on the one hand not granular, nuanced enough, and on the other hand not wholistic, interdependent enough. It gives the illusion of grasping language, of having grasped elements of the poem. In reality it is more like grabbing a delicate butterfly by the leg, and finding you’ve torn it off.

This realisation shouldn't obscure the fact that the idea was useful. What it meant in practice though was that I'd have to invent my own categories over and above the existing ones. More than that, I'd have to start looking at the actual structure of the poems and describe, on a metalevel, how linguistic parts interlink with each other. So it is not just a case of seeing a "verb" here and an "adjective" here, it is also about understanding that the speaker is trying to describe, say, a feeling. What other ways are there to describe that feeling? And how does it relate to other parts of the poem? Easier said than done, and really, this is a challenge that still remains. Today's PoemCrunch release is still just the first milestone.

I started with Yeats' “An Irish Airman Foresees His Death”. The first challenge, which was a step beyond POS tagging, was to interlink the various pronouns: I -> my -> me, their -> them, etc. and find ways to exchange this for alternatives: he -> his, we -> our etc. (You can still see this process happening on PoemCrunch, where the latest incarnation has found its home). I refined the parts of speech and tinkered with the vocabulary. It was interesting, and the results were improving, but I still felt slightly underwhelmed. I knew I had to refine the process, but I was still surprised at how slowly the poem was giving up its secrets.

I decided to try a different tack. Perhaps the poems were too complex, and I was being too ambitious. I tried a couple of simple poems, and even wrote a few simple rhymes myself. No luck. The results were noticeably worse. What the hey?! Now at this point I might be forgiven for thinking that it could be time to move on and try a totally different angle. Were it not for that little voice in the back of my mind saying "but it has got to work!!", despite evidence to the contrary, that's probably what I would have done. But instead I decided to try once more and apply the same process to Shelley's Ozymandias. Ozymandias is in some ways a more complex bit of clockwork, and I really didn't expect much at all. Imagine my surprise, then, when I immediately saw several signs of improvement. How was it possible?

I refined the template further, and for the first time I was beginning to see genuine progress. It was also becoming clearer that some poems were more suitable to the treatment than others. (I returned to Yeats’ poem in the end, but only after I’d learned a few more lessons that I could apply.) It’s too early to say, even now, but there is some indication that some of the “incontrovertibly great” traditional poems like "Ozymandias", and "Sonnet 18" actually work really well, and that it might be because their inner workings are so precise that they are amenable to having the individual parts exchanged out.

At this stage I was still searching for a tool that could at least correct simple grammatical errors, and I found one in Gingerice. Its services came at a considerable performance penalty, but it sometimes (not always) rescued sentence segments that were otherwise problematic, and I was willing to live with the performance hit. The fact that I’ve now been able to phase it out almost completely is a tribute to the fact that the overall process has improved dramatically.

One of the biggest tangible lessons I learned during this process was not to rely on the existing NLP way of doing things like POS tagging. I repeat this point because in literature it is all about how you say things, not just what you say, and I realised I will have to start creating my own tags - which I soon did. I now work with a vocabulary set that has over forty non-standard tags, including different categories of nouns:  nouns of people (policemen, teachers, actors, etc.), nouns of animals (bears, elephants, birds etc.), the list goes on. (A special shout out must go to Enchanted Learning, whose lists of words proved invaluable). This is still just scratching the surface. What if I wanted to write something with a specifically steampunk flavour? I will need to provide a dictionary that caters to that genre, with adjectives, nouns, phrases, idioms that evoke those steam punk elements. This is very exciting.

The results were getting better and better as I refined the process, and it strongly pointed to having more - rather than less - control over poem segments. This was the total opposite of where I'd started, when I basically wanted an unsupervised process to take care of things. In fact, it is a lot more like writing "normal" poetry (which is to say, without programmatic intervention), and I would say the lines will blur more and more in future. The naysayers will probably go quiet at that point, but hey, let the poetry do the talking. There is a still a good way to go before that will happen. PoemCrunch has reached its first milestone, yet there are many ways in which the process can still be enhanced. Language errors do creep in, sometimes because the underying NLP tools do not provide correct results, sometimes because the structural interaction of various poem components is still at an early stage. Then there is the data. Building a good vocabulary, for one, is time-consuming, and it is becoming clear to me that they have to be tailored to work well. The extent of that tailoring depends on many factors, and one always wants to leave enough unpredictability to keep things interesting.

Another interesting outcome of the whole process was that, although I hadn't set out to cover the same ground as Zachary Scholl, I was nevertheless coming to similar conclusions. For instance, I found that due to the heuristics our brains apply, it makes sense to group words according to emotional tone (eg. words that are positive or negative in tone). Likewise, I had begun to create custom tags in response to the limitations of "standard" tags. It probably suggests that there is often convergence along the path of evolution given certain starting conditions - in this case the parameters of a context free grammar in relation to English poetry.

Although there are still exciting improvements to be made, the consistency with which I was achieving interesting results suggested I was reaching a milestone. It was time to share those results. That's when I started work on setting up PoemCrunch. It was conceived as a showcase of “new poems” generated from the templates of those old classics, via the process described above. To this end I have selected 5 poems: “The Tyger” (by William Blake), “Sonnet 18” (by William Shakespeare), “Ozymandias” (by Percy Bysshe Shelley), “Do not go Gentle into that Good Night” (by Dylan Thomas), and “An Irish Airman Foresees his Death” (by William Butler Yeats).

The result is a series of poems that have much in common with their illustrious parents - not least the rhythm and rhyme scheme - but also reveals to what extent the strength of those classics lies in their magnificent scaffolding, against which new bricks can be laid, new windows can be installed, and a new facade can be erected.

Enjoy!

Saturday, July 11, 2015

The Road to PoemCrunch

PoemCrunch has gone live and I can finally explain what it’s all about.

So, what is PoemCrunch? Is it a text-bending exercise involving the classics? Is it an exploration of poetics with the aid of computing? And, does it have a point? Yes, Yes, and - good question: Yes!

In short, PoemCrunch showcases a particular type of computer aided poetry, namely those templated from various classic poems with a formal structure called Context Free Grammar (a linguistic concept pioneered by Noam Chomsky), and subsequently automated using the modern art of computing. That’s PoemCrunch in a nutshell. How it ended up that way, however, requires a bit more explanation.

Some time ago I talked about the need for Poetry Technology and argued for its wider adoption. At the time I'd had PoetryDB up for a while and was looking at new ways to create poems from existing material, with the aid of - as I called it - Poetry Technology.

I looked at what tools were freely available, and my early impression was that good Poetry Technology will likely be a Hard problem to solve. On the one hand there were plenty of widely available NLP (not Anthony Robbins' stomping ground, but Natural Language Processing) tools available. Some were quite "low level" in the sense that they attempted to solve particular language and linguistic problems (with varying degrees of success). Others attempted higher level functionality, for instance by creating text summaries. This is not the right blog post in which to survey them all properly, but suffice it to say that they ranged from the fairly simple to the really complex.

What's more interesting, perhaps, was seeing what others had already done. Since I've talked about some of that in the earlier article I won't rehash the results, but on the whole it became clear that people were starting to explore this, especially for producing website content. Article writing software such as Article Builder and summary writing software like Smmry are more or less par for the course.

Back to Poetry Technology. I saw three broad areas to explore, which can be categorised loosely as Easy, Harder, and Hard. Easy represented the lowest barrier to entry, the proverbial "lowest hanging fruit". Techniques in this space involve various venerable home hacks such as cut-ups, chained n-grams, and the like.

The Easy methods are fairly easy to get to grips with, and yet capable of surprisingly interesting results. The drawbacks, however, are considerable too. There is little or no control over semantics and structure, and the results often work best with a healthy dose of curation. I'm also conscious that it can suffer from YAB (Yet Another Bot) saturation once the novelty wears off.  That is not a reflection on quality as such. It is more of an image problem. We don't want to tarnish it prematurely. We need quality.

The Hard part is a space that I associate with IBM’s Watson, deep machine learning and Huge gulps of Language Data. Whatever the holy grail of language processing looks like, it lies somewhere on the road to that misty, mythical place where fabulous new poems and major works of literature will one day be spawned in the depths of a data lake. It is also a world I don’t know very well yet and, if I ever want to reach it, I have much - much - to learn.

Luckily for me, a journey is not made up of giant leaps. In the meantime, there is plenty to explore beyond the green fields of Easy, in the wet marshes of Harder.  Our sustenance here, so I figured, would be a selection of the NLP tools already on offer, a willingness to dabble in linguistics, and ultimately, a desire to see the literary emerge from the primordial soup of language, life ... and data.

Back to my story. As may be imagined, I started out looking for easy pickings. I was still in the Easy phase. I experimented with the data available to me on PoetryDB, which is a wealth of some of the best poetry in the English language up to about a century ago, easily consumable via an API. I tried various angles. A simple, and not uninteresting, approach was to use, say, all the sonnets of Shakespeare and create new sonnets at random. With fairly simple code I could combine the first line of Sonnet X with the second line of Sonnet Y with the third line of Sonnet Z and so forth until voilà! a new sonnet. There are 154 sonnets by Shakespeare on the site, so plenty of permutations are possible.

In other cases I took lines from different poets and mixed them together. This was even more interesting. The results can be both quirky and cohesive, as can be seen in the following example, which mixes William Shakespeare with Alexander Pope. A sonnet rhyme scheme was used to guide the selection of lines, and the result actually hangs together quite well:

   And agents from each foreign state
And birds sit brooding in the snow,
  Merchants unloaded here their freight,
When all aloud the wind doth blow,
Since no reprisals can be made on thee.
In so profound abysm I throw all care
They parch'd with heat, and I inflamed by thee.
  As any she belied with false compare.
Without a pain, a trouble, or a fear;
Shall I compare thee to a summer's day?
See what delights in sylvan scenes appear!
From his low tract, and look another way:
  The sun, next those the fairest light,
Do paint the meadows with delight,
This was good fun - up to a point. For every case like this there were also cases that didn’t work very well. How do we let the tool decide when a poem is decent and when it is not? We have to teach it the skill of judgment, of poetry aesthetics - or at the very, very least, the skill of deciding when a sentence or series of sentences are grammatically correct. I began wondering about machine learning's capabilities at this point, but it is linguistic NLP that drew me in first.

All roads eventually lead to NLP tools, and the prince of NLP tools is the Python NLTK (Natural Language Toolkit). Be that as it may, it is not the first one I became aware of. Being more of a Rubyist myself, I stumbled upon diasks2’s excellent page of links to Ruby NLP tools. This was both fortuitous and ultimately sobering, but more about that in a minute.

At this point I was still on the first leg of my journey, the “looking for quick wins” phase, but I was on the verge of embarking on the second. This transition occurred after my initial foray into automating various NLP tools ended up being a bit of a disaster...

Now, linguists and NLP'ers commonly talk about something called Part-of-Speech tagging (POS tagging for short), which is a way of saying sentences are comprised of linguistic elements like nouns, verbs, adjectives, adverbs and so on, and we can can "tag" (or create metadata for) all those elements in a sentence, in a paragraph or in a full text. It's something I'd played with before, based on Zachary Scholl's work, without quite knowing that it was a widely used technique or that numerous NLP tools cater for it.

There were several Ruby tools in diasks2's list, and I approached them with gusto. After playing with various tools, the possibilities of automated tagging looked rather fun. In fact, I got so excited that I decided to brute force tag most of the poetry available to me via PoetryDB (i.e., quite a lot of poetry), and build a kind of “poetic dictionary” of nouns, verbs, etc., and finally build new poems by interpolating this vocabulary with the templates of the original poems. What a great idea. What could possibly go wrong?!

In case you hadn't guessed: just about everything!

To begin with, poetic language is well ahead of prose when it comes to bending the rules of language, and it laps journalism and scientific writing comfortably on that same account without even breaking a sweat. In short, it beats all comers hands down. Yes, sometimes even healthy folk, otherwise at ease with urbanspeak and text talk, break out in a rash when confronted with Shakespeare, or the evergreen Chaucer. Evergreen he may be, but we nolite talken like that no-more…

You get my point. The phrase “poetic license” was coined for a reason, and here I was, trying to run Part-Of-Speech tagging over the works of Shakespeare, Chaucer, Byron, Dickinson, Browning and the results were OMG?! This initial mistake obscured an even deeper problem, which I’ll come to in a second, and which is how I fully and truly had to leave those Halcyon days of the low and ripe fruit behind.

Once I’d narrowed down the types of poetry I used to the more understandable language of poets like Henry Wadsworth Longfellow, I began to see slightly better results - but not much. It was depressing. "Where am I going wrong?", I wondered in frustration.

All this time I was resisting the nagging feeling that I might have to give up on a fully automated process, from vocabulary through to templating to new poem generation. I really wanted to have my cake and eat it too: I wanted to use all the best poetry available as my input, I wanted to use available NLP tools to extract all the linguistic nutrients, and I wanted to do all this in an automated fashion without human intervention to see where the best results would come from. If only. Without fail, nearly all the results were dismal. Here and there was a little glimmer of brilliance, but mostly they were few and far between.

I was effectively not really in control of the process, trusting that the tools were as good as I wanted them to be. I had delegated my trust. Sometimes it's hard to distinguish between excitement, overconfidence, and foolishness. I was learning the hard way that not only was the poetry too varied and “poetic” to be reduced to one dimensional categories of speech, but likewise the tools were far too blunt.

At this point I really want to single out the idea of Part-Of-Speech tagging. It is a particularly misleading concept in the context of poetry. Part of speech really is part of nothing here, because in poetry, where the language is often interlinked, if you poke in one place, you poke everywhere - it is a ball of nerves. It is misleading because it is on the one hand not granular, nuanced enough, and on the other hand not wholistic, interdependent enough. It gives the illusion of grasping language, of having grasped elements of the poem. In reality it is more like grabbing a delicate butterfly by the leg, and finding you’ve torn it off.

This realisation shouldn't obscure the fact that the idea was useful. What it meant in practice though was that I'd have to invent my own categories over and above the existing ones. More than that, I'd have to start looking at the actual structure of the poems and describe, on a metalevel, how linguistic parts interlink with each other. So it is not just a case of seeing a "verb" here and an "adjective" here, it is also about understanding that the speaker is trying to describe, say, a feeling. What other ways are there to describe that feeling? And how does it relate to other parts of the poem? Easier said than done, and really, this is a challenge that still remains. Today's PoemCrunch release is still just the first milestone.

I started with Yeats' “An Irish Airman Foresees His Death”. The first challenge, which was a step beyond POS tagging, was to interlink the various pronouns: I -> my -> me, their -> them, etc. and find ways to exchange this for alternatives: he -> his, we -> our etc. (You can still see this process happening on PoemCrunch, where the latest incarnation has found its home). I refined the parts of speech and tinkered with the vocabulary. It was interesting, and the results were improving, but I still felt slightly underwhelmed. I knew I had to refine the process, but I was still surprised at how slowly the poem was giving up its secrets.

I decided to try a different tack. Perhaps the poems were too complex, and I was being too ambitious. I tried a couple of simple poems, and even wrote a few simple rhymes myself. No luck. The results were noticeably worse. What the hey?! Now at this point I might be forgiven for thinking that it could be time to move on and try a totally different angle. Were it not for that little voice in the back of my mind saying "but it has got to work!!", despite evidence to the contrary, that's probably what I would have done. But instead I decided to try once more and apply the same process to Shelley's Ozymandias. Ozymandias is in some ways a more complex bit of clockwork, and I really didn't expect much at all. Imagine my surprise, then, when I immediately saw several signs of improvement. How was it possible?

I refined the template further, and for the first time I was beginning to see genuine progress. It was also becoming clearer that some poems were more suitable to the treatment than others. (I returned to Yeats’ poem in the end, but only after I’d learned a few more lessons that I could apply.) It’s too early to say, even now, but there is some indication that some of the “incontrovertibly great” traditional poems like "Ozymandias", and "Sonnet 18" actually work really well, and that it might be because their inner workings are so precise that they are amenable to having the individual parts exchanged out.

At this stage I was still searching for a tool that could at least correct simple grammatical errors, and I found one in Gingerice. Its services came at a considerable performance penalty, but it sometimes (not always) rescued sentence segments that were otherwise problematic, and I was willing to live with the performance hit. The fact that I’ve now been able to phase it out almost completely is a tribute to the fact that the overall process has improved dramatically.

One of the biggest tangible lessons I learned during this process was not to rely on the existing NLP way of doing things like POS tagging. I repeat this point because in literature it is all about how you say things, not just what you say, and I realised I will have to start creating my own tags - which I soon did. I now work with a vocabulary set that has over forty non-standard tags, including different categories of nouns:  nouns of people (policemen, teachers, actors, etc.), nouns of animals (bears, elephants, birds etc.), the list goes on. (A special shout out must go to Enchanted Learning, whose lists of words proved invaluable). This is still just scratching the surface. What if I wanted to write something with a specifically steampunk flavour? I will need to provide a dictionary that caters to that genre, with adjectives, nouns, phrases, idioms that evoke those steam punk elements. This is very exciting.

The results were getting better and better as I refined the process, and it strongly pointed to having more - rather than less - control over poem segments. This was the total opposite of where I'd started, when I basically wanted an unsupervised process to take care of things. In fact, it is a lot more like writing "normal" poetry (which is to say, without programmatic intervention), and I would say the lines will blur more and more in future. The naysayers will probably go quiet at that point, but hey, let the poetry do the talking. There is a still a good way to go before that will happen. PoemCrunch has reached its first milestone, yet there are many ways in which the process can still be enhanced. Language errors do creep in, sometimes because the underying NLP tools do not provide correct results, sometimes because the structural interaction of various poem components is still at an early stage. Then there is the data. Building a good vocabulary, for one, is time-consuming, and it is becoming clear to me that they have to be tailored to work well. The extent of that tailoring depends on many factors, and one always wants to leave enough unpredictability to keep things interesting.

Another interesting outcome of the whole process was that, although I hadn't set out to cover the same ground as Zachary Scholl, I was nevertheless coming to similar conclusions. For instance, I found that due to the heuristics our brains apply, it makes sense to group words according to emotional tone (eg. words that are positive or negative in tone). Likewise, I had begun to create custom tags in response to the limitations of "standard" tags. It probably suggests that there is often convergence along the path of evolution given certain starting conditions - in this case the parameters of a context free grammar in relation to English poetry.

Although there are still exciting improvements to be made, the consistency with which I was achieving interesting results suggested I was reaching a milestone. It was time to share those results. That's when I started work on setting up PoemCrunch. It was conceived as a showcase of “new poems” generated from the templates of those old classics, via the process described above. To this end I have selected 5 poems: “The Tyger” (by William Blake), “Sonnet 18” (by William Shakespeare), “Ozymandias” (by Percy Bysshe Shelley), “Do not go Gentle into that Good Night” (by Dylan Thomas), and “An Irish Airman Foresees his Death” (by William Butler Yeats).

The result is a series of poems that have much in common with their illustrious parents - not least the rhythm and rhyme scheme - but also reveals to what extent the strength of those classics lies in their magnificent scaffolding, against which new bricks can be laid, new windows can be installed, and a new facade can be erected.

Enjoy!

Monday, March 02, 2015

Why We Need Poetry Technology

Sometimes you’re doing something that is so new that it has no name yet.

Over the last few years, on and off, I’ve been experimenting with programmatic approaches to new ways of writing. An obvious starting point was cut-ups. Cut-ups are cool and can be useful as a writing aid. William Burroughs made them sing in striking disharmony. He also revoked the traditional author’s monopoly on textual narrative.

I wrote several variations on a primitive cut-up generator program. It was meant to automate the job that Burroughs achieved with paper and scissors. The outcome illustrates some of the possibilities (as well as limitations) of a basic application of the idea.

This was 2007. A few isolated voices aside it was hard to find anyone who was experimenting in this area. There was hype about e-books and Amazon's soon-to-be-released e-book reader, but if literature was about to experience a revolution I wasn't in on the secret.

And so the first Kindle arrived and all the talk was of e-books, as if that was a milestone in literary innovation. To be sure, it wasn’t even close. The first e-books were simply content transposed into portable digital formats, and the Kindle extended that to the physical appliance. In other words, what you held in your hand and how you paged it was new. The reading content remained just the same.

The years passed and it kept nagging at me that in the age of the internet, literature was missing out on a massive opportunity. What opportunity exactly? The opportunity to use information technologies and the internet for the purposes of literary creativity. To bend it to our will. What the first Great Big Work of the internet age was going to look like exactly I wasn’t too sure, but one thing was crystal clear: it hadn’t yet been written.

Fast forward seven years and the major literary awards are still going to traditional forms of literature. But on the fringes and beneath the surface, the beginnings of a new way of writing literature is brewing. When I first started reading about Andy Warhol's literary works, about Flarf poetry, text remixing and the use of texts as material, the scales fell from my eyes.

Kenneth Goldsmith chronicles the background and rise of this subculture in his excellent book Uncreative Writing. This loosely distributed internet engaged community has been producing interesting and provocative literature for the last decade or so. Its influences, too, are discussed in stimulating detail and reach back via Andy Warhol, Oulipo, the Situationists, Walter Benjamin and all the way to Gertrude Stein.

However, as Goldsmith himself observes, he himself is but a bridge between the old world of literature and the new world of an as yet undefined anonymously authored Uncreative era:

“The future really belongs to anonymous writers writing for anonymous readers: people who are writing programmes for machines to read, for other machines to read; I think this whole thing is going to be pushed much further. I’m just a bridge between the old and the new.”

Goldsmith’s work has provided me with a guiding principle while I explored the implications of uncreative writing. It inspired my venture into experimental poetry curation, an online zine called Poetry WTF?!. The work that is being published at Poetry WTF?! operate on the principle that existing texts are material to be used for new poetry. The resulting artefacts are often hand crafted to reveal stimulating, ironic or conceptual new poems - and sometimes a creation exhibits all of these qualities at once.

By decoupling human agency from the immediacy of expression and instead reintroducing it at the stage of creative composition we are preparing the stage for a new type of writing. Nevertheless, when we do so we are still inhabiting the world that the Oulipoets from the 1960s would recognise. It is a phase in a literary evolution that has not been taken to its limits, that hasn’t transformed into a radically new literary being just yet. We could say that the ontology of literature is still authored, analog, and un-automated.

As Goldsmith speculates, the next stage of this evolution will be evidenced by greater anonymity of authorship as well as readership. In fact this is already happening. This anonymous dialectic between creator and consumer is being played out, at the very moment that I’m authoring this, by a variety of Twitter bots. Pentametron, the brainchild of NY based conceptual artist Ranjit Bhatnagar, is one of the best known poetry Twitter bots. It has been around since 2012.

Pentametron employs an automated program (the bot) that searches Twitter feeds for tweets written in iambic pentameter, matches two that rhyme, and writes them out as Pentametron tweets. Pretty simple, but the results are remarkably readable and rather moreish. They’re also a vindication of Goldsmith’s controversial observation that language transforms rather than loses its expressive capacity when viewed as material - which is precisely what Twitter bots do par excellence. The success of the work now depends on the repeatable realisation of a concept rather than on novelty of expression.

This technological mediation is clearly a step in the right direction. It is easy to see that Pentametron’s automated method of operation and Kindle’s mere transplant of content from a physical book to an e-book are poles apart. In the former, technology is playing a significant role in the creative process. The medium itself is now coming into play both in creation as well as consumption.

While Twitter hosts some of the most famous literary bots, it isn’t the only platform where automated, anonymous literature can be read. Tumblr has its own share of autoposted literary mash-ups. A typical case in point is King James Programming, which employs an algorithm known as Markov chains to combine phrases from the King James Bible with a couple of programming guides. The results are generally seamless and frequently funny, as this example illustrates:

“And since programming languages are largely written in English, who would suspect a language to come from Japan? And yet, here is this great and wide sea, wherein are things creeping innumerable, both small and great”

The snowball poem generator uses Markov chains to create a totally different type of poem called a snowball (a.k.a. a chaterism). It is a type of constrained writing (because it is based on a set of rules) and concrete poetry (since its typography is important). This particular snowball generator even got a mention on Boing Boing.

The use of Markov chains has become a favoured approach in automated poetry and literature generation. Markov chains are used so widely now, from genetics to physics, that few people are aware that the Russian mathematician Andrei Markov in fact developed his now-famous concept by studying consonant and vowel patterns in poetry. Poetry generation and Markov chains go together like strawberries and cream.

Yet, as can be expected in such a young and burgeoning field, Markov chains is not the only game in town. Various other approaches to generating literature have been attempted. Just recently it came to light that Zackary Scholl submitted several poems to a literary journal back in 2011, one of which was subsequently accepted and published. The twist in the tale is that the poem was not written by him directly, but generated by a program he developed. Scholl's program employs a type of context-free grammar, an area of linguistics invented by Noam Chomsky, called Backus-Naur Form.

Scholl has made his code available on Github, and the poetry generator can be seen in action on his website, where you can generate new poems at the click of a button. Some are pretty good, too.

This is definitely a trend, and the methods will only get more complex. What machine learning can do for Watson of Jeopardy fame, it can surely do for poetry and literature in general. But who will take up the challenge?

This is part of the question that has been bouncing around my head during the past year. I looked in vain around the internet for evidence that literature or poetry is evolving along the lines of, say, finance or marketing, which both enjoy tremendous technological innovation to create more intelligent platforms and, of course, generate more money. Surprise surprise, I couldn’t find even a single website dedicated to literary texts that made their content available via an API. How on earth are we going to get literature into the information age if Shakespeare is still stuck in a book?! (including e-books)

That’s when I decided to create Poetry DB, the world’s first poetry database that has an easy-to-use API ready and available for both human and automated machine consumption. As of this writing Poetry DB contains a selection of poetry by most of the well known pre-20th century poets in the English language (from Chaucer to Dickinson and beyond), as well as the complete works of a subset of those (such as Shelley, Keats, Clare, Byron and Blake).

Yet whenever dinner party talk turned to my hobbies, I got the same slightly anxious reaction about my hopes for programmatically generated poetry. I would talk about APIs and the ability to grab lines from different poets at  Poetry DB and pass them through an algorithm that splices and dices and produces something both modern and ancient and beautiful. Then I'd hear a response along the lines of “but that’s not really poetry … (!!!)” or “but that’s just …. wrong”. The fact that I couldn’t point to any concrete example of Greatness in this brave new world didn't exactly help my cause.

This leads me back to the start of this discussion. Sometimes you’re doing something that is so new that it has no name yet. It finally dawned on me that the activity I'm engaged in is not simply creating poetry. I'm not just writing poetry. I am also trying to define the process and tools that are required for its new form. In short I'm entering a radically new space, helping to midwife a new type of literary ontology.

It is the literary equivalent of music pioneers like The Beatles (not that I'm comparing myself with them of course) playing with tape loops, creating noise that didn’t always sound like music to anyone else - maybe not even to themselves. Yet today their innovations are accepted as groundbreaking music. Today we also have sophisticated production music technologies with which to create and control sound and music. In other words, what we are doing with data driven and algorithmic poetry is perhaps best described as poetry made with poetry technology, via the application of poetry science. The end goal is still "poetry", but it's a new kind of poetry, in a new medium, and a new type of audience.

What do we mean by poetry science and poetry technology? Does this playful activity really warrant such formal terms? I think it does, because I think the process is being misunderstood as just a different type of traditional poetry, and development of the field is languishing as a result.

From Homer to Elena Ferrante, from Aristotle through to the present day, literature and literary appraisal are bound in a dialectic that permeates culture and occasionally beyond, even into the very fabric of politics and society. A body of knowledge has evolved that has theoretical as well as practical implications. This knowledge includes a more or less formal understanding of poetry (metre, rhythm, rhyme in traditional forms), drama, prose and various other forms of literature. It also concerns detailed and analytical appraisals, such as what are examples of good literature and why, which range from close readings to serious, serious literary ciriticsm. This body of knowledge is enormously rich.

Whenever a writer attempts to innovate, he or she is applying part of this inherited knowledge in new contexts or to new purposes. The outcome may be more or less successful, but part of this learning process is what we may consider the science of literature or poetry as the case may be. In other words, poetry science is both (1) the body of knowledge and (2) the application of (a subset and a particular interpretation of) that knowledge. Such a body of knowledge will no doubt in time come to include more formal interfaces to information technology, which becomes part of that science. Just like a pipet, and a petri dish, a telescope and data science are all inextricably part of physical science. Above all, science is a learning process to discover what succeeds and what doesn’t. Poetry technology (and literature technology more broadly) is the development of methods, supporting tools, and processes for the purpose of generating new poetry.

So for instance, Poetry DB and my forked development of Scholl’s original Poetry Generator are all poetry technologies aimed at the creation of poetry. They are also experiments that enable me to learn what works and how these technologies could be improved. As the field evolves, and machine learning techniques are developed that are capable of absorbing the existing body of poetry knowledge (not only an understanding of its formal properties as poetry differs from, say, prose, but especially the qualitative understanding of what distinguishes Great poetry from mediocre poetry), we may gradually come to see genuine novelty.

Just as it took a few decades for music technologies such as sophisticated post production software to really mature and come into their own, so it will take a while for poetry science and technology to evolve a robust set of concepts and solutions that writers will want to use on a regular basis. But given a bit of time a new breed of writers, with the aid of poetry technology, will plant their flags firmly in the technological infosphere.

In the meantime, if we continue to associate poetry technology with poetry's traditional context, its growth will be stunted. That's the alternative of inertia. "Yes, so what about traditional poetry?", I hear you say. They will co-exist, the old and the new. They have to. But it's time that we acknowledge poetry technology for what it is, and welcome the new.

Monday, January 26, 2015

Emily Dickinson and the Art of Dwelling in Possiblity

In I dwell in possibility Emily Dickinson compares Possibility - the world of the imagination: not simply the frivolous imagination, but the very occupation of writing poetry - to a house, a place to live in and feel at home.

What she makes clear to us, her readers, is that this is no ordinary dwelling. Compared to prose, it is superior. It is “More numerous of Windows”, and paradoxically has a roof that is not your usual roof. It does not hem in or coddles the inhabitant. It is the "everlasting sky". Those who appreciate this dwelling and are attracted to visit are simply “the fairest”. Her imagination draws progressively closer to Nature - “the Cedars”, “the Sky” - until “Paradise” itelf is within her grasp.

In Dickinson's depiction of Poetry we can also detect that living there has been a choice. The activity of writing poetry has been chosen for its advantages and superior qualities. By comparing this state of dynamic being to a dwelling, Dickinson is also subverting the traditional place of the woman, namely the home. She is reclaiming the home as her home, a dwelling of her own making, untouched and unspoilt by others. In short, a Paradise.

Rebecca Resinski takes Emily Dickinson’s poem and creates a new poem. Yet, like Dickinson’s, Resinski’s poem is no ordinary poem. She pushes the possibility of Possibility to the very limit. In Dwelling: An Inventory she breaks the poem into its constituent parts and takes Kenneth Goldsmith’s project, of literature as material, to one of its logical conclusions: exhibiting it as the catalog of its lexical units.

As Goldsmith observes in Uncreative Writing, his treatise on the subject, words can never be other than expressive. Even in their newly sundered form their ordered groupings produce nearly-phrases that are on the brink of meaning: "a an and as", "more my / narrow numerous".

Resinski’s version, however, is wrily witty and ironic. A poem is itself a dwelling, and these are the words that live in Dickinson's. Yet while the poem appears at first inert and a(na)tomised in its dissected state, closer inspection reveals the promise of life brimming within words. These words are ready for new compositional possibilities.

Sunday, November 09, 2014

Dana Gioia and the State of Poetry

While researching poetry on the internet last night it was my good fortune to come across a fascinating essay that was published 23 years ago. A lot of water has flown into the sea since then, but "Can Poetry Matter?", written by eminent man-of-letters Dana Gioia, has a lot to say that we can still learn from today.

He discusses the erosion of poetry's influence over the last century, in American poetry specifically, but broadly applicable. His key observation is that poetry has receded from public life to an insular poetry subculture. In particular, this subculture has been fed by the influx of creative writing programs in academic institutions. At that time this had a number of consequences.

To begin with, quantity of poetic output had become more important than quality. For a career to progress it must be seen to have produced. The importance of being published in journals and of being cited by others takes precedence over actual quality. Gioia notes that the  

"proliferation of literary journals and presses over the past thirty years has been a response less to an increased appetite for poetry among the public than to the desperate need of writing teachers for professional validation."

These are the demands of the job.

A second consequence was that the majority of readers of new poetry were either poets themselves or the students of poets (in their capacity as creative writing teachers). In short most readers were from academic institutions. It goes without saying that the imagination and response of a reader from an academic institution is very different from the imagination of a social worker, a banker, a lawyer, a member of parliament, a doctor, a homeless bohemian. Yet in olden times, many layers of society took note of poetry. It was a natural vehicle for thought.

A third consequence was that, if most of the readers had become other poets from academic institutions, the concomitant reality was that most poets were those very same readers and teachers from academic institutions. As Gioia observes

"The problem is not that poets teach. The campus is not a bad place for a poet to work. It's just a bad place for all poets to work." (my emphasis)

If The Waste Land is the most important and influential poem of the 20th century, it is perhaps also a marker of the coming academification of poetry. Its complex cultural references shows poetry turning inward to a vast territory of intertextuality, and it takes a certain level of erudition to absorb, let alone imitate or take further in its implications. Has culture become so complex that an academic approach is inevitable, or have poets lost the appetite - and the confidence - to address topical matters in society?

Gioia's essay leaves the reader with much food for thought. Although his focus is on the academic environment of which he was a part, it can equally apply to the online and printed journals in circulation today. It just so happens that the internet has widened the gyre of poets and readers - a good thing - but it often still feels fairly insular.

When Gioia says that "the poetry subculture no longer assumes that all published poems will be read", I am less surprised at the reality, than at the implication that it was ever different. Yet there was a time when newspapers published new poetry and, even more importantly, discussed and critiqued what was published. New poetry now rarely appears outside the abovementioned insular subcultural journals and magazines, and its appeal to a wider audience is almost non-existent: "over the past half century, as American poetry's specialist audience has steadily expanded, its general readership has declined."

There is no doubt that a lot of energy is still being spent in the act of writing poetry, as well as in the corresponding editing and curation of journals, the hosting of competitions, and the creation of chapbooks and zines, but unfortunately the overall quality is very uneven. The damaging outcome is that even those who are interested in discovering good new poetry don't believe anything truly great is getting written. Lots of average, a fair amount of good, occasionally great, but nothing truly great. They simply don't know where to find it.

"The divorce of poetry from the educated reader has had another, more pernicious result. Seeing so much mediocre verse not only published but praised, slogging through so many dull anthologies and small magazines, most readers--even sophisticated ones like Joseph Epstein--now assume that no significant new poetry is being written. This public skepticism represents the final isolation of verse as an art form in contemporary society."

The problem is not that there is a lack of talent. Not at all. It's that talented poets' influence and - even more importantly - their potential influence is so limited that it dents their confidence to "speak up" and try to "make a difference" via their poetry. Some of the more successful poets of recent decades - Gioia mentions Adrienne Rich (feminism) and Robert Bly (anti-war) - used political agendas to inform their work and raise their profile.

Poetry seems to suffer, to some extent, from the same problem as philosophy, namely that practitioners have  been driven into the confining demands of professional academic activity. In poetry, however, the situation is worse. The low barrier to entry and the myriad of online publishing channels ensure that the subculture at large is highly fractured and drenched with mediocre poetry. Individuals themselves have to become the primary arbiters of taste, which is no bad thing in itself, but there are increasingly few benchmarks other than the acknowledged masterpieces of the past.

Then there is the disconnect between writers and the publications themselves. Do poets actually read the publications where their poem is going to get published? I confess that on many, many occasions have I taken the time to read through a competition or publication's submissions criteria, only to balk and abort at the proposed turnaround time for feedback (2 months, 4 months, etc.) and huge submission fees (anything from $5 to $50). I struggle to imagine how a bohemian poet could afford to submit and wait that long, but I digress. If, on occasion, my confidence is high enough not to stop here, it almost always drops at the final hurdle: "please look through our current issue to familiarise yourself with the kind of work we are looking for".

Bang! It's all backwards. Is there a struggling author who would spend $25 to read the most recent issue of a publication he or she wants to submit to, but would not read otherwise? What are the chances, therefore, of attracting an authentic poet from the fringes of society?

It is difficult to blame the competition hosts or magazine editors. These demands are necessary to limit time-wasting and cover overheads and perhaps provide a minor compensation. Poetry circulation has dwindled and fractured, and so has its remaining readership. If it is so hard for me to even get to know the publication in question, what are the chances that a general reader - who I hope will be the reader of my poem - will ever read my poem? The answer, I believe, hovers just above zero.

Yet there is no clear alternative for exposure and prestige. This goes back to poetry's loss of the general reader, those people  "who support the arts--who buy classical and jazz records; who attend foreign films and serious theater, opera, symphony, and dance; who read quality fiction and biographies; who listen to public radio and subscribe to the best journals."

In my role as editor and founder of Poetry WTF?!, I've had the privilege of corresponding with a variety of poets, many of whom do not fit Gioia's description of the academic poet. But Poetry WTF?! does not publish typical poetry, and therefore does not attract the typical poet. That is by design. There are other publications in the field, too, like Found Poetry Review who appeal to more adventurous poets who are less tied to tradition.

On the other hand, my experience so far is still that the readership is mainly other poets and a few observers of the avant-garde rather than the informed general reader Gioia wants to attract.

Poetry needs to continue to reinvent itself to reach that wider audience, and to put the fun and excitement back into creation as well as reading. The internet has made it possible to share and enjoy poetry on a scale that would make Alexander Pope's eyes pop. We can't return to the past, but we can reclaim our future.

There is certainly more cause for optimism now than back in 1991. Austin Kleon's blackout poetry is a versatile technique with the time boxed appeal of a crossword puzzle: try a new one every day. It now has a large following on the internet. Purists may object that blackout poetry's barrier to entry is too low and that it is limited to fairly short pieces of work, but it has certainly broadened the appeal while staying true to the spirit of poetry.

A more intellectual approach is exemplified by Christian Bök's Oulipo-inspired Eunoia, which sold well in many countries, and listed in The Times top 10 in 2002. Constraints poetry in general seems to be coming into its own in the internet era.

Attempts to frame search engine results, popular comments, and tweets into poetic structures, for instance Darren Wershler and Bill Kennedy's Apostrophe Engine and Andrei Gheorge's The Longest Poem in the World are more conceptual but do make for amusing reading. Purists could object that theirs is no longer a craft of the text, but of the rules and processes that create the text. But a counter-argument runs that the poet's toolbox needs expanding, and that there is every reason to leverage the insights that data processing science affords the craftsperson.

Between these poles a website like The Poetry Foundation strikes a balance between wide coverage , intellectual stimulation and popular poetry. It publishes the works of past masters as well as contemporary pieces on topics and issues that we recognise. It provides discussion and translations into English from a variety of languages, recognising the universal language of poetry across borders. The translation of a poem by Liu Xia and corresponding background notes is typical. With nearly  a hundred thousand Twitter followers it appeals to the general reader that Gioia envisaged. 

In conclusion, Dana Gioia suggests six ways in which the influence of poetry can be expanded, and I want to mention two. The first is,

"When poets give public readings, they should spend part of every program reciting other people's work--preferably poems they admire by writers they do not know personally. Readings should be celebrations of poetry in general, not merely of the featured author's work."

We need to be reminded of the highest exemplars, even on a night of new poetry. There is no doubt that self-promotion is necessary, but we all stand on the shoulders of giants. A bit of humility and homage is in order. If the poet's work is good, it will hopefully compare not unfavourably with the work of other poets whom he or she admires. Either way, there will be more for the audience to enjoy, and they'll be more likely to come again.

The second take-away is that "poets who compile anthologies--or even reading lists--should be scrupulously honest in including only poems they genuinely admire. Anthologies are poetry's gateway to the general culture. They should not be used as pork barrels for the creative-writing trade."

I would like to extend this appeal to the editors of zines, online journals and websites. There is nothing more disheartening than discovering a new website of poetry, only to find that it lacks any standards. Another symptom, even more corrosive, is the conscious decision to praise or include poetry because of personal or professional ties rather than merit: whether it be that of a colleague, a friend or someone who could procure a favour. The temptation is totally understandable, but it should be remembered that in government or business such behaviour would be labelled favouritism, or worse, cronyism or corruption. Readers lose faith and poetry as a whole suffers.

Conversely, there can never be enough closely knit groups to help new types of poetry to gain a foothold and flourish. Every artistic industry relies on talented communities and friends who set trends where others follow. They attract, focus, and direct energy. Even the great romantics, individualists par excellence, are famed for the sets in which they moved: Wordsworth and Coleridge, the Shelleys and Byron, Goethe and Schiller.

In the final instance it is worth remembering that, as with wine, the best exemplars take time. When we try to turn poetry writing into a profession or a conveyer belt for consumption, we deprive it of the conditions it needs to be truly great. Gioia observes how "Wallace Stevens was forty-three when his first book appeared. Robert Frost was thirty-nine. Today these sluggards would be unemployable." To imagine a careful craftsman like Leonard Cohen working in a modern office environment is to imagine a tragedy.

I always liked it slow:
I never liked it fast
With you it's got to go:
With me it's got to last


- Slow, by Leonard Cohen
& Patrick Leonard

I'm always interested in hearing about people's ideas for the future of literature and poetry. So please ping me on @thundercomb or leave a comment. I look forward to hearing from you.

Monday, September 22, 2014

What is Mutated Poetry?

Poetry exists because of, and not in spite of, its cultural contexts. But those contexts are fragile. What happens when they change?

Mutated Poetry is an attempt to find out by transforming poetry in an imaginative context. It is neither science fiction, nor science fact, but a speculative linguistic reality. Just as biological mutation has no inherent purpose - not even survival - so a mutated poem may only incidentally have aesthetic value. But if it has, it may survive.

In the context of plays, West Side Story is a mutation of Romeo and Juliet. We can trace its lineage because it retains the central myth of Shakespeare's original.

But what happens when even that myth is removed? Is the DNA of the original destroyed?

Can Shakespeare's Shall I Compare Thee to a Summer's Day survive without the subject of his affection - or are their destinies bound together, like star crossed lovers?

These and other not so serious questions are being asked, and occasionally answered, at Poetry WTF?! So if you think you know such stuff as dreams are made of, why not grow a little culture in the petri dish?

Tuesday, June 10, 2014

Shakespeare Sonnet 18 Google Translated

Following a roundtrip Google Translation of the Fresh Prince, I decided to give a certain famous Shakespeare sonnet the same treatment. The results, I am happy to say, sound almost nothing like the original.

The following is "Shall I Compare Thee to a Summer's Day" translated from English into Finnish, Yoruba, Arabic, Spanish, Traditional Chinese, Latin, Greek, Korean, and then back to English.

Compare Rates Summer
Thou art more lovely and more temperate.
May the wind to shake the bud of love
And the temperature is less than the first one.
Heat in the southern sky;
Pale skin, golden,
And the fair and equitable
Perhaps the essence of the song changes
Eternal Summer, do not fade,
Announced on ow'st O
Are you proud of your field is killed by Wand'rest
The edge of the immortal five Xiangrong. 
One person can breathe or eyes can see, as though 
Maybe you offer.

Tuesday, June 03, 2014

"Mississippi by Mark Twain" as Situationist Dérive

The sculpture poem "Mississippi by Mark Twain" by Mark Staniforth explores new territory in what is still a fledgling genre.  Whereas it recalls Lewis Carroll's "The Mouse's Tale" and Apollinaire's "Calligrammes" in its literal composition, its heart lies with the Situationists and their notions of drift (dérive) and rerouting (détournement).

The original text is still detectable as an eroded background. It tells of the narrator and his friends' boyhood ambitions while living in a village on the West bank of the Mississippi river. Their enduring ambition was to grow up to become steamboatmen:

"When I was a boy, there was but one permanent ambition among my comrades in our village on the west bank of the Mississippi River. That was, to be a steamboatman."

In Staniforth's version, the original narrator and his boyhood ambitions are not mentioned. Instead, primacy is given to the circus clowns and their hopes. Yet in the original, they are but an aside:

"We had transient ambitions of other sorts, but they were only transient. When a circus came and went, it left us all burning to become clowns; the first negro minstrel show that came to our section left us all suffering to try that kind of life; now and then we had a hope that if we lived and were good, God would permit us to be pirates. These ambitions faded out, each in its turn; but the ambition to be a steamboatman always remained." (my italics)

What we are given in Staniforth's version therefore is a hidden possibility of the text, a kind of alternative reality. We may think of it as a dérive through the text, tracing a branch of its psychogeography.

This drift also hints that things do not necessarily proceed in the same order as before: "of" follows "wavelets" as the Mississippi meanders backwards before slinking forward once more, whereas in the original it is the other way around.

"Mississippi by Mark Twain" demonstrates the versatility of existing text as raw material, and the unique potential of poetry to engage with the many mysteries of meaning.

Wednesday, May 14, 2014

What is Poetry WTF?!

Poetry WTF?! is a website dedicated to the kind of poetry I've been discussing recently here at The Combed Thunderclap. It views existing texts as material to be used for new literary works.

Take Remixed Poetry for instance. Existing poems are remixed with the language of other cultural artefacts, whether they be poems, famous speeches, adverts, or indeed anything apprehendable through text.

Then there is Sculpture Poetry. These poems' essence is distilled through a process of subtraction from the originals. However the process does not need to stop there. Subtraction could mean altering the original beyond recognition, as it becomes a different text with a different meaning altogether. The sculpture emerges by chipping away.

Poetry WTF?! moves away from traditional poetry by viewing language and cultural artefacts as raw material, and the novelty of expression is realised  through the different methods applied.

 

Friday, May 02, 2014

Winterslaap

Vir 'n bietjie pret het ek twee ikoniese Afrikaanse gedigte vermeng tot iets minder. Die reel was om van iedere gedig, lyn vir lyn, ietsie te neem, saam te plaas, en niks by te voeg nie (behalwe punktuasie).

Hier is Winternag (Eugene Marais) en Slaap (D.F. Malherbe) soos nog nooit vantevore gesien nie ...

Winterslaap


O koud is die slaap 'n wondersoete ding!
En sag blink haar bloue oë dof-lig en vaak,
so wyd as die Heer diep waterkuile raak
le die droom in sterlig en skemering.
Vir laas beef haar lippe hoog in die rande.
Ek merk hoe langsaam versprei in die brande
die grassaad aan roere, my lieweling
en sluit so my oe soos winkende hande.

O treurig van wilde woeling die wysie
my stille drome op die ooswind se maat,
en deur duisternisse die lied van ‘n meisie
sluit so my oë as haar liefde verlaat.
In die gras slaap 'n wondersoete ding!
Blink op haar bloue oë ‘n druppel van dou,
en soos maneskyn vinnig verbleek dit
tot ryp in die silwer skemering.