Sunday, October 06, 2013

Luciano Floridi and a Short Introduction to Information : Part 1

I have in at least two past instances found titles in the Very Short Introductions series most useful guides. Information Technology being a familiar subject to me I was curious how Luciano Floridi, one of the foremost thinkers in the Philosophy of Information, would introduce it to the reader. 

As it turns out the introduction is very clear, on the one hand with regards to information as a field of study, and on the other hand in relation to its claims as a philosophical field of study. It is at the same time evident that the Philosophy of Information is still a young field, which makes it an exciting time to get on board.
 
In this blog post and those that follow I will survey the topics discussed in the title called "Information : A Very Short Introduction" by Luciano Floridi.

***

As I write this in October 2013 information technology has come to pervade our lives to such an extent that many of us find it hard to imagine a life without it. In just a few decades it has reprogrammed the way we relate to others and how we see the world. Luciano Floridi calls this transformation the Fourth Revolution.

Floridi suggests that three prior revolutions have changed the way we look at ourselves: the Copernican, the Darwinian, and the Freudian revolutions. Whether we agree with the significance of these three milestones in Western thinking (the last in particular is perhaps questionable), we cannot but agree with the thrust of his suggestion. In each case, what it means to be human was radically problematised. Indeed, each revolution irreversibly altered our self-perception and identity.


That humanity is greatly influenced by technological advances in the Information Age is hardly news. Anyone born before the 1990s remembers how different the analog age was. What Floridi proposes is more radical. Technology does not merely affect us externally, via our bodies and senses, but effects "a radical transformation of our understanding of reality and ourselves" (p. 10). 

To guide our understanding he references two well-known films, The Matrix and Ghost in the Shell. The matrix of The Matrix has a material basis in the "real" world. In other words, the correct way of looking at the world is from the material and biological view represented by Zion. In Ghost in the Shell, on the other hand, information is primary. Information becomes the lense through which the world is perceived. Floridi suggests that it is information in the latter sense that will come to be the default, rather than The Matrix's version of information, which is ultimately based on an analog view. 

"the infosphere will not be a virtual environment supported by a genuinely 'material' world behind; rather, it will be the world itself that will be increasingly interpreted and understood informationally, as part of the infosphere" (p. 17).

This turns out to be one of the most interesting of the early ideas explored in the book. The guide then detours through territory that will sound all too familiar to students of Computer Science: data vs. information, analog vs digital data, binary data, types of data.

But what is information? It is common to distinguish between data and information, and the data-based General Definition of Information (GDI) that information, or semantic content:
  1. consists of data (one, or more than one datum); 
  2. which is well formed;
  3. and is meaningful.
This definition helps us to understand the distinction between the quantitative and semantic view of information. The implications of this distinction is clear once we understand that for Floridi, semantic content is but one step away from the crowning glory of information theory. By adding truth or falsity to it, it turns into semantic information.

"When semantic content is also true, it qualifies as semantic information. This is the queen of all concepts discussed in this book." (p. 47)

But before we reach this premium destination we must grapple with the quantitative view of information. Although there are several models that attempt to define such a quantitative view, Claude Shannon's Mathematical Theory of Communication (MTC) is the stalwart horse in the stable. Shannon is often referred to as the "father of information theory”. Whereas his influence is indisputable, reference to MTC as information theory has created much misunderstanding and Shannon himself regretted it. The reason is that MTC is indifferent to meaning, and deals mainly with data communication, including encoding and transmission. Floridi suggests that mathematical theory of data communication would have been a more meaningful title.

To begin with we need to understand the concept of data deficit (Shannon used the more psyhologistic term “uncertainty”). If we have a coin with two sides, heads and tails, and we are about to throw it, we are in a state of data deficit about the outcome. The deficit is two units, because there are two possible outcomes: heads, or tails: {h} {t}. A coin is therefore a binary device, producing one bit of information. If we had two coins to throw, the size of the data deficit is four (i.e. there are four possible outcomes: {h , t} {t , h} {h , h} {t , t}).

For those who study or practise computing the notion of a binary digit, or bit, will be more than familiar. However there is another sense in which MTC is highly amenable to computing. As it deals with uninterpreted symbols MTC can be described as “a study of information at the syntactic level” (p. 45).

John von Neumann, one of the colossuses of 20th century mathematics, also had an influence on MTC. He suggested that Shannon call information (in the MTC sense) entropy. This was already a well-known concept (albeit less widely understood) concept in the natural sciences. Information entropy is a measure, as it is in the natural sciences. In particular it measures any of three equivalent quantities:
  1. the quantity of data on the side of the informer;
  2. the quantity of data deficit (prior to inspection) on the side of the informee;
  3. the informational potentiality of the source.
It is easy to see how 1 and 2 are two sides of the same coin (so to speak). If I have one bit of data and send it to you, your data deficit is one bit (prior to you receiving the data). The third point is a little more complicated to understand. The best way is to think of it as the amount of randomness in a message. For instance a phrase such as “The dog is black” has very little entropy because it is highly structured and organised. On the other hand, T. S. Eliot's The Waste Land has high entropy because it is open to various and indeterminate interpretations, and can be said to contain much greater data deficit.
 
The phrase “the dog is black” can also be viewed as semantic content, because it is meaningful. If we change it into a query, eg. “Is the dog black?” and provide the answer: “yes”, then we have semantic information.

This important part of Floridi's discussion of information will be dealt with in part 2.

No comments: