# Information

*First published Fri Oct 26, 2012*

Philosophy of Information deals with the philosophical analysis of the notion of information both from a historical and a systematic perspective. With the emergence of empiricist theory of knowledge in early modern philosophy, the development of various mathematical theories of information in the 20th century and the rise of information technology, the concept of ‘information’ has conquered a central place in the sciences and in society. This interest also led to the emergence of a separate branch of philosophy that analyzes information in all its guises (Adriaans and van Benthem 2008a,b; Lenski 2010; Floridi 2002, 2011). Information has become a central category in both the sciences and the humanities and the reflection on information influences a broad range of philosophical disciplines varying from logic (Dretske 1981; van Benthem en van Rooij 2003; van Benthem 2006) to ethics (Floridi 1999) and esthetics (Schmidhuber 1997a; Adriaans 2008) to ontology (Zuse 1969; Wheeler 1990; Schmidhuber 1997b; Wolfram 2002; Hutter 2010).

The term ‘information’ in colloquial speech is currently predominantly used as an abstract mass-noun used to denote any amount of data, code or text that is stored, sent, received or manipulated in any medium. The detailed history of both the term ‘information’ and the various concepts that come with it is complex and for the larger part still has to be written (Seiffert 1968; Schnelle 1976; Capurro 1978, 2009; Capurro and Hjørland 2003). The exact meaning of the term ‘information’ varies in different philosophical traditions and its colloquial use varies geographically and over different pragmatic contexts. Although an analysis of the notion of information has been a theme in Western philosophy from its early inception, the explicit analysis of information as a philosophical concept is recent, and dates back to the second half of the 20th century. Historically the study of the concept of information can be understood as an effort to make the extensive properties of human knowledge measurable. In the 20th century various proposals for formalization of concepts of information were made:

**Fisher information**: the amount of information that an observable random variable*X*carries about an unknown parameter*θ*upon which the probability of*X*depends (Fisher 1925).**Shannon information**: the entropy,*H*, of a discrete random variable*X*is a measure of the amount of uncertainty associated with the value of*X*(Shannon 1948; Shannon & Weaver 1949).**Kolmogorov complexity**: the information in a binary string*x*is the length of the shortest program*p*that produces*x*on a reference universal Turing machine*U*(Solomonoff 1960, 1964a,b, 1997; Kolmogorov 1965; Chaitin 1969, 1987).**Quantum Information**: The qubit is a generalization of the classical bit and is described by a quantum state in a two-state quantum-mechanical system, which is formally equivalent to a two-dimensional vector space over the complex numbers (Von Neumann 1955; Redei & Stoeltzner 2001).**Information as a state of an agent**: the formal logical treatment of notions like knowledge and belief was initiated by Hintikka (1962, 1973). Dretske (1981) and van Benthem & van Rooij (2003) studied these notions in the context of information theory, cf. van Rooij (2004) on questions and answers, or Parikh & Ramanujam (2003) on general messaging. Also Dunn seems to have this notion in mind when he defines information as “what is left of knowledge when one takes away believe, justification and truth” (Dunn 2001 pg. 423, 2008).**Semantic Information**: Bar-Hillel and Carnap developed a theory of semantic Information (1953). Floridi (2002, 2003, 2011) defines semantic information as well-formed, meaningful and truthful data. Formal entropy based definitions of information (Fisher, Shannon, Quantum, Kolmogorov) do not imply wellformedness or truthfulness.

The first four concepts are quantitative, the last two qualitative. These proposals can roughly be classified in terms of the nature of the definiens: Probability in the case of Fisher and Shannon Information, computation in the case of Kolmogorov complexity, quantum mechanics in the case of quantum information, true beliefs as the core concept of Semantic Information, whereas information states of agents seem to correlate with the formal notion propositions that not necessarily have to be true. The philosophical interpretation of the definiendum ‘Information’ naturally depends on the views one holds about the definiens. Until recently the possibility of a unification of these theories was generally doubted (Adriaans and van Benthem 2008a) but in the past decade conversions and reductions between various formal models have been studied (Cover and Thomas 2006; Grünwald and Vitányi 2008; Bais and Farmer 2008). The situation that seems to emerge is not unlike the concept of energy: there are various formal sub-theories about energy (kinetic, potential, electrical, chemical, nuclear) with well-defined transformations between them. Apart from that, the term ‘energy’ is used loosely in colloquial speech. There is no consensus about the exact nature of the field of philosophy of information. Some authors like Floridi (2002, 2003, 2011) present ‘Philosophy of Information’ as a completely new development with a capacity to revolutionize philosophy per se. Others (Adriaans and van Benthem 2008a; Lenski 2010) see it more as a technical discipline with deep roots in the history of philosophy and consequences for various disciplines like methodology, epistemology and ethics.

- 1. Information in colloquial speech
- 2. History of the term and the concept of information
- 3. Building blocks of modern theories of information
- 4. Developments in philosophy of Information
- 5. Conclusion
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. Information in colloquial speech

The lack of preciseness and the universal usefulness of the term ‘information’ go hand in hand. In our society, in which we explore reality by means of instruments and installations of ever increasing complexity (telescopes, cyclotrons) and communicate via more advanced media (newspapers, radio, television, SMS, the Internet), it is useful to have an abstract mass-noun for the ‘stuff’ that is created by the instruments and that ‘flows’ through these media. Historically this general meaning emerged rather late and seems to be associated with the rise of mass media and intelligence agencies (Devlin & Rosenberg 2008; Adriaans and van Benthem 2008b).

In present colloquial speech the term information is used in various
loosely defined and often even conflicting ways. Most people, for
instance, would consider the following inference *prima facie* to be
valid:

“If I get the information thatpthen I know thatp.”

The same people would probably have no problems with the statement that “Secret services sometimes distribute false information”, or with the sentence “The information provided by the witnesses of the accident was vague and conflicting”. The first statement implies that information necessarily is true, while the other statements allow for the possibility that information is false, conflicting and vague. In everyday communication these inconsistencies do not seem to create great trouble and in general it is clear from the pragmatic context what type of information is designated. These examples suffice to argue that references to our intuitions as speakers of the English language are of little help in the development of a rigorous philosophical theory of information. There seems to be no pragmatic pressure in everyday communication to converge to a more exact definition of the notion of information.

## 2. History of the term and the concept of information

Until the second half of the 20^{th} century almost no modern
philosopher considered ‘information’ to be an important philosophical
concept. The term has no lemma in the well-known encyclopedia of
Edwards (1967) and is not mentioned in Windelband (1921). In this
context the interest in ‘Philosophy of Information’ is a recent
development. Yet, with hindsight from the perspective of a history of
ideas, reflection on the notion of ‘information’ has been a predominant
theme in the history of philosophy. The reconstruction of this history
is relevant for the study of information.

A problem with any ‘history of ideas’ approach is the validation of the underlying assumption that the concept one is studying has indeed continuity over the history of philosophy. In the case of the historical analysis of information one might ask whether the concept of ‘informatio’ discussed by Augustine has any connection to Shannon information, other than a resemblance of the terms. At the same time one might ask whether Locke's ‘plain historical method’ is an important contribution to the emergence of the modern concept of information although in his writings Locke hardly uses the term ‘information’ in a technical sense. As is shown below, there is a conglomerate of ideas involving a notion of information that has developed from antiquity till recent times, but further study of the history of the concept of information is necessary.

An important recurring theme in the early philosophical analysis of knowledge is the paradigm of manipulating a piece of wax: either by simply deforming it, by imprinting a signet ring in it or by writing characters on it. The fact that wax can take different shapes and secondary qualities (temperature, smell, touch) while the volume (extension) stays the same, make it a rich source of analogies, natural to Greek, Roman and medieval culture, where wax was used both for sculpture, writing (wax tablets) and encaustic painting. One finds this topic in writings of such diverse authors as Democritus, Plato, Aristotle, Theophrastus, Cicero, Augustine, Avicenna, Duns Scotus, Aquinas, Descartes and Locke.

### 2.1 Classical philosophy

In classical philosophy ‘information’ was a technical
notion associated with a theory of knowledge and ontology that
originated in Plato's (427–347 BCE) theory of forms, developed in
a number of his dialogues (*Phaedo, Phaedrus, Symposium, Timaeus,
Republic*). Various imperfect individual horses in the physical
world could be identified as horses, because they participated in the
static atemporal and aspatial idea of ‘horseness’ in the
world of ideas or forms. When later authors like Cicero (106–43
BCE) and Augustine (354–430 CE) discussed Platonic concepts in
Latin they used the terms
*informare* and *informatio* as a translation for
technical Greek terms like *eidos* (essence), *idea*
(idea), *typos* (type), *morphe* (form) and
*prolepsis* (representation). The root ‘form’ still
is recognizable in the word *in-form-ation* (Capurro and
Hjørland 2003). Plato's theory of forms was an attempt to
formulate a solution for various philosophical problems: the theory of
forms mediates between a static (Parmenides, ca. 450 BCE) and a dynamic
(Herakleitos, ca. 535–475 BCE) ontological conception of reality
and it offers a model to the study of the theory of human
knowledge. According to Theophrastus (371–287 BCE) the analogy of
the wax tablet goes back to Democritos (ca. 460–380/370 BCE)
(*De Sensibus* 50). In the *Theaetetus* (191c,d) Plato
compares the function of our memory with a wax tablet in which our
perceptions and thoughts are imprinted like a signet ring stamps
impressions in wax. Note that the metaphor of imprinting symbols in
wax is essentially spatial (extensive) and can not easily be
reconciled with the aspatial interpretation of ideas supported by
Plato.

One gets a picture of the role the notion of ‘form’ plays in classical methodology if one considers Aristotle's (384–322 BCE) doctrine of the four causes. In Aristotelian methodology understanding an object implied understanding four different aspects of it:

- Material Cause:
- that as the result of whose presence something comes into being—e.g., the bronze of a statue and the silver of a cup, and the classes which contain these
- Formal Cause:
- the form or pattern; that is, the essential formula and the classes which contain it—e.g., the ratio 2:1 and number in general is the cause of the octave-and the parts of the formula.
- Efficient Cause:
- The source of the first beginning of change or rest; e.g., the man who plans is a cause, and the father is the cause of the child, and in general that which produces is the cause of that which is produced, and that which changes of that which is changed.
- Final Cause:
- The same as “end”; i.e., the
final cause; e.g., as the “end” of walking is health. For
why does a man walk? “To be healthy,” we say, and by
saying this we consider that we have supplied the
cause. (Aristotle,
*Metaphysics*1013a)

Note that Aristotle, who rejects Plato's theory of forms as atemporal
aspatial entities, still uses ‘form’ as a technical
concept. This passage states that knowing the form or structure of an
object, i.e., the *information*, is a necessary condition for
understanding it. In this sense information is a crucial aspect of
classical epistemology.

The fact that the ratio 2:1 is cited as an example also illustrates
the deep connection between the notion of forms and the idea that the
world was governed by mathematical principles. Plato believed under
influence of an older Pythagorean (Pythagoras 572–ca.500 BCE)
tradition that ‘everything that emerges and happens in the
world’ could be measured by means of numbers (*Politicus*
285a). On various occasions Aristotle mentions the fact that Plato
associated ideas with numbers (Vogel 1974, pg. 139). Although formal
mathematical theories about information only emerged in the
20^{th} century, and one has to be careful not to interpret
the Greek notion of a number in any modern sense, the idea that
information was essentially a mathematical notion, dates back to
classical philosophy: the form of an entity was conceived as a
structure or pattern that could be described in terms of numbers. Such
a form had both an ontological and an epistemological aspect: it
explains the essence as well as the understandability of the object.
The concept of information thus from the very start of philosophical
reflection was already associated with epistemology, ontology and
mathematics.

Two fundamental problems that are not explained by the classical
theory of ideas or forms are 1) the actual act of knowing an object
(i.e., if I see a horse in what way is the idea of a horse activated in
my mind) and 2) the process of thinking as manipulation of ideas.
Aristotle treats these issues in *De Anime*, invoking the
signet-ring-impression-in-wax analogy:

By a ‘sense’ is meant what has the power of receiving into itself the sensible forms of things without the matter. This must be conceived of as taking place in the way in which a piece of wax takes on the impress of a signet-ring without the iron or gold; we say that what produces the impression is a signet of bronze or gold, but its particular metallic constitution makes no difference: in a similar way the sense is affected by what is coloured or flavoured or sounding, but it is indifferent what in each case the substance is; what alone matters is what quality it has, i.e., in what ratio its constituents are combined. (

De Anime, Book II, Chp. 12)Have not we already disposed of the difficulty about interaction involving a common element, when we said that mind is in a sense potentially whatever is thinkable, though actually it is nothing until it has thought? What it thinks must be in it just as characters may be said to be on a writing-tablet on which as yet nothing actually stands written: this is exactly what happens with mind. (

De Anime, Book III, Chp. 4)

These passages are rich in influential ideas and can with hindsight
be read as programmatic for a philosophy of information: the process of
*informatio* can be conceived as the imprint of characters on a wax
tablet (*tabula rasa*), thinking can be analyzed in terms of manipulation
of symbols.

### 2.2 Medieval philosophy

Throughout the middle ages the reflection on the concept of
*informatio* is taken up by successive thinkers. Illustrative
for the Aristotelian influence is the passage of Augustine in *De
Trinitate* book XI. Here he analyzes vision as an analogy for the
understanding of the Trinity. There are three aspects: the corporeal
form in the outside world, the *informatio* by the sense of
vision, and the resulting form in the mind. For this process of
information Augustine uses the image of a signet ring making an
impression in wax (*De Trinitate*, XI Cap 2 par 3). Capurro
(2009) observes that this analysis can be interpreted as an early
version of the technical concept of ‘sending a message’ in
modern information theory, but the idea is older and is a common topic
in Greek thought (Plato *Theaetetus* 191c,d; Aristotle *De
Anime*, Book II, Chp. 12, Book III, Chp. 4; Theophrastus *De
Sensibus* 50).

The *tabula rasa* notion was later further developed in the theory
of knowledge of Avicenna (c.980–1037 CE):

The human intellect at birth is rather like a

tabula rasa, a pure potentiality that is actualized through education and comes to know. Knowledge is attained through empirical familiarity with objects in this world from which one abstracts universal concepts. (Sajjad 2006, Other Internet Resources)

The idea of a *tabula rasa* development of the human mind was
the topic of a novel Hayy ibn Yaqdhan by the Arabic Andalusian
philosopher Ibn Tufail (1105–1185 CE, known as
“Abubacer” or “Ebn Tophail” in the West). This
novel describes the development of an isolated child on a deserted
island. A later translation in Latin under the title *Philosophus
Autodidactus* (1761) influenced the empiricist John Locke in the
formulation of his
*tabula rasa* doctrine.

Apart from the permanent creative tension between theology and
philosophy, medieval thought, after the rediscovery of Aristotle's
*Metaphysics* in the 12th century inspired by Arabic scholars, can be
characterized as an elaborate and subtle interpretation and development
of, mainly Aristotelian, classical theory. Reflection on the notion of
*informatio* is taken up, under influence of Avicenna, by
thinkers like Aquinas (1225–1274 CE) and Duns Scotus
(1265/66–1308 CE). When Aquinas discusses the question whether
angels can interact with matter he refers to the Aristotelian doctrine
of hylomorphism (i.e., the theory that substance consists of matter
(hylo-wood, matter) and form (morphè)). Here Aquinas translates
this as the in-formation of matter (*informatio materiae*)
(*Summa Theologiae,* 1a 110 2, Capurro 2009). Duns Scotus
refers to *informatio* in the technical sense when he discusses
Augustine's theory of vision in *De Trinitate*, XI Cap 2 par 3
(Duns Scotus, 1639, De imagine, *Ordinatio*, I, d.3, p.3).

The tension that already existed in classical philosophy between
Platonic idealism(*universalia ante res*) and Aristotelian
realism (*universalia in rebus*) is recaptured as the problem
of universals: do universal qualities like ‘humanity’ or
the idea of a horse exist apart from the individual entities that
instantiate them? It is in the context of his rejection of universals
that Ockham (c. 1287–1347 CE) introduces his well-known razor:
entities should not be multiplied beyond necessity. Throughout their
writings Aquinas and Scotus use the Latin terms *informatio*
and *informare* in a technical sense, although this terminology
is not used by Ockham.

### 2.3 Modern philosophy

The history of the concept of information in modern philosophy is
complicated. Probably starting in the 14th century the term
‘information’ emerged in various developing European
languages in the general meaning of ‘education’ and
‘inquiry’. The French historical dictionary by Godefroy
(1881) gives *action de former, instruction, enquête,
science, talent* as early meanings of ‘information’.
The term was also used explicitly for legal inquiries
(*Dictionnaire du Moyen Français* (1330–1500)
2010). Because of this colloquial use the term
‘information’ loses its association with the concept of
‘form’ gradually and appears less and less in a formal
sense in philosophical texts.

At the end of the middle ages society and science are changing fundamentally (Hazard 1935; Ong 1958; Dijksterhuis 1986). In a long complex process the Aristotelian methodology of the four causes was transformed to serve the needs of experimental science:

- The Material Cause developed in to the modern notion of matter.
- The Formal Cause was reinterpreted as geometric form in space.
- The Efficient Cause was redefined as direct mechanical interaction between material bodies.
- The Final Cause was dismissed as unscientific. Because of this, Newton's contemporaries had difficulty with the concept of the force of gravity in his theory. Gravity as action at a distance seemed to be a reintroduction of final causes.

In this changing context the analogy of the wax-impression is reinterpreted. A proto-version of the modern concept of information as the structure of a set or sequence of simple ideas is developed by the empiricists, but since the technical meaning of the term ‘information’ is lost, this theory of knowledge is never identified as a new ‘theory of information’.

The consequence of this shift in methodology is that only phenomena that can be explained in terms of mechanical interaction between material bodies can be studied scientifically. This implies in a modern sense: the reduction of intensive properties to measurable extensive properties. For Galileo this insight is programmatic:

To excite in us tastes, odors, and sounds I believe that nothing is required in external bodies except shapes, numbers, and slow or rapid movements. (Galileo 1623)

These insights later led to the doctrine of the difference between
primary qualities (space, shape, velocity) and secondary qualities
(heat, taste, color etc.). In the context of philosophy of information
Galileo's observations on the secondary quality of ‘heat’
is of particular importance since they lay the foundations for the
study of thermodynamics in the 19^{th} century:

Having shown that many sensations which are supposed to be qualities residing in external objects have no real existence save in us, and outside ourselves are mere names, I now say that I am inclined to believe heat to be of this character. Those materials which produce heat in us and make us feel warmth, which are known by the general name of “fire,” would then be a multitude of minute particles having certain shapes and moving with certain velocities. (Galileo 1623)

A pivotal thinker in this transformation is René Descartes
(1596–1650 CE). In his *Meditationes*, after
‘proving’ that the matter (*res extensa*) and mind
(*res cogitans*) are different substances (i.e., forms of being
existing independently), the question of the interaction between these
substances becomes an issue. The malleability of wax is for Descartes
an explicit argument against influence of the *res extensa* on
the *res cogitans* (*Meditationes* II, 15). The fact
that a piece of wax loses its form and other qualities easily when
heated, implies that the senses are not adequate for the
identification of objects in the world. True knowledge thus can only
be reached via ‘inspection of the mind’. Here the wax
metaphor that for more than 1500 years was used to
*explain* sensory impression is used to argue *against*
the possibility to reach knowledge via the senses. Since the essence of
the *res extensa* is extension, thinking fundamentally can not be
understood as a spatial process. Descartes still uses the terms ‘form’
and ‘idea’ in the original scholastic non-geometric (atemporal,
aspatial) sense. An example is the short formal proof of God's existence
in the second answer to Mersenne in the *Meditationes de Prima
Philosophia*

I use the term idea to refer to the

formof any given thought, immediate perception of which makes me aware of the thought.

(Idea nomine intelligo cujuslibet cogitationis)formamillam, per cujus immediatam perceptionem ipsius ejusdem cogitationis conscious sum

I call them ‘ideas’ says Descartes

only in so far as they make a difference to the mind itself when they

informthat part of the brain.

(sed tantum quatenus mentem ipsam in illam cerebri partem conversam). (Descartes, 1641,informantAd Secundas Objections, Rationes, Dei existentiam & anime distinctionem probantes, more Geometrico dispositae.)

Because the *res extensa* and the *res cogitans* are different
substances, the act of thinking can never be emulated in space: machines
can not have the universal faculty of reason. Descartes gives two
separate motivations:

Of these the first is that they could never use words or other signs arranged in such a manner as is competent to us in order to declare our thoughts to others: (…) The second test is, that although such machines might execute many things with equal or perhaps greater perfection than any of us, they would, without doubt, fail in certain others from which it could be discovered that they did not act from knowledge, but solely from the disposition of their organs: for while reason is an universal instrument that is alike available on every occasion, these organs, on the contrary, need a particular arrangement for each particular action; whence it must be morally impossible that there should exist in any machine a diversity of organs sufficient to enable it to act in all the occurrences of life, in the way in which our reason enables us to act. (

Discourse de la méthode,1647)

The passage is relevant since it directly argues against the possibility of artificial intelligence and it even might be interpreted as arguing against the possibility of a universal Turing machine: reason as a universal instrument can never be emulated in space. This conception is in opposition to the modern concept of information which as a measurable quantity is essentially spatial, i.e., extensive (but in a sense different from that of Descartes).

Descartes does not present a new interpretation of the notions of form and idea, but he sets the stage for a debate about the nature of ideas that evolves around two opposite positions:

**Rationalism**:- The Cartesian notion that ideas are innate and thus a priori. This form of rationalism implies an interpretation of the notion of ideas and forms as atemporal, aspatial, but complex structures i.e., the idea of ‘a horse’ (i.e., with a head, body and legs). It also matches well with the interpretation of the knowing subject as a created being (ens creatu). God created man after his own image and thus provided the human mind with an adequate set of ideas to understand his creation. In this theory growth, of knowledge is a priori limited. Creation of new ideas ex nihilo is impossible. This view is difficult to reconcile with the concept of experimental science.
**Empiricism**:- Concepts are constructed in the mind a
posteriori on the basis of ideas associated with sensory impressions.
This doctrine implies a new interpretation of the concept of idea as:
whatsoever is the object of understanding when a man thinks … whatever is meant by phantasm, notion, species, or whatever it is which the mind can be employed about when thinking. (Locke 1691, Essay, I,i,8)

Here ideas are conceived as elementary building blocks of human knowledge and reflection. This fits well with the demands of experimental science. The downside is that the mind can never formulate apodeictic truths about cause and effects and the essence of observed entities, including its own identity. Human knowledge becomes essentially probabilistic (Locke 1691, Essay, IV 25).

Locke's reinterpretation of the notion of idea as a ‘structural
placeholder’ for any entity present in the mind is an essential step in
the emergence of the modern concept of information. Since these ideas
are not involved in the justification of apodeictic knowledge, the
necessity to stress the atemporal and aspatial nature of ideas
vanishes. The construction of concepts on the basis of a *collection
of elementary ideas* based in sensorial experience opens the gate
to a reconstruction of *knowledge as an extensive property of an
agent*: more ideas implies more probable knowledge.

In the second half of the 17th century formal theory of probability is
developed by researchers like Pascal (1623–1662), Fermat (1601
or 1606–1665) and Christiaan Huygens (1629–1695). The
work *De ratiociniis in ludo aleae* of Huygens was translated
in to English by John Arbuthnot (1692). For these authors, the world
was essentially mechanistic and thus deterministic, probability was a
quality of human knowledge caused by its imperfection:

It is impossible for a Die, with such determin'd force and direction, not to fall on such determin'd side, only I don't know the force and direction which makes it fall on such determin'd side, and therefore I call it Chance, wich is nothing but the want of art;… (John Arbuthnot

Of the Laws of Chance(1692), preface)

This text probably influenced Hume, who was the first to marry formal probability theory with theory of knowledge:

Though there be no such thing as Chance in the world; our ignorance of the real cause of any event has the same influence on the understanding, and begets a like species of belief or opinion. (…) If a dye were marked with one figure or number of spots on four sides, and with another figure or number of spots on the two remaining sides, it would be more probable, that the former would turn up than the latter; though, if it had a thousand sides marked in the same manner, and only one side different, the probability would be much higher, and our belief or expectation of the event more steady and secure. This process of the thought or reasoning may seem trivial and obvious; but to those who consider it more narrowly, it may, perhaps, afford matter for curious speculation. (Hume 1748, Section VI, “On probability” 1)

Here knowledge about the future as a degree of belief is measured in terms of probability, which in its turn is explained in terms of the number of configurations a deterministic system in the world can have. The basic building blocks of a modern theory of information are in place. With this new concept of knowledge empiricists laid the foundation for the later development of thermodynamics as a reduction of the secondary quality of heat to the primary qualities of bodies.

At the same time the term ‘information’ seems to have lost much of its technical meaning in the writings of the empiricists so this new development is not designated as a new interpretation of the notion of ‘information’. Locke sometimes uses the phrase that our senses ‘inform’ us about the world and occasionally uses the word ‘information’.

For what information, what knowledge, carries this proposition in it, viz. ‘Lead is a metal’ to a man who knows the complex idea the name lead stands for? (Locke 1691, VIII, 4)

Hume seems to use information in the same casual way when he observes:

Two objects, though perfectly resembling each other, and even appearing in the same place at different times, may be numerically different: And as the power, by which one object produces another, is never discoverable merely from their idea, it is evident cause and effect are relations, of which we receive information from experience, and not from any abstract reasoning or reflection. (Hume 1739, Part III, section 1)

The empiricists methodology is not without problems. The biggest
issue is that all knowledge becomes probabilistic and a posteriori.
Immanuel Kant (1724–1804) was one of the first to point out that the
human mind has a grasp of the meta-concepts of space, time and
causality that itself can never be understood as the result of a mere
combination of ‘ideas’. What is more, these intuitions allow us to
formulate scientific insights with certainty: i.e., the fact that the
sum of the angles of a triangle in Euclidean space is 180 degrees. This
issue cannot be explained in the empirical framework. If knowledge is
created by means of combination of ideas then there must exist an a
priori synthesis of ideas in the human mind. According to Kant, this
implies that the human mind can evaluate its own capability to
formulate scientific judgements. In his *Kritik der reinen Vernunft*
(1781) Kant developed transcendental philosophy as an investigation of
the necessary conditions of human knowlevdge. Although Kant's
transcendental program did not contribute directly to the development
of the concept of information, he did influence research in to the
foundations of mathematics and knowledge relevant for this subject in
the 19^{th} and 20^{th} century: e.g., the work of Frege, Husserl, Russell,
Brouwer, L. Wittgenstein, Gödel, Carnap, Popper and Quine.

### 2.4 Historical development of the meaning of the term ‘information’

The history of the term ‘information’ is intricately
related to the study of central problems in epistemology and ontology
in Western philosophy. After a start as a technical term in classical
and medieval texts the term ‘information’ almost vanished
from the philosophical discourse in modern philosophy, but gained
popularity in colloquial speech. Gradually the term obtained the
status of an abstract mass-noun, a meaning that is orthogonal to the
classical process-oriented meaning. In this form it was picked up by
several researchers (Fisher 1925; Shannon 1948) in the 20^{th}
century who introduced formal methods to measure
‘information’. This, in its turn, lead to a revival of the
philosophical interest in the concept of information. This complex
history seems to be one of the main reasons for the difficulties in
formulating a definition of a unified concept of information that
satisfies all our intuitions. At least three different meanings of the
word ‘information’ are historically relevant:

**‘Information’ as the process of being informed**.- This is the oldest meaning one finds in the writings of authors
like Cicero (106–43 BCE) and Augustine (354–430 CE) and
it is lost in the modern discourse, although the association of
information with processes (i.e., computing, flowing or sending a
message) still exists. In classical philosophy one could say that
when I recognize a horse as such, then the ‘form’ of a
horse is planted in my mind. This process is my
‘information’ of the nature of the horse. Also the act
of teaching could be referred to as the ‘information’
of a pupil. In the same sense one could say that a sculptor
creates a sculpture by ‘informing’ a piece of
marble. The task of the sculptor is the ‘information’
of the statue (Capurro & Hjørland 2003). This
process-oriented meaning survived quite long in western European
discourse: even in the 18
^{th}century Robinson Crusoe could refer to the education of his servant Friday as his ‘information’ (Defoe 1719). It is also used in this sense by Berkeley: “I love information upon all subjects that come in my way, and especially upon those that are most important” (*Alciphron*Dialogue 1, Section 5, Paragraph 6/10, see Berkeley 1732). **‘Information’ as a state of an agent**,- i.e., as the result of the process of being informed. If one
teaches a pupil the theorem of Pythagoras then, after this process
is completed, the student can be said to ‘have the
information about the theorem of Pythagoras’. In this sense
the term ‘information’ is the result of the same
suspect form of substantiation of a verb (
*informare*>*informatio*) as many other technical terms in philosophy (substance, consciousness, subject, object). This sort of term-formation is notorious for the conceptual difficulties it generates. Can one derive the fact that I ‘have’ consciousness from the fact that I am conscious? Can one derive the fact that I ‘have’ information from the fact that I have been informed? The transformation to this modern substantiated meaning seems to have been gradual and seems to have been general in Western Europe at least from the middle of the fifteenth century. In the renaissance a scholar could be referred to as ‘a man of information’, much in the same way as we now could say that someone received an education (Adriaans and van Benthem 2008b; Capurro & Hjørland 2003). In ‘Emma’ by Jane Austen one can read: “Mr. Martin, I suppose, is not a man of information beyond the line of his own business. He does not read” (Austen 1815, pg 21). **‘Information’ as the disposition to inform**,- i.e., as a capacity of an object to inform an agent. When the
act of teaching me Pythagoras' theorem leaves me with information
about this theorem, it is only natural to assume that a text in
which the theorem is explained actually ‘contains’
this information. The text has the capacity to inform me when I
read it. In the same sense, when I have received information from
a teacher, I am capable of transmitting this information to
another student. Thus information becomes something that can be
stored and measured. This last concept of information as an
abstract mass-noun has gathered wide acceptance in modern society
and has found its definitive form in the 19
^{th}century, allowing Sherlock Homes to make the following observation: “… friend Lestrade held information in his hands the value of which he did not himself know.” (“The Adventure of the Noble Bachelor,” Conan Doyle 1892). The association with the technical philosophical notions like ‘form’ and ‘informing’ has vanished from the general consciousness although the association between information and processes like storing, gathering, computing and teaching still exist.

## 3. Building blocks of modern theories of information

Leaving aside for a moment the exact nature of information bearers (an
‘idea’, a text, number, message, physical object, system,
proposition or structure) there are various ways to measure the amount
of information stored in an information bearer *x*. Let
*I*(*x*) be an indeterminate information function that assigns a
scalar value to *x* measuring it's ‘information’. There are two
basic intuitions or maxims that any such measurement proposal should
observe:

*Information is extensive*. Our intuition is that longer text potentially contains more information. Thus when we have two structures*A*and*B*that are mutually independent, then the total information in the combination should be the sum of both the information in*A*and*B*:*I*(*A*and*B*)=*I*(*A*)+*I*(*B*).*Information reduces uncertainty*. Information grows with the reduction of uncertainty it creates. When we are absolutely certain about a state of affairs we cannot receive new information about it. This suggests an association between information and probability. Improbable structures contain more information. If we measure the probability of an event in terms of a real number between 0 and 1, then when*P*(*A*) = 1, i.e., it is absolutely certain that*A*will occur, we should have that*I*(*A*) = 0, i.e., the occurrence of*A*contains no information.

Both intuitions are related to the methodology of empiricism (Locke
1691; Hume 1748) and it's underlying theory of knowledge. The simplest
mathematical function that unifies these two intuitions is the one
that defines the information in terms of the negative log of the
probability: *I*(*A*)= −log *P*(*A*)
(Shannon 1948; Shannon & Weaver 1949). The elegance of this
formula however does not shield us from the conceptual problems it
harbors and the history of its genesis is involved. In the following
paragraphs we discuss some developments that contributed to the
emergence of modern theories of information.

With hindsight many notions that have to do with optimal code systems, ideal languages and the association between computing and processing language have been recurrent themes in the philosophical reflection since the seventeenth century.

### 3.1 Languages

One of the most elaborate proposals for a universal
‘philosophical’ language was made by bishop John Wilkins:
“An Essay towards a Real Character, and a Philosophical
Language” (London 1668). Wilkins' project consisted of an
elaborate system of symbols that supposedly were associated with
unambiguous concepts in reality. Proposals such as these made
philosophers sensitive to the deep connections between language and
thought. The empiricist methodology made it possible to conceive the
development of language as a system of conventional signs in terms of
associations between ideas in the human mind. The issue that currently
is known as the *symbol grounding problem* (how do arbitrary
signs acquire their inter-subjective meaning) was one of the most
heavily debated questions in the 18^{th} century in the
context of the problem of the origin of languages. Diverse thinkers as
Vico, Condillac, Rousseau, Diderot, Herder and Haman made
contributions. The central question was whether language was given a
priori (by God) or whether it was constructed and hence an invention
of man himself. Typical was the contest issued by the Berlin Academy
in 1769:

En supposant les hommes abandonnés á leurs facultés naturelles, sont-ils en état d'inventer le langage, et par quels moyens parviendront-ils á cette invention?Assuming men abandoned to their natural faculties, are they able to invent language and by what means will they come to this invention?

The controversy raged on for over a century without any conclusion and in 1866 the French Academy of Science banished the issue from the scientific arena.

Philosophically more relevant is the work of Leibniz (1646–1716)
on a so-called *characteristica universalis*: the notion of a
universal logical calculus that would be the perfect vehicle for
scientific reasoning. A central presupposition in Leibniz' philosophy
is that such a perfect language of science is in principle possible
because of the perfect nature of the world as God's creation
(*ratio essendi* = *ration cognoscendi,* the origin of
being is the origin of knowing). This principle was rejected by Wolff
(1679–1754) who suggested more heuristically oriented
characteristica combinatoria (van Peursen 1987). These ideas had to
wait for thinkers like Boole (1854, *An Investigation of the Laws
of Thought*), Frege (1879, *Begriffschrift*), Peirce (who
in 1886 already suggested that electrical circuits could be used to
process logical operations) and Russell and Whitehead
(1910–1913,
*Principia Mathematica*) to find a more fruitful treatment.

### 3.2 Optimal codes

The fact that frequencies of letters vary in a language was known since the invention of book printing. Printers needed many more ‘e’s and ‘t’s than ‘x’s or ‘q’s to typeset an English text. This knowledge was used extensively to decode ciphers since the 17th century (Kahn 1967; Singh 1999). In 1844 an assistant of Samuel Morse, Alfred Vail, determined the frequency of letters used in a local newspaper in Morristown and used them to optimize Morse code. Thus the core of theory of optimal codes was already established long before Shannon developed its mathematical foundation (Shannon 1948; Shannon & Weaver 1949). Historically important but philosophically less relevant are the efforts of Charles Babbage to construct computing machines (Difference Engine in 1821, and the Analytical Engine 1834–1871) and the attempt of Ada Lovelace (1815–1852) to design what is considered to be the first programming language for the Analytical Engine.

### 3.3 Numbers

The simplest way of representing numbers is via a *unary
system*. Here the length of the representation of a number is
equal to the size of the number itself, i.e., the number
‘ten’ is represented as ‘\\\\\\\\\\’. The
classical Roman number system is an improvement since it contains
different symbols for different orders of magnitude (one = I, ten = X,
hundred = C, thousand = M). This system has enormous drawbacks since
in principle one needs an infinite amount of symbols to code the
natural numbers and because of this the same mathematical operations
(adding, multiplication etc.) take different forms at different orders
of magnitude. Around 500 CE the number zero was invented in
India. Using zero as a placeholder we can code an infinity of numbers
with a finite set of symbols (one = I, ten = 10, hundred = 100,
thousand = 1000 etc.). From a modern perspective an infinite number of
position systems is possible as long as we have 0 as a placeholder and
a finite number of other symbols. Our normal decimal number system has
ten digits ‘0, 1, 2, 3, 4, 5, 6, 7, 8, 9’ and represents
the number two-hundred-and-fifty-five as ‘255’. In a
binary number system we only have the symbols ‘0’ and
‘1’. Here two-hundred-and-fifty-five is represented as
‘11111111’. In a hexadecimal system with 16 symbols (0, 1,
2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f) the same number can be
written as ‘ff’. Note that the length of these
representations differs considerable. Using this representation,
mathematical operations can be standardized irrespective of the order
of magnitude of numbers we are dealing with, i.e., the possibility of
a uniform algorithmic treatment of mathematical operations (addition,
subtraction, multiplication and division etc.) is associated with such
a position system.

The concept of a positional number system was brought to Europe by
the Persian mathematician al-Khwarizmi (ca.780–ca.850
AD). His main work on numbers (ca. 820 CE) was translated into Latin as
*Liber Algebrae et Almucabola* in the 12th century, which gave
us amongst other things the term ‘algebra’. Our word
‘algorithm’ is derived from *Algoritmi*, the Latin
form of his name. Positional number systems simplified commercial and
scientific calculations.

In 1544 Michael Stifel introduced the concept of the exponent of a
number in *Arithmetica integra* (Stifel 1544). Thus 8 can be
written as 2^{3} and 25 as
5^{2}. The notion of an exponent immediately
suggests the notion of a logarithm as its inverse function:
log_{b}(*b ^{a}*) =

*a*. Stifel compared the arithmetic sequence:

−3, −2, −1, 0, 1, 2, 3

in which the term 1 have a difference of 1 with the geometric sequence:

⅛, ¼, ½, 1, 2, 4, 8

in which the terms have a ratio of 2. The exponent notation allowed him to rewrite the values of the second table as:

2

^{−3}, 2^{−2}, 2^{−1}, 2^{0}, 2^{1}, 2^{2}, 2^{3}

which combines the two tables. This arguably was the first logarithmic table. A more definitive and practical theory of logarithms is developed by John Napier (1550–1617) in his main work (Napier 1614). He coined the term logarithm (logos + arithmetic: ratio of numbers). As is clear from the match between arithmetic and geometric progressions, logarithms reduce products to sums:

log

_{b}(xy) = log_{b}(x) + log_{b}(y)

They also reduce divisions to differences:

log

_{b}(x/y) = log_{b}(x) − log_{b}(y)

and powers to products:

log

_{b}(x^{p}) =plog_{b}(x)

After publication of the logarithmic tables by Briggs (1624) this new technique of facilitating complex calculations rapidly gained popularity.

### 3.4 Physics

Galileo (1623) already had suggested that the analysis of phenomena
like heat and pressure could be reduced to the study of movements of
elementary particles. Within the empirical methodology this could be
conceived as the question how the sensory experience of the secondary
quality of heat of an object or a gas could be reduced to movements of
particles. Bernoulli (*Hydrodynamica* published in 1738) was
the first to develop a kinetic theory of gases in which
macroscopically observable phenomena are described in terms of
microstates of systems of particles that obey the laws of Newtonian
mechanics, but it was quite an intellectual effort to come up with an
adequate mathematical treatment. Clausius (1850) made a conclusive
step when he introduced the notion of the mean free path of a particle
between two collisions. This opened the way for a statistical
treatment by Maxwell who formulated his distribution in 1857, which
was the first statistical law in physics. The definitive formula that
tied all notions together (and that is engraved on his tombstone,
though the actual formula is due to Planck) was developed by
Boltzmann:

S=klogW

It describes the entropy *S* of a system in terms of the
logarithm of the number of possible microstates *W*, consistent
with the observable macroscopic states of the system, where *k*
is the well-known Boltzmann constant. In all its simplicity the value
of this formula for modern science can hardly be overestimated. The
expression ‘log *W*’ can, from the perspective of
information theory, be interpreted in various ways:

- As the amount of
*entropy*in the system. - As the length of the
*number*needed to count all possible microstates consistent with macroscopic observations. - As the length of an optimal
*index*we need to identify the specific current unknown microstate of the system, i.e., it is a measure of our ‘lack of information’. - As a measure for the
*probability*of any typical specific microstate of the system consistent with macroscopic observations.

Thus it connects the additive nature of logarithm with the extensive qualities of entropy, probability, typicality and information and it is a fundamental step in the use of mathematics to analyze nature. Later Gibbs (1906) refined the formula:

S= −Σ_{i}pln_{i}p_{i}

where *p _{i}* is the probability that the system is in
the

*i*microstate. This formula was adopted by Shannon (1948; Shannon & Weaver 1949) to characterize the communication entropy of a system of messages. Although there is a close connection between the mathematical treatment of entropy and information, the exact interpretation of this fact has been a source of controversy ever since (Harremoës & Topsøe 2008; Bais & Farmer 2008).

^{th}### 3.5 Logic

Dunn (2001, 2008) has pointed out that the analysis of information
in logic is intricately related to the notions of intension and
extension. The distinction between intension and extension is already
anticipated in the *Port Royal Logic* (1662) and the writings
of Mill (1843), Boole (1847) and Peirce (1868) but was systematically
introduced in logic by Frege (1879, 1892). In a modern sense the
extension of a predicate, say “*X* is a bachelor”,
is simply the set of bachelors in our domain. The intension is
associated with the meaning of the predicate and allows us to derive
from the fact that ‘John is a bachelor’ the facts that
‘John is male’ and ‘John is unmarried’. It is
clear that this phenomenon has a relation with both the possible world
interpretation of modal operators and the notion of information. A
bachelor is by necessity also male, i.e., in every possible world in
which John is a bachelor he is also male, consequently: If someone
gives me the information that John is a bachelor I get the information
that he is male and unmarried for free. The possible world
interpretation of modal operators (Kripke 1959) is related to the
notion of ‘state description’ introduced by Carnap
(1947). A state description is a conjunction that contains exactly one
of each atomic sentence or its negation. The ambition to define a good
probability measure for state descriptions was one of the motivations
for Solomonoff (1960, 1997) to develop algorithmic information theory.

## 4. Developments in philosophy of Information

The modern theories of information emerged in the middle of the
20^{th} century in a specific intellectual climate in which
the distance between the sciences and parts of academic philosophy was
quite big. Some philosophers displayed a specific anti-scientific
attitude: Heidegger, “*Der Wissenschaft denkt
nicht.*” On the other hand the philosophers from the Wiener
Kreis overtly discredited traditional philosophy as dealing with
illusionary problems (Carnap 1928). The research program of logical
positivism was a rigorous reconstruction of philosophy based on a
combination of empiricism and the recent advances in logic. It is
perhaps because of this intellectual climate that early important
developments in the theory of information took place in isolation from
mainstream philosophical reflection. A landmark is the work of Dretske
in the early eighties (Dretske 1981). Since the turn of the century,
interest in Philosophy of Information has grown considerably, largely
under the influence of the work of Luciano Floridi on
semantic information.
Also the rapid theoretical development of quantum computing and the
associated notion of
quantum information
have had it repercussions on philosophical reflection.

### 4.1 Popper: Information as degree of falsifiability

The research program of logical positivism of the Wiener Kreis in the
first half of the 20^{th} century revitalized the older
project of empiricism. Its ambition was to reconstruct scientific
knowledge on the basis of direct observations and logical relation
between statements about those observations. The old criticism of Kant
on empiricism was revitalized by Quine (1951). Within the framework of
logical positivism induction was invalid and causation could never be
established objectively. In his *Logik der Forschung* (1934)
Popper formulates his well-known demarcation criterion and he
positions this explicitly as a solution to Hume's problem of induction
(Popper 1934 [1977], pg. 42). Scientific theories formulated as
general laws can never be verified definitively, but they can be
falsified by only one observation. This implies that a theory is
‘more’ scientific if it is richer and provides more
opportunity to be falsified:

Thus it can be said that the amount of empirical information conveyed by a theory, or its

empirical content, increases with its degree of falsifiability” (Popper 1934 [1977], pg. 113, emphasis in original).

This quote, in the context of Popper's research program, shows that
the ambition to measure the *amount of empirical information in
scientific theory conceived as a set of logical statements* was
already recognized as a philosophical problem more than a decade
before Shannon formulated his theory of information. Popper is aware
of the fact that the empirical content of a theory is related to its
falsifiability and that this in its turn has a relation with the
probability of the statements in the theory. Theories with more
empirical information are less probable. Popper distinguishes
*logical probability* from *numerical probability*
(“which is employed in the theory of games and chance, and in
statistics” (Popper 1934 [1977], pg. 119, emphasis in
original)). In a passage that is programmatic for the later
development of the concept of information he defines the notion of
logical probability:

The logical probability of a statement is complementary to its falsifiability:it increases with decreasing degree of falsifiability. The logical probability 1 corresponds to the degree 0 of falsifiability andvice versa. (Popper 1934 [1977], p. 119, emphasis in original)It is possible to interpret numerical probability as applying to a subsequence (picked out from the logical probability relation) for which a

system of measurementcan be defined, on the basis of frequency estimates. (Popper 1934 [1977], pg. 119, emphasis in original)

Popper never succeeded in formulating a good formal theory to measure this amount of information although in later writings he suggests that Shannon's theory of information might be useful (Popper 1934 [1977], appendix ix (1954), pg. 404). These issues were later developed in philosophy of science. Theory of conformation studies induction theory and the way in which evidence ‘supports’ a certain theory (Huber 2007, Other Internet Resources). Although the work of Carnap motivated important developments in both philosophy of science and philosophy of information the connection between the two disciplines seems to have been lost. There is no mention of information theory or any of the more foundational work in philosophy of information in Kuipers (2007a), but the two disciplines certainly have overlapping domains. (See, e.g., the discussion of the so-called Black Ravens Paradox by Kuipers (2007b) and Rathmanner & M. Hutter (2011).)

### 4.2 Shannon: Information defined in terms of probability

In two landmark papers Shannon (1948; Shannon & Weaver 1949)
characterized the communication entropy of a system of
messages *A*:

H(P) = −Σ_{(i∈A)}plog_{i}_{2}p_{i}

Here *p _{i}* is the probability of message

*i*in

*A*. This is exactly the formula for Gibb's entropy in physics. The use of base-2 logarithms ensures that the code length is measured in bits (binary digits). It is easily seen that the communication entropy of a system is maximal when all the messages have equal probability and thus are typical.

The amount of information *I* in an
individual message *x* is given by:

I(x) = −logp_{x}

This formula, that can be interpreted as the inverse of the Boltzmann entropy, covers a number of our basic intuitions about information:

- A message
*x*has a certain probability*p*between 0 and 1 of occurring._{x} - If
*p*= 1 then_{x}*I*(*x*) = 0. If we are certain to get a message it literally contains no ‘news’ at al. The lower the probability of the message is, the more information it contains. A message like “The sun will rise tomorrow” seems to contain less information than the message “Jesus was Caesar” exactly because the second statement is much less likely to be defended by anyone (although it can be found on the web). - If two messages
*x*and*y*are unrelated then*I*(*x*and*y*)=*I*(*x*) +*I*(*y*). Information is*extensive*. The amount of information in two combined messages is equal to the sum of the amount of information in the individual messages.

Information as the negative log of the probability is the only
mathematical function that exactly fulfills these constraints (Cover
& Thomas 2006). Shannon offers a theoretical framework in which
binary strings can be interpreted as words in a (programming) language
containing a certain amount of information (see
3.1 Languages).
The expression -log *p _{x}* exactly
gives the length of an optimal code for message

*x*and as such formalizes the old intuition that codes are more efficient when frequent letters get shorter representations (see 3.2 Optimal codes ). Logarithms as a reduction of multiplication to addition (see 3.3 Numbers) are a natural representation of extensive properties of systems and already as such had been used by physicists in the 19

^{th}century (see 3.4 Physics).

One aspect of information that Shannon's definition explicitly does not cover is the actual content of the messages interpreted as propositions. So the statement “Jesus was Caesar” and “The moon is made of green cheese” may carry the same amount of information while their meaning is totally different. A large part of the effort in philosophy of information has been directed to the formulation of more semantic theories of information (Bar-Hillel and Carnap 1953; Floridi 2002, 2003, 2011). Although Shannon's proposals at first were almost completely ignored by philosophers it has in the past decennia become apparent that their impact on philosophical issues is big. Dretske (1981) was one of the first to analyze the philosophical implications of Shannon's theory, but the exact relation between various systems of logic and theory of information are still unclear (see 3.5 Logic).

### 4.3 Solomonoff, Kolmogorov, Chaitin: Information as the length of a program

This problem of relating a set of statements to a set of observations
and defining the corresponding probability was taken up by Carnap
(1945, 1950). He distinguished two forms of probability:
Probability_{1} or “degree of confirmation”
*P*_{1}(*h*;*e*) is a *logical*
relation between two sentences, a hypothesis *h* and a
sentence *e* reporting a series of observations. Statements of
this type are either analytical or contradictory. The second form,
Probability_{2} or “relative frequency”, is the
statistical concept. In the words of his student Solomonoff
(1997):

Carnap's model of probability started with a long sequence of symbols that was a description of the entire universe. Through his own formal linguistic analysis, he was able to assign a priori probabilities to any possible string of symbols that might represent the universe.

The method for assigning probabilities Carnap used, was not universal
and depended heavily on the code systems used. A general theory of
induction using Bayes' rule can only be developed when we can assign a
universal probability to ‘any possible string’ of
symbols. In a paper in 1960 Solomonoff (1960, 1964a,b) was the first
to sketch an outline of a solution for this problem. He formulated the
notion of a *universal distribution*:

consider the set of all possible finite strings to be programs for a universal Turing machine

Uand define the probability of a stringxof symbols in terms of the length of the shortest programpthat outputsxonU.

This notion of Algorithmic Information Theory was invented
independently somewhat later separately by Kolmogorov (1965) and
Chaitin (1969). Levin (1974) developed a mathematical expression of
the universal a priori probability as a universal (that is, maximal)
lower semicomputable semimeasure *M*, and showed that the
negative logarithm of
*M*(*x*) coincides with the Kolmogorov complexity of *x* up
to an additive logarithmic term.

Algorithmic Information Theory (a.k.a. Kolmogorov complexity theory) has developed into a rich field of research with a wide range of domains of applications many of which are philosophically relevant (Li and Vitányi 1997):

- It provides us with a general theory of induction. The use of Bayes' rule allows for a modern reformulation of Ockham's razor in terms of Minimum Description Length (Rissanen 1978, 1989; Barron, Rissanen, and Yu 1998; Grünwald 2007) and minimum message length (Wallace 2005). Note that Domingos (1998) has argued against the general validity of these principles.
- It allows us to formulate probabilities and information content for individual objects. Even individual natural numbers.
- It lays the foundation for a theory of learning as data compression (Adriaans 2007).
- It gives a definition of randomness of a string in terms of incompressibility. This in itself has led to a whole new domain of research (Niess 2009; Downey & Hirschfeld 2010).
- It allows us to formulate an objective a priori measure of the predictive value of a theory in terms of its randomness deficiency: i.e., the best theory is the shortest theory that makes the data look random conditional to the theory. (Vereshchagin and Vitányi 2004).

There are also down-sides:

- Algorithmic complexity is uncomputable, although it can in a lot of practical cases be approximated and commercial compression programs in some cases come close to the theoretical optimum (Cilibrasi and Vitányi 2005).
- Algorithmic complexity is an asymptotic measure (i.e., it gives a value that is correct up to a constant). In some cases the value of this constant is prohibitive for use in practical purposes.
- Although the shortest theory is always the best one in terms of randomness deficiency, incremental compression of data-sets is in general not a good learning strategy since the randomness deficiency does not decrease monotonically with the compression rate (Adriaans and Vitányi 2009).
- The generality of the definitions provided by Algorithmic Information Theory depends on the generality of the concept of a universal Turing machine and thus ultimately on the interpretation of the Church-Turing-Thesis.

Algorithmic Information Theory has gained rapid acceptance as a
fundamental theory of information. The well-known introduction
in *Information Theory* by Cover and Thomas (2006) states:
“… we consider Kolmogorov complexity (i.e., AIT) to be
more fundamental than Shannon entropy” (pg 3).

The idea that algorithmic complexity theory is a foundation for a
general theory of artificial intelligence (and theory of knowledge)
has already been suggested by Solomonoff (1997) and Chaitin
(1987). Several authors have defended that data compression is a
general principle that governs human cognition (Chater &
Vitányi 2003; Wolff 2006). Hutter (2005, 2007a,b) argues that
Solomonoff's formal and complete theory essentially solves the
induction problem. Hutter (2007a) and Rathmanner & Hutter (2011)
enumerate a plethora of classical philosophical and statistical
problems around induction and claim that Solomonoff's theory solves or
avoids all these problems. Probably because of its technical nature,
the theory has been largely ignored by the philosophical
community. Yet, it stands out as one of the most fundamental
contributions to information theory in the 20^{th} century and
it is clearly relevant for a number of philosophical issues, such as
the problem of induction.

### 4.4 Applications

The first domain that could benefit from philosophy of information is of course philosophy itself. The concept of information potentially has an impact on almost all philosophical main disciplines, ranging from logic, theory of knowledge, to ontology and even ethics and esthetics (see introduction above). Philosophy of science and philosophy of information, with their interest in the problem of induction and theory formation, probably both could benefit from closer cooperation (see 4.1 Popper: Information as degree of falsifiability). The concept of information plays an important role in the history of philosophy that is not completely understood (see 2. History of the term and the concept of information).

As information has become a central issue in almost all of the
sciences and humanities this development will also impact
philosophical reflection in these areas. Archaeologists, linguists,
physicists, astronomers all deal with information. The first thing a
scientist has to do before he can formulate a theory is gathering
information. The application possibilities are abundant. Datamining
and the handling of extremely large data sets seems to be an essential
for almost every empirical discipline in the 21^{st}
century.

In biology we have found out that information is essential for the organization of life itself and for the propagation of complex organisms (see entry on biological information). One of the main problems is that current models do not explain the complexity of life well. Valiant has started a research program that studies evolution as a form of computational learning (Valiant 2007) in order to explain this discrepancy. Aaronson (2011, Other Internet Resources) has argued explicitly for a closer cooperation between complexity theory and philosophy.

Until recently the general opinion was that the various notions of information were more or less isolated but in recent years considerable progress has been made in the understanding of the relationship between these concepts. Cover and Thomas (2006), for instance, see a perfect match between Kolmogorov complexity and Shannon information. Similar observations have been made by Grünwald and Vitányi (2008). Also the connections that exist between the theory of thermodynamics and information theory have been studied (Bais and Farmer 2008; Harremoës & Topsøe 2008) and it is clear that the connections between physics and information theory are much more elaborate than a mere ad hoc similarity between the formal treatment of entropy and information suggests (Gell-Mann & Lloyd 2003; Verlinde 2010 (Other Internet Resources)). A unified theory of information, however, seems beyond our reach at this moment.

## 5. Conclusion

The notion of information has become central in both our society and in the sciences. Information technology plays a central role in the way we organize our lives. It also has become a central category in the sciences and the humanities. Philosophy of information, both as a historical and a systematic discipline, offers a new perspective on old philosophical problems and also suggest some new research domains. A deeper analysis of some of the more technical problems concerning the philosophical analysis of information is given in the supplementary document Open Problems in the Study of Information and Computation.

## Bibliography

- Adriaans, P.W., 2007, ‘Learning as Data Compression’,
in S. B. Cooper, B. Löwe & A. Sorbi,
*Computation and Logic in the Real World*(Lecture Notes in Computer Science: Volume 449), Berlin: Springer, pp. 11–24. - –––, 2008, “Between Order and Chaos: The
Quest for Meaningful Information,”
*Theory of Computing Systems*(Special Issue: Computation and Logic in the Real World; Guest Editors: S. Barry Cooper, Elvira Mayordomo and Andrea Sorbi), 45 (July): 650–674. - Adriaans, P.W. and J.F.A.K. van Benthem, 2008a, ‘Information is what information does’, in Adriaans and van Benthem 2008b.
- ––– (eds.), 2008b,
*Handbook of Philosophy of Information*, Elsevier Science Publishers. - Adriaans, P. and P.M.B. Vitányi, 2009, “Approximation of the
Two-Part MDL Code,”
*IEEE Transactions on Information Theory*, 55(1): 444–457. - Aristotle.
*Aristotle in 23 Volumes*, Vols. 17, 18, translated by Hugh Tredennick. Cambridge, MA, Harvard University Press; London, William Heinemann Ltd. 1933, 1989. - Antunes, L. and L. Fortnow, 2003, “Sophistication Revisited,” in
*Proceedings of the 30th International Colloquium on Automata, Languages and Programming*(Lecture Notes in Computer Science: Volume 2719), Berlin: Springer, pp. 267–277. - Antunes, L., L. Fortnow, D. Van Melkebeek and N. V. Vinodch, 2006,
“Computational depth: Concept and application,”
*Theoretical Computer Science*, volume 354. - Aquinas, St. Thomas, 1265–1274,
*Summa Theologiae*. - Arbuthnot, J., 1692,
*Of the Laws of Chance, or, a method of Calculation of the Hazards of Game, Plainly demonstrated, And applied to Games as present most in Use*, translation of Huygens’*De Ratiociniis in Ludo Aleae*, - Austen, J., 1815,
*Emma*, London: Richard Bentley and Son. - Bar-Hillel, Y. and R. Carnap, 1953, ‘Semantic Information’,
*The British Journal for the Philosophy of Science*, 4(14): 147–157. - Bais, F.A. and J.D. Farmer, 2008, “The Physics of Information,” in Adriaans and van Benthem 2008b.
- Barron, A., J. Rissanen, and B. Yu, 1998, “The minimum description
length principle in coding and modeling,”
*IEEE Transactions on Information Theory*, 44(6): 2743–2760. - Barwise, J. and J. Perry, 1983,
*Situations and Attitudes*, Cambridge, MA: MIT Press. - Bennett, C. H., 1988, “Logical depth and physical complexity,” in R.
Herken (ed.),
*The Universal Turing Machine: A Half-Century Survey*, Oxford: Oxford University Press, pp. 227–257. - van Benthem, J.F.A.K., 1990, “Kunstmatige Intelligentie: Een Voortzetting
van de Filosofie met Andere Middelen,”
*Algemeen Nederlands Tijdschrift voor Wijsbegeerte*, 82: 83–100. - –––, 2006, “Epistemic Logic and Epistemology: the state
of their affairs,”
*Philosophical Studies*, 128: 49–76. - van Benthem, J.F.A.K. and R. van Rooij, eds., 2003, “Connecting the
Different Faces of Information,”
*Journal of Logic, Language and Information*, 12(4): 375–379. - Berkeley, G., 1732,
*Alciphron: Or the Minute Philosopher*, Edinburgh: Thomas Nelson, 1948–57. - Birkhoff, G.D., 1950,
*Collected Mathematical Papers*, New York: American Mathematical Society. - Boole, G., 1847,
*Mathematical Analysis of Logic*, Cambridge: Macmillan, Barclay, & Macmillan. [available online]. - Bovens, L. and S. Hartmann, 2003,
*Bayesian epistemology*, Oxford: Oxford University Press. - Briggs, H., 1624,
*Arithmetica Logarithmica*, London: Gulielmus Iones. - Capurro, R., 1978,
*Information. Ein Beitrag zur etymologischen und ideengeschichtlichen Begründung des Informationsbegriffs*[Information: A contribution to the foundation of the concept of information based on its etymology and in the history of ideas]. Munich, Germany: Saur. [available online]. - –––, 2009, “Past, present and future of
the concept of information,”
*tripleC*(*Cognition, Communication, Co-operation*), 7(2): 125–141. - Capurro, R. & B. Hjørland, 2003, “The Concept of
Information,” in Blaise Cronin (ed.),
*Annual Review of Information Science and Technology (ARIST)*, 37 (Chapter 8), 343–411. - Carnap, R., 1928,
*Scheinprobleme in der Philosophie*(Pseudoproblems of Philosophy). Berlin: Weltkreis-Verlag. - –––, 1945, “The Two Concepts of
Probability: The Problem of Probability,”
*Philosophy and Phenomenological Research*, 5(4): 513–532. - –––, 1947,
*Meaning and Necessity*, Chicago: The University of Chicago Press. - –––, 1950,
*Logical Foundations of Probability*, Chicago: The University of Chicago Press. - Chaitin, G. J., 1969, “On the length of programs for computing finite
binary sequences: statistical considerations,”
*J. Assoc. Comput. Mach.*, 16: 145–159. - –––, 1987,
*Algorithmic information theory*, New York: Cambridge University Press. - Chater, N. and P.M.B. Vitányi, 2003, “Simplicity: a unifying
principle in cognitive science,”
*Trends in Cognitive Science*, 7(1): 19–22. - Cilibrasi, R. and P.M.B. Vitányi, 2005, “Clustering by
compression,”
*IEEE Transactions on Information Theory*, 51(4), 1523–1545. - Clausius, R., 1850, “Über die bewegende Kraft der Wärme und
die Gesetze welche sich daraus für die Wärmelehre selbst
ableiten lassen,”
*Poggendorffs Annalen der Physik und Chemie*, 79: 368–97. - Conan Doyle, A., 1892,
*The Adventures of Sherlock Holmes*, George Newnes Ltd. - Cover, T.M. and J.A. Thomas, 2006,
*Elements of Information Theory*, 2nd edition, New York: John Wiley & Sons. - Crawford, J.M. and L.D. Auton, 1993, “Experimental Results
on the Cross over Point in Satisfiability
Problems,”
*Proceedings of the Eleventh National Conference on Artificial Intelligence*, AAAI Press, pp. 21–27. - Crutchfield, J.P. and K. Young, 1989, “Inferring Statistical
Complexity,”
*Physical Review Letters*, 63:105. - –––, 1990, “Computation at the Onset of
Chaos,” in
*Entropy, Complexity, and the Physics of Information*, W. Zurek, editor, SFI Studies in the Sciences of Complexity, VIII, Reading, MA: Addison-Wesley, pp. 223–269. - Defoe, D., 1719,
*The Life and Strange Surprising Adventures of Robinson Crusoe of York, Mariner: who lived Eight and Twenty Years, all alone in an uninhabited Island on the coast of America, near the Mouth of the Great River of Oroonoque; Having been cast on Shore by Shipwreck, wherein all the Men perished but himself. With An Account how he was at last as strangely deliver'd by Pirates. Written by Himself*, London: W. Taylor. - Dershowitz, N. and Y. Gurevich, 2008, “A Natural
Axiomatization of Computability and Proof of Church's
Thesis,”
*Bulletin of Symbolic Logic*, 14(3): 299–350. - Descartes, R., 1641,
*Meditationes de Prima Philosophia*(Meditations on First Philosophy), Paris. - –––, 1647,
*Discours de la Méthode*(Discourse on Method), Leiden. - Devlin, K. and D. Rosenberg, 2008, “Information in the Study of Human Interaction,” in Adriaans and van Benthem 2008b.
- Dictionnaire du Moyen Français (1330–1500) 2010, [available online]
- Domingos, P., 1998, “Occam's Two Razors: The Sharp and the
Blunt,” in
*Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining*(KDD–98), New York: AAAI Press, pp. 37–43. - Downey, R.G. and D.R. Hirschfeldt, 2010,
*Algorithmic Randomness and Complexity*(Series: Theory and Applications of Computability), New York: Springer. - Dretske, F., 1981,
*Knowledge and the Flow of Information*, Cambridge, MA: The MIT Press. - Dufort, P.A. and C.J. Lumsden, 1994, “The Complexity and Entropy
of Turing Machines,” Workshop on Physics and
Computation.
*PhysComp '94 Proceedings*, 227–232. - Dunn, J.M., 2001, “The Concept of Information and the Development
of Modern Logic,” in
*Non-classical Approaches in the Transition from Traditional to Modern Logic*, W. Stelzner (ed.), de Gruyter, pp. 423–427. - –––, 2008, “Information in computer science,” in Adriaans and van Benthem 2008b.
- Dijksterhuis, E. J., 1986,
*The Mechanization of the World Picture: Pythagoras to Newton*, Princeton University Press. - Duns Sotus, Opera Omnia. ("The Wadding edition") Lyon, 1639; reprinted Hildesheim: Georg Olms Verlagsbuchhandlung, 1968.
- Edwards, P., 1967,
*The Encyclopedia of Philosophy*, Macmillan Publishing Company. - Fisher, R.A., 1925, “Theory of statistical estimation,”
*Proceedings Cambridge Philosophical Society*, 22(5): 700–725. - Floridi, L., 1999, “Information Ethics: On the Theoretical
Foundations of Computer Ethics,”
*Ethics and Information Technology*, 1(1): 37–56. - –––, 2002, “What Is the Philosophy of Information?”
*Metaphilosophy*, 33(1–2): 123–145. - –––, ed., 2003,
*The Blackwell Guide to the Philosophy of Computing and Information*, Blackwell, Oxford. - –––, 2011,
*The Philosophy of Information*, Oxford University Press. - Frege, G., 1879,
*Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens*, Halle. - –––, 1892, Über Sinn und Bedeutung Zeitschrift für Philosophie und philosophische Kritik, NF 100.
- Galileo Galilei, 1623,
*Il Saggiatore*(in Italian) (Rome); The Assayer, English trans. Stillman Drake and C. D. O'Malley, in The Controversy on the Comets of 1618 (University of Pennsylvania Press, 1960). - Garey, M.R. and D.S.Johnson, 1979,
*Computers and Intractability*, W.H.Freeman & Co. - Gell-Mann, M. and S. Lloyd, 2003, “Effective Complexity,”
*Working papers Santa Fe Institute*, 387–398. - Gibbs, J.W., 1906,
*The scientific papers of J. Willard Gibbs in Two Volumes*, 1. Longmans, Green, and Co. - Godefroy, F.G., 1881,
*Dictionnaire de l'ancienne langue française et de tous ses dialectes du 9e au 15e siècle*, Paris F. Vieweg. - Grünwald, P.D., 2007,
*The Minimum Description Length Principle*, MIT Press. - Grünwald, P. and P.M.B. Vitányi, 2008, “Algorithmic Information Theory,” in Adriaans and van Benthem 2008b.
- Harremoës, P. and F. Topsøe, 2008, “The quantitative theory of information,” in Adriaans and van Benthem 2008b.
- Hazard, P., 1935,
*La Crise de la conscience européenne*, Paris. - Hintikka, J., 1962,
*Knowledge and Belief*, Cornel University Press, Ithaca. - –––, 1973,
*Logic, Language Games, and Information*, Clarendon, Oxford. - Hume, D., 1739–40,
*A Treatise of Human Nature*. - –––, 1748,
*An Enquiry concerning Human Understanding*, P.F. Collier & Son. 1910, ISBN 0198250606. [available online] - Hutter, M., 2005,
*Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability*, EATCS Book, Berlin: Springer. - –––, 2007a, “On Universal Prediction and Bayesian
Confirmation,”
*Theoretical Computer Science*, 384(1): 33–48. - ––– 2007b, “Algorithmic Information Theory: a brief
non-technical guide to the field,”
*Scholarpedia*, 2(3): 2519. - –––, 2010, “A Complete Theory of Everything (will be
subjective),”
*Algorithms*, 3(4): 329–350. - Ibn Tufail,
*Hayy ibn Yaqdhan*, translated as*Philosophus Autodidactus*, published by Edward Pococke the Younger in 1671. - Kahn, D., 1967,
*The Code-Breakers, The Comprehensive History of Secret Communication from Ancient Times to the Internet*, New York: Scribner. - Kolmogorov, A.N., 1965, “Three Approaches to the Quantitative
Definition of Information,”
*Problems Inform. Transmission*, 1(1): 1–7. - Koppel, M., 1987, “Complexity, Depth, and
Sophistication,” in
*Complex Systems*, 1(6): 1087–1091. - Kripke, S.A., 1959, “A Completeness Theorem in Modal Logic,”
*The Journal of Symbolic Logic*, 24(1): 1–14. - Kuipers, Th.A.F. (ed.), 2007a,
*General Philosophy of Science*, Amsterdam: Elsevier Science Publishers. - Kuipers, Th.A.F., 2007b, “Explanation in Philosophy of Science,” in Kuipers 2007a.
- Langton, C.G., 1990, “Computation at the edge of chaos: Phase
Transitions and Emergent Computation,”
*Physica D*, 42(1–3): 12–37. - Lenski, W., 2010, “Information: a conceptual
investigation,”
*Information 2010*, 1(2): 74–118. - Levin, L.A., 1974, “Laws of information conservation
(non-growth) and aspects of the foundation of probability
theory,”
*Problems Information Transmission*, 10(3): 206–210. - Li, M. and P.M.B. Vitányi, 2008,
*An introduction to Kolmogorov complexity and its applications*, Berlin: Springer-Verlag, third edition. - Lloyd, S. and J. Ng, 2004, “Black Hole Computers,”
*Scientific American*, 291(5): 30–39. - Locke, J., 1689,
*An Essay Concerning Human Understanding*, J. W. Yolton (ed.), London: Dent; New York: Dutton, 1961. - Mill, J.S., 1843,
*A System of Logic*, London. - Napier, J., 1614,
*Mirifici Logarithmorum Canonis Descriptio*, Edinburgh: Andre Hart. [translation available online]. - Von Neumann, J., 1955,
*Mathematische Grundlagen der Quantenmechanik*, Berlin: Springer. - Nielsen, M.A. and I.L. Chuang, 2000,
*Quantum Computation and Quantum Information*, Cambridge: Cambridge University Press. - Niess, A., 2009,
*Computability and Randomness*(Oxford Logic Guides 51), Oxford: Oxford University Press. - Ong, W. J., 1958, 2004,
*Ramus, Method, and the Decay of Dialogue, From the Art of Discourse to the Art of Reason*, Chicago: University of Chicago Press. - Parikh, R. and R. Ramanujam, 2003, “A Knowledge Based Semantics of
Messages,”
*Journal of Logic, Language and Information*, 12: 453–467. - Peirce, C. S., 1868, “Upon Logical Comprehension and Extension,”
*Proceedings of the American Academy of Arts and Sciences*, 7: 416–432. - –––, 1886, Letter, Peirce to A. Marquand, dated 1886, W
5:541–3, Google Preview. See Burks, Arthur W., Review: Charles S.
Peirce, The new elements of mathematics,
*Bulletin of the American Mathematical Society*, 84(5) (1978): 913–18. - van Peursen, C.A., 1987, “Christian Wolff's Philosophy of Contingent
Reality,”
*Journal of the History of Philosophy*, 25(1): 69–82 - Popper, K., 1934,
*The Logic of Scientific Discovery*, (*Logik der Forschung*), English translation 1959, London: Hutchison, 1977). - Putnam, H., 1988,
*Representation and reality*, Cambridge: The MIT Press. - Quine, W.V.O., 1951, “Two Dogmas of Empiricism,”
*The Philosophical Review*, 60: 20–43. Reprinted in his 1953*From a Logical Point of View*, Harvard University Press. - Rathmanner, S. and M. Hutter, 2011, “A Philosophical Treatise on
Universal Induction,”
*Entropy*, 13(6): 1076–1136. - Redei, M. and M. Stoeltzner, eds., 2001,
*John von Neumann and the Foundations of Quantum Physics*, Dordrecht: Kluwer Academic Publishers. - Rissanen, J.J., 1978, “Modeling by Shortest Data Description,”
*Automatica*, 14(5): 465–471. - –––, 1989,
*Stochastic Complexity in Statistical Inquiry*, World Scientific Series in Computer Science, 15, Singapore: World Scientific. - van Rooij, R., 2004, “Signalling games select Horn strategies,”
*Linguistics and Philosophy*, 27: 493–527. - Schmidhuber, J. 1997a, “Low-Complexity
Art,”
*Leonardo, Journal of the International Society for the Arts, Sciences, and Technology*, 30(2): 97–103, MIT Press. - –––, 1997b, “A Computer Scientist's View
of Life, the Universe, and Everything,”
*Lecture Notes in Computer Science*, 1337: 201–208. - Schnelle, H., 1976, “Information,” in J. Ritter
(ed.),
*Historisches Wörterbuch der Philosophie*, IV [Historical dictionary of philosophy, IV] (pp. 116–117). Stuttgart, Germany: Schwabe. - Searle, J.R., 1990, “Is the Brain a Digital
Computer?”
*Proceedings and Addresses of the American Philosophical Association*, 64: 21–37. - Seiffert, H., 1968,
*Information über die Information*[Information about information] Munich: Beck. - Shannon, C. 1948, “A Mathematical Theory of
Communication,”
*Bell System Technical Journal*, 27: 379–423, 623–656. - Shannon, C. E. and W. Weaver, 1949,
*The Mathematical Theory of Communication*, Urbana: University of Illinois Press. - Simon, J.C. and Olivier Dubois, 1989, “Number of Solutions of
Satisfiability Instance—Applications to Knowledge Bases,”
*International Journal of Pattern Recognition and Artificial Intelligence*(IJPRAI), 3(1):53–65. - Singh, S., 1999,
*The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography*, New York: Anchor Books. - Solomonoff, R.J., 1960, “A preliminary report on a general theory of inductive inference,” Techical Report ZTB-138, Zator.
- –––, 1964a, “A Formal Theory of Inductive
Inference Part I,”
*Information and Control*, 7(1): 1–22. - –––, 1964b, “A Formal Theory of Inductive
Inference Part II,”
*Information and Control*, 7(2): 224–254. - –––, 1997, “The Discovery of Algorithmic
Probability,”
*Journal of Computer and System Sciences*, 55(1): 73–88. - Stalnaker, R., 1984,
*Inquiry*, Cambridge, MA: MIT Press. - Stifel, M. 1544,
*Arithmetica integra*, Nuremberg: Johan Petreium. - Tarski, A. 1944, “The Semantic Conception of
Truth,”
*Philosophy and Phenomenological Research*, 4: 13–47. - Valiant, L. G., 2007, “Evolvability,”
*Journal of the ACM*, 56(1): Article 3. - Vereshchagin, N.K. and P.M.B. Vitányi, 2004,
“Kolmogorov's Structure functions and model
selection,”
*IEEE Transactions on Information Theory*, 50(12): 3265–3290. - Vitányi, P.M.B., 2006, “Meaningful information,”
*IEEE Transactions on Information Theory*, 52(10): 4617–4626. [available online]. - de Vogel, C.J., 1974,
*Plato: De filosoof van het transcendente*, Baarn: Het Wereldvenster, 1968. - Wallace, C. S., 2005,
*Statistical and Inductive Inference by Minimum Message Length*, Springer, Berlin. - Wheeler, J. A., 1990, “Information, physics, quantum: The
search for links,” in W. Zurek (ed.)
*Complexity, Entropy, and the Physics of Information*, Redwood City, CA: Addison-Wesley. - Windelband, W., 1921,
*Lehrbuch der Geschichte der Philosophie*, Tübingen. - Wolff, J.G., 2006,
*Unifying Computing and Cognition*, CognitionResearch.org.uk. - Wolfram, S., 2002,
*A New Kind of Science*, Wolfram Media Inc. - Wolpert, D.H. and W. Macready, 2007, “Using
self-dissimilarity to quantify complexity,”
*Complexity*, 12(3): 77–85. - Zuse, K., 1969,
*Rechnender Raum*, Friedrich Vieweg & Sohn, Braunschweig. Translated as “Calculating Space” MIT Technical Translation AZT-70-164-GEMIT, MIT (Proj. MAC), Cambridge, MA, Feb. 1970.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Indiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

- Aaronson, S., 2006,
Reasons to Believe,
*Shtetl-Optimized*blog post. - Aaronson, S., 2011, Why Philosophers Should Care About Computational Complexity, in PhilSci Archive.
- al-Khwarizmi, ca. 820 CE,
*Hisab al-jabr w'al-muqabala, Kitab al-Jabr wa-l-Muqabala*(The Compendious Book on Calculation by Completion and Balancing), see The Algebra of Mohammed Ben Musa. - Bekenstein, J.D., 1994, “Do We Understand Black Hole Entropy?” Plenary talk at Seventh Marcel Grossman meeting at Stanford University. arXiv:gr-qc/9409015. [available online].
- Cook, S., 2000, The P versus NP Problem, Clay Mathematical Institute; The Millennium Prize Problem.
- Huber, F., 2007,
Confirmation and Induction,
entry in the
*Internet Encyclopedia of Philosophy*. - Sajjad, H. Rizvi, 2006, Avicenna/Ibn Sina
, entry in the
*Internet Encyclopedia of Philosophy*. - Verlinde, E. P., 2010, “On the Origin of Gravity and the Laws of Newton,” arXiv:1001.0785 [hep-th]. [available online].
- Computability – What would it mean to disprove Church-Turing thesis?, discussion on Theoretical Computer Science StackExchange.
- P versus NP Problem,
entry in
*Wikipedia*.