Coherentist Theories of Epistemic Justification

First published Tue Nov 11, 2003; substantive revision Tue Mar 9, 2021

According to the coherence theory of justification, also known as coherentism, a belief or set of beliefs is justified, or justifiably held, just in case the belief coheres with a set of beliefs, the set forms a coherent system or some variation on these themes. The coherence theory of justification should be distinguished from the coherence theory of truth. The former is a theory of what it means for a belief or a set of beliefs to be justified, or for a subject to be justified in holding the belief or set of beliefs. The latter is a theory of what it means for a belief or proposition to be true. Modern coherence theorists, in contrast to some earlier writers in the British idealist tradition, typically subscribe to a coherence theory of justification without advocating a coherence theory of truth. Rather, they either favor a correspondence theory of truth or take the notion of truth for granted, at least for the purposes of their epistemological investigations. This does not prevent many authors from claiming that coherence justification is an indication or criterion of truth.

1. Coherentism Versus Foundationalism

A central problem in epistemology is to explain when we are justified in holding a proposition to be true. It is not at all evident what epistemic justification is, and classical accounts of that notion have turned out to be severely problematic. A tradition inspired by Descartes holds that justified beliefs are those that are either self-evidently true or deduced from self-evident truths. But, as is often argued, little of what we take ourselves to justifiably believe satisfies these austere conditions: many of our apparently justified beliefs, it is commonly thought, are neither based on self-evident truths nor derivable in a strict logical sense from other things we believe in. Thus, the Cartesian rationalist picture of justification seems far too restrictive. Similar problems hound empiricist attempts to ground all our knowledge in the allegedly indubitable data of the senses. Depending on how they are understood, sense data are either not indubitable or else not informative enough to justify a sufficient portion of our purported knowledge. The exact characterization of foundationalism is a somewhat contentious issue. There is another form of foundationalism according to which some beliefs have some non-doxastic source of epistemic support that requires no support of its own. This support can be defeasible and it can require supplementation to be strong enough for knowledge. This sort of non-doxastic support would terminate the regress of justification. To do so it may not have to appeal to self-evidence, indubitability or certainty. Such foundationalist views vary on the source of the non-doxastic support, how strong the support is on its own, and what role in justification coherence plays, if any. Some critics of this position have questioned the intelligibility of the non-doxastic support relation. Thus, Davidson (1986) complains that advocates have been unable to explain the relation between experience and belief that allows the first to justify the second.

The difficulties pertaining to both rationalism and empiricism about justification have led many epistemologists to think that there must be something fundamentally wrong with the way in which the debate has been framed, prompting their rejection of the foundationalist justificatory structure underlying rationalism and empiricism alike. Rather than conceiving the structure of our knowledge on the model of Euclidean geometry, with its basic axioms and derived theorems, these epistemologists favor a holistic picture of justification which does not distinguish between basic or foundational and non-basic or derived beliefs, treating rather all our beliefs as equal members of a “web of belief” (Quine and Ullian 1970, cf. Neurath 1983/1932 and Sosa 1980).

The mere rejection of foundationalism is not itself an alternative theory because it leaves us with no positive account of justification, save a suggestive metaphor about webs of belief. A more substantial contrasting proposal is that what justifies our beliefs is ultimately the way in which they hang together or dovetail so as to produce a coherent set. As Davidson puts it, “[w]hat distinguishes a coherence theory is simply the claim that nothing can count as a reason for a belief except another belief” (Davidson, 1986). The fact that our beliefs cohere can establish their truth, even though each individual belief may lack justification entirely if considered in splendid isolation, or so it is thought. Following C. I. Lewis (1946), some proponents think of this situation as analogous to how agreeing testimonies in court can lead to a verdict although each testimony by itself would be insufficient for that purpose.

There is a serious objection that any coherence theory of justification or knowledge must immediately face. It is called the isolation objection: how can the mere fact that a system is coherent, if the latter is understood as a purely system-internal matter, provide any guidance whatsoever to truth and reality? Since a coherence theory, in its basic form, does not assign any essential role to experience, there is little reason to think that a coherent system of belief will accurately reflect the external world. A variation on this theme is presented by the equally notorious alternative systems objection. For each coherent system of beliefs there exist, conceivably, other systems that are equally coherent yet incompatible with the first system. If coherence is sufficient for justification, then all these incompatible systems would be justifiably held. But this observation, of course, thoroughly undermines any claim suggesting that coherence is indicative of truth.

As we shall see, most, if not all, influential coherence theorists try to avoid these traditional objections by assigning some beliefs that are close to experience a special role, whether they are called “supposed facts asserted” (Lewis, 1946), “truth-candidates” (Rescher, 1973), “cognitively spontaneous beliefs” (BonJour, 1985) or something else. Depending on how this special role is construed, these theories are often classified as versions of weak foundationalism. An advocate of weak foundationalism typically holds that while coherence is incapable of justifying beliefs from scratch, it can provide justification for beliefs that already have some initial, perhaps minuscule, degree of warrant, e.g., for observational beliefs.

A fair number of distinguished contemporary philosophers have declared that they advocate a coherence theory of justification. Apart from this superficial fact, these theories often address some rather diverse issues loosely united by the fact that they in one way or the other take a holistic approach to the justification of beliefs. Here are some of the problems and questions that have attracted the attention of coherence theorists (cf. Bender, 1989):

  • How can a regress of justification be avoided?
  • How can we gain knowledge given that our information sources (senses, testimony etc) are not reliable?
  • How can we know anything at all given that we do not even know whether our own beliefs or memories are reliable?
  • Given a set of beliefs and a new piece of information (typically an observation), when is a person justified in accepting that information?
  • What should a person believe if confronted with a possibly inconsistent set of data?

The fact that these separate, though related, issues are not always clearly distinguished presents a challenge to the reader of the relevant literature.

Although the regress problem is not a central contemporary issue, it is helpful to explain coherence theories as responses to the problem. This will also serve to illustrate some challenges that a coherence theory faces. We will then turn to the concept of coherence itself as that concept is traditionally conceived. Unfortunately, not all prominent authors associated with the coherence theory use the term coherence in this traditional sense, and the section that follows is devoted to such non-standard coherence theories. The arguably most systematic and prolific discussion of the coherence theory of justification has focused on the relationship between coherence and probability. The rest of this entry will be devoted to this development, which took off in the mid-1990s inspired by seminal work by C. I. Lewis (1946). The development has given us precise and sophisticated definitions of coherence as well as detailed studies of the relationship between coherence and truth (probability), culminating in some potentially disturbing impossibility results that shed doubt on the possibility of defining coherence in a way that makes it indicative of truth. More precise descriptions of key entailments of these results, and ways to address the worries that they raise, will be discussed in later sections of this entry.

2. The Regress Problem

On the traditional justified true belief account of knowledge, a person cannot be said to know that a proposition \(p\) is true without having good reasons for believing that \(p\) is true. If Lucy knows that she will pass tomorrow’s exam, she must have good reasons for thinking that this is so. Consider now Lucy’s reasons. They will presumably consist of other beliefs she has, e.g., beliefs about how well she did earlier, about how well she has prepared, and so on. For Lucy to know that she will pass the exam, these other beliefs, upon which the first belief rests, must also be things that Lucy knows. Knowledge, after all, cannot be based on something less than knowledge, i.e., on ignorance (cf. Rescher 1979, 76). Since the reasons are themselves things that Lucy knows, those reasons must in turn be based on reasons, and so on. Thus, any knowledge claim requires a never-ending chain, or “regress”, of reasons for reasons. This seems strange, or even impossible, because it involves reference to an infinite number of beliefs. But most of us think that knowledge is possible.

What is the coherentist’s response to the regress? The coherentist can be understood as proposing that nothing prevents the regress from proceeding in a circle. Thus, \(A\) can be a reason for \(B\) which is a reason for \(C\) which is a reason for \(A\). If this is acceptable, then what we have is a chain of reasons that is never-ending but which does not involve an infinite number of beliefs. It is never-ending in the sense that for each belief in the chain there is a reason for that belief also in the chain. Yet there is an immediate problem with this response due to the fact that justificatory circles are usually thought to be vicious ones. If someone claims \(C\) and is asked why she believes it, she may reply that her reason is \(B\). If asked why she believes \(B\), she may assert \(A\). But if prompted to justify her belief in \(A\), she is not allowed to refer back to \(C\) which in the present justificatory context is still in doubt. If she did justify \(A\) in terms of \(C\) nonetheless, her move would lack any justificatory force whatsoever.

The coherentist may respond by denying that she ever intended to suggest that circular reasoning is a legitimate dialectical strategy. What she objects to is rather the assumption that justification should at all proceed in a linear fashion whereby reasons are given for reasons, and so on. This assumption of linearity presupposes that what is, in a primary sense, justified are individual beliefs. This, says the coherentist, is simply wrong: it is not individual beliefs that are primarily justified, but entire belief systems. Particular beliefs can also be justified but only in a secondary or derived sense, if they form part of a justified belief system. This is a coherence approach because what makes a belief system justified, on this view, is precisely its coherence. A belief system is justified if it is coherent to a sufficiently high degree. This, in essence, is Laurence BonJour’s 1985 solution to the regress problem.

This looks much more promising than the circularity theory. If epistemic justification is holistic in this sense, then a central assumption behind the regress is indeed false, and so the regress never gets started. Even so, this holistic approach raises many new questions to which the coherentist will need to respond. First of all, we need to get clearer on what the concept of coherence involves as that concept is applied to a belief system. This is the topic of the next section. Second, the proposal that a singular belief is justified merely in virtue of being a member of a justified totality can be questioned because, plausibly, a belief can be a member of a sufficiently coherent system without in any way adding to the coherence of that system, e.g., if the belief is the only member which does not quite fit in an otherwise strikingly coherent system. Surely, a belief will have to contribute to the coherence of the system in order to become justified by that system. A particular belief needs, in other words, to cohere with the system of which it is a member if that belief is to be considered justified. We will turn to this issue in section 4, in connection with Keith Lehrer’s epistemological work. Finally, we have seen that most coherence theories assign a special role to some beliefs that are close to experience in order to avoid the isolation and alternative systems objections. This fact raises the question of what status those special beliefs have. Do they have to have some credibility in themselves or can they be totally lacking therein? A particularly clear debate on this topic is the Lewis-BonJour controversy over the possibility of justification by coherence from scratch, which we will examine more closely in section 5.

3. Traditional Accounts of Coherence

By a traditional account of coherence we will mean one which construes coherence as a relation of mutual support or agreement among given data (propositions, beliefs, memories, testimonies etc.). Early characterizations were given by, among others, Brand Blanshard (1939) and A. C. Ewing (1934). According to Ewing, a coherent set is characterized partly by consistency and partly by the property that every belief in the set follows logically from the others taken together. Thus, a set such as \(\{A_1, A_2, A_1 \amp A_2\}\), if consistent, is highly coherent on this view because each element follows by logical deduction from the rest in concert.

While Ewing’s definition is admirably precise, it defines coherence too narrowly. Few belief sets that occur naturally in everyday life satisfy the austere second part of his definition: the requirement that each element follow logically from the rest when combined. Consider, for instance, the set consisting of propositions \(A, B\) and \(C\), where

\(A =\)“John was at the crime scene at the time of the robbery”
\(B =\)“John owns a gun of the type used by the robber”
\(C =\)“John deposited a large sum of money in his bank account the next day”

This set is intuitively coherent, and yet it fails to satisfy Ewing’s second condition. The proposition \(A\), for instance, does not follow logically from \(B\) and \(C\) taken together: that John owns a gun of the relevant type and deposited money in his bank the day after does not logically imply him being at the crime scene at the time of the crime. Similarly, neither \(B\) nor \(C\) follows from the rests of the propositions in the set by logic alone.

C. I. Lewis’s definition of coherence, or “congruence” to use his term, can be seen as a refinement and improvement of Ewing’s basic idea. As Lewis defines the term, a set of “supposed facts asserted” is coherent (congruent) just in case every element in the set is supported by all the other elements taken together, whereby “support” is understood not in logical terms but in a probabilistic sense. In other words, \(P\) supports \(Q\) if and only if the probability of \(Q\) is raised on the assumption that \(P\) is true. As is readily appreciated, Lewis’s definition is less restrictive than Ewing’s: more sets will turn out to be coherent on the former than on the latter. (There are some uninteresting limiting cases for which this is not true. For instance, a set of tautologies will be coherent in Ewing’s but not in Lewis’s sense. There cases are not interesting because they are not significant parts of anyone’s actual body of beliefs.)

Let us return to the example with John. The proposition \(A\), while not logically entailed by \(B\) and \(C\), is under normal circumstances nevertheless supported by those propositions taken together. If we assume that John owns the relevant type of gun and deposited a large sum the next day, then this should raise the probability that John did it and thereby also raise the probability that he was at the crime scene when the robbery took place. Similarly, one could hold that each of \(B\) and \(C\) is supported, in the probabilistic sense, by the other elements of the set. If so, this set is not only coherent in an intuitive sense but also coherent according to Lewis’s definition. Against Lewis’s proposal one could hold that it seems arbitrary to focus merely on the support single elements of a set receive from the rest of the set (cf. Bovens and Olsson 2000). Why not consider the support any subset, not just singletons, receives from the rest?

Another influential proposal concerning how to define coherence originates from Laurence BonJour (1985), whose account is considerably more complex than earlier suggestions. Where Ewing and Lewis proposed to define coherence in terms of one single concept—logical consequence and probability, respectively—BonJour thinks that coherence is a concept with a multitude of different aspects corresponding to the following “coherence criteria” (97–99):

  1. A system of beliefs is coherent only if it is logically consistent.
  2. A system of beliefs is coherent in proportion to its degree of probabilistic consistency.
  3. The coherence of a system of beliefs is increased by the presence of inferential connections between its component beliefs and increased in proportion to the number and strength of such connections.
  4. The coherence of a system of beliefs is diminished to the extent to which it is divided into subsystems of beliefs which are relatively unconnected to each other by inferential connections.
  5. The coherence of a system of beliefs is decreased in proportion to the presence of unexplained anomalies in the believed content of the system.

A difficulty pertaining to theories of coherence that construe coherence as a multidimensional concept is to specify how the different dimensions are to be amalgamated so as to produce an overall coherence judgment. It could well happen that one system \(S\) is more coherent than another system \(T\) in one respect, whereas \(T\) is more coherent than \(S\) in another. Perhaps \(S\) contains more inferential connections than \(T\), but \(T\) is less anomalous than \(S\). If so, which system is more coherent in an overall sense? Bonjour’s theory is largely silent on this point.

BonJour’s account also raises another general issue. The third criterion stipulates that the degree of coherence increases with the number of inferential connections between different parts of the system. Now as a system grows larger the probability that there will be relatively many inferentially connected beliefs is increased simply because there are more possible connections to be made. Hence, one could expect there to be a positive correlation between the size of a system and the number of inferential connection between the beliefs contained in the system. BonJour’s third criterion, taken at face value, entails therefore that a bigger system will generally have a higher degree of coherence due to its sheer size. But this is at least not obviously correct. A possible modified coherence criterion could state that what is correlated with higher coherence is not the number of inferential connections but rather the inferential density of the system, where the latter is obtained by dividing the number of inferential connections by the number of beliefs in the system.

4. Other Accounts of Coherence

We will return, in section 6, to the problem of defining the traditional concept of coherence while addressing some of the concerns that we have raised, e.g., concerning the relationship between coherence and system size. The point of departure for the present discussion, however, is the observation that several prominent self-proclaimed coherentists construe the central concept, and to some extent also its role in philosophical inquiry, in ways that depart somewhat from the traditional view. Among them we find Nicolas Rescher, Keith Lehrer and Paul Thagard.

Central in Rescher’s account, as laid out in Rescher (1973), his most influential book on the subject, is the notion of a truth-candidate. A proposition is a truth-candidate if there is something that speaks in its favor. Rescher’s truth-candidates are related to Lewis’s “supposed facts asserted”. In both cases, the propositions of interest are prima facie rather than bona fide truths. Although Rescher’s 1973 book is entitled A Coherence Theory of Truth, the purpose of Rescher’s investigation is not to investigate the possibility of defining truth in terms of coherence but to find a truth criterion, which he understands to be a systematic procedure for selecting from a set of conflicting and even contradictory truth-candidates those elements which it is rational to accept as bona fide truths. His solution amounts to first identifying the maximal consistent subsets of the original set, i.e., the subsets that are consistent but would become inconsistent if extended by further elements of the original set, and then choosing the most “plausible” among these subsets. Plausibility is characterized in way that reveals no obvious relation to the traditional concept of coherence. While the traditional concept of coherence plays a role in the philosophical underpinning of Rescher’s theory, it does not figure essentially in the final product. In a later book, Rescher develops a more traditional “system-theoretic” view on coherence (Rescher 1979).

Keith Lehrer employs the concept of coherence in his definition of justification, which in turn is a chief ingredient in his complex definition of knowledge. According to Lehrer, a person is justified in accepting a proposition just in case that proposition coheres with the relevant part of her cognitive system. This is the relational concept of coherence alluded to earlier. In Lehrer (1990), the relevant part is the “acceptance system” of the person, consisting of propositions to the effect that the subject accepts this and that. Thus, “\(S\) accepts that \(A\)” would initially be in \(S\)’s acceptance system, but not \(A\) itself. In later works, Lehrer has emphasized the importance of coherence with a more complex cognitive entity which he calls the “evaluation system” (e.g., Lehrer 2000 and 2003).

The starting point of Lehrer’s account of coherence is the fact that we can think of all sorts of objections an imaginative critic may raise to what a person accepts. These objections might be directly incompatible with what that person accepts or they might threaten to undermine her reliability in making assessments of the kind in question. For instance, a critic might object to her claim that she sees a tree by suggesting that she is merely hallucinating. That would be an example of the first sort of objection. An example of the second sort would be a case in which the critic replies that the person cannot tell whether she is hallucinating or not. Coherence, and (personal) justification, results when all objections have been met.

Lehrer’s concept of coherence does not seem to have much in common with the traditional concept of mutual support. If one takes it as essential that such a theory make use of a concept of systematic or global coherence, then Lehrer’s theory is not a coherence theory in the traditional sense because, in Lehrer’s view, “[c]oherence … is not a global feature of the system” (1997, 31), nor does it depend on global features of the system (31). A critic may wonder what reasons there are for calling the relation of meeting objections to a given claim relative to an evaluation system a relation of coherence. Lehrer’s answer seems to be that it is a relation of “fitting together with”, rather than, say, a relation of “being inferable from”: “[i]f it is more reasonable for me to accept one of [several] conflicting claims than the other on the basis of my acceptance system, then that claim fits better or coheres better with my acceptance system” (116), and so “[a] belief may be completely justified for a person because of some relation of the belief to a system to which it belongs, the way it coheres with the system, just as a nose may be beautiful because of some relation of the nose to a face, the way it fits with the face” (88). Olsson (1999) has objected to this view by pointing out that it is difficult to understand what it means for a belief to fit into a system unless the former does so in virtue of adding to the global coherence of the latter.

Paul Thagard’s theory is clearly influenced by the traditional concept of coherence but the specific way in which the theory is developed gives it a somewhat non-traditional flavor, in particular considering its strong emphasis on explanatory relations between beliefs. Like Rescher, Thagard takes the fundamental problem to be which elements of a given set of typically conflicting claims that have the status of prima facie truths to single out as acceptable. However, where Rescher proposes to base the choice of acceptable truths on considerations of plausibility, Thagard suggests the use of explanatory coherence for that purpose.

According to Thagard, prima facie truths can cohere (fit together) or “incohere” (resist fitting together). The first type of relation includes relations of explanation and deduction, whereas the second type includes various types of incompatibility, such as logical inconsistency. If two propositions cohere, this gives rise to a positive constraint. If they incohere, the result is a negative constraint. A positive constraint between two propositions can be satisfied either by accepting both or by rejecting both. By contrast, satisfying a negative constraint means accepting one proposition while rejecting the other. A “coherence problem”, as Thagard sees it, is one of dividing the initial set of propositions into those that are accepted and those that are rejected in such a way that most constraints are satisfied. Thagard presents several different computational models for solving coherence problems, including a model based on neural networks.

How acceptability depends on coherence, more precisely, is codified in Thagard’s “principles of explanatory coherence” (Thagard, 2000):

Principle E1 (Symmetry)
Explanatory coherence is a symmetric relation. That is, two propositions \(A\) and \(B\) cohere with each other equally.
Principle E2 (Explanation)
  1. A hypothesis coheres with what it explains, which can either be evidence or another hypothesis.
  2. Hypotheses that together explain some other proposition cohere with each other.
  3. The more hypotheses it takes to explain something, the lower the degree of coherence.
Principle E3 (Analogy)
Similar hypotheses that explain similar pieces of evidence cohere.
Principle E4 (Data Priority)
Propositions that describe the results of observation have a degree of acceptability on their own.
Principle E5 (Contradiction)
Contradictory propositions are incoherent with each other.
Principle E6 (Competition)
If \(A\) and \(B\) both explain a proposition, and if \(A\) and \(B\) are not explanatorily connected, then \(A\) and \(B\) are incoherent with each other \((A\) and \(B\) are explanatorily connected if one explains the other or if together they explain something).
Principle E7 (Acceptance)
The acceptability of a proposition in a system of propositions depends on its coherence with them.

Principle E4 (Data Priority) reveals that Thagard’s theory is not a pure coherence theory, as it gives some epistemic priority to observational beliefs, making it rather a form of weak foundationalism, i.e., the view that some propositions have some initial epistemic support apart from coherence. Moreover, Thagard’s theory is based on binary coherence/incoherence relations, i.e., relations holding between two propositions. His basic theory does not handle incompatibilities that involve, in an essential way, more than two propositions. But incompatibilities of that sort may very well arise, as exemplified by the three propositions “Jane is taller than Martha”, “Martha is taller than Karen” and “Karen is taller than Jane”. Nevertheless, Thagard reports the existence of computational methods for converting constraint satisfaction problems whose constraints involve more than two elements into problems that involve only binary constraints, concluding that his characterization of coherence “suffices in principle for dealing with more complex coherence problems with nonbinary constraints” (Thagard 2000, 19). Thagard (2009) argues that there is a connection between explanatory coherence and (approximate) truth, where explaining consists in describing causal mechanisms. Several other authors have advocated coherence theories that emphasize the importance of explanatory relations. See, for example, Lycan (1988, 2012) and, for a book-length defense of explanatory coherentism, Poston (2014). Also related to Thagard’s work is Susan Haack’s so-called foundherentist theory, which draws on a proposed analogy between coherence justification (with foundationalist ingredients) and crossword puzzle-solving (Haack, 2009).

5. Justification by Coherence from Scratch

The arguably most significant development of the coherence theory in recent years has been the revival of C. I. Lewis’s work and the research program he inspired by translating parts of the coherence theory into the language of probability. The kind of coherence in question should be distinguished from a probability function being coherent in the sense of conforming to the axioms of the probability calculus. The theory of coherence that we are concerned with here is an application of such coherent probability functions to model coherence as mutual support, agreement etc. Thus “probabilistic coherence” means something else than it does in standard Bayesian theories. The probabilistic translations of coherence theory has made it possible to define concepts and prove results with mathematical precision. It has also led to increased transferability of concepts and results across fields, e.g., between coherence theory and confirmation theory as it is studied in philosophy of science. As a result, the study of coherence has developed into an interdisciplinary research program with connections to philosophy of science, cognitive psychology, artificial intelligence and philosophy of law. The rest of this article will be devoted to this recent transformation of the subject.

To introduce Lewis’s view on the role of coherence, consider the following famous passage on “relatively unreliable witnesses who independently tell the same story” from his 1946 book:

For any one of these reports, taken singly, the extent to which it confirms what is reported may be slight. And antecedently, the probability of what is reported may also be small. But congruence of the reports establishes a high probability of what they agree upon, by principles of probability determination which are familiar: on any other hypothesis than that of truth-telling, this agreement is highly unlikely; the story any one false witness might tell being one out of so very large a number of equally possible choices. (It is comparable to the improbability that successive drawings of one marble out of a very large number will each result in the one white marble in the lot.) And the one hypothesis which itself is congruent with this agreement becomes thereby commensurably well established. (346)

While Lewis allows that individual reports need not be very credible considered in isolation for coherence to have a positive effect, he is firmly committed to the view that their credibility must not be nil. He writes, in his discussion of reports from memory, that “[i]f … there were no initial presumption attaching to the mnemically presented … then no extent of congruity with other such items would give rise to any eventual credibility” (357). In other words, if the beliefs in a set have no initial credibility, then no justification will ensue from observing the coherence of that set. Thus, Lewis is advocating weak foundationalism rather than a pure coherence theory.

In apparent agreement with Lewis, Laurence BonJour (1985, 148) writes: “[a]s long as we are confident that the reports of the various witnesses are genuinely independent of each other, a high enough degree of coherence among them will eventually dictate the hypothesis of truth telling as the only available explanation of their agreement.” However, BonJour proceeds to reject Lewis’s point about the need for positive antecedent credibility: “[w]hat Lewis does not see, however, is that his own [witness] example shows quite convincingly that no antecedent degree of warrant or credibility is required” (148). BonJour is here apparently denouncing Lewis’s claim that coherence will not have any confidence boosting power unless the sources are initially somewhat credible. BonJour is proposing that coherence can play this role even if there is no antecedent degree of warrant, so long as the witnesses are delivering their reports independently.

Several authors have objected to this claim of BonJour’s, arguing that coherence does not have any effect on the probability of the report contents if the independent reports lack individual credibility. The first argument to that effect was given by Michael Huemer (1997). A more general proof in the same vein is presented in Olsson (2002). What follows is a sketch of the latter argument for the special case of two testimonies, couched essentially in the terminology of Huemer (2011). In the following, all probabilities are assumed to lie strictly between 0 and 1.

Let \(E_1\) be the proposition that the first witness reports that \(A\), and let \(E_2\) be the proposition that the second witness reports that \(A\). Consider the following conditions:

Conditional Independence
\(P(E_2 \mid E_1, A) = P(E_2 \mid A)\)
\(P(E_2 \mid E_1,\neg A) = P(E_2 \mid \neg A)\)

\(P(A \mid E_1) = P(A)\)
\(P(A \mid E_2) = P(A)\)

Coherence Justification
\(P(A \mid E_1, E_2) \gt P(A)\)

Conditional independence is intended to capture the idea that the testimonies are independent in the sense that there is no direct influence between the testimonies. The probability of a testimony is influenced only by the fact it reports on, meaning that once that fact is given, this “screens off” any probabilistic influence between the individual testimonies making them irrelevant to each other. Nonfoundationalism states that neither testimony confers any justification upon \(A\) by itself: assuming merely that one single witness has testified that \(A\) has no effect on the probability of \(A\). Finally, Coherence Justification states that testimonies, when combined, do provide justification for \(A\).

The debate between Lewis and BonJour can be reconstructed as a debate over the joint consistency of these three conditions. BonJour is claiming that the conditions are jointly consistent, and that Coherence Justification follows from Conditional Independence even in the context of Nonfoundationalism, whereas Lewis is rejecting these claims. Olsson (2002) established that if the dispute is couched in these terms, then Lewis was provably right. From Conditional Independence and Nonfoundationalism it follows, via Bayes’s theorem, that

\[ P(A \mid E_1, E_2) = P(A) \]

so that combining collectively independent but individually useless testimonies, however coherent, fails to give rise to anything useful. (As noted in Olsson, 2005, section 3.5, the matter is somewhat complicated by the fact that Lewis adopted a notion of independence that is weaker than Conditional Independence. Ironically, Lewis’s weaker notion turns out to be compatible with the combination of Nonfoundationalism and Coherence Justification.)

Nonfoundationalism should be contrasted with the following condition:

Weak Foundationalism
\(P(A \mid E_1) \gt P(A)\)
\(P(A \mid E_2) \gt P(A)\)

Weak Foundationalism does not by itself entail Coherence Justification: it is common knowledge in probability theory that even if two pieces of evidence each support a given conclusion, that support may disappear, or even turn into disconfirmation, if they are combined. However, in the context of Conditional Independence, Weak Foundationalism does imply Coherence Justification. Indeed, the combined testimonies will, in this case, confer more support upon the conclusion than the testimonies did individually. As confirmed by James Van Cleve (2011), the conclusions supported by these considerations are that coherence can boost justification or credibility that is already there without being able to create such justification or credibility from scratch.

There are various ways to save the coherence theory from this probabilistic attack. The most radical strategy would be to dismiss the probabilistic framework as altogether unsuitable for coherentism. Independent reasons for this response can be found in Thagard’s work (e.g., Thagard 2000 and 2005). A less radical approach would be to refrain from any blanket dismissal of probability theory in this context but reject one of the premises of the troublesome proof. This is the strategy recently taken by Huemer, who now considers his 1997 probabilistic refutation of coherentism to be mistaken (Huemer 2011, 39, footnote 6). While he thinks that Coherentist Justification correctly captures a minimal sense of coherentism, he reports dissatisfaction with both Conditional Independence and Nonfoundationalism (his term for the latter is “Strong Nonfoundationalism”). Huemer now thinks independence, in the intuitive sense, is better captured by the condition \(P(E_2 \mid E_1, A) \gt P(E_2 \mid E_1, \neg A)\). Moreover, he takes the condition \(P(A \mid E_1, \neg E_2) = P(A)\), or “Weak Nonfoundationalism” in his terminology, to be a more suitable explication of nonfoundationalist intuitions than the condition \(P(A \mid E_1) = P(A)\). He goes on to show that they are jointly consistent with Coherentist Justification: there are probability distributions satisfying all three conditions. Thus the immediate threat to coherentism presented by the observed inconsistency of the three original conditions has been neutralized, even though a critic might point out that the defense is weak since it has not been shown that Coherence Justification follows from the two new conditions.

Whatever merits Huemer’s new conditions might have, their standing in the literature is hardly comparable to that of the original conditions. Conditional Independence, for instance, is an extremely powerful and intuitive concept which has been put to fruitful use in many areas in philosophy and computer science, the most spectacular example being the theory of Bayesian networks (Pearl, 1985). Similarly, the Nonfoundationalist condition is still the most widely used—and many would say most natural—way of stating, in the language of probability theory, that a testimony fails to support that which is testified. Thus, it would seem that coherentism is saved at the price of disconnecting it from the way in which probability theory is standardly applied. Roche (2010) criticizes Nonfoundationalism from another perspective. In his view, a close reading of BonJour reveals that the latter requires only that the witness reports lack individual credibility in the sense that \(P(A \mid E_i) = 0.5\) and not in the sense of \(P(A \mid E_i) = P(A)\), which is the condition we called Nonfoundationalism. Since the former does not entail the latter, coherentists, to the extent that they follow BonJour, need not worry about the joint inconsistency of Conditional Independence, Nonfoundationalism and Coherence Justification. Still, this account of what it means to lack initial credibility is non-standard if taken as a general characterization, and it may in the end be more charitable to interpret BonJour as not having subscribed to it. For an elaboration of this point the reader is referred to Olsson (2005, 65), footnote 4. In later works, BonJour has gradually retracted from his original coherentist position (e.g., BonJour 1989 and 1999).

6. Probabilistic Measures of Coherence

We recall that Lewis’s defined coherence, or congruence, not for any old set of proposition but rather for a set of supposed facts asserted. One way to capture this idea is in terms of the notion of a testimonial system introduced in Olsson (2005). A testimonial system \(S\) is a set \(\{\langle E_1,A_1\rangle ,\ldots ,\langle E_n,A_n\rangle \}\) where \(E_i\) is a report to the effect that \(A_i\) is true. We will say that \(A_i\) is the content of report \(E_i\). The content of a testimonial system \(S = \{\langle E_1,A_1\rangle , \ldots ,\langle E_n,A_n\rangle \}\) is the ordered set of report contents \(\langle A_1 ,\ldots ,A_n\rangle\). By the degree of coherence \(C(S)\) of such a testimonial system we will mean the degree of coherence of its content. Bovens and Hartmann (2003) proposed a similar representation of supposed facts asserted in terms of ordered sets.

To illustrate these concepts, consider a case in which all witnesses report exactly the same thing, e.g., that John was at the crime scene. That would be a paradigm case of a (highly) coherent set of reports. Now contrast this situation with one in which only one witness reports this. That would be a situation which would intuitively not qualify as coherent. Indeed, it does not even seem meaningful to apply the concept of coherence to a case of just one report (except in the trivial sense in which everything coheres with itself). Letting \(A\) be the proposition “John was at the crime scene”, and \(E_1 ,\ldots ,E_n\) the corresponding reports, this intuitive difference can be represented as the difference between two testimonial systems: \(S = \{\langle E_1,A\rangle ,\ldots ,\langle E_n ,A\rangle \}\) and \(S' = \{\langle E_1,A\rangle \}\). If, by contrast, the entities to which coherence applies are represented as simple unstructured sets, the sets of testimonies in question would be given the same formal representation in terms of the set having \(A\) as its sole member.

By a (probabilistic) coherence measure, as defined for ordered sets of propositions, we shall mean any numerical measure \(C(A_1 ,\ldots ,A_n)\) defined solely in terms of the probability of \(A_1 ,\ldots ,A_n\) (and their Boolean combinations) and standard arithmetical operations (Olsson, 2002). This definition makes the degree of coherence of a set of witness reports a function of the probability of the report contents (and their Boolean combinations). Huemer (2011, 45) refers to this consequence as the Content Determination Thesis. We will return to the status of this thesis in section 8, in connection with the recent impossibility results for coherence. A reasonable constraint on any coherence measure is that the degree of coherence of an ordered set should be independent of the particular way in which the content propositions are listed. Thus, \(C(\langle A_1,A_2 , \ldots ,A_n\rangle) = C(\langle B_1,B_2 , \ldots ,B_n\rangle)\) whenever \(\langle B_1,B_2 , \ldots ,B_n\rangle\) is a permutation of \(\langle A_1,A_2 , \ldots ,A_n\rangle\). This is a formal way of stating that all propositions in the relevant set should be treated as epistemic equals. All measures that will be discussed below satisfy this condition.

Our starting point will be an attempt to identify the degree of coherence of a set with its joint probability:

\[ C_0 (A,B) = P(A\wedge B) \]

However, it is easily seen that this is not a plausible proposal. Consider the following two cases. Case 1: Two witnesses point out the same person as the perpetrator, John, say. Case 2: One witness states that John or James did it, and the other witness that John or Mary did it. Since the joint probability is the same in both cases, equaling the probability that John did it, they yield the same degree of coherence as measured by \(C_0\). And yet, the reports in the first case are more coherent from a presystematic standpoint because the witnesses are in complete agreement.

One way of handling this example would be to define coherence as relative overlap, in the following sense (Glass 2002, Olsson 2002):

\[ C_1 (A,B) = \frac{P(A\wedge B)} {P(A\vee B)} \]

\(C_1 (A,B)\), which also takes on values between 0 and 1, measures how much of the total probability mass assigned to either \(A\) or \(B\) falls into their intersection. The degree of coherence is 0 if and only if \(P(A\wedge B) = 0\), i.e., just in case \(A\) and \(B\) do not overlap at all, and it is 1 if and only if \(P(A\wedge B) = P(A\vee B)\), i.e., just in case \(A\) and \(B\) coincide. The measure is straightforwardly generalizable:

\[ C_1 (A_1 ,\ldots A_n) = \frac{P(A_1\wedge \ldots \wedge A_n)} {P(A_1\vee \ldots \vee A_n)} \]

This measure assigns the same coherence value, namely 1, to all cases of total agreement, regardless of the number of witnesses that are involved. Against this it may be objected that agreement among the many is more coherent than agreement among the few, an intuition that can be accounted for by the following alternative measure introduced by Shogenji (1999):

\[ C_2 (A,B) = \frac{P(A \mid B)}{P(A)} = \frac{P(A\wedge B)}{P(A)\times P(B)} \]

or, as Shogenji proposes to generalize it,

\[ C_2 (A_1 ,\ldots ,A_n) = \frac{P(A_1\wedge \ldots \wedge A_n)}{P(A_1)\times \ldots \times P(A_n)} \]

It is easy to see that this measure is sensitive, in the way we suggested, to the number of reports in cases of total agreement: \(n\) agreeing reports correspond to a coherence value of \(\bfrac{1}{P(A)^{n-1}}\), meaning that as \(n\) approaches infinity, so does the degree of coherence. Like the other measures, \(C_2 (A,B)\) equals 0 if and only if \(A\) and \(B\) do not overlap. An alternative generalization of the Shogenji measure is presented in Shupbach (2011). However, whatever its philosophical merits, Schupbach’s proposal is considerably more complex than Shogenji’s original suggestion. Akiba (2000) and Moretti and Akiba (2007) raise a number of worries for the Shogenji measure and for probabilistic measures of coherence generally but they seem to be predicated on the assumption that the concept of coherence is interestingly applicable to unordered sets of propositions, an assumption that we found reason to question above.

\(C_1\) and \(C_2\) can also be contrasted with regard to their sensitivity to the specificity of the propositions involved. Consider two cases. The first case involves two witnesses both claiming that John committed the crime. The second case involves two witnesses both making the weaker disjunctive claim that John, Paul or Mary committed the crime. Which pair of witnesses are delivering the more coherent set? One way to reason is as follows. Since both cases involve fully agreeing testimonies, the degree of coherence should be the same. This is also the result we get if we apply \(C_1\). But one could maintain instead that since the first two witnesses agree on something more specific—a particular individual’s guilt—the degree of coherence should be higher. This is what we get if we apply \(C_2\). In an attempt at reconciliation, Olsson (2002) suggested that \(C_1\) and \(C_2\) may capture two different concepts of coherence. While \(C_1\) measures the degree of agreement of a set, \(C_2\) is more plausible as a measure of how striking the agreement is.

Recently, however, Koscholke and Schippers (2015) have noted a counterintuitive feature of \(C_1\): that it is impossible to increase a set’s degree of coherence by adding further propositions to the set. They also show that according to a more sophisticated relative overlap measure which avoids this issue no set’s degree of coherence exceeds the degree of coherence of its maximally coherent subset. On the other hand, they also prove in a later article, in collaboration with Stegman, that there is a further relative overlap measure which avoids both these problems and which therefore in their view re-establishes relative overlap as a candidate for a proper formalization of coherence (Koscholke, Schippers and Stegman, 2019).

A further much discussed measure is that proposed in Fitelson (2003). It is based on the intuition that the degree of coherence of a set \(E\) should be “a quantitative, probabilistic generalization of the (deductive) logical coherence of \(E\)” (ibid., 194). Fitelson takes it to be a consequence of this idea that a maximum (constant) degree of coherence is attained if the propositions in \(E\) are all logically equivalent (and consistent). This is in accordance with \(C_1\) but not with \(C_2\), which as we saw is sensitive to the specificity (prior probability) of the propositions involved. Fitelson, who approached the subject from the standpoint of confirmation theory, proposed a complex coherence measure based on Kemeny and Oppenheim’s (1952) measure of factual support. A further innovative idea is that Fitelson extends this measure to take into account support relations holding between all subsets in the set \(E\), whereas Lewis, we recall, only considered the support relation holding between one element and the rest. The degree of coherence of a set, finally, is defined as the mean support among the subsets of \(E\). An alleged counterexample to this measure can be found in Siebel (2004) and criticisms and proposed amendments in Meijs (2006). The reader may wish to consult Bovens and Hartmann (2003), Douven and Meijs (2007), Roche (2013a) and Shippers (2014a) for further coherence measures and how they fare in relation to test cases in the literature, and Koscholke and Jekel (2017) for an empirical study of coherence assessments drawing on similar examples. The latter study indicates that the measures by Douven and Meijs and by Roche are more in line with intuitive judgement than other established measures. Some recent works have focused on applying coherence measures to inconsistent sets, e.g., Schippers (2014b) and Schippers and Siebel (2015).

It is fair to say that coherence theorists have yet to reach anything like consensus on how best to define coherence in probabilistic terms. Nevertheless, the debate so far has given rise to a much more fine-grained understanding of what the options are and what consequences they have. What is more, some quite surprising conclusions can be drawn even with this issue largely unresolved: all we need to assume in order to prove that no coherence measure can be truth conducive, in a sense to be explained, is that those measures respect the Content Determination Thesis.

7. Truth Conduciveness: the Analysis Debate

Peter Klein and Ted Warfield’s 1994 paper in Analysis initiated a lively and instructive debate on the relationship between coherence and probability (e.g., Klein and Warfield 1994 and 1996, Merricks 1995, Shogenji 1999, Cross 1999, Akiba 2000, Olsson 2001, Fitelson 2003 and Siebel 2004). According to Klein and Warfield, just because one set of beliefs is more coherent than another set, this does not mean that the first set is more likely to be true. On the contrary, a higher degree of coherence can, so they claimed, be associated with a lower probability of the whole set. The idea behind their reasoning is simple: We can often raise the coherence of an informational set by adding more information that explains the information already in the set. But as more genuinely new information is added, the probability that all the elements of the set are true is correspondingly diminished. This, Klein and Warfield wrote, follows from the well-known inverse relationship between probability and informational content. They concluded that coherence is not truth conducive.

Much in the spirit of C. I. Lewis, Klein and Warfield illustrated their argument referring to a detective story (the so-called “Dunnit example”). It turns out that this example is unnecessarily complex and that the main point can be illustrated by reference to a simpler case (borrowed from computer science where it is used to exemplify the concept of non-monotonic inference). Suppose that you are told by one source, Jane, that Tweety is a bird and by another source, Carl, that Tweety cannot fly. The resulting information set \(S = \langle\)“Tweety is a bird”, “Tweety cannot fly”\(\rangle\) is not particularly coherent from an intuitive standpoint. Nor is it coherent from the point of view of Lewis’s definition: assuming one of the items true decreases the probability of the other. At this point, it would be reasonable to conjecture that either Jane or Carl is not telling the truth. However, upon consulting a further source, Rick, we receive the information that Tweety is a penguin. The new set \(S' = \langle\)“Tweety is a bird”, “Tweety cannot fly”, “Tweety is a penguin”\(\rangle\) is surely more coherent than \(S\). In explaining the previous anomaly, the information supplied by Rick contributes to the explanatory coherence of the set.

The new enlarged set \(S'\) is more coherent than the original smaller set \(S\). And yet \(S\), being less informative, is more probable than \(S'\): the conjunction of all the propositions in \(S\) is more probable than the conjunction of all the propositions in \(S'\). Hence, more coherence does not necessarily imply higher likelihood of truth in the sense of higher joint probability. Klein and Warfield seem to be right: coherence is not truth conducive.

But, as will soon be clear, this conclusion is premature. As a preliminary, let us state Klein and Warfield’s argument more formally using the following abbreviations:

\(A_1 =\)“Tweety is a bird.”
\(A_2 =\)“Tweety cannot fly.”
\(A_3 =\)“Tweety is a penguin.”

The first information set \(S\) consists of \(A_1\) and \(A_2\). The second, more coherent set \(S'\) contains, in addition, \(A_3\). We let \(C\) denote the degree of coherence, intuitively understood. What we have then is:

\[ C(A_1,A_2) \lt C(A_1,A_2,A_3). \]

As we saw, due to the greater informational content of the larger set, its probability is lower than that of the smaller set:

\[ P(A_1,A_2,A_3) \lt P(A_1,A_2). \]

Yet behind this seemingly impeccable piece of reasoning lurks a serious difficulty. As we saw, it is part of the example that we are supposed to know also that Jane reports that Tweety is a bird, that Carl reports that Tweety cannot fly and that Rick reports that Tweety is a penguin. Let:

\(E_1 =\)“Jane reports that Tweety is a bird”
\(E_2 =\)“Carl reports that Tweety cannot fly”
\(E_3 =\)“Rick reports that Tweety is a penguin”

The well-known principle of total evidence now dictates that all relevant evidence should be taken into consideration when computing probabilities. Since it cannot be excluded at the outset that the evidence represented by \(E_1\)–\(E_3\) may be relevant to the probability of the information sets \(S\) and \(S'\), the probability of the smaller set is not \(P(A_1,A_2)\) but rather \(P(A_1,A_2 \mid E_1,E_2)\). Similarly, the probability of the larger set is not \(P(A_1,A_2,A_3)\) but rather \(P(A_1,A_2,A_3 \mid E_1, E_2, E_3)\).

Bovens and Olsson (2002) raised the question whether, given this revised understanding of the probability of a set of reported propositions, it would still follow that extended sets are no more probable than the sets they extend. Referring to our Tweety example, would it still hold that

\[ P(A_1,A_2,A_3 \mid E_1,E_2,E_3) \lt P(A_1,A_2 \mid E_1,E_2)? \]

Bovens and Olsson demonstrated that the answer to the general question is in the negative by giving an example of a more coherent extended set that is also more probable, on the revised understanding of what this means, than the original smaller set. Klein and Warfield’s reasoning is based on an problematic understanding of the joint probability of a set of reported propositions. In the end, they have not shown that coherence is not truth conducive.

Let us say that a measure \(C\) of coherence is propositionally truth conducive if and only if the following holds:

if \(C(A_1 ,\ldots ,A_n) \gt C(B_1 ,\ldots ,B_m)\), then
\(P(A_1\wedge \ldots \wedge A_n) \gt P(B_1\wedge \ldots \wedge B_m)\).

One lesson emerging from the Analysis debate is that this way of construing truth conduciveness should be replaced by a notion of truth conduciveness where the relevant probabilities take all relevant evidence into account, whatever that evidence may be (beliefs, testimonies etc.). For example, a coherence measure \(C\) is doxastically truth conducive (for a subject \(S)\) if and only if:

if \(C(A_1 ,\ldots ,A_n) \gt C(B_1 ,\ldots ,B_m)\), then
\(P(A_1\wedge \ldots \wedge A_n \mid \mathrm{Bel}_S A_1,\ldots,\mathrm{Bel}_{S}A_n) \gt\) \(P(B_1\wedge \ldots \wedge B_m \mid \mathrm{Bel}_S B_1,\ldots,\mathrm{Bel}_{S}B_m)\),

where \(\mathrm{Bel}_S A\) abbreviates “\(S\) believes that \(A\)”. In other words, a measure of coherence is doxastically truth conducive just in case a more coherent set of believed propositions is jointly more probable than a less coherent set of believed propositions. This the how we will understand the probability (likelihood of truth) of a set in the following.

8. Impossibility Results

The impossibility results for coherence draw on all three debates summarized above: the Lewis-BonJour controversy, the debate over probabilistic measures of coherence and also the dispute in Analysis regarding truth conduciveness. Before we can discuss the results we need to make one further observation. Given the conclusion of the Lewis-BonJour dispute, it is a reasonable expectation that no coherence measure is truth conducive, in the relevant conditional sense, unless the reports (beliefs, memories etc.) in question are individually credible and collectively independent. But assuming this is not sufficient for coherence to stand a reasonable chance of being truth conducive. We must also require that when we compare two different sets of reports, we do so while keeping the degree of individual credibility fixed. Otherwise we could have a situation in which one set of report contents is more coherent than another set but still fails to give rise to a higher likelihood of truth simply because the reporters delivering the propositions in the less coherent set are individually more reliable. Thus, truth conduciveness must be understood in a ceteris paribus sense. The question of interest, then, is whether more coherence implies a higher probability (given independence and individual credibility) everything else being equal. We are now finally in a position to state the impossibility theorems. What they show is that no measure of coherence is truth conducive even in a weak ceteris paribus sense, under the favorable conditions of (conditional) independence and individual credibility.

The first result of this nature was presented by Bovens and Hartmann (2003). Their definition of truth conduciveness deviates slightly from the standard account given above. As they define it, a measure \(C\) is truth conducive if and only if, for all sets \(S\) and \(S'\), if \(S\) is at least as coherent as \(S'\) according to \(C\), then \(S\) is at least as likely to be true as \(S'\) ceteris paribus and given independence and individual credibility. Very roughly, their proof has the following structure: They show that there are sets \(S\) and \(S'\), each containing three propositions, such that which set is more likely to be true will depend on the level at which the individual credibility (reliability) is held fixed. Thus for lower degrees of reliability, one set, say \(S\), will be more probable than the other set, \(S'\); for higher degrees of reliability, the situation will be reversed. One can now find a counterexample to the truth conduciveness of any measure \(C\) through a strategic choice of the level at which the reliability is held fixed. Suppose for instance that, according to \(C\), the set \(S\) is more coherent than the set \(S'\). In order to construct a counterexample to \(C\)’s truth conduciveness, we set the reliability to a value for which \(S'\) will be more probable than \(S\). If, on the other hand, \(C\) makes \(S'\) more coherent than \(S\), we fix the reliability to a level at which \(S\) will be the more probable set. For the details, see Bovens and Hartmann (2003, section 1.4).

Olsson defines truth conduciveness in the standard fashion. His impossibility theorem is based on the following alternative proof strategy (Olsson 2005, appendix B): Consider a situation of two witnesses both reporting that \(A\), represented by \(S = \langle A, A\rangle\). Take a measure \(C\) of coherence that is informative with respect to \(S\), in the sense that it does not assign the same degree of coherence to \(S\) regardless of which probability assignment is used. This means that the measure is non-trivial in the situation in question. Take two assignments \(P\) and \(P'\) of probabilities to the propositions in \(S\) that give rise to different coherence values. Olsson shows that a counter example to the truth conduciveness of \(C\) can be constructed through a strategic choice of the probability of reliability. If \(P\) makes \(S\) more coherent than does \(P'\) according to \(C\), we fix the probability of reliability in such a way that \(S\) comes out as more probable on \(P'\) than on \(P\). If, on the other hand, \(P'\) makes \(S\) more coherent, then we choose a value for the probability of reliability so that \(P\) makes \(S\) more probable. It follows that no coherence measure is both truth conducive and informative.

There are some further subtle differences between the two results. First, Olsson’s theorem is proved against the backdrop of a dynamic (or, in the language of Bovens and Hartmann, 2003, endogenous) model of reliability: the assessment of witness reliability, which in this model is represented as a probability of reliability, may change as we obtain more reports. Bovens and Hartmann’s detailed proof assumes a non-dynamic (exogenous) model of reliability, although they indicate that the result carries over to the dynamic (endogenous) case. Second, there is a difference in the way the ceteris paribus condition is understood. Olsson fixes the initial probability of reliability, but allows the prior probability of the report contents to vary. Bovens and Hartmann fix not only the reliability but also the prior probability of the report contents.

These impossibility results give rise to a thought-provoking paradox. It can hardly be doubted that we trust and rely on coherence reasoning when judging the believability of information, in everyday life and in science (see Harris and Hahn, 2009, for an experimental study in a Bayesian setting). But how can this be when in fact coherence is not truth conducive? Since the impossibility results were published a number of studies have been dedicated to the resolution of this paradox (see Meijs and Douven, 2007, for an overview of some possible moves). These studies can be divided into two camps. Researchers in the first camp accept the conclusion that the impossibility results show that coherence is not truth conducive. They add, however, that this does not prevent coherence from being valuable and important in other ways. Researchers in the other camp do not accept the conclusion that the impossibility results show that coherence is not truth conducive because they think that at least one premise used in proving the results is doubtful.

Let us start with responses from the first camp. Dietrich and Moretti (2005) show that coherence in the sense of the Olsson measure is linked to the practice of indirect confirmation of scientific hypotheses. That measure turns out to be, in the terminology of Moretti (2007), “confirmation conducive”. Glass (2007) argues, similarly, that coherence can provide the key to a precise account of inference to the best explanation, the main idea being to use a coherence measure for ranking competing hypotheses in terms of their coherence with a given piece of evidence. Furthermore, Olsson and Schubert (2007) observe that, while coherence falls short of being truth conducive, it can still be “reliability conducive”, i.e., more coherence, according to some measures, entails a higher probability that the sources are reliable, at least in a paradigmatic case (cf. Schubert 2012a, 2011). Nevertheless, Schubert has proved an impossibility theorem to the effect that no coherence measure is reliability conducive in general (Schubert 2012b). For yet another example, Angere (2007, 2008) argues, based on computer simulations, that the fact that coherence fails to be truth conducive, in the above sense, does not prevent it from being connected with truth in a weaker, defeasible sense. In fact, almost all coherence measures that have an independent standing in the literature satisfy the condition that most cases of higher coherence are also cases of higher probability, although they do so to different degrees. Moreover, Roche (2013b) has demonstrated that assuming a set to be coherent implies an increase in the probability of truth of any of its elements. This is a weak form of truth-conduciveness, and Roche is right to point out that it should not give the coherentist much comfort. It has also been noted that coherence plays an important negative role in our thinking. If our beliefs show signs of incoherence, this is often a good reason for contemplating a revision. See chapter 10 in Olsson (2005) for an elaboration of this point.

As for the other approach to the impossibility results (questioning the premises used in their derivation), we have already seen that Huemer (2007, 2011), in connection with the Lewis-BonJour dispute, has expressed doubts regarding the standard way of formalizing independence in terms of conditional probability. It should come as no surprise that he objects to the impossibility results (ibid.) on the same grounds. In his 2011 article, Huemer even questions the Content Determination Thesis, which plays a pivotal role in the derivation of the results, for reasons that we have to leave aside here.

All these things can be consistently questioned. But the question is: at what cost? We have already seen that there are strong systematic reasons for explicating independence in terms of conditional independence in the standard way. Furthermore, the Content Determination Thesis is deeply entrenched in just about all work on coherence that takes agreeing witnesses to be the prototypical case. Giving up Content Determination would mean purging the coherence theory of one of its clearest and most distinctive pre-systematic intuitions: that coherence is a property at the level of report contents. The worry is that coherentism is saved at the cost of robbing it of almost all its significance, as Ewing put it almost a century ago in response to a similar worry (Ewing 1934, 246).

These concerns do not obviously carry over to another dialectical move: questioning the ceteris paribus conditions employed in the impossibility results, i.e., the conditions that determine what to hold fixed as the degree of coherence is varied. This line of criticism has been taken up by several authors, including Douven and Meijs (2007), Schupbach (2008) and Huemer (2011), and it may well be the internally least problematic strategy to explore for those who are inclined to challenge the premises upon which the impossibility results are based. It should be borne in mind, though, that the tendency to offer ever stronger ceteris paribus conditions may in the end be self-defeating. As more things are held fixed, it becomes easier for a coherence measure to be truth conducive. Hence, researchers pursuing this line of defense ultimately run the risk of trivializing the debate by making coherence truth conducive by definition (cf. Schubert 2012b).

There are some attempts to explain or come to grips with the impossibility results that do not easily fit into the two camps identified above or represent a combination of ideas from both. For an example of the latter, Wheeler (2012; see also Wheeler and Scheines, 2013) both suggest focusing on reliability conduciveness as opposed to truth conduciveness (camp 1) and question the assumptions, primarily independence but also the Content Determination Thesis, used in the derivation of the impossibility results (camp 2). Shogenji (2007, 2013) and McGraw (2016) are other complex and insightful attempts to deepen the Bayesian analysis and diagnose of those results. Specifically, Lydia McGrew argues for a shift in focus from the coherence of contents of reports to coherence of the evidence with an hypothesis \(H\) (which need not coincide with the conjunction of the report contents). If the conjunction of the members of a set of evidence supports some hypothesis \(H\), and if all members of that set are playing a role in that epistemic effect, then that set of evidence is, McGrew shows, indeed confirmatory of \(H\) and in that sense truth-conducive for \(H\) (McGrew, 2016). McGrew offers several proposals for how to spell out “coherence with \(H\)”.

9. Conclusions

The coherence theory of justification represents an initially suggestive solution to some deeply rooted problems of epistemology. Perhaps most significantly, it suggests a way of thinking about epistemic justification as arising in a “web of belief”. As such, it competes with, and could potentially replace, the historically dominating, but increasingly disreputable, foundationalist picture of knowledge as resting on a secure base of indubitable fact. Coherentism may also be more promising than alternative foundationalist views with their reliance on non-doxastic support. Unfortunately, coherence theorists have generally struggled to provide the details necessary for their theory to advance beyond the metaphorical stage, something which has not gone unnoticed by their critics. Following the seminal work of C. I. Lewis, contemporary scholars have taken on that challenge with considerable success in terms of clarity and established results, although a fair number of the latter are to the coherentist’s disadvantage. Some results support a weak foundationalist theory according to which coherence can boost credibility that is already there, without creating it from scratch. However, on the face of it, the impossibility results negatively affect this less radical form of coherence theory as well. It is often observed that while it is relatively easy to put forward a convincing theory in the outline, the ultimate test for any philosophical endeavor is whether the product will survive detailed specification (the devil is in the details, and so on). What the recent developments in this area have shown, if nothing else, is that this is very much true for the coherence theory of epistemic justification.


  • Akiba, K., 2000, “Shogenji’s Probabilistic Measure of Coherence is Incoherent,” Analysis, 60: 356–359.
  • Angere, S., 2007, “The Defeasible Nature of Coherentist Justification,” Synthese, 157 (3): 321–335.
  • –––, 2008, “Coherence as a Heuristic,” Mind, 117 (465): 1–26.
  • Bender, J. W., 1989, “Introduction,” in The Current State of the Coherence Theory: Critical Essays on the Epistemic Theories of Keith Lehrer and Laurence BonJour, with Replies, J. W. Bender (ed.), Dordrecht: Springer.
  • Blanshard, B., 1939, The Nature of Thought, London: Allen & Unwin.
  • BonJour, L., 1985, The Structure of Empirical Knowledge. Cambridge, Mass.: Harvard University Press.
  • –––, 1989, “Replies and Clarifications,” in The Current State of the Coherence Theory: Critical Essays on the Epistemic Theories of Keith Lehrer and Laurence BonJour, with Replies, J. W. Bender (ed.), Dordrecht: Kluwer.
  • –––, 1999, “The Dialectic of Foundationalism and Coherentism,” in The Blackwell Guide to Epistemology, J. Greco and E. Sosa (eds.), Malden, MA: Blackwell.
  • Bovens, L, and Hartmann, S., 2003, Bayesian Epistemology, Oxford: Clarendon Press.
  • Bovens, L. and Olsson, E. J., 2000, “Coherentism, Reliability and Bayesian Networks,” Mind, 109: 685–719.
  • –––, 2002, “Believing More, Risking Less: On Coherence, Truth and Non-trivial Extensions,” Erkenntnis, 57: 137–150.
  • Cleve, J. V., 2011, Can Coherence Generate Warrant Ex Nihilo? Probability and the Logic of Concurring Witnesses, Philosophy and Phenomenological Research, 82 (2): 337–380.
  • Cross, C. B., 1999, “Coherence and Truth Conducive Justification,” Analysis, 59: 186–93.
  • Davidson, D., 1986, “A Coherence Theory of Knowledge and Truth,” in Truth and Interpretation, E. LePore (ed.), Oxford: Blackwell, pp. 307–319.
  • Dietrich, F., and Moretti, L., 2005, “On Coherent Sets and the Transmission of Confirmation,” Philosophy of Science, 72 (3): 403–424.
  • Douven, I., and Meijs, W., 2007, “Measuring Coherence”, Synthese 156 (3): 405–425.
  • Ewing, A. C., 1934, Idealism: A Critical Survey, London: Methuen.
  • Fitelson, B., 2003, “A Probabilistic Measure of Coherence,” Analysis, 63: 194–199.
  • Glass, D. H., 2002, “Coherence, Explanation and Bayesian Networks,” in Artificial Intelligence and Cognitive Science, M . O’Neill and R. F. E. Sutcliffe et al (eds.) (Lecture Notes in Artificial Intelligence, Volume 2464), Berlin: Springer-Verlag, pp. 177–182.
  • –––, 2007, “Coherence Measures and Inference to the Best Explanation,” Synthese, 157 (3): 257–296.
  • Haack, S., 2009, Evidence and Inquiry: A Pragmatist Reconstruction of Epistemology, Amherst, NY : Prometheus Books.
  • Harris, A. J. L. and Hahn, U., 2009, “Bayesian Rationality in Evaluating Multiple Testimonies: Incorporating the Role of Coherence,” Journal of Experimental Psychology: Learning, Memory, and Cognition, 35: 1366–1372.
  • Huemer, M., 1997, “Probability and Coherence Justification,” Southern Journal of Philosophy, 35: 463–472.
  • –––, 2007, “Weak Bayesian Coherentism,” Synthese, 157 (3): 337–346.
  • Huemer, M., 2011, “Does Probability Theory Refute Coherentism?”, Journal of Philosophy, 108 (1): 35–54.
  • Kemeny, J. and Oppenheim, 1952, “Degrees of Factual Support,” Philosophy of Science, 19: 307–24.
  • Klein, P., and Warfield, T. A., 1994, “What Price Coherence?,” Analysis, 54: 129–132.
  • –––, 1996, “No Help for the Coherentist”, Analysis, 56: 118–121.
  • Koscholke, J. and Jekel, M., 2017, “Probabilistic Coherence Measures: A Psychological Study of Coherence Assessment,” Synthese, published online 11 January 2016, doi: 10.1007/s11229-015-0996-6
  • Koscholke, J. and Schippers, M., 2015, “Against Relative Overlap Measures of Coherence,” Synthese, first online 15 September 2015, doi:10.1007/s11229-015-0887-x
  • Koscholke, J., Schippers, M., and Stegman, 2019, “New Hope for Relative Overlap Measures of Coherence,” Mind, 128: 1261–1284.
  • Lehrer, K., 1990, Theory of Knowledge, first edition, Boulder: Westview Press.
  • –––, 1997, “Justification, Coherence and Knowledge,” Erkenntnis, 50: 243–257.
  • –––, 2000, Theory of Knowledge, second edition, Boulder: Westview Press.
  • –––, 2003, “Coherence, Circularity and Consistency: Lehrer Replies,” in The Epistemology of Keith Lehrer, E. J. Olsson (ed.), Dordrecht: Kluwer, pp. 309–356.
  • Lewis, C. I., 1946, An Analysis of Knowledge and Valuation, LaSalle: Open Court.
  • Lycan, W. G., 1988, Judgment and Justification, New York: Cambridge University Press.
  • –––, 2012, “Explanationist Rebuttals (Coherentism Defended Again),” The Southern Journal of Philosophy, 50 (1): 5–20.
  • McGrew, L., 2016, “Bayes Factors All the Way: Toward a New View of Coherence and Truth,” Theoria, 82: 329–350.
  • Meijs, W., 2006, “Coherence as Generalized Logical Equivalence,” Erkenntnis, 64: 231–252.
  • Meijs, W., and Douven, I., 2007, “On the alleged impossibility of coherence,” Synthese, 157: 347–360.
  • Merricks, T.,1995, “On Behalf of the Coherentist,” Analysis, 55: 306–309.
  • Moretti, L., 2007, “Ways in which Coherence is Confirmation Conducive,” Synthese, 157 (3): 309–319.
  • Moretti, L., and Akiba, K., 2007, “Probabilistic Measures of Coherence and the Problem of Belief Individuation,” Synthese, 154 (1): 73–95.
  • Neurath, O., 1983/1932, “Protocol Sentences,” in Philosophical Papers 1913–1946, R.S. Cohen and M. Neurath (eds.), Dordrecht: Reidel.
  • Olsson, E. J., 1999, “Cohering with,” Erkenntnis, 50: 273–291.
  • –––, 2001, “Why Coherence is not Truth-Conducive,” Analysis, 61: 236–241.
  • –––, 2002, “What is the Problem of Coherence and Truth?,” The Journal of Philosophy, 99: 246–272.
  • –––, 2005, Against Coherence: Truth, Probability, and Justification, Oxford: Clarendon Press.
  • Olsson, E. J., and Schubert, S., 2007, “Reliability Conducive Measures of Coherence,” Synthese, 157 (3): 297–308.
  • Poston, T., 2014, Reason and Explanation: A Defense of Explanatory Coherentism, New York: Palgrave Macmillan.
  • Quine, W. and Ullian, J., 1970, The Web of Belief, New York: Random House.
  • Rescher, N., 1973, The Coherence Theory of Truth, Oxford: Oxford University Press.
  • –––, 1979, Cognitive Systematization, Oxford: Blackwell.
  • Roche, W., 2010, “Coherentism, Truth, and Witness Agreement”, Acta Analytica, 25 (2): 243–257.
  • –––, 2013a, “A Probabilistic Account of Coherence,” in Coherence: Insights from Philosophy, Jurisprudence and Artificial Intelligence, M. Araszkiewicz and J. Savelka (eds.), Dordrecht: Springer, pp. 59–91.
  • –––, 2013b, “On the Truth-Conduciveness of Coherence,” Erkenntnis, 79 (S3): 1–19.
  • Schippers, M., 2014a, “Probabilistic Measures of Coherence: From Adequacy Constraints Towards Pluralism,” Synthese, 191: 3821–3845.
  • –––, 2014b, “Incoherence and Inconsistency,” Review of Symbolic Logic, 7 (3), 511–528.
  • Schippers, M., and Siebel, M., 2015, “Inconsistency as a Touchstone for Coherence Measures,” Theoria: Revista de Teoria, Historia y Fundamentos de la Ciencia, 30 (1): 11–41.
  • Schubert, S., 2012a, “Coherence Reasoning and Reliability: A Defense of the Shogenji Measure”, Synthese, 187(2): 305–319.
  • –––, 2012b, “Is Coherence Conducive to Reliability?”, Synthese, 187(2): 607–621.
  • –––, 2011, “Coherence and Reliability: The Case of Overlapping Testimonies,” Erkenntnis, 74, 263–275.
  • Schupbach, J. N., 2008, “On the Alleged Impossibility of Bayesian Coherentism”, Philosophical Studies, 141 (3): 323–331.
  • –––, 2011, “New Hope for Shogenji’s Coherence Measure”, British Journal for the Philosophy of Science, 62 (1): 125–142.
  • Shogenji, T., 1999, “Is Coherence Truth-conducive?,” Analysis, 59: 338–345.
  • –––, 2007, “Why does Coherence Appear Truth-Conducive,” Synthese, 157 (3): 361–372.
  • –––, 2013, “Coherence of the Contents and the Transmission of Probabilistic Support,” Synthese, 190: 2525–2545.
  • Siebel, M., 2004, “On Fitelson’s Measure of Coherence,” Analysis, 64: 189–190.
  • Sosa, E., 1980, “The Raft and the Pyramid: Coherence Versus Foundations in the Theory of Knowledge,” Midwest Studies in Philosophy, 5(1): 3–26.
  • Thagard, P., 2000, Coherence in Thought and Action, Cambridge, Mass.: The MIT Press.
  • –––, 2005, “Testimony, Credibility, and Explanatory Coherence,” Erkenntnis, 63 (3): 295–316.
  • –––, 2007, “Coherence, Truth, and the Development of Scientific Knowledge,” Philosophy of Science, 74: 28–47.
  • Wheeler, G., 2012, “Explaining the Limits of Olsson’s Impossibility Result,” The Southern Journal of Philosophy, 50: 136–50.
  • Wheeler, G., and Scheines, R., 2013, “Coherence and Confirmation through Causation,” Mind, 142: 135–170.

Other Internet Resources


Many people have commented on earlier versions of this article offering valuable suggestions and criticisms, and I thank them all. They are Staffan Angere, Luc Bovens, Igor Douven, Branden Fitelson, David Glass, Michael Hughes, Lydia McGraw, Luca Moretti, Stefan Schubert, Tomoji Shogenji, Mark Siebel, Wolfgang Spohn, Paul Thagard and Greg Wheeler. Finally, I am indebted to an anonymous referee for a number of stylistic and other improvements.

Copyright © 2021 by
Erik Olsson <>

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free