Jürgen Habermas

First published Fri Sep 15, 2023

[Editor’s Note: The following new entry by James Gordon Finlayson and Dafydd Huw Rees replaces the former entry on this topic by the previous authors.]

Jürgen Habermas is one of the leading social theorists and philosophers of the post-Second World War period in Germany, Europe, and the US, a prodigiously productive journalist, and a high-profile public intellectual who was at the forefront of the liberalization of German political culture. He is often labelled a second-generation Frankfurt School theorist, though his association with the Frankfurt School is only one of a rather complex set of allegiances and influences, and can be misconstrued. This entry will begin with a summary of Habermas’s background and early and transitional works, including his influential concept of the public sphere, before moving on to discuss in detail his three major philosophical projects: his social theory, discourse theory of morality (or “discourse ethics”), and discourse theory of law and democracy. It will then more briefly address Habermas’s methodology and philosophical framework (rational reconstruction and postmetaphysical thinking), his applied political theory, focusing on issues of national identity and international law, and finally his recent work on religion.

1. Biography

1.1 Biographical Introduction

Habermas was born in June 1929 and brought up in provincial North-Rhine Westphalia, to conservative, educated middle-class parents, who had been neither critical nor strongly supportive of the Nazi regime. In 1944 he was called up to man the defences on the western front. A little over a year later he was shaken to his core by what he learnt of the Nazi atrocities from the Nuremberg Trials, and news coverage of the Holocaust. Thus, although still in his teens, he experienced 1945 as a turning point that would shape his political and cultural outlook. As he put it frankly in an interview in 1979:

I am myself a product of “reeducation” … By this I mean that … we learnt that the bourgeois constitutional state in its French, or American, or English form is an historical achievement. (1992a: interview 3, 79)

Two moments exemplify Habermas’s complex position in-between the generations of 1945 and 1968. In 1953, when Habermas was a student at Göttingen University, he published a critical essay in the Frankfurter Allgemeine Zeitung concerning Heidegger’s remark about “the inner truth and greatness of the Nazi movement” that Heidegger had written in his lectures on metaphysics in 1935, and then failed to retract or alter in 1953 when republishing those lectures. (1971c [1977]) In 1968, at the height of the student protests in Germany, Habermas, who had been critical of the policing that had resulted in the killing of Benno Ohnesorg at a student demonstration the year before, directly criticized the students for acting out revolutionary fantasies, and for provoking the authorities into violence. He used the phrase “left-wing fascism”, a term he later admitted was too harsh (Müller-Doohm 2016: 141). Instead, he urged them to put the latitude granted to them by liberal democratic institutions to work in the service of a “radical reformism” (Specter 2010: 111–115).

Habermas studied German philosophy and literature at Bonn, and wrote his doctoral dissertation on “The Absolute and History: the Ambivalence of Schelling’s Thought”. He came to Frankfurt in 1956, where he was Theodor Adorno’s Assistent at the Institute for Social Research for three years. In 1959 he left for Marburg, having effectively been shouldered out by Max Horkheimer, who considered him a dangerous Marxist, and who tried to have him dismissed (Müller-Doohm 2016: 84–86; Habermas 1992a: interview 8, 218). In Marburg, he wrote his habilitation dissertation, The Structural Transformation of the Public Sphere, under Wolfgang Abendroth, one of the few Marxist academic philosophers in the post-war Federal Republic. Habermas, though often deemed a member of the Frankfurt School, was, in reality, at the institute for a very brief period. Whilst there, he recalls, “Critical Theory, at Frankfurt School—there was no such thing … no coherent doctrine” (Habermas 1992a: interview 4, 98). So it is misleading to say that he was or became a “member” of the Frankfurt School. In truth, he arrived there as an outsider, and while there, briefly, ploughed his own furrow.

Habermas returned to Frankfurt, after a short period at the University of Heidelberg, where he succeeded Horkheimer, with whom he soon reconciled, as Professor of Philosophy and Sociology. He declined to become director of the institute. In Frankfurt, Habermas spent the latter half of the 1960s teaching in febrile and tumultuous political circumstances not conducive to research. In 1971 he became director of the Max Planck Institute for the Study of Living Conditions in the Scientific and Technical World in Starnberg, Bavaria, where he conducted the research which led to his magnum opus, the two-volume Theory of Communicative Action. The year his magnum opus was published, 1981, Habermas resigned from the Max Planck Institute under unhappy circumstances, and again returned to Frankfurt. There he would remain, but for various visiting professorships in the US, until his retirement in 1994. Landmark publications during these years include many essays on moral philosophy, and Between Facts and Norms in 1992, Habermas’s major work in political and legal philosophy. Throughout his life Habermas has enthusiastically played the role of the public intellectual, taking part in disputes about positivism in the social sciences, the historical uniqueness of the Holocaust, German reunification, genetic engineering, and secularism and religion. He is the recipient of numerous honorary doctorates and prizes, including the Adorno Prize of the city of Frankfurt and the Kyoto Prize of the Inamori Foundation (Müller-Doohm 2016: 340).

1.2 The Public Sphere

The public sphere is one of Habermas’s most well-known concepts, introduced in his habilitation thesis, published in 1962 as The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society. Belonging to neither the state, the economy, nor the family, the public sphere is where private individuals come together to communicate about matters of general concern. It is the location of the public use of reason and the place where “public opinion” is formed. Structural Transformation is a reconstructed history of the rise and fall of the public sphere focused on Britain, France, and Germany from the early modern era to the mid-twentieth century. In the Middle Ages, there was a merely “representative” public sphere, in which kings and nobles displayed their status before society (1962 [1989: 7–10]). The bourgeois public sphere begins to emerge in the seventeenth and eighteenth centuries, at first in the guise of a literary public sphere. In coffee houses, salons, and literary societies, the new reading public came together to discuss novels—Habermas cites Samuel Richardson’s Pamela as an example (1962 [1989: 31–6, 49–50, 174]). Skills of critical reasoning first developed in the journals of the literary public sphere were subsequently applied to the political public sphere, where public affairs rather than literary texts are the objects of criticism. In this period the modern state was emerging, as political authority was gradually depersonalized and vested in more-or-less independent bureaucratic institutions, rather than the person of the monarch (1962 [1989: 17–8]). Simultaneously, the development of mercantile capitalism endowed merchants with unprecedented wealth and influence, and an ever-greater need for accurate information about market conditions. This need was met by news-sheets and gazettes, which soon turned their attention to state policy as much as commodity prices (1962 [1989: 20–2]). The bourgeois public sphere thus developed concomitantly with both the capitalist economy and the Westphalian sovereign state, and flourished during the high point of bourgeois-liberal politics in the eighteenth and nineteenth centuries.

The bourgeois public sphere is constituted by an ideological separation between public and private. The state and politics are deemed “public”, whereas civil society, the market economy, and the family are deemed “private”. The public sphere, according to Habermas, mediates between these two realms (1962 [1989: 30]). Participants in the bourgeois public sphere are private individuals, coming together to rationally and critically discuss public affairs, above all the actions of governments; it is

a realm of private individuals assembled into a public body who as citizens transmit the needs of bourgeois society to the state, in order, ideally, to transform political into “rational” authority within the medium of this public sphere. (1964 [1974: 53])

Habermas would later refer to this as the generation of “communicative power”, which can legitimate the political system’s actions if yoked to the latter’s “administrative power” (1992b [1996b: 147–50]). As members of the public, private individuals bring decisions into the public sphere where they are open to rational discussion and criticism. In the process, participants form and articulate the general interest of society, drawing on ideas of truth, justice and human rights.

Needless to say, participants in the bourgeois public sphere were de facto almost all educated male property-owning members of the bourgeoisie, along with some sympathisers from the aristocracy. Habermas has acknowledged the selective membership of the bourgeois public sphere during its heyday (1992c: 425–430), although critics have charged that he does not pay sufficient attention to the way it was constituted by excluding propertyless workers (Negt & Kluge 1972 [2016]) and, above all, women (Landes 1988). Despite its limitations, Habermas argues that the bourgeois public sphere nevertheless embodied certain principles and ideals, never fully realized, that are vital to any flourishing democratic society. It was thus both an “ideal” and an “ideology” (1962 [1989: 112]). Since differences in social status between interlocutors were bracketed as “private” matters (1962 [1989: 36]) the public sphere was in theory universal, open to any literate person (1962 [1989: 37]). This ensured that rational argumentation was calibrated to universal standards of validity, not to the relative status of interlocutors, and could function as a cooperative search for truth and justice (1962 [1989: 54]).

Habermas’s approach in this early work can be described as one of historical reconstruction in the service of internal criticism. He reconstructs an “ideal type” of the bourgeois public sphere, in order to criticize the really existing public spheres of modern democracies.

The structural transformation which marks the beginning of the end of the bourgeois public sphere involves a re-definition of the public/private distinction. Under conditions of mid-twentieth century “welfare-state mass democracy”, (1962 [1989: 208]) state and society became ever more entangled as governments pursued interventionist economic policies and expanded welfare provision. At the same time, non-state actors such as pressure groups, corporations, and political parties played an increasing role in governance (1962 [1989: 142]). Habermas refers to this process as “refeudalization” (1962 [1989: 200–1]).

Habermas sees the modern public sphere as, in many ways, the victim of its own success. As it expanded far beyond its original basis of educated male property-owners, material inequalities could no longer be set aside, but rather became the subject of public debate (1962 [1989: 127]). And this debate was no longer a matter of rational-critical analysis of state action by the assembled public, but of negotiation between interest groups which bypass public reason. Instead of the approximation of society to the ideal type, what emerged was an impoverished pseudo-public sphere, lacking its original capacity for rational-critical discourse, easily manipulated by states, corporations, and interest groups using the techniques of “public relations” (1962 [1989: 176, 236]). Its role now, as in the feudal era, is to acclaim decisions which have already been made.

Habermas continues to make use of the concept of the public sphere in his later works (1973a [1975: 37–8, 48]; 1992b [1996b]), developing a detailed account of its place in modern societies (1992b [1996b: 359–87]; 2008b [2009: chapters 8 & 9]). In his original formulation, there was a tendency to assume the existence of a single unified public sphere for a single polity. In response to Nancy Fraser’s discussion of “subaltern counterpublics” (Fraser 1992) and acknowledging his own earlier neglect of “plebian public spheres”, Habermas now concedes that there may be a multitude of intersecting public spheres within a given society, focusing on different communities and topics but with porous boundaries that allow flows of communication to pass between them (1992c: 424–5; 1992b [1996b: 373–4]). The closest thing to a “universal” public sphere is the political public sphere, focused on the political system. The political public sphere acts as a “sounding board” for problems which affect society as a whole, as well as a “filter-bed”, filtering out contributions to public discourse which represent generalizable interests (2008b chapter 11 [2009: chapter 9, 143]). In this manner, reflexive public opinion is formed and communicative power is generated. One of the most salient features of the contemporary political public is its division into “formal” and “informal” segments, the former denoting the highly regulated discourse of elected politicians, judges, parliaments, and courts, and the latter the “wild”, unregulated flows of communication outside these spaces (2008b chapter 11 [2009: chapter 9, 159–62]). Habermas argues that the right kind of feedback between formal and informal public spheres is vital for legitimating the political system’s actions. At the same time, there is no reason to think that public spheres must stop at national borders. In an increasingly interdependent world, global attention can be focused on single issues—Habermas mentions the wars in Vietnam and Iraq as examples—creating, at least temporarily, a transnational public sphere (2004: chapter 3 [2006c: 39–48]). The political legitimacy of transnational polities like the EU hinges on whether the public spheres of its member states can act together, functioning as a European public sphere (2008b chapter 9, chapter 11 [2009: chapter 6, 87–8, chapter 9, 181–3]).

1.3 Early Works (1964–71)

Knowledge and Human Interests is a historical reconstruction of the prehistory of positivism and scientism. The history Habermas reconstructs is a decline and fall of self-reflection in Wissenschaft—science in the broadest sense. His particular interest is in Erkenntniskritik, namely the tradition of critical philosophy from Kant, German Idealism, and Hegel through to Marx, which, on his account, is gradually side-lined by ways of thinking that employ the methods of positivistic natural and social science. His thesis is: “That we disavow reflection is positivism” (1968b: 9 [1971a: vii]).

Habermas’s analysis takes its cue from Lukács’ idea of reification—the idea that beings, forms of life, and social relations assume the appearance of nature (Lukács 1971: 83–223). Habermas’s analysis reveals this illusory independence and naturalness to be historical, and in principle reversible. Horkheimer and Adorno’s thesis that: “All reification is a forgetting” also animates the analysis (Horkheimer & Adorno 1971 [2002: 191]). Positivism in the empirical and social sciences, and historicism in the cultural sciences, Habermas argues, exhibit a merely contemplative stance toward their respective objects, an attitude that obscures the point that science and knowledge is fundamentally a human enterprise that serves human interests, which are rooted in the natural history of the species (1968b [1971a: 301–4]).

According to Habermas, there are three knowledge-constitutive interests. The empirical and natural sciences are governed by the cognitive interest in the technical control of objectified processes. The historical-hermeneutical sciences are shaped by a practical interest in orienting action and reaching understanding, while self-reflection (and Erkenntniskritik) are determined by a cognitive interest in emancipation and in Mündigkeit—autonomy and responsibility (1968b [1971a: 313–314]). Habermas’s overall aim is to explain how Marxism, and social theory more broadly, succumbed to a positivistic self-misconception, while rescuing the animus of Marx’s theory of society for critical social theory, by connecting it with the interest in emancipation and autonomy, and with a method of critical self-reflection.

Habermas’s long essay “Technology and Science as ‘Ideology’” is a festschrift for Herbert Marcuse, and not an uncritical one (1968a [1970]). It deals with topics he broached in the volume of essays Theory and Practice (1971 [1973b]) that broadly argue against the reduction of political theory from a body of thought answering practical questions of how one should live, to “a form of social engineering that dispenses with public discourse” (Celikates & Jaeggi 2009 [2017: 261]).

Habermas develops the following theses that exhibit the major concerns of his early work, based on the following diagnosis: Capitalist societies institutionalize the aim of economic growth, which has the effect of expanding sub-systems of instrumental reason. In that context, science and technology cease to be connected to a realm of values that help people answer practical questions of how they want to live, and become absorbed instead into the economic and administrative systems as forces of production geared to the aim of continuous economic growth. Politics, instead of residing in a popular practice of democratic self-determination, atrophies to the technocratic administration of public affairs in the hands of small groups of experts, leading to a depoliticization of ordinary citizens (Celikates & Jaeggi 2009 [2017: 261]). The question is who can reconnect technocratic politics with the “good life”. Habermas’s answer is that students may be part of the solution, since their protests do not aim at securing them “a larger share of social rewards”, but are targeted against the reduction of social and political life to economic growth and individual gain (1970: chapter 6, 120–122).

1.4 Transitional Works (1971–1982)

Around the time of his move to Starnberg in 1971 Habermas initiated a complete overhaul of his theoretical framework for social theory, ultimately leading to the development of his mature communicative paradigm (McCarthy 1978). He also produced some seminal transitional works which dealt with the issues raised in the early work. Legitimation Crisis (1975), or to give the full original title, Legitimationsprobleme im Spätkapitalismus (1973), is a sketch of an incipient research programme that sets out from a critique of Marx’s social theory, which sees crises in capitalist societies as arising not from the immiseration of the working classes and class struggle, which have largely been pacified by the welfare state and economic growth, but from legitimation deficits due to alterations in the constellation of economics and politics. The problem posed by legitimation crises, which are specific to late capitalism, is how the increasing intervention of the state in economic affairs can be made legitimate to those who are affected by these state interventions, and hold the state responsible for their effects. In conclusion, Habermas develops some hypotheses about how such crises can be resolved. The best-case scenario is that legitimation takes place through discursive justification, on the basis of norms embodying generalizable interests, and, failing that, through various kinds of compromise (1973a [1975: 112–114]). On this model normative structures and ideals of consensus become the key vantage point through which societies can be understood and criticized by the social theorist.

This thesis is further elaborated in the collection of essays entitled Zur Rekonstruction des historischen Materialismus (1976a). In this volume Habermas challenges the Marxian assumption that developments in the sphere of social integration are determined by developments in the sphere of material production. By contrast he posits a logic of the development of normative structures, which represent the institutional analogues of the stages of cognitive development of moral consciousness in individuals as developed and tested by the cognitive moral psychologist Lawrence Kohlberg (1976a: 9–49 [1979: 95–130]). These normative structures represent a directional sequence of discrete stages that gain in complexity and comprehensiveness as they develop, and make collective learning possible. They allow different kinds of reasons to count as legitimations for the relevant kinds of social structures and levels of social integration, and serve as normative standpoints in the light of which societies can be understood and criticized. Crucially, they exhibit a logic of development altogether independent from that of the forces of production.

At the time, Marxists criticized Habermas’s work both for abandoning central tenets of Marxism and for conceding too much to Niklas Luhmann’s system’s theory (Ebbighausen 1976). That said, Habermas denies Luhmann’s central claim that legitimacy is nothing but the motiveless acquiescence of citizens to the binding decisions of an administrative mechanism (Luhmann 1969). As for Habermas’s Marxism, the very focus of his transitional work on the system crises of capitalist society betrays not only the theoretical aspiration to understand capitalism, but also the practical aspiration to overcome it, or at least to understand how it could be overcome. That said, since the theory of legitimation crisis, as McCarthy noted, is not addressed to any agent of social transformation, it must ultimately remain content with diagnosing crisis tendencies (McCarthy 1978).

2. Habermas’s Mature Social Theory: The Theory of Communicative Action

The idea of reason, which is differentiated in the various claims to validity, is necessarily built into the way in which the species of talking animals reproduces itself. (2001a: chapter 5, 85)

Habermas’s mature work begins with The Theory of Communicative Action (1981 [1984a/1987]), the fruits of a long and difficult decade he spent at the Max Planck Institute in Starnberg (Müller-Doohm 2016: 214). It is an ambitious and wide-ranging work, which provides a general framework within which several related research programs are arranged. It comprises:

  1. A sketch of a unified theory of meaning and action.
  2. A typological theory of social action.
  3. A social ontology.
  4. An outline of a critical social theory tied to (1)–(3).

This section will discuss these topics in turn.

2.1 Habermas’s Pragmatic Theory of Meaning

While the theory of communicative action is designed to answer questions in social theory, Habermas also considers it as a contribution to the theory of meaning (1984b: 604). There are three pillars to this theory:

  1. The first comes from Karl Bühler’s Organon Model of language according to which language is triadic, with three functions corresponding respectively to the objective world, the hearer, and the speaker (or the third, second and first person): a cognitive function, an appeal function, and an expressive function.
  2. The second pillar is speech-act theory, and in particular the idea of illocutionary force or meaning, which was developed by J. L. Austin and John Searle.
  3. The third pillar is “formal semantics”, the truth-conditional theory of meaning, and in particular Michael Dummett’s “verificationist” critique of it.

These three pillars form the basis of what Habermas calls “formal pragmatics” or the pragmatic theory of meaning, the basic idea of which is that “We understand a speech-act when we know what makes it acceptable” (1984a: 297 hereafter TCAI): to understand what a speaker means the hearer has to have access to the reasons for the speaker’s utterance.

The first pillar, Bühler’s functional schema of language, is important as a guiding assumption of Habermas’s theory. Habermas sets so much store by Bühler’s model because it encompasses the entire field of linguistic meaning, and gives equal weight and priority to its three dimensions: what is intended by the speaker, what is said in the content of the utterance, and what is done with that utterance. All three dimensions are present in what Habermas considers as the original mode of communication whereby a speaker, S, reaches understanding with another person, H, about something (1988b [1998b chapter 6: 279] [1992a: 58]). The triadic architectonic radiates into all aspects of Habermas’s theory: the thesis that there are three validity claims: to truth, rightness and sincerity (TCAI: 307); that speakers can adopt three attitudes: an objectivating, a norm-conformative, and an expressive attitude (TCAI: 309); that speakers through their utterances take up relations to three “worlds”: the objective world of states of affairs, the social or intersubjective world of legitimate social orders, and the subjective internal world (TCAI: 49–52, 60, 236, 308); and finally that there are three basic modes of speech which forms the basis of the classification of speech-acts: constatives or assertoric speech-acts, regulative speech-acts (such as imperatives or requests), and expressive speech-acts (TCAI: 309). Each of these triadic distinctions nests in each other.

To the extent that there is an argument in Theory of Communicative Action for the triadic structure itself, it rests on the basic claim that there are three equiprimordial, meaning-critical validity claims. Every speech-act simultaneously makes a claim to truth, to rightness, and to truthfulness. That means a speech-act can be taken up by the hearer, and assessed in the light of its propositional truth, normative rightness, or the sincerity with which it is expressed. If accepted, this means that agreement (Einverständnis) is reached “simultaneously on three levels” (1981 [1984a: 307]). In defence of this view Habermas argues that a speech-act can always be rejected from three perspectives: in the light of its assertibility conditions, its normative justification, or the sincerity of the speaker (1981 [1984a: 306]; 1988b essay 4 [1998b: 231]; 1988b essay 6 [1998b: 296]; 1999a [1998b: 317]). However, the claim that needs to be defended, not assumed, is that the validity claim for every speech-act can be rejected from three and only three perspectives. And as Dorschel claims, an assertion or utterance might be rejected in virtue of the volume or style with which it is uttered (Dorschel 1988: 8–9). In the final analysis, the “argument” that any utterance can be rejected from three and only three perspectives is question-begging. So the triadic structure, stemming from Bühler’s schema remains best thought of as a hinge assumption.

The second pillar is speech-act theory. Because speech-act theory construes speech as action, or, to use Austin’s phrase, as “doing things” with words, it is well suited to provide the basis for a unified theory of meaning and action. According to the theory, a speech has both propositional content, p, and illocutionary force M. So the meaning of an utterance Mp—“the ice is thin”—can be both a statement about the way the world is, and, say, depending on the context, a warning.

That said, Habermas’s main focus is on illocutionary acts, the aims of which, in contrast to perlocutionary acts, he contends, can always be made manifest (1986b [1998b chapter 3: 202]). When a speaker makes a declaration or a promise they thereby signal to the hearer what they are doing. The key to Habermas’s use of the term “illocutionary” is that he identifies and specifies a putatively universal internal mechanism by which speakers realize their various illocutionary aims: they make validity claims for their utterance in order to reach understanding (Verständigung or Einverständnis). Speakers do this by making an implicit guarantee that they can, if necessary, adduce good reasons for their utterance. Hearers, for their part, are always free to respond with a “yes” or “no” (1981 [1984a: 302]) to this validity claim. When they respond with a “yes”, speaker and hearer reach understanding or agreement. “Reaching understanding is the inherent telos of human speech” (1981 [1984a: 280]).

The third pillar is formal semantics. To explain the notion of a meaning-critical validity claim, Habermas enrols Dummett’s verificationist critique of truth-conditional semantics, and the epistemic turn in formal semantics (1981 [1984a: 316–8]; 1981 [1998b chapter 2: 153]). Dummett argues that justification is an epistemic idea, but truth is not, and that the truth-conditions of many sentences are unknowable, even where their justification conditions are not. He proposes the view that we understand the meaning of a sentence when we know the conditions under which it is assertible, rather than the conditions under which it is true (Dummett 1993: 45; Heath 2001: 120–121; Fultner 2011a: 60–62).

Habermas takes Dummett’s thought and extends it to natural languages, and to the pragmatics of meaning and understanding. This is why he calls his approach “formal pragmatics”, rather than “formal semantics”.

[I]t is possible to generalize Dummett’s explanation. We understand a speech-act when we know the kinds of reasons that a speaker could provide … claim validity for his utterance—in short when we know what makes it acceptable. (1986b [1998b chapter 3: 232])

But as Heath points out this is problematic. For there is a semantic dimension to, and motivation for, Dummett’s idea that to know the meaning of a sentence is to know the conditions under which it is assertible. It offers a unified explanation of the compositional structure of language, namely of how one can construct an infinite number of meaningful sentences out of a finite number of semantic units and the rules for their composition. In turn that explains how we can understand the meaning of a sentence we have never encountered before. This may work for assertions, but it is inapplicable to the pragmatic dimension of meaning, to the illocutionary force of utterances, which lacks a compositional structure. So it is also unclear how it would work for the other kinds of speech-acts such as regulatives and expressives.

This strongly suggests, as Heath argues, that there may after all be only one validity claim that is “meaning-critical”, or “internal” in the sense that it is constitutive of the meaning of utterances, namely the validity claim to the truth of assertions or of the propositional components of other kinds of speech-act (Heath 2001: 115–6). It is potentially damaging for Habermas also for another reason. It shows that to understand an utterance that makes a rightness claim one need not know how the utterance or claim is justified. Understanding an utterance need not involve accepting reasons for action. Recall that Habermas insists that the illocutionary aim of the speaker is not only to be understood, in the sense that addressees recognize the sense of her utterance, but agreed with, in the sense that they also accept the relevant reasons that the speaker could adduce in support of their utterance. In the case of utterances that make rightness claims these are reasons to act or behave in certain ways. Habermas must establish the latter, because his whole theory depends on the claim that the normative commitments unavoidably generated in speech reach over into the subsequent action sequence (1981 [1984a: 302–3]).

Habermas later makes a move that appears to address this problem without solving it: he claims truth to be paradigmatic of validity, and rightness to be merely analogous with truth (1999a [2003a: 229]). However, he does not say what the analogues are, nor does he explain what the basis for the analogy is (Finlayson 2005). To claim that that validity claims to rightness are analogous to validity claims to truth, in that they determine the meaning of the utterances that make them, and can play the role as a core component of a theory of meaning, is to beg the important questions.

2.2 A Theory of Social Action

The pragmatic theory of meaning is intended to provide the theoretical framework for Habermas’s social theory, which consists of a typological theory of human action. Habermas offers a number of typologies. He distinguishes four different “models” of action: teleological action, subdivided into instrumental and strategic action; normatively regulated action; dramaturgical action; and communicative action (1981 [1984a: 85ff]). He later offers the following more rudimentary typology which divides action on the horizontal axis into success-oriented and consensus-oriented action, and on the vertical axis between “non-social (individual) and social action (1981 [1984a: 285]).

  Action Orientation
Success Consensus
Action Situation Non-Social Instrumental Action * * *
Social Strategic Communicative Action

At the heart of this typology is the distinction between communicative action (rationality) on the one hand and instrumental and strategic action (rationality) on the other.

Habermas defines communicative action in various ways but always to do with agreement on the basis of validity claims. In one place, he claims that it comprises “linguistically mediated actions in which all participants pursue illocutionary aims and only illocutionary aims” (1981 [1984a: 295]). That, however, is to put the point too strongly since agents don’t only aim to understand and to be understood (Steinhoff 2009: 35–6). A better definition is that communicative action is action in which participants pursue illocutionary aims “without reservation” also when they pursue “perlocutionary goals” via “illocutionary goals already achieved” (1986b [1998b chapter 3: 241]). This is important because Habermas correlates instrumental action with perlocutionary aims and communicative action with illocutionary aims, and he maintains that instrumental and strategic action depends on communicative action, but not vice versa. The idea is that agents, through their utterances, make validity claims (to truth, rightness and truthfulness) on which the meaning of their utterances depends, which are then taken up by their interlocutors and form the basis of understanding and any subsequent interactions.

In response to the basic worry that he conflates reaching understanding of an utterance with reaching agreement on a norm of action, Habermas introduces a distinction between weak and strong communicative action. He tends to make the distinction in terms of the difference between utterances for which speakers only make validity claims to truth, and those for which they make validity claims to rightness and hence offer normative reasons that “bind their wills” (1999a [1998b chapter 7: 327]). In weak communicative action the reasons that determine meaning are also supposed to guide action by providing information about the way the world is: in strong communicative action the reasons that are supposed to determine meaning are action guiding because they are practical reasons based on shared intersubjective norms (1999a [1998b chapter 7: 326–7]).

Habermas contrasts communicative action with instrumental action, which he takes to be action whereby agents select the best or only means in order to achieve the agents’ ends (1981 [1984a: 285]). He construes strategic action as a social variant of instrumental action, whereby agents seek to influence other agents, in order to achieve their ends, often (though not always) via the medium of language (1981 [1984a: 285]). Habermas assumes that all action is rational. But different action types involve different kinds of rationality—instrumental and communicative. He also allows that all action is broadly teleological, but claims that while instrumental and strategic actions are oriented towards success, communicative action is oriented towards consensus or reaching understanding/agreement.

Habermas’s conception of instrumental and strategic action has come in for much criticism. He tends to assume that all instrumental and strategic action is egocentric and monological action, in which agents aim to achieve their desired ends, and that they pursue these ends individually, not in concert with others. These assumptions are brought out when he contrasts instrumental and strategic action with communicative action whereby

the actions of the agents involved are coordinated not through egocentric calculations of success but through acts of reaching understanding. (1981 [1984a: 285–6, 288]; 1981 [1998b chapter 2: 118]; Celikates & Jaeggi 2009 [2017: 263–4])

He also assumes that strategically acting agents achieve their means through influencing others. He allows that strategic actors can cooperate, but claims that strategic cooperation is always subordinated to the satisfaction of the agents own individual (egocentric) ends.

Success in action is also dependent on other actors, each of whom is oriented to his own success and behaves cooperatively only to the degree that this fits with his egocentric calculus of utility. (1981 [1984a: 87–88])

Habermas also assumes that in acting strategically an agent adopts an objectivating attitude towards others. Strategic actors coordinate their actions with others by trying to influence them or manipulate them.

However, many critics have pointed out that it is not the case that agents who act instrumentally or strategically must act selfishly, or monologically, such that they are incapable of stable forms of co-operation (Johnson 1991; Heath 2001; Steinhoff 2009; Blau, 2022). These assumptions appear to be imported from the “traditional” action theories as Habermas understands them. Not only do these assumptions require independent justification, but they are also incidental to the basic distinction Habermas is attempting to draw.

2.2.1 The Unavoidability Thesis

Having drawn the distinction between communicative and instrumental action, Habermas argues for two basic theses. Recall the starting intuition of Theory of Communicative Action:

The idea of reason, which is differentiated in the various claims to validity, is necessarily built into the way in which the species of talking animals reproduces itself. (1984b [2001d: 85])

Habermas claims that in societies like ours, there is no functional equivalent for language as the medium of action-coordination and social integration. And if his reconstruction of language use is correct that takes place through communicative action:

the symbolic structures of the life-world can be reproduced only through the medium of action orientated to understanding. (1982: 237)

If Habermas’s pragmatic theory of meaning is correct, then a weak transcendental necessity transmits from the premises, that communication and discourse are necessary to social reproduction, to features of his reconstruction thereof, such as validity claims, rules of argumentation, etc. The transcendental necessity in question is weak since the premises are contingent and empirical: they are not themselves logically or even physically necessary. And the reconstruction is itself defeasible. Nonetheless, the reconstructed features, Habermas contends, are socially necessary for agents like us—roughly agents of modern societies. Language exists, and language use is not optional for human beings. Linguistic practice presupposes the structures and rules of communication and discourse that alone enable communication. So there is no feasible alternative to speaking, making one’s utterances understood to interlocutors, and so of raising validity claims, and also, as we will see below, invoking the rules of discourse (Heath 2001: 295–98). Validity claims, and the rules of discourse that govern the practice of argumentation, are what Habermas calls “pragmatic preconditions” of communication. This means that

from the performative perspective of the participants in interaction, these presuppositions must be undertaken. (Fultner 2019; Habermas 1999a [2003a: 85–86; 17–18])

This is the “universality” expressed in the label “universal pragmatics” which Habermas originally gave to his research programme (1976b [1979: 1–68]).

2.2.2 The Irreducibility Thesis

Habermas’s theory is that communicative action is the “basic form of action” and that instrumental and strategic forms of action are derived from, and parasitic upon, action oriented toward reaching understanding (1981 [1984a: 228]; 1999a [2003a: 86ff]; 1976b [1979: 1]). This is a bold and controversial claim which overturns the central “traditional” action theory which is that the basic form or rationality is means-ends rationality, and that the basic form of action is instrumental, and success-oriented.

Habermas’s argument for the irreducibility thesis rests on his speech-act theory. He claims that perlocutionary meaning (roughly, the intended or unintended ends that agents achieve through the use of language) depends essentially on illocutionary meaning, namely on the reason-based consensus arising from the offer and acceptance of validity claims, but not vice versa. The latter—the illocutionary—is the “original mode of language use” upon which he instrumental use of language is “parasitic” (1981 [1984a: 288]). In latently strategic uses of language, perlocutionary effects are achieved only through the unreserved pursuit of illocutionary aims. In other words, strategically acting individuals use language normally, giving their interlocutors to believe mistakenly they are aiming at reaching understanding, when they are not (1981 [1998b chapter 2: 118]).

One problem with this argument is that Habermas recalibrates Austin’s terms “perlocutionary” and “illocutionary” so that they are inextricably bound up with instrumental/strategic and communicative action respectively from the start. So his argument virtually presupposes what it is supposed to explain. Another difficulty is that it is unclear whether the claim that the normal mode of language use involves the illocutionary goals of reaching agreement on shared norms and binding practical commitments, as in “strong” communicative action, or not.

For these and other reasons most commentators agree that Habermas fails to establish the irreducibility thesis by argument (Baurmann 1985; Steinhoff 2009; Blau 2022). Steinhoff claims that all Habermas’s arguments for the claim that communicative rationality is irreducible to instrumental rationality fail, but that the reverse is true (Steinhoff 2009: 46) and the traditional theory of action is correct. Others think that the basic distinction is useful and can be justified by argument, even if Habermas’s own argument is not clinching (Blau 2022; Heath 2001). Heath provides an argument in the other direction which shows, contra Steinhoff, that instrumental and strategic action cannot account for language use, and argues that this result can support Habermas’s contention, that instrumental and strategic action presuppose communicative action but not vice versa. (Heath 2001: 45–48)

2.3 Habermas’s Social Ontology

The third research program, Habermas’s social ontology, rests on the previous two.

This takes the form of dyadic analytic distinction between “system” and “lifeworld”, which form the ontological counterparts or “complements” of instrument and strategic action, and communicative action respectively (1981 [1987: 119]). Societies should be conceived “simultaneously as systems and lifeworlds” (1981 [1987: 118]).

They are theoretical concepts, which enable social theorists to explain social order.

They are also ontological concepts that respectively denote two different kind of existing social order, or to put it another way two distinct but complementary mechanisms of social integration. In the former case, systems stabilize the

non-intended interconnections of action … by a non-normative regulation of individual decision that extend beyond the actor’s consciousness.

In the latter, the lifeworld, social integration is brought about by means of “a normatively secured or communicatively achieved consensus” (1981 [1987: 117]).

2.3.1 Lifeworld

On the one hand, Habermas presents the lifeworld phenomenologically as “a background stock of cultural knowledge that is ‘always already’ familiar to agents” and thus makes mutual understanding possible. In this it has what he calls a “peculiar half transcendence” that, unlike the formal notions of the subjective, objective, and intersubjective worlds, cannot be objectified and brought before consciousness (1981 [1987: 154–4]). On the other hand, he associates it with specific domains of social life—such as family life, everyday life, and civil society—in which communicative actions predominate, and agents coordinate their interactions by means of speech-acts and their underling validity claims (Baxter 2011: 166; Heath 2011: 75). Either way, the lifeworld enjoys a certain primacy over the system, since it “remains the subsystem that defines the pattern of the social system as a whole” and because systems “need to be anchored in the lifeworld” (1981 [1987: 154]).

2.3.2 System

Habermas developed his notion of the system in his writings prior to Theory of Communicative Action, particularly through his engagement with Luhmann (1973a [1975: 1–8]). Systems are macro-level processes that stabilize complexes of actions via steering mechanisms. The two main examples are the economy and bureaucracy, which function respectively via the steering mechanisms of money and power. Unlike the lifeworld, systems fulfil their functions through “a non-normative regulation of individual decisions that extends beyond the actor’s consciousnesses” (1981 [1987: 117]). This is a reference to an “invisible hand” mechanism of the kind that occurs in Mandeville, Smith, and Hegel, with the important difference that in these latter theorists the invisible hand, providence-like, serves the common good. For Habermas the function of the systems of the economy and bureaucracy is merely to harmonize and stabilize complexes of individual actions, and thereby to bring about societal integration and reproduction. They act as relief mechanisms that ease the burden on communicatively acting subjects. Of course, in functioning societies the economy and bureaucracies should also serve the common good. According to Habermas’s theory this only happens when the system bears the right kind of relation to the lifeworld.

When Habermas first introduced the idea of the system he was engaging with Luhmann’s work and thinking mainly of cybernetic systems (1976a [1979: 170]; 1973a [1975: 130–142]). A good example of a cybernetic system is a heater linked to a thermostat, where each serves as input and output. If the heat rises beyond a fixed temperature, say 20 Celsius, the thermostat switches the heater off, and if it drops below that temperature the thermostat switches the heater on again (Heath 2011: 83–84). The result (or goal-state to use the slightly misleading technical term) is to keep the room at an even ambient temperature.

Habermas construes systems, in the light of Luhmann, as spheres of “norm free” sociality. “In capitalist societies the market is the most important example of a norm-free regulation of cooperative contexts” (1981 [1987: 150, 154]). This puts him on the side of those who see markets as destroying rather than nourishing the web of moral relations. His predominant way of thinking about subsystems is that they are “demoralized”. This is true of the market economy, bureaucracies and state administration, and the law, which in Theory of Communicative Action he treats as a subsystem. That said, even in this text, while he claims that law in the process of rationalization becomes “detached from the ethical motivations of the legal person” (1981 [1987: 174]), he nonetheless thinks of basic rights and the principle of popular sovereignty as sources of legitimation which act as

bridge between a de-moralized and externalized legal sphere and a deinstitutionalized and internalized morality. (1981 [1987: 178])

Systems, which facilitate integration and social order through “delinguistified steering media” like money and power, have great advantages for citizens of modern societies. They fulfil functions that are too complex or burdensome to be undertaken by communicative action, that is, by individuals acting consciously in concert. For example, markets distribute goods and resources to where they are most needed, using price signals and laws of supply and demand.

However, systems also have disadvantages. For one thing, systems, once in place, operate independently of human agents. There is, consequently, a gap between an actor’s agency, and their conscious intentions and aims, and the purpose that they serve in the system. This lack of transparency is evident in firms, for instance, where the agents fulfil their roles and tasks, whether using instrumental, communicative, or moral rationality, or a mix of all, while all the time behind their backs or “beyond their consciousness” at the macro-level they are making profit for the firm’s owners and shareholders. For another, Habermas claims, agents operating in spheres steered by delinguistified media are inclined to shift from communicative to instrumental and strategic action orientations with the result that

success-oriented action steered by egocentric calculations of utility loses its connection to action oriented by mutual understanding. (1981 [1987: 196])

Whether Habermas holds that agents’ actions in economic and bureaucratic domains are merely constrained by system imperatives of the relevant steering media, or reduced to instrumental and strategic actions is moot (Jütten 2013). But it is empirically false for reasons given by Honneth and Joas and others that the mediatization of a domain of social life would force agents to adopt only one type of action. As Joas puts it, every sphere of action contains “a wealth of different types of action” (Joas 1986 [1991: 104]). Systems, economic and bureaucratic, and the specific organisations they comprise, all involve numerous different kinds of action. This is not only an empirical claim but a conceptual one that follows from Habermas’s own theory that instrumental and strategic action is parasitic on communicative action. Habermas has also been criticized for being seduced by systems theory into merely accepting spheres of norm free sociality, and the uncoupling of system and lifeworld as a normal result of modernization and social differentiation (McCarthy 1991).

2.3.3 The Relation of Lifeworld to System

The relation of lifeworld to system is pivotal to Habermas’s social theory. Most commentators (for example McCarthy 1991: 154 and Baxter 2011: 166) take this to be the main problem facing the theory. Actually, as Habermas notes, there are two related problems: the problem of constructing a theory that can combine systems theory and action theory fruitfully (1981 [1987: 201]), and the problem of articulating the actual relation between system and lifeworld.

So what is that relation? Habermas accords primacy to the lifeworld, on the on the grounds that it “defines the pattern of the social system as a whole”. Systemic mechanisms, he contends, “need to be anchored in the lifeworld” (1981 [1987: 154]). It is tempting to think that Habermas’s thesis of the primacy of the lifeworld over the system must have to do with the primacy of communicative over instrumental and strategic action, the view that instrumental and strategic action are “parasitic” on communicative action, but not vice versa. However, that would be problematic since Habermas does not succeed in establishing by means of speech-act theory that communicative action is the basic form of action on which instrumental and strategic action essentially depend.

Even if Habermas had an argument that conclusively demonstrated the primacy of communicative over instrumental action, that would not suffice to establish the primacy of the lifeworld over the system. The primacy of one kind of action over another could not itself establish the primacy of one kind of social order over another. That would be a conflation of levels. Habermas does in fact quite often conflate types of action with spheres of action, as we see from his claim cited above that success-oriented individual actions are “steered” by egocentric calculations (Joas 1986 [1991: 104]). For example, he talks of “subsystems of purposive rational action” and “normative steering” (1981 [1987: 180–1]).

2.4 Habermas’s Critical Social Theory

Habermas’s thesis of the primacy of the lifeworld over the system gives him a framework in which to criticize neo-liberalism and its mania for the marketization and financialization of everyday life: “the whole program of subjecting the lifeworld to the imperatives of the market must be subject to scrutiny”, whether in health care, public transport, military security, or secondary and tertiary education (2009: 186). He does this with his theory of the colonization of the lifeworld by the system.

2.4.1 Colonization

Recall that mediatization—the intrusion of steering media into social domains that were hitherto not systematically organized and integrated—is not, according to Habermas, itself a pathological development. It has a positive side, namely as a relief mechanism, a way of coping efficiently with complexity. But colonization, as the label suggests, is bad. Colonization occurs where

systemic mechanisms suppress forms of social integration even in those areas where a consensus-dependent coordination of action cannot be replaced, that is, where the symbolic reproduction of the lifeworld is at stake. (1981 [1987: 196])

Symbolic reproduction is where society is maintained through communicative actions. As a consequence of colonization, the lifeworld shrinks, and its capacity for symbolic reproduction atrophies (1981 [1987: 154, 173]). This is functionally bad. To the extent that the system, and society as a whole, depend on the lifeworld and its communicative resources, colonization is self-stultifying and destabilizing.

Habermas follows Marx and Weber on this, rather than Luhmann, insofar as he shows that

the rationalization for the lifeworld makes possible the emergence and growth of subsystems whose independent imperatives turn back destructively upon the lifeworld. (1981 [1987: 186])

This not only eradicates normative contexts of action but, he maintains, supplants these with instrumental and strategic action orientations, which he calls a “pathological de-formation of the communicative infrastructure of the lifeworld” (1981 [1987: 375, 180, 181]). This recalls Adorno’s discussion of bourgeois coldness, but broadens and extends it and decouples it specifically from the moral abomination of Auschwitz (Adorno 1966 [1973: 363]). For example, consider that once healthcare and housing are privatized and seen as mere commodities, landlords can evict tenants because of rent arrears, and hospitals can turn away patients who cannot afford to pay, with no moral qualms or social responsibility, because they see these actions as merely economic transactions. Colonization is also bad, according to Habermas, insofar as it leads to a variety of social pathologies pertaining to culture, society, and person, including loss of freedom, loss of meaning, crises in legitimation, anomie, and alienation (1981 [1987: 385; 142–3]).

2.4.2 Reification

Habermas presents his colonization thesis as a reformulation of the Lukácsian idea of reification that was so influential on the first generation of Frankfurt School critical theorists, especially Adorno (1981 [1984a: 399]; 1981 [1987: 1]). Lukács argues that the commodification of all domains of modern society, and consciousness thereof, has led to its “taking on the character of a thing”, hence the economy and bureaucracy and the legal system appear to subjects as a “second nature” (Lukács 1971: 83, 86). Consequently they adopt a passive attitude towards it, in theory as contemplation, in practice as adaptation. This results not only in an illusion, but also in a kind of inaction, since people see and treat things as natural, and so not up to them, and not alterable, when such things are in fact historical and in principle reversible, and up to them. That idea needs reformulating, Habermas maintains, because it mistakenly conflates rationalization and social differentiation with reification. Lukács, for example, thinks of rationalization as the destruction of social totality, namely the substantial ethical life of a community, as an unqualified bad, whereas Habermas sees it as having upsides and downsides.

Reification is the major downside, which Habermas sees as a specific alienating effect of the destruction of the communicative capacity for symbolic reproduction and social integration provided by the lifeworld. In particular, reification arises when the intrusion of steering media causes the “conversion to another mechanism of action coordination” (1981 [1987: 375]). Since this conversion happens as it were behind the backs of social agents rather than through their conscious intentions and choices, colonization gives rise to “objectively false consciousness” and its effects “have to remain hidden” (1981 [1987: 187]). Notably, contra the Marxist tradition up to Lukács, the reification effect arises in class-unspecific ways and thus the crises do not precipitate class conflict, which, Habermas argues, the welfare state has largely succeeded in pacifying (1981 [1987: 187]). Furthermore, though reification effects are “filtered through the pattern of social inequality”, the latter is not one of the pathologies he focuses on (1981 [1987: 349]). He focuses more on welfare state clientism, and juridification (1981 [1987: 357, 363–4]). This is one of the important differences between Habermas’s social theory and Rawls’s Theory of Justice, and leads to a completely different outlook in the diagnosis of social ills (Jütten 2011).

2.4.3 Normative Grounds

This brings us to the third significant feature of Habermas’s critical social theory: the problem of normative grounds. In the introduction to Theory of Communicative Action Habermas describes his project as “not a meta-theory but the beginning of a social theory concerned to validate its own critical standards” (Schnädelbach 1986 [1991: 8]), inviting the contrast with the critical “social theory” developed in the mid twentieth century by Horkheimer and Adorno, which, according to Habermas’s criticism, had foundered on “the difficulty of giving an account of its own normative foundations” (1981 [1984a: 374]). He also criticizes systems theory for failing to see the primacy and fragility of the lifeworld, its function of embedding and facilitating all spheres of action, and its crucial role in producing and reproducing social order. Systems theory conflates system and social integration and “deprives itself of the standard of communicative rationality” (1981 [1987: 186]).

This has led commentators to assume that the ideas of communicative action together with the theory that in modern society socialization takes place through communicative action provide Habermas with an account of normative foundations, of the kind he claimed was missing from earlier Frankfurt School theory. The trouble is that it is unclear to what extent this is a functionalist theory, supported by mainly empirical claims, or a normative one; and, if normative, in what sense. As Herbert Schnädelbach was among the first to point out, Habermas’s approach of rationally reconstructing the communicative infrastructure of modern societies does not obviously fit the bill here. Functionalist explanation is not normative justification (Schnädelbach 1986 [1991: 21]). This also applies to his reconstructed theory of reification (Jütten 2011). It is clear that Habermas’s is a normative theory, to the extent that if his diagnosis is correct, the “damage” done by colonization is not only social and structural, individuals too are harmed and suffer as a result of it, and their legitimate expectations are disappointed. This is the case whether or not they are “wronged”. The problem can be put as a dilemma. Either Habermas’s social theory is genuinely critical, in which case it judges that a colonized lifeworld is bad and ought to be changed (and such judgement requires substantive normative premises), or it stops short of such a judgement, and is hence not properly critical.

Maeve Cooke argues that Habermas’s theory contains implicitly a “utopian promise” of a rationalized lifeworld, that is “reproduced through processes of intersubjective evaluation of validity claims” (Cooke 1994: 162), so that substantial ethical reasons ground its normative claims. Other commentators, such as Honneth, assume that discourse ethics provides the account of its normative foundations that, according to Habermas, earlier critical theory stood in need of (Honneth 1985 [1991: 282]). However, there are numerous problems with this view. One is that the normative claims of critical theory stand in need of substantive moral reasons, rather than accounts of those reasons (Finlayson 2013a), and as Schnädelbach points out it is not clear that Habermas’s rational reconstruction of communication as discourse can provide such reasons.

3. Discourse Ethics

Discourse ethics was developed contemporaneously with the theory of communicative action and fits into roughly the same framework. Discourse ethics comprises a number of different interlocking theories:

  • the lineaments of a social theory of morality
  • a normative moral theory
  • a philosophical justification of the moral standpoint (the principle of universalization)
  • a theory of the development of moral consciousness

Unlike the Theory of Communicative Action there is no single work in which Habermas’s discourse ethics is given a settled and canonical statement. Instead, it is an evolving research programme, presented in a series of essays most of which are contained in two volumes published in English as Moral Consciousness and Communicative Action (1990a) and Justification and Application (1993). The development of Habermas’s discourse ethics falls into two phases, roughly 1983–1990 and 1991–1996. Broadly speaking, the essays in Moral Consciousness and Communicative Action comprise phase one, and those from Justification and Application through to Between Facts and Norms, phase two. We will call the theory developed in phase one “discourse ethics”, and that in phase two, “discourse morality”. For in the beginning of the 1990s Habermas introduces a distinction between morality as a normative theory of rightness, and ethics as a theory of the good, in which light, he himself later notes, to be accurate he should have renamed his theory “the discourse theory of morality” (1993: 1; 1991: 7).

3.1 Discourse

Discourse is not a synonym for language or speech. Only speech that is explicitly oriented towards reaching rationally motivated consensus counts as discourse (1981 [1984a: 42]). In other words, discourse is communication, for Habermas stipulates that communication just is speech oriented towards reaching understanding. But discourse is communication of a special kind. When communication breaks down and the everyday shared meanings and understandings that normally coordinate interactions are disrupted, interlocutors have to switch from action to discourse. Discourses are a “reflective form of agreement-oriented action that …sit on top of the latter” (1986a [1990b: 245–6]). Discourse is a higher order of communication, with the aim of renewing or replacing a problematized consensus. To participate in discourse just is to provide reasons with the aim of convincing all interlocutors to accept a disputed validity claim. So, on Habermas’s view, discourse just is the language game of argumentation in which disputed validity claims are “redeemed”, and when a validity claim is redeemed successfully the disrupted consensus is restored, renewed, or replaced by a new one.

On this picture, argumentation is a rule-governed practice, not a verbal free-for-all. Habermas identifies three levels of rules. First, there are the basic rules of logic such as the principle of contradiction, and the basic semantic rules of universalizability and consistency (1983 [1990: 86]). Second, there are procedural norms such as the principles of sincerity and accountability, namely that every participant must undertake, if only implicitly, to assert only what she genuinely believes and always either to justify upon request what she asserts or to provide reasons for not offering a justification. These are, contends Habermas, preconditions for all genuine argumentation, i.e., argumentation conceived as a “search for truth” and organized like “a competition for better arguments” (1983 [1990: 87–8]). Third, there are the processual preconditions that immunize discourse against coercion, repression, and inequality. These norms are supposed to insulate discourse from all persuasive forces except the “unforced force of the better argument”, and they must be followed, if a rationally motivated consensus is to be reached.

Habermas suggests that the following rules of discourse can be established:

Every subject with the competence to speak and act is allowed to take part in a discourse.
Everyone is allowed to question any assertion whatever.
Everyone is allowed to introduce any assertion whatever into the discourse.
Everyone is allowed to express his attitudes, desires, and needs.
No speaker may be prevented, by internal or external coercion, from exercising his rights as laid down in (1) and (2) (1983 [1990: 89]; Rehg 1994: 62).

The above list is not intended to be complete. And Habermas nowhere provides a complete list. They are a representative sample borrowed from Robert Alexy’s Theory of Practical Discourse (Alexy 1978 [1990: 165–7]).

3.2 Performative Self-Contradiction and Transcendental Pragmatic Justification

Habermas assumes that the rules (1), (2) a–c, and (3) above can be identified by a test of performative self-contradiction. A performative contradiction arises when a rule that speakers pragmatically invoke by the illocutionary act of, say, assertion, is contradicted by the semantic content of that assertion. An example is Moore’s paradox: “It is raining but I don’t believe it” (Moore 1993). Habermas maintains that the test of whether the denial of a rule yields a performative contradiction does not justify rules of discourse so much as identify them (1983 [1990: 95]). On that point, he differs from his colleague Karl-Otto Apel, who in his seminal 1973 paper attempts what he calls a transcendental-pragmatic, “ultimate” justification of the norms of a minimal ethics, by showing how they follow directly from the pragmatic preconditions of communication (Apel 1976 [1980]). According to Cristoph Lumer, Habermas made the same attempt in the original manuscript of Moral Consciousness and Communicative Action, though abandoned it in subsequent versions (Lumer 1997: 50). Later, as Heath (2014: 843) notes, Habermas denies that his principle of universalizability (U)—see §3.3 below—can be justified through a “performative contradiction argument”, not only because applying the test of performative contradiction to a rule is in his view heuristic, not justificatory, but also, because he thinks, no such principles and certainly no substantive moral norms follow directly from those rules.

3.3 The Principles of Discourse Ethics and their Justification

The two principles of discourse ethics are principle (D) and principle (U), the principle of universalizability. In phase one, Habermas formulates (D) as follows:

Only those norms can claim to be valid that meet (or could meet) with the approval of all in their capacity as participants in a practical discourse (1983 [1990: 66]).

He initially calls (D) “the principle of discourse ethics” (1983 [1990: 66]). It takes the form of a straightforward validity to consensus (or possible consensus) conditional. The scope of the consensus is “all those affected” by the norms. As such it is universal among participants in practical discourse, namely among all agents.

In addition, there is principle (U)—the principle of universalization, which spells out the “criterion for generalizing maxims of action” (1983 [1990: 62]). Every valid norm has to fulfil the following condition:

All affected can accept the consequences and the side effects its general observance can be anticipated to have for the satisfaction of everyone’s interests (and these consequences are preferred to those of known alternative possibilities (1983 [1990: 65]).

Habermas first introduces (U) as “a rule of argument that makes agreement in practical discourses possible” (1983 [1990: 66]). In phase one he begins with a justification of (U), and then infers (D) (1983 [1990: 93]). The initial idea was that (U) was to be given a “transcendental pragmatic” derivation from the rules of discourse, and (D) inferred from (U). The nature of the putative transition between (U) and (D) however remains unclear (Lumer 1997: 49), and Habermas has not succeeded in clarifying it.

In the 1983 programme, after having distanced himself from Apel’s programme, Habermas proposes to justify (U) by deriving it logically from two premises: (1) the rules of discourse and (2) “the idea of the justification of norms” or “a weak idea of normative justification” (1983 [1990: 92, 97]). Later, in phase two, Habermas reverses the order of justification, and infers (U), now designated the “moral principle”, from (D). At the same time, he weakens (D) by making it apply generally to all norms of action whether moral or not, and labels it “the discourse principle”:

Only those action norms are valid to which all possibly affected persons could agree as participants in rational discourse (1992b [1996b: 107]).

(D) now merely “expresses the meaning of post-conventional requirements of justification” (1992b [1996b: 107]), and is given along with premise (2). It supposedly merely explicates “the point of view from which action norms can be impartially grounded” (1992b [1996b: 109]). While Habermas weakens (D) he strengthens (U).

a norm is valid if and only if the foreseeable consequences and side effects of its general observance for the interests and value-orientations of each individual could be freely accepted jointly by all concerned. (1996a essay 1: 60 [1998a: 42 translation amended])

Neither Habermas’s followers, with the possible exception of Rehg (1991 & 1994), nor any of his critics, think that this logical derivation of (U) goes through. Gradually, Habermas backs away from the claim that (U) can be given a logical or formal derivation. He presents discourse ethics as a “programme” of philosophical justification and keeps adding stronger premises to that programme (Lumer 1997: 53; Habermas 1983 [1990: 43]). According to Kettner the fact that in phase two (D) is no longer the principle of discourse ethics, but construed as “still neutral” with regard to morality and law, together with the fact that (U) is now the moral principle, means that the grounding of discourse ethics is normatively incomplete, and this effectively amounts to the “disappearance” of the original project, which was supposed to show that the moral principle and the moral point of view fall out of the success conditions for communicative action (Kettner 2002: 207, 211–12).

To sidestep the absence (or failure) of a logical derivation of (U), Habermas presents (U) provisionally as an abductive hypothesis (1996a essay 1: 60 [1998a: 42]) where abduction is an inference to the best explanation. Ott makes an attempt to justify (D) and (U) as a conjunction of pragmatic implications (Ott 1996: 42–45), while Finlayson claims that U is best thought of as an inference to the best explanation, and both agree that it rests on supplementary premises drawn from Habermas’s modernization theory (Finlayson 2000; Ott 1996). This raises the question of what kind of justification of the moral standpoint Habermas has in mind. In Moral Consciousness and Communicative Action he presents discourse ethics as a defence of a cognitivist account of morality against the moral sceptic on grounds that even the sceptic must accept (1983 [1990: 76ff]). The philosophical programme of justification looks like a constructivist—and what Gunnarsson calls a “rationalist”—argument, invoking slender premises and reaching thick normative conclusions, albeit the said normative conclusions are not substantial moral norms, but a moral principle of universalization, and an account of the moral point of view, in which moral agents (participants in discourse) collectively validate substantial moral norms (Gunnarsson 2000: 99, see also Rees 2020: 678–81, 689–91). Gunnarsson argues that all Habermas’s rationalist arguments for (U) fail. Benhabib is also of this view. In an early critique she argued that (U) was redundant, and that (D) should be given a “weak justification programme” which adduced supplementary moral premises of “universal respect” and “egalitarian reciprocity” (Benhabib 1986: 308; 1992: 31). Habermas’s tendency to add ever stronger supplementary premises for his justification of (U) is a tacit acknowledgement of his failure to provide either a formal derivation or a “rationalist” justification of (U). In doing so he moves away from his original idea of providing a philosophical justification of (U), towards a more modest explication and reconstruction of the moral point of view, which can be corroborated by insights drawn from an array of disciplines, including the moral psychology of Lawrence Kohlberg.

3.4 (U) and Kant’s Categorical Imperative

Principles (D) and (U) are supposed to capture the practice of universalization in ethics, but in a social sense that differs from apparently similar procedures in Rawls, whom Habermas takes to be working in the Kantian tradition of political philosophy. The differences are as follows. (U) is not presented as a moral norm: it is not formulated as an imperative. Initially, (U) is presented as “a bridging principle” on analogy with the principle of induction, except that (U) tests whether norms are amenable to consensus among all affected, given their interests. (U) is not a hypothetical test for generating moral norms. It rationally reconstructs an actual practice by which real moral agents, as participants in an actual discourse, themselves ascertain the validity of moral norms, or institute them. Finally, whereas, according to Habermas, the test of universalization contained in Kant’s first formulation of the Categorical Imperative is “monological”, since it can be performed by each person individually, the procedure captured by (U) is “dialogical” since it must involve other people acting together. As McCarthy puts it:

Rather than ascribing as valid to all others any maxim that I can will to be a universal law, I must submit my maxim to all others for the purposes of discursively testing its claim to universality. The emphasis shifts from what each can will without contradiction to be a general law, to what all can will in agreement to be a universal norm. (1983 [1990: 67], Habermas quoting McCarthy 1978: 326)

Finally, on the Kantian conception a maxim is adopted in virtue of its universal form, whereas on Habermas’s conception it is not just the form of the maxim that is universalized. It is in part the content of the norm that is universalized. This is why (U) refers to the “interests” of everyone affected it. But on Habermas’s view universalization is thought of as a social process that extends to the very self-conception of the moral agent. Habermas’s guiding idea here is G. H. Mead’s notion of “ideal role taking” (1988b: chapter 3 [1992a]; 1983 [1990: 56–68, 121]), in which moral agents learn by projecting themselves into the position of all other moral agents. This is a process whereby agents consider “every interest involved” and evaluate what is “good for everyone under the same conditions”, a process that, according to Mead, leads “to the development of a larger self, which can be identified with the interests of others” (Mead 1934 [1962: 363]).

3.5 The Return of Habermas’s Kantianism and the Morality/Ethics Distinction

Habermas thus develops a distinctively social and intersubjective version of ethics, different from Kant’s. But certain developments in phase two push back toward a Kantian conception of discourse ethics (Heath 2014: 846). In Between Facts and Norms, (U) is said to allow participants to agree on norms that are impartial, and that have “categorial validity” (1992b [1996b: 28]). And Habermas appears to hold, like Kant, that the rational will is the source of moral obligation (1992b [1996b: 110]; Kettner 2002, 212). Furthermore, the moral domain, which in phase one encompassed the entire domain of action norms and values, later “shrinks” to a domain of thinner, universal moral norms with characteristic overridingness and stringency (1991a chapter 6, 202 [1993: 91]).

It is primarily Habermas’s introduction of a tripartite typology of practical reason—the ethical, moral, and pragmatic employments of practical reason—that drives his return to Kant in Between Facts and Norms. In particular Habermas introduces a notion of ethical discourse that differs from moral discourse in that it addresses clinical questions such as “What is the good life (for me, or for us)?” and that is restricted in scope to communities with shared values. Ethical discourses differ, he claims, from moral discourses which deal with questions of justice, understood as what is equally in the interests of all (1983 [1990: 180]; 1991a chapter 5, 101–105 [1993: chapter 1, 3–8]; 1996a essay 1: 39–40 [1998a: 26–7]). He understands “justice” not as a political value, as the later Rawls does, but as the supreme moral value, and the designated value of moral discourse, on analogy with truth for theoretical discourse. Although he allows that many questions can be addressed by both ethical and moral discourses, he draws a sharp distinction between them and accords priority to morality over ethics (1983 [1990: 104]; 1991a chapter 5, 100–119 [1993: chapter 1, 1–17]). Ethical discourses always take place within the bounds of moral permissibility, and on Habermas’s account no moral norm can be weighed against, or trumped by, any ethical value. Along with various communitarians, Charles Taylor and Martin Seel both criticize Habermas on this point and defend the idea that the good (albeit construed in different ways) has priority over the morally right (Taylor 1989 [1992]; Seel 1995).

The ethics/morality distinction has also been a target of much criticism. Benhabib (1986), McCarthy (1991; chapter 7: 181–200); Putnam (2002); and Kettner (2002), among others have shown that Habermas cannot and does not maintain a sharp distinction between the two. The main reason they give is that the notion of value that Habermas associates with ethics bleeds into his conception of a valid moral norm. Indeed, Habermas himself acknowledges the existence of what he calls a “hidden link” between justice and the common good (1983 [1990: 202]), and “the remnant of the good at the core of the right” (“das Gute im Gerechten”) (1996a essay 1: 43 [1998a: 29]). And he appears to think of this relation, not as threat to his conception of morality but as a feature of it, namely that it rests on an underlying web of solidary relations between human beings. Habermas claims that morality arose from this web of solidarity as a process of generalization and universalization in the course of modernization, and, he claims, although solidarity remains as the other side of justice (qua moral rightness) this does not smudge the bright line between ethics and morality (Habermas 1986a [1990b]).

Kettner objects that Habermas’s typology is dogmatic, established by “terminological fiat” and that it should not be seen as an insight into the nature of practical reason (Kettner 2002: 208). Few critics are prepared to defend Habermas’s distinction with the notable exception of Rainer Forst (Forst 2007 [2011: 60–79]). In the final analysis, Habermas settles on the view of discourse ethics in general, and (U) in particular, as a reconstruction of an alternative to Kant’s first formula of the Categorical Imperative that is allegedly superior to the latter in virtue of being dialogical rather than monological and the reconstruction of an actual practice involving real moral agents.

3.6 Dialogical vs. Monological Morality

Much depends on Habermas’s articulation and defence of the distinction between a dialogical and a monological ethics, and his argument for the superiority of the former, which he makes with vehemence and conviction both in his critical engagement with Lawrence Kohlberg and later in his debate with Rawls. Nonetheless, critics have argued that the putative distinction between dialogical and monological ethics is one without a difference, and have cast doubt on Habermas’s argument for the cognitive superiority of dialogical ethics. McMahon argues that (U) is ambiguous between two very different conceptions of dialogicality. Weak dialogicality consists in the concurrence of independent judgements about which norm would satisfy the interest of all concerned, that is put together in piecemeal fashion (McMahon 2000: 521). Strong dialogicality, by contrast, involves a collective, joint judgement by all affected. McMahon claims that Habermas, and Rehg (1991 & 1994), whose interpretation of (U) Habermas cites approvingly (1992b [1996b: 109]), endorse the strong conception. But the strong conception, which requires that each participant in discourse suspend judgement until all other have cast their vote, is, so to speak, incoherent. For one thing, it robs each participant of any reason to judge from their own perspective whether a norm is valid. For another, it renders deviation from a norm impossible, because as soon as an agent deviates from a norm, it is no longer valid, and cannot be criticized as mistaken (McMahon 2000: 529).

Another reason to think that the distinction is without a difference is that Habermas allows participants in an actual discourse to conduct advocatory discourses whereby they imaginatively project themselves into other people’s points of view. He must allow this because it would be impossible to conduct an actual discourse with “all affected” by a norm. So Habermas must allow that the constituency of all participants in an actual discourse might be small, at the limit two people, and that the constituency of “all affected” by a norm might consist of every moral agent and patient including the unborn. But with that degree of idealization, the difference between an ideal discourse conducted monologically and an actual but advocatory dialogue between real people has all but disappeared, and the cognitive superiority of the latter cannot be maintained.

3.7 Discourse Ethics and Critical Social Theory

Habermas claimed in Theory of Communicative Action that Adorno and Horkheimer’s critical theory failed to provide an adequate account “of its own normative foundations” (1981 [1984a: 374]) and that his own project was, by contrast the “beginning of a social theory concerned to validate its own critical standards” (1981 [1984a: xxxix]).

In Legitimation Crisis Habermas had explored the idea that critical social theory could find its point d’appui in “suppressed generalizable interests”, namely the unrealized rational potentials of modern society (1973a [1975: 111]). As we saw, and Habermas himself admits, he did not develop this approach in Theory of Communicative Action. He claimed that critical theory must refrain from “critically evaluating … forms of life and cultures … as a whole” and focused instead on investigating the way in which communicative pathologies arising from the colonization of the lifeworld hinder “learning potentials” (1981 [1984a: 383]). Critics such as Schnädelbach and Taylor countered that such an approach would only explicate critical or rational potentials, but never justify normative criticisms of society (Schnädelbach 1986 [1991]; Taylor 1986 [1991]). Recall that (U) is a moral principle that is supposed to validate norms containing generalizable interests. Some of Habermas’s supporters in the 1980s, for example, Honneth and Benhabib, argued that discourse ethics could serve as the account of the normative foundations that Habermas argued first generation critical theory was missing (Honneth 1985 [1991: 286]; Benhabib 1986: 279–81).

The trouble is that Habermas’s social theory does not set out to criticize society in virtue of its degree of conformity to principle (U) or to any substantive moral principle. Habermas continues to deny that this is the proper approach of social theory. He explicitly rebukes Rawls for thinking that the theorist can criticize society by first designing “the basic norms of a well-ordered society on the drafting table” and then checking society against it (1990d [1994: 101]; 1996a essay 3: 122 [1998a: 97]). That approach arrogates the task of criticism to the philosopher, instead of social agents and citizens. Besides which, the problem, as Schnädelbach understands it, is that social criticism proper requires normative judgements, which must be supported by normative reasons, while discourse ethics limits itself to the clarification of the moral point of view, and (U) is a procedural principle that leaves such judgements up to participants in discourse, and cannot furnish the requisite reasons (Finlayson 2013b).

3.8 Other Criticisms

In the 1980s and 1990s a number of feminist critics, inspired by Carol Gilligan’s critique of Kohlberg, In a Different Voice, developed two significant lines of argument against Kantian moral theory and Rawlsian liberalism. The first was that the “liberal” conception of the moral self is formal, abstract, and gender-neutral. The second was that the moral self is a male self masquerading as neutral and universal. The suggestion is that morality/moral theory is complicit with discrimination against women and patriarchal oppression. They aimed a similar criticism at Kantian, Rawlsian, and Kohlbergian conceptions of “the moral standpoint” (Benhabib & Cornell [eds] 1987a; Benhabib 1992; Meehan [ed.] 1995). Among these feminist critics, some, e.g., Benhabib and Meehan, were initially supportive of discourse ethics because of its emphasis on including other voices, and its insistence its ideals must be won from the reconstruction of actual practices of real agents. Later, their arguments were recalibrated and turned against discourse ethics. Because the moral self is conceived as formal and abstract, and because the moral point of view is characterized by the formal criteria of universalizability and reversibility, Habermas’s discourse ethics cannot accommodate the kind of moral experiences that, Gilligan maintains, are characteristic of women: it is blind to the importance of considerations of care and responsibility for others. This leads Benhabib claims to a privatisation, personalisation, and devaluation of women’s moral experiences (Benhabib & Cornell 1987a: 7–9; Benhabib 1992: 152, 184). She argues further that “the restriction of the moral domain to questions of justice” results in “the privatization of women’s experience and leads to epistemological blindness towards the concrete other” (Benhabib 1992: 164). There is some doubt whether Benhabib’s position amounts to a refutation of Habermas’s theory or an explication of it. The universalization procedure imposed by discourse involves participants in moral discourse imaginatively switching perspectives with concrete other people. If it were not so, if “ideal role taking” demanded only that one examined candidate norms from the perspective of others generally conceived, there would be nothing to gain from such a procedure (Finlayson 2013a). It is also unclear, whether such criticisms apply to Habermas’s moral theory, or to the actually existing Kohlbergian “Stage 6” morality he takes as its object. Second, it is unclear whether the criticism shows merely that discourse ethics is a flawed moral theory, or whether it shows that, by dint of these flaws, the theory (or the morality it theorizes) perpetuates discrimination or oppression against women.

Finally, discourse ethics together with the theory of modernity in Theory of Communicative Action has been criticized from an anti-colonial perspective. The most sustained and detailed criticism comes from Amy Allen. Noting that Habermas’s argument for principles (D) and (U) rests on assumptions drawn from modernization theory, she brings the charge of Eurocentrism. Furthermore, building on criticisms by Dussel (1993) and Bhambra (2011) among others, she argues that Habermas’s discourse ethics is wedded to a theory of social evolution and the idea of modernization as a learning process, which commits him to a “progressive view of history” (Allen 2016: 72–3). In spite of his attempt to deflate the presuppositions of the philosophy of history that weighed on Western Marxism, Allen and Dussel argue that Habermas’s theory is freighted with a dogmatic Hegelian universalism that in the final analysis asserts the developmental superiority of European modernity.

4. The Discourse Theory of Law and Democracy: Between Facts and Norms

Between Facts and Norms is a legal and political theory focused on the ways in which constitutional democracies produce and institutionalize democratically legitimate law. As Habermas wrote and researched the book in the late 80s and early 90s, political theory was in the grip of the debate between liberalism of a Rawlsian or Kantian stripe, and communitarianism, or as Habermas preferred to call it Neo-Aristotelianism. In Between Facts and Norms he attempts to mediate between these two opposed approaches to political theory. He does this by setting out and defending the thesis of the equiprimordiality (or co-originality) of liberal rights and popular sovereignty, and their correlative values of individual and collective autonomy.

One of the basic ideas in Between Facts and Norms is that the rule of law, and indeed—though this is implied not stated—legitimate liberal constitutional states, cannot exist without radical democracy (1992b [1996b: xlii]). But at the same time radical democracy has to be made compatible with the exigencies of large scale administrative and bureaucratic states organized through law. At the same time Habermas also tries to locate and defend the common ground between legal positivists, like Austin, and normativists like Dworkin and Rawls. He does this by identifying and rationally reconstructing the “normative self-understanding” of the legal system as embodied in

particles and fragments of an “existing reason” already incorporated in existing practices. (1992b [1996b: 287])

Habermas does not oppose the ideal, but attempts to reconstruct idealizations embodied in existing practices, and set these in play without relying on the optimistic providential assumptions of the philosophy of history. The practices he has in mind are legislative practices, and the “particles of existing reason” consist in the various ways in which discourse is built into those practices. In short Habermas reconstructs and describes the ways in which discourse is institutionalized by political and legal systems. Thus his approach is a hybrid between the sociology of law and jurisprudence, and normative political philosophy. The approach is captured by the title Faktizität und Geltung (literally “Facticity and Validity”), or Between Facts and Norms where the “between” designates a complex set of interrelations, rather than a middle ground.

4.1 The Two-Tracked Theory of Democracy

Habermas’s conception of democracy has been called a “two track” theory (Baynes 1995). That said, it is more like a “two complex” conception, since the “tracks” in question designate the formal and informal public spheres and each of these is a complex. At the centre is the parliamentary complex, not only parliament but the administrative and judicial bodies accompanying it. Parliament itself is a formal “public” forum that is legally established and organized to take decisions (1992b [1996b: 355]).

It is surrounded by and embedded in an “informal” public sphere, an “open and inclusive network” of various kinds of discourse—moral, ethical, and pragmatic—that form “a ‘wild’ complex” that is not formally organized, even though each form of discourse has its own internal discipline. (1992b [1996b: 307]). (Habermas also calls the informal public sphere “civil society” to indicate that it is not legally or political regulated.) When deliberative democracy works as it should, discourse and its outputs—moral norms, values, and more broadly public opinion—percolate into the parliamentary complex through a system of sluices and channels (1992b [1996b: 355]). These inputs are then worked up in parliamentary discussion and debate and eventually embodied in the form of laws and policies and returned as outputs into society at large. As they have been shaped by public opinion and shared moral values, and worked up through debate, they find acceptance by citizens on the basis of the reasons they embody. That is how, when things go well, according to Habermas’s theory, legitimate law is produced. Roughly, this is Habermas’s account of the production of the “validity” of modern law.

This model can also be thought of in terms of the circulation of communicative power from periphery to centre, whereby the unregulated flows of communication and discourse in civil society lay siege to the political system, “without, however, intending to conquer it” (1992b [1996b: 487]) but in a manner than allows them to influence judgement and decisions in the political system. The model is supposed to explain how the embers of radical democracy can stay aflame within a modern bureaucratic state. Earlier models of popular sovereignty presuppose that society is “an association writ large” or a macro-subject with a sovereign will: citizens are actually authors of the very laws to which they must submit, and hence encounter these as an expression of, not a constraint on, their autonomy. But such a model would only work, if at all, in small-scale, ethically homogeneous societies, with a very high degree of popular participation, none of which applies to modern Western states (1992b [1996b: 102–3]). By contrast, Habermas’s discourse theory offers a picture where members of civil society can, through participation in discourse, help shape public opinion, which through the circulation of communicative power from periphery to centre can indirectly “program” or “counter-steer” the political system (1992b [1996b: 372; 332]) by means of laws and policies that are “in the equal interest of all” (1992b [1996b: 98; 154]). In this way, he claims, the political autonomy of legal persons is ensured since they

can at the same time understand themselves as authors of the law to which they are subject as addressees. (1992b [1996b: 408 187])

William E. Scheuerman’s objection is pertinent here. On Habermas’s view, a law acts as a “transformer” between the communicative power circulating in civil society and the administrative power of the legal and political systems, and it has to do so if it is to facilitate social integration. However, this view is incompatible with Habermas’s earlier conception of system and lifeworld, according to which the former, which includes the legal and administrative system, remains a “block of more or less norm free sociality” (1981 [1987: 171]). And Scheuerman points out that, not only is it improbable that communicative power can be transformed into administrative power and counter-steer the political system, but Habermas gives no detail about where this interface lies—institutionally speaking—and how it operates.

A more radical objection stems from direction of the Rochester School who consider that the very idea of a “common good” or “the equal interests of all” that might be served by law, is an illusion (Riker 1982). Other realist theorists of democracy, e.g., Luhmann, deny that public reasons flowing into the political system from civil society can and do steer political decisions (Luhmann 1969).

4.2 The Co-Originality Thesis

Habermas’s theory asserts what he calls a co-originality thesis, between various pairs of political ideas: between the system of rights and the principle of democracy; between private/individual and public/political/civic autonomy; between individual rights that secure the former, and popular sovereignty that is the expression of the latter. Co-originality means that both enjoy equal priority, and that neither is reducible to each other: they reciprocally call one another into being. The co-originality relations are revealed when the idea of self-legislation, namely “that the addressees of law are simultaneously authors of their rights” is decoded in discourse-theoretical terms (1992b [1996b: 104, 314, 409]).

The co-originality thesis has both architectonic significance for Habermas’s theory and substantive implications. The architectonic implication is that both the principle of democracy, and the system of rights are derived independently of principle (U). One substantive implication is, as Ingeborg Maus argues (Maus 1996 [2002: 90–98]), that

the circular process in which the … legal form, and … the democratic principle—are co-originally constituted (1992b [1996b: 122])

indicates that basic rights are called into being by the democratic process and vice versa, in such a way that neither depends on, or is externally constrained by, an antecedently existing order of moral rights (pace Larmore 1995) or an ethical form of life (pace Bernstein 1996 [1998] & Michelman 1998). The co-originality thesis thus expresses Habermas’s view of the autonomy of the political (and legal) domain from the moral, and the sui generis nature of political legitimacy.

4.3 The Principle of Democracy

The keystone of Habermas’s political theory is the principle of democracy that states:

Only those statutes may claim legitimacy that can meet with the assent of all citizens in a discursive process of legislation that in turn has been legally constituted. (1992b [1996b: 110])

It also takes the form of a validity to consensus conditional, though democratic discourse is inclusive of ethical, moral, and pragmatic reasons, and indeed fair compromises, so that consensus is a messy and imperfect affair. This makes the process of reaching discursive agreement far more difficult, when one considers that many deeply held ethical values are limited to particular cultural communities, though this difficulty is supposed to be mitigated by the mediation of the legislative process.

A central contention of Between Facts and Norms is that what Habermas calls the principle of democracy “derives” from the interpenetration of principle (D) and the legal form (1992b [1996b: 122–3]). Recall that in phase two principle (D) is supposed to be a rule of practical argumentation in general that is neutral with respect to morality and law (1992b [1996b: 107]). In this respect, Habermas claims, as in respect of the circular process of reciprocal co-original constitution, the principle of democracy is “morally freestanding” (1992b [1996b: 80]). That is, the principle of democracy is derived completely independently of the moral principle (Finlayson 2019: 94). Here again Habermas insists on the autonomy of the democratic political process, and claims that the democratic procedure of the production of law is the sole source of its legitimacy, and he criticizes the views of Rawls, Dworkin, Larmore, and Apel, all of whom, in various different ways, claim that the legitimacy (or validity) of law is borrowed from that of morality.

One difficulty facing Habermas is to square his claims that (a) democratic discourse is an amalgam of all three kinds of discourse, and (b) that political legitimacy is sui generis and both ethically and “morally freestanding”, with (c) the absolute priority of morality in all spheres. He claims that legitimate laws “must harmonize with the universal principles of justice” (1992b [1996b: 99, 155]) and that legitimate laws must not “contradict basic moral principles” (1992b [1996b: 106]), by which he means valid moral norms. He places a moral permissibility constraint on political legitimacy, in such a way that it appears that morality constrains political legitimacy from the outside (Finlayson 2016). Habermas does not, though, see this as an “external” constraint, since he argues that morality flows into the political and legal domain through the constitutional role of basic rights, and circulates within it. Nevertheless, the moral permissibility constraint smudges the bright line that Habermas likes to draw between what he calls natural law theories of legitimacy, which are based on an antecedent morality, and discourse theory, which is not.

4.4 The System of Rights

Habermas argues that, alongside the principle of democracy, what he calls a “logical genesis of rights” arises from the “interpenetration” of the legal form and the discourse principle (D) (1992b [1996b: 121]). The argument is hard to follow. It begins from the premises of (D) and the form of modern law, and assumes that the idea of legitimate law presupposes that of a legal subject qua bearer of rights, no matter what the content of those specific rights is. The conclusion to the argument is a system of rights, of five different kinds.

  1. Basic rights to the greatest possible measure of equal individual liberties.
  2. Basic rights to membership in a voluntary association of consociates under law.
  3. Basic rights to the actionability of rights arising from the legal protection of rights-holders.
  4. Basic rights to the equal opportunity to participate in the processes of political will formation and the production of legitimate law.
  5. Basic rights to living conditions that are socially, technologically, and ecologically safeguarded, insofar as this is necessary for citizens to exercise their civil rights 1–4 (1996b: 123–4).

The first three rights are supposed to arise theoretically from the application of the discourse principle to the form of law. These are rights that citizens must grant to one another if they are “legitimately to regulate their living together by means of positive law” (1992b [1996b: 126; 82; 118]). The next two—political and social rights—are practical and material enabling conditions that ensure the effectiveness of the first three rights. The first three rights, Habermas claims, are not specific rights, but what he calls “unsaturated placeholders” for specific rights that have to “be interpreted and given concrete shape” by actual citizens in response to determinate historical conditions (1992b [1996b: 125–6]). This is crucial to Habermas’s theory, because it purports to reconstruct the ability of citizens, from their perspective, to reciprocally grant one another the rights necessary for their common existence as consociates under law. That’s why he claims that he, unlike Rawls, doesn’t design “the basic norms of a well-ordered society on the drafting table”, and then apply them to society (1990d [1994: 101]). In that sense, just as discourse ethics leaves the validation of moral norms to participants in discourse, the discourse theory of law and democracy has to leave the political process of establishing a system of rights up to citizens themselves as much as possible. This is the sense in which Habermas claims the discourse theory of democratic legitimacy is “strictly procedural” and more modest than “normative political theory” à la Rawls (Habermas 1995: 117 & 132; Rawls 1995: 175–177). For all that, unlike in discourse ethics where neither (U) nor (D) have the status of valid moral norms, Habermas nonetheless derives a system of rights that for all the world resembles T. H. Marshall’s Whiggish account of civic, political, and social rights, in his classic work of political sociology (Marshall 1950).

4.5 Objections to Between Facts and Norms

Joshua Cohen objects that the principle of discourse does not amount to a requirement of equal liberty, and that nothing so rich as Habermas’s scheme of individual liberties follows solely from the application of the discourse principle to the legal form (Cohen 1999: 393, 398). He objects even while acknowledging that the various rights are not yet saturated: they are not yet specific, historically and socially determinate rights. But contra Cohen, on Habermas’s account, legal form, or modern “form of law” is a richer idea than the mere rule of law, and refers to a complex of features that law has in a modern constitutional democratic state. As Baynes and Zurn point out, Habermas’s theory reconstructs the way that, via the discourse principle, the form of law in modern—that is, post-traditional and post-conventional—societies functions to compensate for the loss of shared traditions, and relieves the burden on citizens to reach reasoned agreement with one another and thereby coordinate their actions (Baynes 2016: 166: Zurn 2011).

Some critics argue that Habermas is wrong to look for a justification of basic rights that is functionalist, or merely “internal to law”, one that sees them only as necessary conditions for the institutionalization of the democratic process, or one that is strongly constructivist, that begins from slender premises that eschew moral or ethical considerations (Forst 2011; Larmore 1995; Michelman 1998; Bernstein 1996; and Cohen 1999; cf. Flynn 2003). The upshot of such criticisms is that Habermas’s justification of the system of rights requires stronger normative support of one kind or another, and that political legitimacy is not entirely sui generis.

Rawls, Cohen, and Larmore argue in addition that Habermas’s political theory rests on what Rawls calls a “comprehensive doctrine” since it is based on a controversial theory of meaning and communication and a controversial doctrine of method (Rawls 1995: 139; Cohen 1999; Larmore 1995). However, there is an important difference between comprehensive philosophical doctrines and comprehensive moral, ethical or religious doctrines. The fact that a normative political theory has controversial philosophical assumptions, which almost all do, does not create the kind of practical problems that arise when a political system, or constitution, is saddled with controversial moral or religious assumptions, and its citizens cannot regard it as legitimate (Lister 2007). To make that claim is to presuppose that political theory answers to the same canons of justification as political systems (Laden 2010).

5. Methodology and Philosophical Framework

In the transitional period of the 1970s when Habermas began his communicative turn, he developed various ideas about method that came to shape his mature work: for example, rational reconstruction (§5.1) as a method for critical social theory, postmetaphysical thinking (§5.2) as a framework for philosophy, and a set of related views about the proper role of philosophy.

5.1 Rational Reconstruction

Rational reconstruction is the method, and the label, for a cluster of methodological assumptions shaping the major philosophical projects of Habermas’s middle period: the theory of communicative action, discourse ethics, and the discourse theory of law and democracy. He originally developed it as part of an attempt to explain social phenomena and to recalibrate critical social theory on the basis of formal pragmatics.

Rational reconstruction is an approach that Habermas developed on the model of Noam Chomsky’s universal grammar (1976b [1998b 1: 35]), Jean Piaget’s developmental psychology, and Lawrence Kohlberg’s moral psychology (1983 [1990: 33–41]). These are theories that reconstruct universal human capacities—for language acquisition, cognitive development, and moral reasoning, respectively. Habermas’s use of rational reconstruction aims to set out the structures, rules, and competences underlying lifeworld practices. The targets of the method may also be described as the idealizing, counterfactual commitments which participants in a practice must make, in order for the practice to be meaningful or rational for them (1999a [2003a: 85–6]; 2005b chapter 3 [2008a: 81–4]). To rationally reconstruct a practice is to turn the implicit “know how” of participants into explicit “know that” (1976b [1998b: 33, 34–5]). For example, rationally reconstructing the everyday practice of communication gives access not to the semantic content of the speaker’s particular utterances, which is already explicitly known, but the implicitly known rules which the speaker follows in successfully communicating (1976b [1998b: 33]). Habermas calls this “illocutionary” or “pragmatic” meaning.

These underlying structures are

brought to consciousness through the choice of suitable examples and counterexamples, through contrast and similarity relations, through translation, paraphrase and so on—that is, through a well thought out, maieutic method of interrogation. (1976b [1998b: 40])

They are revealed not as timeless constants, but as they have developed over time, with their internal developmental logics (Pedersen 2008: 463, 474–9). Habermas originally claimed that rational reconstruction uncovers knowledge of universal human capabilities, “species competences”, rather than the competences of particular groups or individuals (1976b [1998b: 34–5]; McCarthy 1991: 130–2). For example, rationally reconstructing the practice of everyday speech uncovers the rules of communicative action as such, not the grammar of a particular language. However, Habermas’s later description of the discourse theory of law and democracy as a rational reconstruction of “the self-understanding of modern legal orders” of democratic constitutional states (1992b [1996b: 82], emphasis removed), evidently a local phenomenon, suggests that he has since modified the scope of the reconstructive method. Commentators are dividend on this point, with some distinguishing between an “empirical” variety of reconstruction on display in the Theory of Communicative Action, and a “normative” variety in Between Facts and Norms (Peters 1994: 119), and others arguing that the same methodology underlies both projects (Patberg 2014: 511–3; however, see Gaus 2013: 561). This tension is partly resolved if we remember that democratic law-making draws on general communicative competencies, and makes use of pragmatic, ethical, and moral discourses. The phenomenon is local, but the capacities involved are general.

In terms of his own work, formal pragmatics rationally reconstructs the communicative capacities possessed by all human beings, making them explicit in Habermas’s accounts of communicative action and the rules for redeeming validity claims (1976b [1998b: 22–4]). The discourse theory of morality does this for our capacity for engaging in moral discourse, formalizing this in the (D) and (U) principles (1983 [1990a: chapter 2: 37, chapter 4: 174–5]), while the discourse theory of law and democracy does the same for the practice of lawmaking in democratic constitutional states, formalizing this in the system of rights and the principle of democracy (1992b [1996b: 110–1, 118–24, 287]). Importantly, Habermas claims that the theories produced by these processes of rational reconstruction have the status of falsifiable hypotheses—they are not a priori since they are not necessary claims, although they are supposed to be “universal” in the sense that they are, at present, without alternatives. Habermas sometimes refers to them having a “weakly transcendental” status.

Whether or not formal pragmatics accurately describes the practice of human communication can only be decided a posteriori, by the future “success” or “failure” of the theory as an input in further empirical investigations (1976b [1998b: 39]; 1983 [1990: 32]), with Habermas suggesting coherence between theories as the criterion of success (1981 [1987: 399–400]; 1983 [1990a: chapter 2, 39]). Jørgen Pedersen has argued that it is still not clear what constitutes success and failure in this context, and thus not fully clear how rationally reconstructed theories can be tested (Pedersen 2008: 478–1). Karl-Otto Apel, similarly, questions what it would mean to falsify the “unavoidable presuppositions of argumentation” itself (Apel 2002: 19).

Habermas claims that knowledge of the underlying structures and competences acquired through rational reconstruction can then be used for the purposes of social critique. Supposedly, a version of a practice can be identified as pathological, or not fully rationalized, if it does not meet the counterfactual idealization which the practice presupposes (1983 [1990: 31–2]). Habermas thus identifies systematically distorted communication, invalid moral norms, and illegitimate laws as deficits in communicative action, moral discourse, and democratic lawmaking, respectively. Rational reconstruction allows us to critique these actual practices according to their own internal standards, rather than the critic’s arbitrarily chosen standards or the philosopher’s supposedly transcendental standards (1992b [1996b: 5]). Similarly, the developmental logic revealed by rational reconstruction can be used to evaluate processes of historical development as progressive examples of collective learning, or as regressive and pathological. As a method for explaining social phenomena, rational reconstruction is neither merely empirical nor hermeneutic; the principles it produces are supposed to ground a kind of social and political theory which is neither “ideal” nor “real”, but somewhere in between. Habermas’s claim that philosophy can find within traditions construed as learning processes a “standpoint of critical evaluation” (1996a essay 3 [1998a: 97]; 1992b [1996b: 5]) is certainly in the spirit of critical theory, but arguably in tension with his other claims that that philosophy should “limit itself to the clarification of the moral point of view and the procedure of democratic legitimation” (Habermas 1995: 131; Finlayson 2019: 205).

5.2 Postmetaphysical Thinking

Postmetaphysical thinking, Habermas’s paradigm for modern philosophy, must be understood through contrast with its predecessor, metaphysics. He labels much of the history of Western philosophy as “metaphysics”, counting Parmenides, Plato, Plotinus, Augustin, Aquinas, Spinoza, Leibniz, and Hegel as metaphysical thinkers (1988b essay 2 [1992a: 12–13]; 1988b essay 3 [1992a: 29]). He sometimes distinguishes between metaphysics proper and the “philosophy of consciousness” or “philosophy of the subject” associated with the rationalism of Descartes and the idealism of Kant, Fichte, and Schelling (1988b essay 3 [1992a: 31]; 1988b essay 8 [1992a:, 158–62]), though it is not clear whether this should be considered a separate paradigm (1988b essay 2 [1992a: 12–13]) or simply a late stage of metaphysics.

Among the characteristics of metaphysics are:

  • A conception of philosophy as the queen of the sciences, with its own unique method and form of knowledge, distinct from the natural and social sciences, which can yield special insights into the nature of reality and the meaning of life. Plato’s theory of forms is an excellent example of this, since Plato thinks that philosophy, with its dialectical method, can offer true knowledge (episteme) of the forms, superior to mere opinion (doxa) about the material world.
  • A substantive conception of reason as an Archimedean point from which the philosopher can observe reality as a whole. The metaphysical philosopher’s goal is to attain an observer’s perspective on reality, from which they can learn universal and necessary truths. Philosophy can thus act as the judge and arbiter of both science and culture (1983 [1990: 2–3]).
  • Idealism and identity thinking (1988b essay 3 [1992a: 29–31]). Metaphysics assumes that ideas are primary and the material world secondary, and that to grasp the underlying intellectual reality is to grasp the whole: “the structures of being themselves are what is laid hold of in knowledge” (1988b essay 2 [1992a: 13). Again, Plato’s theory of forms is a prime example.
  • For philosophy of consciousness, the use of strong transcendental arguments to ground claims about the nature of reality on the individual subject’s self-knowledge. Habermas thinks that this turn to introspective self-knowledge as foundational took place as a result of the pressure which natural science was putting on metaphysics by the eighteenth century. It may no longer be plausible that the individual subject can grasp “the structures of being themselves” in thought, but they can at least grasp the structure of their own thoughts, and build on these foundations some certainty about the world. Kant’s transcendental unity of apperception (1988b essay 7 [1992a: 124–5]) and refutation of idealism in the First Critique are the clearest examples.

Habermas concedes that this is a stipulative definition of metaphysical philosophy, focused on idealism. Ancient materialism and scepticism, medieval nominalism, and modern empiricism do not fit the mould, but Habermas argues that they should be seen as “antimetaphysical countermovements” within the horizon of metaphysics (1988b essay 3 [1992a: 29]). He often characterises modern philosophical trends of which he is critical as covertly metaphysical. Habermas sees some trends as attempted breaks with metaphysics which remain trapped within the paradigm (Nietzsche, Heidegger, Derrida—1981 [1987: 83–105, 131–160, 161–184]), others as still being mired in the philosophy of consciousness (Niklas Luhmann’s systems theory—1981 [1987: 368–385]), and other again as deliberate attempts to return to the philosophy of consciousness (Dieter Henrich— 1988b essay 2 [1992a: 10–27]).

Postmetaphysical thinking begins with the first generation of post-Hegelian thinkers (Feuerbach, Marx, Kierkegaard) (1988b essay 3 [1992a: 39]), and includes pragmatists such as C.S. Peirce and G.H. Mead, speech-act theorists such as J.L. Austin and John Searle, the later Wittgenstein, and Karl-Otto Apel.

Among the characteristics of postmetaphysical thinking are:

  • Rational reconstruction as a method (1988b essay 3 [1998b: 38]). Since rational reconstruction is also used by many of the social sciences, philosophy is no longer seen as unique in its method and form of knowledge, but simply as one discipline among others. Postmetaphysical philosophy has no priority over the natural and social sciences, but neither is it subordinate to them. It opposes scientism, the idea that the natural sciences have unique authority and privileged access to the truth.

  • A “weak but not defeatistic (sic) concept of linguistically embodied reason” (1988b essay 7 [1992a: 142]). Postmetaphysical philosophy conceives of reason as being historically, socially, and linguistically situated, embedded in the intersubjective communicative processes of the lifeworld, rather than transcending them. The philosopher is a participant in these processes, not an outside observer. Postmetaphysical reason is immanent in the communicative practices of the lifeworld, and procedural (1988b essay 3 [1992a: 34–5]), rather than being the external Archimedean point of metaphysics. Habermas thus sees postmetaphysical thinking as having a detranscendentalized conception of reason.

  • The use of weak transcendental arguments, rather than the strong transcendental arguments of metaphysics (Yates 2011: 41–4). Benhabib defines strong transcendental arguments as ones which aim to

    (prove) the necessity and singularity of certain conditions without which some aspect of our world, conduct, and consciousness could not be what it is.

    Descartes’ cogito and Kant’s refutation of idealism are examples, aiming to prove the existence of the subject and of the external world. Weak transcendental arguments, in contrast, focus on lifeworld practices such as communicative action, moral discourse, and democratic deliberation (rather than experience as such), and

    demonstrate more modestly that certain conditions need to be fulfilled for us to judge those practices to be of a certain sort rather than of a different kind. (Benhabib 2002: 38)

    Unlike strong transcendental arguments, weak transcendental arguments are both a posteriori, since they are based on rationally reconstructed experience, and falsifiable, since they can be refuted by further empirical experience (1976b [1998b: 42]).

  • The ability to provide context-transcending validity, despite abandoning metaphysical philosophy’s search for universal necessary truths. Validity claims are the clearest example (1988b essay 7 [1992a: 142]; 1983 [1990: 19, 203]). Although they are always advanced from within a particular lifeworld context, they hold that a certain thing is morally right for the intersubjective world, or true for the objective world, as a whole. The claims raised are not just valid within the context of the interlocutors raising them, but for all speaking and acting subjects, as they must be if postmetaphysical philosophy is to critique existing social and political conditions (1988b essay 3 [1992a: 50]). Habermas speaks in this context of the “immanent transcendence” or “transcendence from within” of language (2005c).

  • Modesty with regard to ethical and ontological claims (Rees 2018: 55–6, 60–2). Although it can clarify the procedures of moral discourse, it should refrain from making substantive contributions to ethical discourse (1988b essay 3 [1992a: 50]; 1988d [1990a chapter 5, 211]). Unlike metaphysical philosophy, it refrains from making ontological claims about matters such as the existence of God. In accordance with the postsecular orientation of Habermas’s recent work, postmetaphysical philosophy remains agnostic about such matters, while being open to the truth-contents of religious language.

The metaphysical philosopher is an isolated figure, aiming through the use of their individual reason to attain a neutral observer’s perspective on reality as a whole and produce knowledge of universal, necessary truths. The postmetaphysical philosopher, in contrast, aims at a participant’s perspective, working in dialogue with the natural and social sciences and making use of a situated, procedural conception of reason. There is no transcendent Archimedean point for the philosopher to occupy, since we are all embedded in our lifeworlds and our thinking conditioned by social, historical, and linguistic factors. What remains distinctive about philosophy as a discipline is that it can step back from the particular questions which the natural and social sciences focus on and produce general knowledge about the human condition; but this knowledge is now based on empirical data and rational reconstructions, and as such is contestable and revisable. Postmetaphysical philosophy no longer claims to be the role of judge and usher of science and culture, assigning each discipline and cultural practice to its proper place. Instead it acts as a placeholder for fruitful collaborations between empirical research and philosophical ideas, and as an interpreter mediating between the rationalized value-spheres of science, law, and art and the everyday discourse of the lifeworld (1983 [1990: 15–9]).

6. Constitutional Patriotism, Cosmopolitanism, and International Law

Habermas’s views on national identity, the nation state, and global politics have been shaped by the historical experience of living through the Third Reich, the division of Germany into East and West, reunification, and the development of the European Union. The key to his applied political theory

consists in differentiating between the three elements of statehood, democratic constitution, and civic solidarity, which are closely linked in the historical form of the constitutional state. (2009 chapter 7: 112)

Habermas argues that the conjunction of these elements is contingent, not necessary, and that in the “postnational constellation” of the late twentieth and early twenty-first centuries they should be disaggregated. His theory of constitutional patriotism argues that civic solidarity need not be reduced to national identity, but can rather be generated by the process of constitution-making itself (§6.1). With regard to the international arena, he argues that the constitutionalization of international law can proceed without a global state, and, as such, cosmopolitans should aim at a “politically-constituted world society” (2004 chapter 8 [2006c: 161]) rather than a world republic (§6.2).

6.1 Constitutional Patriotism

The term “constitutional patriotism” (Verfassungspatriotismus) was not coined by Habermas, but by the political scientist Dolf Sternberger, who popularized it in a 1979 article marking the thirtieth anniversary of the Federal Republic of Germany (Müller 2006). Habermas took up the term and developed his own distinct interpretation of it beginning with the “historians” dispute’ of 1986. Constitutional patriotism is the theory that in modern states the constitution can, and should, take the place of the nation as focus of citizens’ feelings of collective identity and the source of their civic solidarity. While originally concerned with specifically post-war West German questions about history and national identity (1988a), and thus drawing on a rational reconstruction of the postnational form of identification developed in the divided Germany, Habermas later applied the theory to modern states more generally, and to the European Union in particular (1998c).

National identity, for Habermas, is a product of the modern era, dating from the time of the French Revolution and the later Romantic movement. It was constructed by linguists, historians, and writers and propagated through the education system and the public sphere. As such, the nation is intermediate between traditional and post-traditional forms of identity (1988a: 5–7; 1992b [1996b: 494–5]). In a traditional society, collective identity is accepted unreflexively, as are the society’s conventional morality and worldviews. It is a supposedly natural, pre-political given. Collective identity in post-traditional society, in contrast, is adopted in a reflexive manner, in light of reasons given in the public sphere. While in reality post-traditional, the nation is reified as traditional and quasi-natural, and projected into a distant past. From the beginning, the nation played a political role in generating civic solidarity among strangers. National identity establishes an abstract level of solidarity, transcending the face-to-face associations which bind people in traditional society: villages, clans, and localities. The feeling that they belong to the same nation motivates individuals to make sacrifices for others who they have never met, whether in terms of redistributive taxation or military service (1998c [2001a: 64–5]). This was crucial for the viability of democratic republics at the transition from traditional to post-traditional society. Nationhood provides the cultural substrate for the “nation of citizens” who govern themselves democratically (1996a essay 4 [1998a: 117–8]). But, for a number of reasons, the nation is less and less able to play this role today. For one, all contemporary nation states are increasingly multi-ethnic and multicultural, undermining the idea that a purportedly homogeneous collective identity can act as the substrate for democracy. For another, the dangers of extreme ethnonationalism are all too obvious: they generate in-group/out-group distinctions that can lead to discrimination, racism, and at the limit to ethnic cleansing and genocide. Finally, nation states themselves are increasingly powerless in the face of factors outside their control, such as global capitalism and climate change. This “postnational constellation” requires a different form of social integration (1998c [2001a: 58–112]).

Luckily, according to Habermas, the relationship between constitutional democracy and the nation is historically contingent (1992b [1996b: 495]; 1996a essay 5 [1998a: 132–3]; 1998c [2001a: 76]). A modern society does not need anything as substantial as a shared religion, way of life, or repository of values to serve as the basis of collective identity. A democratic political system can generate its own civic solidarity. Supposing that individuals have progressed from a traditional to a post-traditional level of identity and are willing to adopt a more reflexive understanding of their identity, the constitution can replace the nation as a source of civic solidarity and attachment. The democratic procedure of constitution-making not only produces legitimacy, as described in the discourse theory of law and democracy, but can also serve as the basis of belonging. The goal of constitutional patriotism is not, then, to eliminate national identity, but rather to decentre it and deprive it of its political function. In concrete terms, constitutional patriotism involves citizens developing critical and reflexive loyalties and attachments to their country’s constitution and the moral principles encoded therein. What results from this is a collective identity with a political function. Citizens’ status as joint makers and interpreters of the same constitution takes the place of their shared natality and status as co-nationals. This view is constitutional in that it revolves around the work of making, interpreting, and reflecting on the constitution, which takes place in the public sphere. It is patriotic in that it has a binding effect on the community of citizens, furnishing them with civic solidarity and a collective identity. The formation of constitutional-patriotic identity, significantly, takes place at the level of opinion- and will-formation in the public sphere. Unlike supposedly natural and pre-political national identity, it is formed in the light of rational discourse (2004 chapter 6 [2006c: 76–9]).

Critics of constitutional patriotism argue that it is either too “thin” to play the role Habermas assigns to it, or too “thick” to really be postnational and open to all, as he claims (Hayward 2007: 186–9). The first criticism is that the universal moral principles encoded in constitutions are too abstract and affectively thin to generate a sense of personal loyalty and collective belonging among citizens, compared to the thick bonds of national identity (1996a essay 5 [1998a: 132]; Canovan 2000). Habermas rejects this criticism. Citizens interpret the principles found in their constitution—which represent universal moral and political norms of democracy and human rights, and might be found in any liberal-democratic constitution—in the light of their community’s unique historical experience. They internalize these principles, not abstractly, but in the context of the history of their country (1996a essay 8 [1998a: 225–6]). Constitutional principles become part of the “dense web” of a society’s (and an individual’s) historical experiences and pre-political values (2004 chapter 6 [2006c: 77–8]). It follows that each country’s constitutional patriotism will be different, inflected by the particular past that country has worked through—Habermas notes that French constitutional patriotism, marked by a tradition of revolutionary democracy, will be different to German constitutional patriotism, marked by the historical failure to produce a working democracy (1992h: 240–1). Far from being an abstract, bloodless construct as critics have alleged, Habermas sees constitutional patriotism as intimately connected to each community’s particular history and culture, and to its concerns about identity and the common good—its ethical-political self-understanding.

The same universalistic content must each time be appropriated from out of one’s own specific historical life-situation, and become anchored in one’s own cultural form of life. Every collective identity, even a post-national one, is much more concrete than the ensemble of moral, legal and political principles around which it crystallizes. (1992h: 241)

The second criticism is that supposedly constitutional-patriotic identity is, in reality, the thick identity of the dominant majority culture, albeit disguised (Laborde 2002: 593–9). Habermas’s argument that constitutional principles are interpreted in the light of particular histories plays into this criticism. If a particular country’s constitutional patriotism is so closely bound up with its ethical-political self-understanding, then the language, culture, beliefs, and values of that country’s majority culture will make a deep impression on it. Constitutional patriotism consequently collapses into civic nationalism. At the same time, Habermas’s claim that the connection between national identity and constitutional democracy is merely contingent holds open the possibility that, even if they have overlapped in the past and the present, an ongoing learning process may lead to the two being fully decoupled in the future (1998c [2001a: 101–2]). A further question, articulated by Andrea Baumeister among others, is whether the liberal-democratic values which are to be interpreted in the light of national histories are, in fact, as widely accepted as Habermas believes (Baumeister 2007: 491–5). If not, constitutional patriotism may be more particularistic and Eurocentric than it at first appears, raising questions about how those who do not adhere to such values are to be integrated into constitutional-patriotic polities.

Habermas’s longstanding support for a European constitution (2001c [2006g]; 1996a essay 6 [1998a]) can be explained by his hope that a European constitutional patriotism would provide the civic solidarity that would allow the transnational polity of the European Union to fulfil its democratic potential (2004 chapter 6 [2006c]).

6.2 Cosmopolitanism and the Constitutionalization of International Law

Habermas’s views on international politics are characterised by a revision of Kant’s cosmopolitanism, and a total rejection of Carl Schmitt’s conception of the political as irreducibly antagonistic (1996a essay 7 [1998a: 193–201]; 2004 chapter 8 [2006c: 188–93]). After some early explorations (1996a essay 7 [1998a]), he rejects a Kantian version of cosmopolitanism, arguing for a “politically constituted world society” (2004 chapter 8 [2006c: 161]) rather than a world republic or a global federation. His focus is the “constitutionalization of international law” (2004 chapter 8 [2006c: 132–5]) without a world government, with the limited aims of ensuring peace and protecting human rights, rather than democracy on a global scale. Habermas pays particular attention to the different types of legitimation which different levels of global governance would require, and the different types of constitution they would need.

In Perpetual Peace, Kant looks forward to the transformation of international law, with states as its subjects, into cosmopolitan law, with global citizens as its subjects. Habermas, in contrast, argues that cosmopolitan law must be dualistic, with both states and individuals as its subjects. The cosmopolitan community is dualistic in the sense that it is a community of both human beings (individual subjects), and states (collective subjects) (2004 chapter 8 [2006c: 135]; 2005b chapter 11 [2008a: 317]). Both can act as the founding subjects of a world constitution (2009 chapter 7: 119; 2011 [2012: 58]).

Habermas describes “a multilevel political system that does not assume a state-like character as a whole” (2004 chapter 8 [2006c: 144]), divided into three levels (2005b chapter 11 [2008a: 322–7]), each with its own form of legitimation and type of constitution:

  • National: nation states (2009 chapter 7: 115–6). Although they are no longer fully sovereign in a globalized world, states still have the monopoly on the use of force within their territories. They are based on a defined self-legislating demos (“national” or otherwise), which gives itself laws by following democratic procedures, thus linking the rule of law to democracy. In other words, states have a more “republican” type of constitution, which gives a central role to popular sovereignty established in a revolutionary moment (1998d [2001a: 116–8]; 2004 chapter 8 [2006c: 122]). They can have full democratic legitimacy, as described in Between Facts and Norms, since within a state the authors of law can also be its addressees (2004 chapter 8 [2006c: 141]). States have the highest requirements of legitimacy within Habermas’s model. They have two functions: firstly, to provide military force for implementing human rights and policy decisions of the higher levels (2005b chapter 11 [2008a: 320–321]), and secondly, to generate indirect political legitimacy for the higher levels.
  • Transnational: continental blocs like the EU, Habermas’s prime example of a transnational polity, along with its less-integrated siblings such as ASEAN and Mercosur (2005b chapter 11 [2008a: 325–6]). Great powers such as the USA, China, Russia, and India also operate at the transnational level (2009 chapter 7: 114), as do economic organizations such as the WTO, IMF, and World Bank; and UN agencies such as the WHO and UNESCO (2011 [2012: 56]). This is the most “pluralist” level in Habermas’s model, since it contains very different types of polities, and includes ones which are both liberal and illiberal, democratic and non-democratic. At the transnational level there exists a “global domestic politics” (2004 chapter 8 [2006c: 136, 160]) addressing socioeconomic questions. Issues of wealth and redistribution, health and disease, trade, migration, and environmental policy can be discussed within and between transnational polities. Habermas still considers relations in and between them to count as international relations or foreign policy, although recourse to war should be ruled out (2005b chapter 11 [2008a: 325]). The normative yardstick for relations between transnational polities is fair negotiation, not the full democratic legitimacy which can exist within states (2009 chapter 7: 125–6; 2011 [2012: 57, 68]). They are nonetheless open to the influence of deliberative publics from below, and should institutionalize some degree of citizen participation, via referenda or mechanisms for responding to transnational public spheres. Transnational polities also receive some indirect legitimation in virtue of their member states being legitimate (2004 chapter 8 [2006c: 142]). They thus have middling requirements of legitimacy, which can be generated in a number of direct and indirect ways, and may have several different types of constitutions.
  • Supranational: global organizations with universal membership, comprising both individuals and states as members. Habermas envisages a reformed United Nations as the central component of the supranational layer, along with a stronger version of the International Criminal Court (2004 chapter 8 [2006c: 133–4, 173–4]). Their remit is strictly to secure peace and prevent human rights violations (2004 chapter 8 [2006c: 136]; 2005b chapter 11 [2008a: 322]; 2011 [2012: 60–1]), with the security council performing an executive function, the ICC a judicial function, and the UN Charter acting as a supranational constitution (2004 chapter 8 [2006c: 160–1]; 2009 chapter 7: 120). A reformed General Assembly could function as a world parliament, containing representatives of both states and cosmopolitan citizens (2009 chapter 7: 120–1), with its deliberations focused on interpreting and elaborating the meaning of the Charter, rather than the kind of political will-formation which takes place within national parliaments (2011 [2012: 60–1, 65]). All other cross-border political issues should be dealt with at the transnational level—the supranational level is the preserve of law, rather than politics (2005b chapter 11 [2008a: 333–4, 343]; 2011 [2012: 65]).

This division of labour has drawn criticism, with Cristina Lafont arguing that the relegation of all socioeconomic issues to the transnational layer ensures that human rights violations stemming from global inequality cannot be addressed (Lafont 2008). Habermas’s suggestion that the “slender but robust” cross-cultural consensus on basic human rights is enough to legitimate the supranational level’s policies (2004 chapter 8 [2006c: 143]; 2005b chapter 11 [2008a: 343–4]) has also been criticized. There is room for doubt on this point, especially with regard to contentious cases such as LGBT rights (1998d [2001a: 113–129]; Scheuerman 2008: 143–5).

Since it is neither democratic nor a state, the supranational level has the lowest requirements of legitimacy (2004 chapter 8 [2006c: 133–4, 143]; 2011 [2012: 65]), and is not suited to a republican type of constitution, contra Kant. It is suited to a liberal type of constitution, which prioritizes the rule of law rather than popular sovereignty (2004 chapter 8 [2006c: 137–9]). A liberal constitution constrains established power in accordance with human rights, but does not connect it to the will of a self-legislating demos, which is in any case lacking at the global level (2005b chapter 11 [2008a: 316]). Instead, the supranational level derives its legitimacy directly from the negative duties which it enforces (to prevent human rights abuses and wars of aggression), and indirectly from the legitimacy of the states which comprise it (2004 chapter 8 [2006c: 140–1, 143]; 2005b chapter 11 [2008a: 342–4]). This may be supplemented by the periodic emergence of a global public sphere, mobilised in opposition to wars of aggression or in condemnation of gross human rights violations (2004 chapter 8 [2006c: 142]; 2005b chapter 11 [2008a: 343–4]; 2009 chapter 7: 124–5).

In this model, supranational law is to have primacy over state law, in much the same way that EU law has primacy over the laws of member states (2004 chapter 8 [2006c: 137]). The plausibility of Habermas’s proposals depends on two learning processes. Individuals must learn to think and act as both national and cosmopolitan citizens, switching between a perspective that centres national interest and one that centres universal standards of justice (2009 chapter 7: 116–8), while states must learn to regard themselves as members of an international community, not absolute sovereigns (2011 [2012: 61]). Habermas regards the constitutionalization of international law as the “legal domestication of the intensified cooperation between states” (2014: 8), by which the Hobbesian state of nature at the global level can be gradually regulated and subject to law without the need for a global government, which could not in any case be fully legitimate. Some degree of global governance is unavoidable in the contemporary postnational constellation. What is crucial is that it should be constitutional and grounded in universal moral principles, rather than a technocracy at the service of neoliberal capitalism.

7. Religion and Postsecularism

Habermas’s views about religion and its place in modern society have changed strikingly over the course of his career. In a series of texts written mostly after 2001 he revises the secularist bent of his earlier social and political theory, as expressed in Theory of Communicative Action and Between Facts and Norms, so as to acknowledge religion’s close relation to philosophy and the central place of religious believers in democratic states.

Influenced by Weber and Durkheim, Habermas had earlier characterised religious beliefs as the worldviews of traditional societies that in the course of rationalisation are superseded and replaced by secular forms. In religious belief the validity claims of objective truth, moral rightness, and sincerity, and along with them the objective, intersubjective, and subjective world-relations, were fused together. Speakers could not thematize and contest them (1981 [1984a: 214]; 1981 [1987: 189]). This fusion, maintained by the strict segregation of sacred from profane domains of life, enabled a normative consensus to crystallize by non-discursive means (1981 [1987: 54]). That which was in accord with society’s ritually protected normative consensus was right, that which violated it was wrong, and the consensus itself was beyond questioning. The transition to modernity begins with the “linguistification of the sacred”, in which “the authority of the holy is gradually replaced by the authority of an achieved consensus” (1981 [1987: 77, see also 288]). Normative consensus and social integration are achieved in modern societies through communicative action, carried out by competent speaking and acting subjects who have mastered all three validity claims and world-relations (1981 [1987: 107, 145]). The sacred, the lynchpin of religious worldviews, has dissolved into unrestricted discourse, in which any validity claim may be contested. Religious belief may continue to exist, but it is now one worldview among many, and like every other social practice it must be continued by means of communicative action (1981 [1987: 88–9]).

Habermas at this stage saw post-traditional society as secular (1992b [1996b: 443–4]). He subsequently rejected this view, stating that

My earlier Hegelian view of religion as a formation destined to be dialectically superseded in the modern world has indeed changed. The empirical evidence of the survival of religion under modern conditions has accumulated in recent decades. (2012 chapter 6 [2017: 143])

His most recent work paints a very different picture of the role of ritual and the sacred (2012 chapter 3 [2017]), revising many of his central claims in the Theory of Communicative Action. Aside from these revisions to his social theory, Habermas’s postsecular writings address the philosophical theme of the relationship between religious faith and philosophical reason (§7.1) as well as the political theme of religion’s place within deliberative democracy (§7.2).

7.1 Jerusalem and Athens: Religious Faith and Philosophical Reason

Habermas has, since phase two of discourse ethics in the 1990s, considered religious traditions in modern societies to be fruitful sources of ethical values (1983 [1990]; 2001b chapter 1 [2003b]). Since postmetaphysical philosophy refrains from proposing concrete visions of the good, modern subjects must “appropriate” or “translate” insights from religion (alongside art and literature) to use as inputs in their ethical-existential and ethical-political discourses. Maeve Cooke suggests that this process of ethical appropriation involves re-presenting the semantic contents of ethical insights, shorn of their religious, literary, and artistic contexts, such that their exemplary force in disclosing visions of the good remains intact and can be used in secular ethical discourses (Cooke 2011).

Philosophy, too, has a long history of appropriating or translating concepts from religious traditions (2012 chapter 4 [2017: 63–4]). Many apparently secular philosophical ideas have genealogies stretching back to religion. Alongside well-known examples such as Schelling and Hegel’s concept of the Absolute (2005c: 304), Benjamin’s use of the Messiah, and Adorno’s use of the ban on graven images (2005b chapter 8 [2008a: 232]), Habermas discusses several examples from Kant: the summum bonum, the ethical community, moral faith, radical evil, and even the moral law itself can be seen as secular philosophical translations of the kingdom of God, the church, religious faith, original sin, and the Ten Commandments (2001b chapter 2 [2003b: 110]; 2005b chapter 11 [2008a: 220–3, 224–6]). Habermas lists universal egalitarianism and communicative action (2002 chapter 8: 149, 160) as concepts from his own work which have religious genealogies. Although some of the original concepts’ meaning is inevitably lost in translation (2005c: 309; 2002 chapter 8: 164), such “critical assimilation(s) of religious concepts” or “secularizing, but at the same time salvaging, deconstruction(s) of religious truths” (2003c: 110) can enrich the vocabulary of postmetaphysical philosophy (2005b chapter 5 [2008a: 142]). Despite this, Habermas insists that philosophical discourse itself remains secular, retaining its “methodological atheism” (2005c: 304, 309; 2002 chapter 8: 160)—it appropriates religious concepts, but they do not remain religious concepts. They must be contestable in justificatory discourses.

These conceptual appropriations link the apparently secular philosophical tradition to the major world religions, and focusing on them helps to establish a new postsecular self-understanding of philosophy. Nonetheless, the process of appropriation could be said to place philosophy in a superior position to religion. It seems as if philosophy is able to judge which elements of religion count as rational enough to be worth appropriating, and which can be discarded as irrational; religion is reduced to a fund of concepts for philosophy’s use (2012 chapter 4 [2017: 63]). Habermas resists this interpretation, arguing against a Kantian view in which pure rational faith should eventually replace historical faiths (2005b chapter 8 [2008a: 223]), or a Hegelian view in which religion is one moment in the dialectic of absolute spirit, soon to be sublated into philosophy (2005b chapter 8 [2008a: 230–1]). A truly postsecular self-understanding of philosophy, he claims, must move beyond this.

Habermas attempts to deflate philosophy’s stance of superiority with regard to religion by arguing that philosophical reason and religious faith have a shared origin. Both originated in the Axial Age, the intellectual revolution which took place in Greece, Israel, India, and China between 800 and 200 BCE, as described by Karl Jaspers (Jaspers 1949 [1953]). This transition from mythos to logos saw the emergence of universal, context-transcending thinking, and a deepening of human subjectivity and ethical thought. The Western philosophical tradition beginning with Socrates, Plato, and Aristotle is as much an axial phenomenon as Judaism, Buddhism, and Confucianism (2012 chapter 4 [2017: 66–9]; Rees 2017: 221–2). As Habermas puts it,

both modes, faith and knowledge, together with their traditions based respectively in Jerusalem and Athens, belong to the history of the origins of the secular reason which today provides the medium in which the sons and daughters of modernity communicate concerning their place in the world. (2008c [2010: 17])

Religious faith is not, then, the “opaque other of reason” (2005b chapter 5 [2008a: 142]). Rather than being hostile strangers, philosophy and religion are in reality estranged siblings and equal partners in a fruitful dialogue (2008c [2010: 17–18]).

7.2 Postsecular Deliberative Democracy

In writings since the turn of the millennium, Habermas adapts his discourse theory of law and democracy so as to take account of the postsecular nature of modern societies. He uses this expression

to describe modern societies which must assume that religious groups will continue to exist and that different religious traditions will remain relevant, even if the societies themselves are largely secularized. (2012 chapter 4 [2017: 63])

His postsecular political theory, then, applies mostly to European and other Western countries with an established tradition of secular democracy, which must take account of the presence of religious minorities (2009 chapter 5: 59). One alleged problem with earlier models of deliberative democracy, such as those outlined by Rawls in Political Liberalism or by Habermas himself in Between Facts and Norms, is that they place unfair cognitive burdens on religious citizens of secular states, and in doing so threaten to undermine these states’ legitimacy. The central issue is the exclusion of religious language from the public sphere, and it can be addressed by modifying the discourse theory of law and democracy so as to let religious citizens participate fully. Habermas identifies two problems with secular deliberative democracy. It forces religious citizens to “split their identities” between public (secular) and private (religious) personae (2005b chapter 5 [2008a: 126–7, 130]), and places asymmetrical burdens upon them, compared with their secular fellow citizens. Consider Rawls’s “duty of civility” which required citizens not to make use of their comprehensive doctrines while deliberating in public, and to restrict themselves to political conceptions of justice, which are within the bounds of public reason (Rawls 1995 [1996: 217]). Or consider Rawls’s later, more moderate view, namely the proviso which allows that citizens can adduce “comprehensive reasons” in public political discussion, provided that “in due course” these are replaced by proper political reasons (Rawls 1993 [2005: 462, l]). Following Paul Weithman and Nicholas Wolterstorff (Weithman 2002; Audi & Wolterstorff 1997), Habermas objects to Rawls’s proviso for placing morally unacceptable burdens on citizens of faith, since

many religious citizens would not be able to undertake such an artificial division within their own minds without jeopardizing the pious conduct of their lives. (2005b chapter 5 [2008a: 127])

He also claims that burden is unfairly distributed, weighing only on believers (2001b chapter 2 [2003b: 109]; 2005b chapter 5 [2008a: 136]). Some critics have countered that identity-splitting can also affect non-believers (Boettcher 2009; Holst & Molander 2015), while others rejoin that it is a reasonable demand to make of modern democratic citizens in pluralist societies, which does not threaten their complex identities (Mautner 2014: 24; Finlayson 2019: 11). Habermas, however, argues that both objections apply to Rawls’s idea of public reason, governed by the proviso.

Now, Habermas’s account of legitimate law-making in Between Facts and Norms is wedded to the principle of the separation of church and state, and has an unabashedly secular conception of politics. The principle of democracy implies that laws are legitimate when citizens’ contributions to the informal public sphere filter through into the formal public sphere (the state apparatus, parliaments, and legal systems), and influence law-making (1992b [1996b: 371–2, 441–2]). When this cycle of feedback is operating correctly, citizens can understand themselves as the “authors and addressees” of the law (1992b [1996b: 120]), since they have indirectly participated in the legislative process.

However, religious citizens cannot contribute authentically to a secular public sphere. Their contributions to public discourse, if phrased in secular language, will have little meaning for them, and thus they will find themselves in the heteronomous position of only being the addressees of the law, not its authors (2005b chapter 5 [2008a: 128, 130]). Habermas fears that the secular nature of the public sphere prevents religious citizens from fully taking part in the discursive processes which legitimate laws. They might comply strategically, seeing the law as a mere fact, and not a norm, but for them the laws passed by secular democratic states would lack legitimacy. There is evidently a danger here of large numbers of religious believers becoming alienated from democratic law-making, with the concomitant danger of political instability and a legitimation crisis (2009 chapter 5: 76).

The problem is how to construe and design the modern political system in a way that is more congenial to religious citizens, without abandoning the secular political state, and how to show that the Weithman/Wolterstorff objections, which Habermas agrees apply Rawls’s proviso, do not apply to his own theory. His proposed solution, a model of postsecular deliberative democracy, has two elements corresponding respectively to the formal and informal public spheres; namely to the political system and civil society.

First, echoing Rawls, Habermas proposes an “institutional translation proviso” at the threshold between the informal and formal public spheres. In the formal public sphere, public officials, politicians, and judges must restrict themselves to secular language, but in the informal public sphere, ordinary citizens are free to contribute to public discourse in religious language (2009 chapter 5: 76; 2005b chapter 5 [2008a 130–2]). Before these religious statements pass through into the formal public sphere and impact the legislative process, they are to be translated into secular terms, a cooperative task in which both religious and non-religious citizens take part (2005b chapter 5 [2008a: 112–3]; 2012 chapter 7 [2017: 172]). The institutional translation proviso thus acts as a filter, maintaining the secular nature of the state, but allowing religious citizens free reign to air their religious reasons in informal public discourse (2005b chapter 5 [2008a: 131]).

Second, religious and non-religious citizens in civil society must undergo complementary learning processes, leading to them becoming reflexive about their beliefs (2005b chapter 5 [2008a: 111–2]). For religious believers, this involves coming to terms with three aspects of modern society: reasonable pluralism, the priority of scientific knowledge, and the secular nature of the state (2001b chapter 2 [2003b: 104]; 2005b chapter 5 [2008a: 137]; 2012 chapter 7 [2017: 173]). For non-believers, it involves accepting the continuing presence of religion in modern society, coming to view their disagreements with believers as reasonable, accepting that believers have a right to contribute to the public sphere in religious language, being willing to help translate those contributions into secular language, and finally forgoing scientism, and scientistically motivated atheism, in favour of political agnosticism (2005b chapter 4 [2008a: 113]; 2005b chapter 8 [2008a: 263–4]; 2005b chapter 10 [2008a: 309–10]; see also Baxter 2011: 205).

Taken together, Habermas claims, these learning processes help to equalize the cognitive burdens borne by religious and non-religious citizens. This model of postsecular deliberative democracy, Habermas argues, solves the problems of identity-splitting and de-legitimation which secularism inflicts on religious believers. Since they can now contribute to the informal public sphere in religious language, believers no longer have to split their identities in two; since they know that their religiously based contributions are making an impact on the formal public sphere, they can see the laws which it produces as legitimate.

Even if the religious language is the only one which they speak in public, and if religiously justified opinions are the only ones they can or wish to contribute to political controversies, they nevertheless understand themselves as members of a civitas terrena, which empowers them to be the authors of laws to which they are subject as addressees. (2005b chapter 5 [2008a: 130–1])

Habermas’s theory has come in for criticism from all sides. On the one hand critics like Amy Allen claim that there is a residual asymmetry, and that Habermas still “stacks the decks in favour of secularism” (Allen 2013: 149). Wolterstorff agrees, though he thinks the problem lies with the idea of “post-metaphysical reason”, while Allen traces it to his “genealogy of post-secular reason”. Cristina Lafont argues, by contrast, that the asymmetric burden falls the other way, since the cognitive learning process imposes a duty on secular citizens to give up their scientistically motivated atheism in favour of politically motivated agnosticism (Lafont 2013: 238; Finlayson 2019: 14). The idea of “sacred-to-secular translation” is central to Habermas’s model of postsecular deliberative democracy. As an example, Habermas cites German Christian groups translating the statement from Genesis that “God created man in his own image” into secular language as “a gamete fertilized ex utero has the status of a subject of human rights” as part of their arguments against stem-cell research (2001b chapter 2 [2003b: 109]). Yet the idea has proved controversial. It is not clear, for example, whether or not Habermas considers it to work the same way as ethical and philosophical appropriations of religious concepts (see Cooke 2011; Kerkwijk 2015; Rees 2018: 143–65). For Wolterstorff it relies on the idea of post-metaphysical reason that is skewed in favour of secularism, while Rees argues that the idea is empty, since no such translations are possible.


