A semantic approach to nonmonotonic reasoning: Inference operations and choice

This paper presents a uniform semantic treatment of nonmonotonic inference operations that allow for inferences from infinite sets of premisses. The semantics is formulated in terms of selection functions and is a generalisation of the preferential semantics of Shoham, Kraus et al., and Makinson. A selection function picks out from a given set of possible states (worlds, situations, models) a subset consisting of those states that are, in some sense, the most preferred ones. A proposition α is a nonmonotonic consequence of a set of propositions Γ iff α holds in all the most preferred Γ -states. In the literature on revealed preference theory, there are a number of well-known theorems concerning the represen-tability of selection functions, satisfying certain properties, in terms of underlying preference relations. Such theorems are utilised here to give corresponding representation theorems for nonmonotonic inference operations. At the end of the paper, the connection between nonmonotonic inference and belief revision, in the sense of Alchourr (cid:1) on, Gärdenfors, and Makinson, is explored. In this connection, infinitary belief revision operations, that allow for the revision of a theory with a possibly infinite set of propositions, are introduced and characterised axiomatically. Several semantic representation theorems are proved for operations of this kind.

In standard deductive logic, a proposition α is a logical consequence of a set of propositions Γ (in symbols, Γ ' α) just in case α holds (or is true) in every possible state (situation, world) in which all the propositions in Γ hold.In other words, we have the following semantic characterisation of logical consequence: where ⟦α⟧ and ⟦Γ⟧ are the sets of all possible states in which, respectively, α and the set of all propositions in Γ hold.If Γ ⊆ Δ, then, of course, ⟦Δ⟧ ⊆ ⟦Γ⟧.It follows, that standard deductive logic is monotonic, that is: if Γ ' α and Γ ⊆ Δ, then Δ ' α. (Monotonicity) Notions of plausible inference or default reasoning do not in general satisfy monotonicity.From the information that x is a Quaker, we may plausibly infer that x is a pacifist.However, from the information that x is a Quaker and a Republican, it is not a plausible inference to conclude that x is a pacifist.Of course, the phenomenon of nonmonotonicity is familiar also from probabilistic contexts: from α being highly probable given β, we may not conclude in general that α is highly probable given β ^γ.
A common idea in the literature on nonmonotonic reasoning is the following: α is a nonmonotonic consequence of Γ (in symbols, Γ j $ α) just in case α holds in all those Γ-states that are maximally plausible (from the viewpoint of some agent).Or more abstractly, Γ j $ α obtains if α holds in all the best preferred Γ-states, namely in those Γ-states to which no other Γ-state is strictly preferred (or better).
Formally we represent this idea by introducing a selection function S which, given a set X of possible states, picks out the set S(X) of all the "best" elements in X .The relation j $ of nonmonotonic consequence (or plausible inference) is then defined in terms of S in the following way: This definition will in general lead to j $ being nonmonotonic, since there is no guarantee that S(⟦Γ⟧) ⊆ ⟦α⟧ will imply that S(⟦Γ [ Δ⟧) ⊆ ⟦α⟧ (see Figure 1).Clearly, one of the best preferred Γ [ Δ-states may fail to be a best preferred member of the more inclusive class of Γ-states.Therefore, it need not be the case that Sð⟦Γ [ Δ⟧Þ ⊆ Sð⟦Γ⟧Þ.Neither does it follow that Sð⟦Γ [ Δ⟧Þ ⊆ ⟦α⟧.
Different choices of underlying language, different conceptions of possible states, and different formal requirements on the selection function will give rise to different nonmonotonic logics.In this paper we shall explore some of the possibilities that ensue.In particular, we are going to study correspondences between various conditions on the selection function S-many of which are well-known from the literature on preference and choice-and conditions on the inference relation j $.In this connection it is often more natural to look at nonmonotonic inference as a Tarski-style inference operation C on sets of propositions rather than as an inference relation j $.The two notions are simply related by the equation: CðΓÞ ¼ fα : Γ j $ αg.
The essential idea behind our semantic modelling of nonmonotonic inference goes back to McCarthy's classical paper (McCarthy, 1980) on circumscription.McCarthy presents circumscription as a formalised rule of nonmonotonic inference (he calls it a rule of conjecture) which is used in conjunction with the rules of standard logic.There are many versions of circumscription, but the essential model-theoretic idea is the same: among all the models of a formula α, some are singled out as being minimal.Minimality here can mean various things, for instance: (i) Domain Circumscription: the minimal models of α are those that have no proper submodels that are also models of α, (ii) Predicate Circumscription: the extensions of some designated predicates are minimised, while the domain together with the extensions of all other predicates are kept fixed, (iii) Parameterised Predicate Circumscription: this case is like (ii), except that the extensions of some predicates (the parameters) are allowed to vary freely, and (iv) Prioritised Circumscription: there is a priority ordering of the predicates to be minimised: minimising a predicate with higher priority is always preferred to minimizing a predicate with lower priority.
Given some notion of a minimal model, one can define a corresponding notion of minimal entailment: α minimally entails β iff all minimal α-models are β-models. 1In order to single out the minimal models of α, and thereby the sentences that are minimally entailed by α, a new sentence, called the circumscription of α, is associated with α.This new sentence has as its models just the minimal models of α.Thus, α minimally entails β just in case β is a logical consequence of the circumscription of α.It should be noted, however, that the circumscription of α is in general a sentence of second-order logic.
To make all this a little more concrete, let us look at a special case: predicate circumscription (McCarthy, 1980).Let α(P) be a sentence involving the predicate P (for simplicity, we let P be unary).The (Predicate) Circumscription of α with P is the second-order sentence Circumðα,PÞ defined as: 2 αðPÞ ^8QðQ < P !¬αðQÞÞ, where Q < P is an abbreviation for the sentence 8xðQðxÞ !PðxÞÞ ^9xðPðxÞ ^¬QðxÞÞ.Next, we introduce a strict partial ordering ⊏ P on models of the language under consideration: M ⊏ P N iff M and N have the same domain, all predicate symbols in the language besides P have the same extension in M and N but the extension of P in M is a proper subset of its extension in N. We say that a model M is a P-minimal model of α, if M ⊧ α (M is a model of α) and there is no model N such that N ⊏ P M and N ⊧ α.Now, the models of Circumðα, PÞ are exactly the P-minimal models of α.Following McCarthy (1980), we say that α minimally entails β with respect to P (in symbols, α j $ P β) if all P-minimal models of α are models of β.Thus, α j $ P β holds just in case β is a logical consequence of Circumðα,PÞ.Since a P-minimal model of α ^β may not be a P-minimal model of α, minimal entailment with respect to P is nonmonotonic.
saying that (1) Tweety is a bird, (2) birds that are not abnormal 1 can fly, (3) all penguins are birds, (4) penguins are abnormal 1 , and (5) all penguins, except those that are abnormal 2 cannot fly.Applying predicate circumscription to the abnormality predicates ab 1 and ab 2 (that is, minimising the extension of these two predicates while keeping the extensions of the other predicates fixed) we infer from α that Tweety can fly.We cannot make this inference from α ^penguinðtweetyÞ.Since Tweety is a penguin, she is abnormal 1 .Hence, (2) cannot be used to infer that she can fly.On the other hand, it follows by minimality that Tweety is not abnormal 2 .Therefore, it follows by (5) that she cannot fly.
That is, β is like α, except for not containing the so-called cancellation of inheritance axiom (4).Using ordinary predicate circumscription, we can only infer from β ^penguinðtweetyÞ that one of the following cases obtains: (i) Tweety is an abnormal 1 bird that cannot fly; (ii) Tweety is an abnormal 2 penguin that can fly.
Nothing follows concerning Tweety's ability to fly.Intuitively, however, it seems reasonable to conjecture from β ^penguinðtweetyÞ that Tweety cannot fly.The cases (i) and (ii) are not symmetrical: the information that Tweety is a penguin is more specific than the information that she is a bird.It seems reasonable to give higher priority to minimising abnormality with respect to the more specific predicate.In the choice between minimising abnormality 1 and abnormality 2 , we choose the latter.Hence, we conclude that Tweety is not abnormal 2 .Then, it follows by (5) that she cannot fly.Shoham (1987Shoham ( , 1988) ) generalised the concept of circumscription, or minimal entailment, to a more abstract notion: preferential entailment.Shoham's idea was to start from 3 This is a shortened version of an example in McCarthy (1986).
A SEMANTIC APPROACH TO NONMONOTONIC REASONING: INFERENCE OPERATIONS AND CHOICE any ordinary model-theoretic semantics for a formal language L and add a new primitive notion to it: a strict partial ordering ⊏ of all the models of L. Intuitively, M ⊏ N means that the model M is preferred over the model N.Then, M is defined to be a preferred model of α iff (i) M ⊧ α, and (ii) there is no model N such that N ⊧ α and N ⊏ M. Finally, α is said to preferentially entail β (in symbols, α j $ ⊏ β) just in case every preferred model of α is a model of β.Shoham (1988) emphasises three ways in which his own approach generalises that of McCarthy's: (i) Preferential entailment can be defined relative to any logic having a modeltheoretic semantics, not just to standard first-order logic-starting, for instance, with a modal logic and a preference relation over its Kripke-models, one can define the corresponding nonmonotonic modal logic; (ii) a notion of preferential entailment can be defined in terms of any partial ordering of models-that is, one is not limited to those orderings that correspond to circumscription axioms; and (iii) there is a shift of emphasis from syntax-circumscription axioms-to semantics-partial orderings of models.
In the work of Kraus et al. (1990), Shoham's approach is generalised further: a new primitive is introduced into the semantics-the notion of a state.Each state is labelled by a set of models of the underlying nonmonotonic logic and the states, not the models, are ordered by a binary relation ⊏ .In general, it is not assumed that ⊏ satisfies any of the usual properties like irreflexivity or transitivity.A formula α holds in a state u (u is an α-state) iff α is true at every model that is labelled by the state.A state u is a preferred α-state iff (i) u is an α-state and there is no α-state v such that v ⊏ u. α preferentially entails β, in symbols, α j $ β, if all preferred α-states are β-states.The main objective of Kraus et al. (1990) is to study nonmonotonic inference relations j $ both in terms of abstract proof-theoretic properties and semantically in terms of preferential models.Several important classes of inference relations are characterised semantically by means of representation theorems.
The study of abstract non-monotonic inference relations was initiated by Gabbay (1985) who took j $ to be a relation between a finite set Γ of premises and a single conclusion α.Gabbay (1985) defined a nonmonotonic logic as a relation of the described sort satisfying the following conditions: if α Γ, then Γ j $ α; (Reflexivity) if Γ j $ α and Γ, α j $ β, then Γ j $ β; (Finitary Cut) if Γ j $ α and Γ j $ β, then Γ, α j $ β.
(Finitary Cautious Monotony) He argued that these requirements should be satisfied by any reasonable inference relation.As we have seen, Shoham (1987Shoham ( , 1988) ) and Kraus et al. (1990) define j $ as a relation taking only single propositions as premises.In the presence of conjunction in the object language, this is essentially equivalent to allowing finite sets of propositions as premises.A more general treatment is proposed in Makinson (1989), where j $ is allowed to take infinite sets of premises.This generalisation makes it possible for Makinson to redefine nonmonotonic consequence as a Tarski-style operation C on arbitrary sets of sentences.Generalising Gabbay's conditions to the infinitary case and and expressing them in terms of C rather than j $, Makinson (1989) obtains the following conditions: Γ ⊆ CðΓÞ; (Inclusion) Γ ⊆ Δ ⊆ CðΓÞ implies CðΔÞ ⊆ CðΓÞ; (Infinitary Cut) Γ ⊆ Δ ⊆ CðΓÞ implies CðΓÞ ⊆ CðΔÞ.
(Cautious Monotony) An operation on sets of sentences satisfying these conditions is called by Makinson a cumulative inference operation.Makinson (1994) is a comprehensive survey-from an abstract logical point of view-of systems of nonmonotonic logic: its focuses on properties of the inference relations (or operations) that are associated with the various systems.
In the present paper, we follow Makinson-and differ from Kraus et al.-in viewing nonmonotonic consequence as an operation C on arbitrary sets of sentences (or equivalently, as a relation Γ j $ α, where Γ is allowed to be infinite).In addition, we modify the preferential semantics of Shoham and Kraus et al. by defining C, in the way previously described, in terms of a selection function S on sets of states rather than in terms of a preference relation on states.This treatment is more general, since a given selection function may not be definable in terms of any preference relation.
Utilising various well-known results from preference theory on the rationalisability of a selection function by an underlying preference ordering (cf., Moulin, 1985), we are able to prove a series of representation theorems for nonmonotonic inference.The general strategy in proving these results is the following: First, it is shown that any inference operation C that satisfies some set X of conditions may be defined in terms of a selection function S on sets of states satisfying a corresponding set of conditions X*. Next, it is shown that if S satisfies the conditions X*, then S is based on a preference relation P between states (read: xPy as state x is preferred over state y) satisfying some suitable conditions like asymmetry, transitivity, etc.Finally, the two steps are combined to yield a representation theorem for the inference operation C in terms of the preference relation P. The connection between C and P is given by: that is, α is a nonmonotonic consequence of Γ iff every P-maximal member of ⟦Γ⟧ is also a member of ⟦α⟧.
At the end of the paper, we shall also briefly consider dyadic inference operations C, where C Δ ðΓÞ is the set of all nonmonotonic consequences of the set of premisses Γ relative to the background assumptions Δ. Dyadic inference operations may be defined from dyadic selection functions on sets of states: 4 α C Δ ðΓÞ iff Sð⟦Δ⟧,⟦Γ⟧Þ ⊆ ⟦α⟧.
The notion of a dyadic nonmonotonic inference operation is, of course, closely related to Gärdenfors' concept of theory revision.If K * α is the revision of a theory K with the proposition α, then we have the following natural connection: or more briefly: That is, the revision of K with α is identified with the theory consisting of all the nonmonotonic consequences of α relative to the background theory K. 5 Conversely, a dyadic nonmonotonic inference relation may be viewed as a generalisation of ordinary theory revision: C Δ ðΓÞ may be thought of as the result of revising Δ with the set Γ.

4
Binary selection functions were studied by Kanger (2001).However Kanger's interpretation of SðV , X Þ, where X , V are subsets of some grand domain U, differs from the one employed here.Kanger took SðV , X Þ to be "the set of those alternatives of V \ X which, compared with alternatives of V , are regarded as not being worse than any alternative of V \ X " (Kanger, 2001, p. 216).Here, on the other hand, SðV ,X Þ is interpreted as the set of all those alternatives of X which are not farther removed from the set V than any alternatives in X .Thus, we think of the elements of V as the "ideal" alternatives; and the elements of SðV , X Þ are the elements of X that are as close to being ideal as possible.
This method of translating back and forth between theories of belief revision and nonmonotonic inference (with single propositions as premises) was suggested by Makinson and Gärdenfors (1991).

| DEDUCTIVE LOGICS
This section consists essentially of a review of selected, but well-known, material about consequence relations and consequence operations, some of it going back to the work of Tarski in the 1920s and 1930s.The concepts introduced here are basic to the development of nonmonotonic logic in the rest of the paper.We assume that a fixed object language L is given.The details of L are left open, except that we assume L to contain the standard connectives: ⊥ (falsity), !(the material conditional), (conjunction) and _ (disjunction).Hence, the set Φ of sentences of L is closed under the rules: (i) ⊥ Φ, and (ii) if α, β Φ, then (α !β), (α ^β), (α _ β) Φ. ¬α is taken as a metalinguistic abbreviation of (α !⊥ ).
If Γ is a set of sentences in L and α is a sentence in L, then we write Γ ' 0 α just in case α is a tautological consequence of Γ (that is, if α follows from Γ in classical propositional logic).We also write Cn 0 ðΓÞ ¼ fα : Γ ' 0 αg, that is, Cn 0 ðΓÞ is the closure of Γ under tautological consequence.
By a consequence relation we shall understand a binary relation ' which takes sets of sentences (in L) as its first argument and single sentences (in L) as its second and which satisfies the following conditions: Here, Γ and Δ are any sets of sentences and α, β are any sentences.By a deductive logic L we shall understand a finitary consequence relation, that is, a consequence relation ' L that satisfies: We say that a deductive logic L is f ^, _ g-normal if it satisfies the standard natural deduction rules for conjunction and disjunction, that is, By a classical logic we understand a deductive logic that satisfies the following two conditions: That is, a classical logic is a deductive logic which extends the classical propositional calculus and satisfies the deduction theorem.Every classical logic is, of course, f ^, _ g-normal.
A deductive logic L can equivalently be presented as a finitary consequence operation Cn L , that is, an operation that takes sets of sentences in L into sets of sentences in L and satisfies the following conditions: In the presence of ( Cn1) and ( Cn2), ( Cn3) is equivalent to the cut rule: , then it also satisfies: This lemma, like several of the theorems and lemmas below, is proved in the Appendix.
Of course, L is a classical logic if, in addition to (Cn1) -(Cn4), it satisfies the following two conditions: (Cn5) Cn 0 ðΓÞ ⊆ Cn L ðΓÞ; (Supraclassicality) (Deduction Theorem) The two presentations of a deductive logic L are related by the following conditions: and If α Cn L ðΓÞ, we say that α is an L-consequence of Γ.We say that α is an L-theorem, if α Cn L ð;Þ.
Lemma 2.2.If L is a classical logic, then it satisfies the following conditions: (Reductio Ad Absurdum) We omit the straightforward proof of Lemma 2.2.Let L be a deductive logic.L is (absolutely In view of the next lemma, we may speak of L-maximal sets as L-maximal theories.Observe the use of Iteration (i.e., Cut) in the proof of the lemma.
Lemma 2.3.Let L be a deductive logic.Then every L-maximal set is an L-theory.
The proof of the following lemma uses Inclusion, Cut, Monotonicity, Finiteness and the Axiom of Choice in the form of Zorn's Lemma.If L is a deductive logic, then we write M L , T L for the set of all L-maximal theories and the set of all L-theories, respectively.m, m 0 , m 00 , … are variables ranging over L-maximal theories and G, H, K, T, T 0 , … range over L-theories.We also introduce the following notation: In what follows, we shall often suppress the subscript L in contexts where the logic is assumed to be fixed.
Lemma 2.5.Let L be a deductive logic.Then, Cn L ðΓÞ ¼ Proof.()) Suppose that α Cn L ðΓÞ and that m is an L-maximal set such that Γ ⊆ m.It follows by Monotonicity that α Cn L ðmÞ.Since m ¼ Cn L ðmÞ (Lemma 2.1), α m.
(() Suppose that α = 2 Cn L ðΓÞ.By Lemma 2.4, there exists an L-maximal theory m such that Cn L ðΓÞ ⊆ m and α = 2 m. □ Lemma 2.6.Let L be a f ^, _ g-normal deductive logic.Then, every L-maximal set m satisfies the conditions: Lemma 2.7.Let L be a classical logic.Then,

| NONMONOTONIC INFERENCE
We assume that a fixed consistent deductive logic L is given.We are next going to introduce the notion of an inference relation based on the underlying deductive logic L. We shall assume that all relations of nonmonotonic inference that we are going to study are inference relations in the sense defined below.In addition, we introduce the notion of an inference operation which is just a notational variant of that of an inference relation.
According to (j $ 1), an element of Γ is a nonmonotonic consequence of Γ. (j $ 2) says that if β is an L-consequence of a set of nonmonotonic consequences of Γ, then β is itself a nonmonotonic consequence of Γ.In other words, the set of nonmonotonic consequences of Γ is closed under L-consequence.According to (j $ 3), L-equivalent sets of sentences have the same nonmonotonic consequences.
(b) An inference operation based on L is an operation C : ℘ðΦÞ !℘ðΦÞ satisfying the following conditions: Of course, there is a one-to-one correspondence between inference relations based on L and inference operations based on L. That is, we define the inference operation corresponding to j $ by: Conversely, given C, we define j $ by: Γ j $ α iff α CðΓÞ.In other words, if Γ ' L α, then Γ j $ α.
Lemma 3.4.Suppose that L is a classical logic.Then, j $ is an inference relation based on L iff it satisfies the following conditions: It is easy to verify that conditions (j $ 4) and (j $ 5) may be replaced in Lemma 3.4 by the single condition:7 (Right Weakening) In the next definition, we introduce the notion of an L-maximal theory being Γ-optimal with respect to an inference relation j $.The L-maximal theories may be thought of as (descriptions of) those possible worlds that are allowed by the underlying logic L. We may think of Γ j $ α as expressing a (conditional) disposition on the part of an agent to expect α to be true, if she were to be given Γ as her total new information.The set CðΓÞ ¼ fα : Γ j $ αg, then, consists of all the agents Γ-expectations.8A possible world is Γ-optimal if all the Γ-expectations of the agent are true in it.In other words, after having received the total information Γ, the agent would not be surprised at all if any of the Γ-optimal worlds turned out to be the actual one.Definition 3.5.Let L be a deductive logic and j $ an inference relation based on L. Let Γ be any set of sentences and m any L-maximal theory.We say that m is Γ-optimal (with respect to j $) if for all α, if Γ j $ α, then α m.In other words, m is Γ-optimal iff CðΓÞ ⊆ m.
Lemma 3.6.Let L be a deductive logic and j $ an inference relation based on L.Then, Γ j $ α iff for every Γ-optimal m, α m.

| SEMANTICS: MODELS USING SET-VALUED SELECTION FUNCTIONS
In the following we let L be a deductive logic which we assume to be f ^, _ g-normal.
The notion of a model based on L will be introduced in two steps.First, we define the notion of structure based on L. After having defined the requisite concepts, a model will be defined as a a structure of a special kind.
Definition 4.1.A structure based on L is a 4-tuple M ¼ ⟨U,V , l,S⟩, where (i) U is a non-empty set, the elements of which are called states (these might be thought of as representing the possible belief states of an agent).We use the lower case letters x, y, z, u as variables ranging over U.The letters X , Y , Z will be variables ranging over ℘ðUÞ (ii) V is a non-empty family of subsets of U. (iii) l (the labeling function) is a function that assigns to every state u U a non-empty set lðuÞ of L-maximal theories.We may think of the members of lðuÞ as representing those possible worlds that are compatible with the agent's beliefs in state u (the agent's doxastically possible worlds in state u).(iv) S is a function from V to V such that for every X V , SðX Þ ⊆ X .Such a function we call a selection function on V .
Let M ¼ ⟨U,V , l,S⟩ be a structure based on L. We say that a sentence α holds (or is accepted) in the state u U (relative to M) and write M ⊩ u α iff for every m lðuÞ, α m.That is, M ⊩ u α obtains just in case lðuÞ ⊆ jαj L .Intuitively, a sentence α is accepted in a state u just in case α is true in all possible worlds that are compatible with the agent's beliefs in the state u.
The set of all states in which α holds will be written ⟦α⟧ M (or just ⟦α⟧).Thus, For a set of sentences Γ, we write: that is, ⟦Γ⟧ is the set of all states in which all sentences in Γ are accepted.
Given any set X of states in M, we may also define the set tðX Þ of sentences that are accepted in all the states in X , that is, Notice, that the pair of mappings ⟦…⟧ and t together form a Galois connection between ℘ðΦÞ and ℘ðUÞ, that is, they satisfy: It follows from (i)-(iv) that these mappings also satisfy: For every set X ⊆ U, we define the closure of X , ClðX Þ, as the set ( * ) ⟦tðX Þ⟧ ¼ T f⟦α⟧ : X ⊆ ⟦α⟧g. 9 The notion of a Galois connection and its use in model theory is discussed, for example, in Cohn (1965).
A SEMANTIC APPROACH TO NONMONOTONIC REASONING: The closure of X is the intersection of all closed subsets of U that include X .10 Lemma 4.2.Let M ¼ ⟨U, V ,l,S⟩ be a structure based on L.Then, the operator Cl : ℘ðUÞ !℘ðUÞ, defined by the equation ( * ) above, satisfies the following conditions.For all X , Y ⊆ U, We are now ready to define the notion of a model based on L.
Definition 4.3.Let M ¼ ⟨U,V , l, S⟩ be a structure based on L. We say that M is a model (based on L) if the family V satisfies the following conditions: That is, a model is a structure in which the domain V of the selection function contains all closed subsets of U and is closed under finite unions and arbitrary intersections.
For any model M ¼ ⟨U, V ,l,S⟩ based on L, we define two corresponding relations ⊧ M and j $ M between sets of sentences and single sentences: That is, Γ ⊧ M α obtains just in case α is accepted in all the Γ-states (i.e., in all the states in which all sentences in Γ are accepted).And, Γ j $ M α obtains just in case α is accepted in all the most preferred Γ-states.
Lemma 4.4.If M ¼ ⟨U, V ,l,S⟩ is a model based on L, then ⊧ M is a consequence relation which extends L, that is, such that: and j $ M is an inference relation based on L.
Proof.The easy verification that ⊧ M is a consequence relation extending L is omitted.We prove that j $ M is an inference relation based on L.
We say that a model M ¼ ⟨U,V , l, S⟩ is a world model if lðuÞ is a unit set for each u U.In a world model, the set of sentences accepted in a state is always L-maximal.Since L is assumed to be f ^, _ g-normal, we have for any model M and all sentences α and β: Theorem 4.5.Let L be a f ^, _ g-normal deductive logic and let j $ be an inference relation based on L.Then, there exists a world model M ¼ ⟨U, V ,l, S⟩ (based on L) such that: that is, ' L and j $ are, respectively, the consequence relation and inference relation determined by M.
Proof.We define a structure M ¼ ⟨U,V ,l, S⟩, which we shall call the canonical model for ' L and j $, as follows: (i) U ¼ M L , that is, U is the set of all L-maximal theories; (ii) V ¼ fjΓj L : Γ is a set of sentences in Lg.That is, V consists of all closed subsets of U; (iii) for each u U, lðuÞ ¼ fug.(iv) We define S as follows: For any set X V , consider the theory tðX Þ determined by X , namely: Thus, the canonical model for ' L and j $ is a structure based on L. In this structure, we have ⟦Γ⟧ ¼ jΓj L , for all Γ.
Proof of ( * ): SðjΓjÞ is the set of all T ðjΓjÞ-optimal L-maximal sets.But T ðjΓjÞ ¼ Cn L ðΓÞ, so SðjΓjÞ is the set of all L-maximal sets that are Cn L ðΓÞ-optimal.However, m is Cn L ðΓÞ-optimal iff m is Γ-optimal (since CðCnðΓÞÞ ¼ CðΓÞ).
Hence, SðjΓjÞ is the set of all L-maximal theories that are Γ-optimal.It follows by lemma 3.6 that Γ j $ α iff for all m SðjΓjÞ, α m.Q.E.D.
It only remains to show that the canonical model M ¼ ⟨U,V , l, S⟩ for ' L and j $ is indeed a model, that is, satisfies conditions (i)-(iii) of Definition 4.3: Condition (i) is immediate from the definition of V .

Condition (ii):
We first prove that the closure operation of the canonical model satisfies: In order to prove the other direction, assume that m ClðX Condition (iii).Let F be a non-empty family of elements in V .By (Cl 2) of Lemma 4.2, Remark 4.6.Let j $ be an inference relation based on L. Let M ¼ ⟨U,V , l,S⟩ be the corresponding canonical model.Then, we have for all Γ ⊆ Φ and X V : It follows that for all sets of sentences Γ and all X V , (v) CðtðX ÞÞ ¼ tðSðX ÞÞ; and (vi) Sð⟦Γ⟧Þ ¼ ⟦CðΓÞ⟧, that is, the two diagrams in Figure 2 commute.By a canonical model for L we understand a model M ¼ ⟨U,V ,l, S⟩ such that: (i) U is the set of all maximal L-theories; (ii) V is the set of all closed subsets of U, that is, It is easy to see that a canonical model for L is the canonical model for L and the inference operation C M defined by: C M ðΓÞ ¼ tðSð⟦Γ⟧ÞÞ.That is, we also have: The next lemma states that the set V of a canonical model has certain important closure properties: V is closed under finite unions and arbitrary intersections and contains all singleton sets.It follows that V contains all finite subsets of U.
Lemma 4.7.Let M ¼ ⟨U,V ,l, S⟩ be a canonical model for L.Then, for all X , Y V, and m U, is, all singleton sets are closed; (iii) all finite subsets of U are members of V .
Proof.We have already proved (i) in the course of proving Theorem 4.5.Observe that in the proof of (i), we used the fact that L is closed under the standard natural deduction rules for _ (see the Appendix).(ii) U is the set of all L-maximal theories.Hence, ClðfmgÞ ¼ fmg.(iii) follows immediately from (i) and (ii). □ We shall now consider some natural conditions that we might want to impose on the selection function in a model.Most of these are taken from the literature on choice functions and Together with (Cl 1)-(Cl 4), this condition implies that the closure operation of a canonical model is a topological closure operation in the sense of Kuratowski (see, for instance, Kelley (1955), p. 43).
We conclude this section by discussing some consequences of the conditions above in the context of L being classical.First of all, L being classical implies that a finite set of premises may be treated as a conjunction, that is, α ^β j $ γ iff fα,βg j $ γ.
Lemma 4.11.(Makinson) Suppose that C is an inference operation based on a classical logic L. If C satisfies Distribution, then the following conditions are also satisfied: Since Chernoff implies Distribution, the assumption of Chernoff yields, in the context of classical logic, Conditions (i)-(iv).
Condition (iv) would license inferences of the kind: (1) If Squeaky is a mammal, then it is expected that Squeaky cannot fly.
(2) If Squeaky is a mammal and a bat, then it is expected that Squeaky can fly.
(3) Hence: if Squeaky is a mammal, then it is expected that Squeaky is not a bat.
If L is classical and C satisfies Arrow, then we also have: This principle yields inferences of the kind: (1) If Squeaky is a mammal, then it is expected that Squeaky cannot fly.
(2) If Squeaky is a mammal, then it is not expected that Squeaky is not a dog.
(3) Hence: if Squeaky is a mammal and a dog, then it is expected that Squeaky cannot fly.

| REPRESENTATION THEOREMS
In the last section, we proved a series of results connecting properties of the inference operation C with properties of the selection function S in the canonical model corresponding to C. In this section we wish to explore under what conditions a given selection function can be defined in terms of an underlying preference relation P on the set U of all states.In order to make this question precise, we introduce the notion of a choice structure: 16 16 The term "choice structure" is borrowed from Hansson (1968), although his notion of a choice structure is not exactly the one defined here: Hansson's choice structures satisfy weaker structural conditions on the set V , but stronger conditions on the selection function S.
Definition 5.1.A choice structure is an ordered triple S ¼ ⟨U, V ,S⟩, where U is a non-empty set, V is a non-empty family of subsets of U, S is a function from V to V , such that: U is the domain of S and the elements of U are here called states (or points).Axioms (ii)-(iv) say that the elements of V form the closed sets of a topological space over U. Hence, it is appropriate to refer to the elements of V as the closed sets of S. For any X ⊆ U, we write ClðX Þ for the closure of X , that is, the intersection of all closed sets that include X .Cl, of course, satisfies the axioms (Cl 1)-(Cl 4) of a topological closure operation.A topological space satisfying condition (i), that all singleton sets are closed, is called a T 1 -space.It follows from (i) and (iii) that all finite sets are members of V. S is the selection function (or the choice function) of the structure S. According to (v), S selects a subset of elements from any closed subset X of U. Since S is an operation on V , SðX Þ is always a closed set.
The principal case we are interested in is the following: A f ^, _ g-normal deductive logic L and an inference operation C based on L are given.S ¼ ⟨U, V ,S⟩ is defined in terms of L and C as follows: (i) U is the set of all L-maximal theories; In this case, S ¼ ⟨U,V , S⟩ is essentially identical to the canonical model for L and C.
In this section, we shall think of the set U of states as being provided with a preference relation P ⊆ U Â U (we read xPy as: x is better than y).In terms of such a relation, we can define the selection function S : V !V as follows: for all X V : That is, SðX Þ is the set of all P-maximal elements of X .We say that S is based on the relation P-and that P rationalises S-if S is defined from P by means of the equation ( * ).S is said to be rationalisable if there is a relation that rationalises it.
We use the following terminology for preference relations: We use xRy as an abbreviation for ¬ðyPxÞ.P is said to be: (i) a strict partial ordering iff P is asymmetric and transitive; (ii) a strict weak ordering iff P is asymmetric and 8xyzðxRy ^yPz !xPzÞ; (iii) a strict linear order iff P is a strict partial ordering and 8xyðxRy ^x ≠ y !xPyÞ; (iv) neat iff every non-empty element X of V contains a P-maximal element, that is, an x such that 8yðy X !¬ðyPxÞÞ. 17Our terminology here, differs from Kanger (2001) who uses "neat" to refer to the stronger property of P À1 (the converse of P) being well-founded.Thus, P is neat in Kanger's sense, just in case every non-empty subset of U has a P-maximal element, that is, iff there are no infinitely ascending P-chains in U. Neatness in our sense only requires every non-empty closed subset of U to contain a P-maximal element.Neatness is analogous to the limit assumption of Lewis (1973).
(b) P is neat iff S satisfies Consistency Preservation.(c) If P is irreflexive, then P is unique and for all x,y U, xPy iff y = 2 Sðfx, ygÞ.
The irreflexivity of P then yields: Proof.Suppose that P rationalises S.Then, we have for all Z V and all z Z, (1) z SðZÞ $ ð8yÞðy Z !¬ðyPzÞÞ.
To prove the other direction of the lemma, suppose that for all X V , (3) SðX Þ ¼ fx X : ð8yÞðy X !x Sðfx, ygÞg.
(3) and ( 4) then yield: that is, P rationalises S. □ In the theory of preference and choice, there are many well-known theorems relating conditions on the selection function S, like Chernoff, Aizerman, Gamma, etc., to the existence of an underlying preference relation P. 18 The following theorem is a slight strengthening of a result by Sen (1971). 1918 Cf.Hansson (1968), Sen (1971), Moulin (1985), Kanger (2001).19 Sen proved that a selection function that satisfies Consistency Preservation is rationalisable iff it satisfies Chernoff and Gamma.Theorem 2 of Moulin (1985) is Sen's theorem for the case when U is finite.

| DYADIC INFERENCE OPERATIONS AND INFINITARY BELIEF REVISION
In this section we shall explore the connection between nonmonotonic inference and belief revision in the sense of Alchourr on et al. 21In doing so, we generalise the notion of belief revision to allow for the revision of a set of beliefs with a, possibly infinite, set of propositions representing the new information. 22A representation theorem is proved for the generalised notion of belief revision in terms of systems of spheres of the kind introduced by Grove (1988).
In Makinson and Gärdenfors (1991) a method is described for translating postulates for belief revision into postulates for nonmonotonic inference, and vice versa. 23The basic idea here is to interpret β K * α as a claim that β is a nonmonotonic consequence of α, relative to the background (or default) theory K.That is, β K * α is translated as α j $ K β, where j $ K is a nonmonotonic inference relation associated with the theory K. Expressing this equivalence, in terms of an inference operation C K instead, we get, for a fixed K, the identity: This idea can be generalised: thinking of C as a binary operation and allowing K to be replaced by an arbitrary set of sentences Δ, we get: In other words, for any Δ and α, the revision of Δ with α is identified with the set of nonmonotonic consequences of α, relative to the default assumptions Δ.Now, in order to get complete interdefinability between the notions of belief revision and nonmonotonic inference, we need just another step: we must allow for the possibility of a set Δ being revised with a possibly infinite set of propositions Γ.Then, for all Γ and Δ, we obtain: 21 Cf. Alchourr on et al. (1985) and Gärdenfors (1988).

22
Infinitary belief revision has been studied before in the literature.Fuhrmann (1988) considers both infinitary belief contraction operations and infinitary belief revision operations.He refers to these kinds of operations as multiple contraction and multiple revision, respectively.Via a generalisation of the so-called Levi identity, Fuhrmann defines infinitary belief revision in terms of infinitary belief contraction.He also formulates a set of postulates for infinitary belief revision which is equivalent to (BC1)-(BC4) together with (BC9).(Fuhrmann, 1988, p. 159).S. O. Hansson (1989) contains a theory of infinitary belief contraction.
Conversely, given a notion of belief revision Δ * Γ , we may, of course, define the corresponding notion of nonmonotonic inference via the same equality.
In our formal treatment, however, we shall not make a complete identification between the notions of belief revision and nonmonotonic inference.Instead, we take the former as a special case of the latter-in the sense of being characterised by stronger axioms.The axioms for belief revision presented here are straightforward generalisations to the infinitary case of Gärdenfors' (1988) basic axioms (K*1) -(K*6) for finitary belief revision. 24  Definition 6.1.Let L be a consistent deductive logic.
(a) A dyadic inference operation based on L is an operation C : ℘ðΦÞ Â ℘ðΦÞ !℘ðΦÞ satisfying the following conditions.For easy readability, we shall write C Δ ðΓÞ instead of CðΔ,ΓÞ.We also write Γ þ Δ as an abbreviation of Cn L ðΓ [ ΔÞ.We speak of Γ þ Δ as the expansion of Γ with Δ.
Here, the preferred reading of C Δ ðΓÞ is: 'the result of revising the set Δ with the new information Γ'.
Notice that (BC1)-(BC3) say no more than that, for any fixed Δ, C Δ ð… Þ is an inference operation in the sense of Definition 3.1 (b).The axioms (BC1)-(BC5) should be compared with the corresponding axioms in Gärdenfors (1988), namely: The following two axioms correspond to Expansion: Finally, there is: To the basic axioms (BC1)-(BC5) for belief revision, we might want to add some of the following supplementary axioms:

24
The reader should perhaps also be reminded that we make weaker assumptions concerning the underlying logic than Gärdenfors does.We assume only that it is a deductive logic, that is, a finitary Tarski-style consequence relation.He assumes, in addition, that it is classical, that is, is closed under the axioms and rules (including the deduction theorem) of classical propositional logic.
, where F is any non-empty family of sets of sentences; (Gamma) Provided that Cn L ðfα ^βgÞ ¼ Cn L ðfα, βgÞ, (BC6) yields the following supplementary axiom of Gärdenfors: Under the same provision, (BC9) yields Revision by Conjunction: which is equivalent to (K*7) together with the other of Gärdenfors' supplementary axiom: It is straightforward to modify the notion of a model M ¼ ⟨U,V , l,S⟩ based on L, that was introduced in Section 4, in such a way as to get a semantics for dyadic inference operations.The only difference occurs in clause (iii) of Definition 4.1, which has to be changed to: we call a dyadic selection function on V .
Each model M ¼ ⟨U,V , l,S⟩, of the new kind, determines two operations: Cn M ðΓÞ ¼ tð⟦Γ⟧Þ and C M Δ ðΓÞ ¼ tðSð⟦Δ⟧,⟦Γ⟧ÞÞ, where the first operation is a consequence operation that extends L and the second is a dyadic inference operation based on L (cf.Lemma 4.4).Now, for any f ^, _ g-normal deductive logic L and any dyadic inference operation C, we may define the corresponding canonical model M ¼ ⟨U, V ,l,S⟩, where: The proof of Theorem 4.5 carries over unchanged, so we have that Cn L ¼ Cn M and for all Γ, Δ, C Δ ðΓÞ ¼ C M Δ ðΓÞ.That is, Cn L and C are, respectively, the consequence operation and the dyadic inference operation that are determined by the canonical model M.
In addition to letting the selection functions take an extra argument, we apply the same procedure to the preference relations.That is, we write xP X y and read it as: x is preferred over (or better than) y, relative to X .Intuitively, xP X y means that x is closer to the optimal alternatives in X than y.We shall refer to ternary relations P ⊆ U Â ℘ðUÞ Â U as (relativised) preference relations.Properties of relations like reflexivity, transitivity, being a weak ordering, etc., carry over to relativised preference relations as follows: a given property is said to apply to P iff for each X , P X has the property in question.A dyadic selection function S : We have now introduced the concepts that are required in order to state the following representation theorem.Theorem 6.2.(Representation Theorem II) Let C be a belief revision operation based on the deductive logic L. Let M ¼ ⟨U,V ,l, S⟩ be the canonical model for L and C.Then, Cn L ¼ Cn M , for all Γ, Δ, C Δ ðΓÞ ¼ C M Δ ðΓÞ and for all X ,Y V: Proof.(to be written) We are next going to prove that any (infinitary) belief revision operation C that satisfies (BC1)-(BC5) together with (BC9) can be defined in terms of "systems of spheres" of the kind defined in Grove (1988).Theorems 6.4 and 6.5 below for infinitary belief revision operations should be compared with Grove's (1988) Theorems 1 and 2 for finitary belief revision operations.In the following, we let L be a fixed deductive logic and U the set M L of all L-maximal theories.V is the set of all closed subsets of U. Definition 6.3.(a) A family of spheres centred on X V is a collection $ X of elements in V satisfying the conditions: 25 In other words, for every closed set Y V , there exists a smallest sphere in $ X intersecting Y .
(b) A system of spheres is a function $ that associates a family of spheres with any set X V .

25
($4) is a strengthening of Grove's (1988) limit assumption.For a discussion of the limit assumption in the context of possible worlds semantics for counterfactuals, see Lewis (1973).
Gamma: First, we assume (g) and prove (G).Condition (g) yields: x 0 = 2 SðX Þ.It follows that there exists some x 1 X such that x 1 Px 0 .If x 1 = 2 SðX Þ, then there exists some x 2 X such that x 2 Px 1 , and so on.For any n, if x n = 2 SðX Þ, we choose x nþ1 in such a way that x nþ1 Px n and x nþ1 X ; and if x n SðX Þ, we terminate the process.Since P À1 is wellfounded, this process must terminate after a finite number of steps.Thus, we get a finite sequence (with at least two terms) x 0 ,x 1 , …, x n of elements in X such that x 0 = 2 SðX Þ and x n SðX Þ and x n Px nÀ1 P…x 1 Px 0 .By the transitivity of P, x n Px 0 .Since SðX Þ ⊆ Y , we get that x n Y .Thus, we have: x 0 SðY Þ (by assumption), x n Px 0 and x n Y , that is, a contradiction.Hence, we have proved that SðY Þ ⊆ SðX Þ. □ Lemma 2.4.(Lindenbaum's Lemma) Let L be a deductive logic.(a) Every L-consistent set is included in an L-maximal theory.(b) If α = 2 Cn L ðΓÞ, then there exists an L-maximal theory m such that Cn L ðΓÞ ⊆ m and α = 2 m.

Lemma 5. 2 .
Let S: V !V be a selection function and P a relation that rationalises S.Then, (a) P is irreflexive iff S satisfies the condition:(ir) SðfxgÞ ≠ ;, for each x U.(Irreflexivity) 17 Theorem 5.8.(Representation Theorem I) Let C be an inference relation based on the deductive logic L. Let M ¼ ⟨U,V ,l, S⟩ be the canonical model for L and C.Then, L ¼ L M and C ¼ C M and: (i) C satisfies Chernoff and Gamma iff S is based on a relation P ⊆ U Â U. (ii) If C satisfies (CP), Chernoff, Gamma and Aizerman, then S is based on a neat strict partial ordering P ⊆ U Â U. (iii) If C satisfies (CP) and Arrow, then S is based on a neat strict weak ordering P ⊆ U Â U.

(
BC1) C Δ ðΓÞ is an L-theory; (Closure) (BC2) Γ ⊆ C Δ ðΓÞ; (Success) (BC3) if Cn L ðΓÞ ¼ Cn L ðΔÞ and Cn L ðΣÞ ¼ Cn L ðΠÞ, then C Σ ðΓÞ ¼ C Π ðΔÞ.(Congruence) (b) An (infinitary) belief revision operation based on L is a dyadic inference operation C satisfying-in addition to (BC1)-(BC3)-the following conditions: a) If C, in addition to the basic axioms (BC1)-(BC5), also satisfies axioms (BC6)(Chernoff)  and (BC7) (Gamma), then there exists a (relativised) preference relation P ⊆ U Â ℘ðUÞ Â U such that P is neat and S is based on P. (b) If C satisfies the conditions (BC1)-(BC8), then there exists a relation P ⊆ U Â ℘ðUÞ Â U such that P is neat and transitive and S is based on P. (c) If C satisfies (BC1)-(BC5) together with (BC9) (Arrow), then there exists a relation P ⊆ U Â ℘ðUÞ Â U such that P is neat strict weak ordering and S is based on P.