Page 1


Suppose some future psychologists are asked to investigate the mental faculties of a newly discovered intelligent life form. Sent to the planet that thesecreaturesinhabit, the psychologistsmake observations, conduct experiments, administer tests, and finally composea report for the Journal of Extraterrestrial Psychology. In reading this report, we would probably not be too surprisedto find that the creatureshave visual systemsdifferent from ours, and perhapsdifferent perceptual experiences . Nor is it hard to imagine thesecreatureshaving vastly greater (or smaller) memory capacity . But supposewe are also told that the creaturesreject simple, familiar inferenceprinciples in favor of equally simple but to our minds obviously incorrect principles. . To us earthlings, an intuitively straightforward inference principle is the one logicians call modusponens. According to this principle, the proposition IF so-and-so THEN such-and-suchand the proposition So-and-so ! jointly entail the proposition Such-and-such. For example, from the propositions IF Calvin deposits50 cents THEN Calvin will get a coke and Calvin deposits50 cents, it follows that Calvin will get a coke. We can write this in the form shown in ( 1), where the sentencesabove the line are called premisesand the sentencebelow the line the conclusion. ( 1) If Calvin deposits 50 cents, Calvin will get a coke. Calvin deposits 50 cents. Calvin will get a coke. Now supposethat our extraterrestrials have exactly the same inference skills that we do, but with one important exception. They reject all modus ponens arguments, such as ( 1), and adopt instead a contrary principle we : From IF so-and-so THEN such-and-suchand might call modusshmonens So-and-so, they conclude NOT such-and-such. For instance, they would ' say that the conclusion of ( 1) doesnt follow from its premises, but that the conclusion of (2) does. (2) If Calvin deposits 50 cents, Calvin will get a coke. Calvin deposits 50 cents. Calvin will not get a coke. The existenceof creatureswho systematicallydeny modus ponens and acceptmodus shmonenswould be extremely surprising- much more surprising than the existenceof creatureswho differ from us in basic perceptual or memory abilities. In a situation like this one, we would probably


be more apt to blame the translation into English from whatever language the creatures speak than to accept the idea that they sincerely believe in modus shmonens(Davidson 1970; Dennett 1981; Frege 1893/ 1964; Lear 1982; Quine 1970, chapter 6). Indeed, our reluctance to attribute exotic inferenceseven to exotic creaturesis an interesting property of our 2 thought processes. Modus ponensand other inferenceprinciples like it are so well integrated with the rest of our thinking - so central to our notion of intelligence and rationality - that contrary principles seemout of the " question. As Lear ( 1982, p. 389) puts it , We cannot begin to make senseof the possibility of someonewhosebeliefsare uninfluencedby modusponens: " we cannot get any hold on what his thoughts or actions would be like. Deep-rooted modesof thought such as theseare important objects of psychological investigation, since they may well turn out to playa crucial ' organizingrole for peoples beliefsand conjectures or so I will try to argue. This book leads up to the conclusion that cognitive scientists should consider deductive reasoning as a basis for thinking . That is, it explores the idea- which I call the Deduction-SystemHypothesis- that principles such as modus ponens are central to cognition becausethey underlie many other cognitive abilities. Part of the content of this proposal is that the mental life of every human embodiescertain deduction principlesthat theseprinciples are part of the human cognitive architecture and not just the properties of people schooled in logic. This approach also views deduction as analogous to a general-purpose programming system: Just as we can usea computer languagesuch as Basic or Pascalto keep track of our finances or to solve scientific or engineering problems, we can 3 use deductive reasoning to accomplish a variety of mental activities. In particular , we will seehow deduction allows us to answer questionsfrom information stored in memory, to plan actions to obtain goals, and to solvecertain kinds of problems. The Deduction-SystemHypothesis is necessarilyvagueat this point , but I hope to clarify it during the courseof the book. In the meantime, it may help to think of deduction systemsas similar to production systemsbut incorporating more flexible forms of representationand process. Goals The Deduction-SystemHypothesis is controversial, both in cognitive science generally and in the psychology of reasoning in particular . Some


current approaches in cognitive sciencedeal with deduction in the same way that they deal with mundane cognitive skills such as the ability to tell time on a standard clock. According to theseapproaches, deduction is a similarly specializedtask that is useful mainly for solving puzzlesin logic or proving theoremsin mathematics. There are several possible reasonsfor supposing that the DeductionSystemHypothesis is false. First , the resultsof many psychologicalexperiments show that subjects' performance on tasks that explicitly call for deduction is often quite poor . If deduction were a truly central cognitive ability , then one would supposethat performance would be nearly optimal . Second, one might hold that deduction is in its essentialsno different from other problem-solving activities (e.g., solving chessor crossword puzzles) and can be explained according to exactly the same principles. Hence, there may be no reasonto supposethat deduction has psychological primacy with respectto other such tasks. Third , there is a possibility that deduction is not sufficiently flexible to provide a reasonablebasisfor higher cognition . Deduction operatesby adding conclusionsderived from stated premises, as in ( 1) above. However, much of our thinking involves retracting beliefs that we now suspect are erroneous, and deduction provides us with no obvious way to do that (at least, outside of special deduction forms such as reductio ad absurdum- seechapter 2). There are some short answersone could give to these objections. For example, one could point out that , although people do have trouble with certain sorts of reasoningproblems, they do perfectlywell with others; their accuracyon the latter rivals their accuracy on almost any cognitive task. One could also mention that many cognitive scientists- including some who think that deduction is nothing special- take production systems extremely seriously as a basis for cognitive processing. Yet thesesystems operateon principlesthat look suspiciouslylike modus ponens. However, a truly convincing defenseof the Deduction-SystemHypothesis entails setfully ting out the details of such a theory and demonstrating that it success esand failures on a range of cognitive tasks. Developing predicts mental success and testingsuch a psychologicaltheory is the main goal of this book.

GeneralPlanof theChapters Althoughthe studyof reasoninghasa long history in experimentalpsychology , chapter30), its contributionto cognitive (seeWoodworth 1938


theory has been slight, becauseresearchershave tended to focus on very specializeddeduction tasks. Only within the last few years have cognitive psychologistscome up with models that are able to describemore than a trivially small set of inferences. My first chapter examinesthe limitations of traditional approaches and points to some design features for more adequatemodels. A key concept of the theory developed in this book is the notion of a mental proof. According to this approach, a person faced with a task involving deduction attempts to carry it out through a seriesof stepsthat take him or her from an initial description of the problem to its solution. These intermediate steps are licensed by mental inference rules, such as modus ponens, whose output people find intuitively obvious. The resulting structure thus provides a conceptual bridge between the problem's " " givens and its solution. Of course, mental proofs and mental inference rules are supposedto be similar in important respectsto proofs and rules in logic and mathematics. Chapters 2 and 3 describe the logical background of theseconceptsand outline the role they have played in computer theorem proving . Along the way, I also try to distinguish properties of logical systemsthat may have psychological relevancefrom properties that have more to do with eleganceor efficiency. Chapter 3 also develops the idea that we can use mental proofs, not only to derive theorems in formal mathematics, but also to solve more generalsorts of problems. My approach to the problem of deductive reasoning is to construct a theory in the form of a computer simulation that mimics the abilities of a person who has had no special training in formal logical methods. Along these lines, chapter 4 describesa model that is able to generate mental proofs for problems in sentential reasoning(i.e., for inferenceswhosecorrectness depends on sentenceconnectives such as IF , AND , OR , and and , NOT) chapter 5 examines some experimental tests of this model. The complete model also handles reasoning with quantified variables, as chapters6 and 7 show. Like all computer models, this one contains lots of specific assumptions - too many to test in a single experiment. For that reason, chapters 4- 7 describe a variety of studies that highlight different aspectsof the theory. Someof theseemploy the traditional method of asking subjectsto evaluate the logical correctnessof simple arguments (premise-conclusion pairs), but in others the subjectsare asked to recall the lines of a proof or to decide(while they are being timed) whether the lines follow from earlier


ones. In still other experiments, subjectsare asked to think aloud as they try to solve deduction pu72les. The last four chapters deal with implications of the model. Chapter 8 points out some of the promises of a deduction-based cognitive theory, applying the model to a number of well-known cognitive tasks, including categorization and memory search. It also discusses the relation of the model to schematheories, nonmonotonic logics, and truth maintenance. As if these topics weren't controversial enough, most of the rest of the book deals with rival theories in the psychology of reasoningitself. Some of these alternative views go along with the idea of inference rules and proof but quarrel with the particular rules or proof strategiesthat I am advocating. I describe some of these alternatives in chapter 9, but my tendencyis to try to accommodatetheseviews in the current framework rather than to try to discriminate among them experimentally. Other alternatives, though, explicitly deny the need for rules and proofs and attempt to substitutepsychologicalversionsof Euler circles, Venn diagrams, truth tables, or similar diagram methods (seeErickson 1974and JohnsonLaird and Byrne 1991). Chapter 10 is a critique of this position. Finally , chapter 11 deals with some general objections to deduction-basedcognitive ' es and limits. theories and tries to summarizethesesystems success ' Of course, this selection of topics doesnt cover all those that might be treated in a work on reasoning, or even in one on deductive reasoning. A ' major omission is a systematictreatment of researchon children s development of reasoning skills. The literature in this area is vast, especially ' when Piagetian studies are included; indeed, Piaget s logic is itself the target of severalmonographs, including Osherson 1974aand Seltmanand Seltman 1985. Becausemy familiarity with developmentalissuesis slim, I stick mainly to adult reasoning. I mention findings from the developmental literature when they bear on the main problems, but my coverage ' of them is narrow. Fortunately , Braine and Rumain s ( 1983) review oflogi cat reasoningin children discussesthis areafrom a point of view similar to the one developedin thesechapters.

What's New? Chapters1- 3 are intended in part to help advanced undergraduates or graduatestudents in the cognitive sciences understand recent research on


deduction, and they seemto have worked fairly well in the classeswhere I ' ve auditioned them. Part I also contains discussionsthat I hope will be of interest to experts in this area, but I expect these experts to focus on ' the last two parts- particularly on the material that hasn t appearedelsewhere ' . Here is a brief guide for them to what s new: The theory of sentential reasoning in chapter 4 is a refinement of an earlier proposal (Rips 1983), but it also contains some new results on the completenessof the theory with respectto classicalsententiallogic. Likewise, chapter 5 reports ' some reaction-time data that haven t appearedpreviously, but also summarizes a few older experiments on the sentential model. In addition , 5 chapter applies the model for the first time to some well-known results on reasoning with negativesand conditionals. The theory for quantified variablesand the supporting experimentsin chapters 6 and 7 are entirely new. So, for the most part, are the extensionsand comparisonsin part III . Experts may be especiallyinterestedin the useof the deduction theory to handle problem solving and categorization in chapter 8 and in the discussion of pragmatic schemasand Darwinian algorithms in Chapter 9. The comments on mental models draw on earlier papers (Rips 1986, 1990b) but are updated to take into account somelater developments. The field of deductive reasoning is currently mined with controversies. Although this makes the field more intense and interesting than it had been in the last few decades, it also makes it hard to issue final judgments . I hope, however, that the perspectiveI develop here is a stable one, and that it will prove helpful to others in discriminating progress from polemics.

Acknowledgments I have accumulated so many intellectual debts in writing this book that it seemsno coincidence that it ends in chapter 11. My most immediate creditors are Denise Cummins, Jonathan Evans, Leon Gross, Philip Johnson-Laird , Matt Kurbat , John Macnamara, Jeff Schank, and several anonymous reviewerswho did their best to get me to improve the chapters . JefTalso helped with some of the experiments reported here, as did Ruth Boneh, Fred Conrad, David Cortner, Jia Li , Sandra Marcus, and others. For conversationsabout psychologicaland logical matters, I thank especiallyChristopher Chemiak, Allan Collins, Dedre Gentner, William



Harper, Reid Hastie, David Israel, David Malament, Ron McClamrock , Ed Smith, Roger Tourangeau, and many others at the University of Chicago , Stanford University , MIT , and the University of Arizona. Thanks also to Olivia Parker for permission to use the jacket photo and to Jia Li for help with the index. I useddrafts of chapters I - lOin classesat Chicago, and commentsfrom the studentsare responsiblefor many changes. For actual monetary support I 'm grateful for NIMH grant MH39633 and for a Cattell Foundation fellowship that enabledme to spenda sabbatical year at Stanford working on this project. I also feel a deep debt to a number of teacherswho have embodied for me, in different ways, the ideals of rigor and clarity in thinking . They have beenan inspiration in the researchand writing, though this book certainly falls far short of what they would like. But, more important , they have convinced me that clarity and rigor are what the study of reasoning is or the fourreally about, rather than the details of categorical syllogisms 't isn our card problem. It goeswithout saying that always clear thinking ' or rigorous; but sporadically it is, and it s important to try to understand how this is possible. " ' " Finally , while attending an authors breakfast for my sevenyear old daughter and her classmatesa few months ago', I realized that I was the only memberof my immediate family who wasn t the author of a book. As ' by far my biggestcreditors, they ve been remarkably helpful and remark ably patient while I' ve beentrying to catch up with them.


The great difference. in fact . betweenthat simpler kind of rational thinking which consistsin the concreteobjectsof past experiencemerelysuggestingeachother. and reasoningdistinctively so called. is this. that whilst the empirical thinking is only reproductive. reasoningis productive. William James( 1890, pp. 329- 330)

It is tempting to begin by announcing that the aim of this book is to proposea psychologicaltheory of deductive reasoning. That is, in fact, the ' goal. But isn t it redundant to speakof a psychologicaltheory of deductive reasoning? What other kinds of theories of deductive reasoning could there be? Isn' t reasoningby definition a psychologicalprocess? " Although there is some redundancy in this expression, the phrase psychological " theory emphasizessome important DllTerencesbetween this approach and earlier ones. On the theory to be developedhere, deduction is a type of psychological process, one that involves transforming mental . For most of its history, however, experimentalpsychology representations hasapproacheddeduction, not as the study of a mental process, but as the study of people answeringquestions about a particular kind of argument. In this formal sense, an argumentis a set of sentences(the premises ) and a single additional sentence(the conclusion). A deductively correct argument , roughly speaking, is one in which the conclusion is true in any state of affairs in which the premisesare true. Thus, argument ( 1) has as premises the two sentencesabove the line and as conclusion the sentencebelow the line. Moreover, argument ( 1) is deductively correct; in any state of affairs in which it is true both that Calvin deposits 50 cents and that if Calvin deposits 50 cents he will get a coke, it is also true that he will get a coke. ' ( 1) If Calvin deposits 50 cents, he ll get a coke. Calvin deposits 50 cents. Calvin will get a coke. An experimenter may present subjects with a list of arguments like this and ask them to decidewhich are deductively correct. Or the experimenter may provide the premisesof the argument along with a set of possible conclusionsand have the subjectschoosethe conclusion that would make " " the argument correct (or choose none of the above if none of the conclusions are acceptable). Or the subjectsmay seejust the premisesand have to come up with a conclusion on their own.

Chapter 1

Yoking deductive reasoning in this way to the study of deductively correct arguments leavesit open whether there are mental processes that are distinctive to reasoningof this sort. It is possiblethat whatever mental activities people go through in evaluating an argument like ( 1) are the sameones they use in nondeductive reasoning(Haviland 1974). It is even possible that there are no cognitive processes peculiar to reasoning, deductive or nondeductive. What we think of as reasoning may involve problem-solving activities much the sameas those we use in balancing a checkbook or planning a menu (Newell 1990; Pollard 1982). Whether there are distinctively deductive processes is one of the major problems to be addressedin this book, and it will not do to beg the question at the outset. In part II , I mount a full -scaleattack on this question by proposing a theory of deductive reasoning that will permit us to seewhat is unique about thinking of this sort. In part III , I defendthe idea that there are deductive processes from those who take a more homogeneous view. However, let us begin by simply trying to get a glimpse of the terrain, looking at some actual transcripts of people reasoning. These examplesshould prove more helpful to us than defining deductive reasoning at the start in some cooptive way. Although the cognitive processes underlying theseexamplesmay be shared with other forms of cognition, they may yet constitute the kind of stable cluster that is psychologically nonarbitrary and henceis a worthwhile target of scientific study. The first two sectionsof this chapter, then, supply some instancesthat seemto fall under the heading of deductive reasoning, and they attempt to explain why such reasoning might be important . In general, earlier approaches within psychology have been far too limited in their scope. These older proposals contain many astute observations; but they are handicapped becausethey rely on overrestrictive methods and assumptions and becausethey insulate themselvesfrom important developments in logic and artificial intelligence. As a result, they have relegateddeduction to a quite minor role in cognitive psychology. In the rest of the chapter , I consider in more detail the two main psychological approaches to deduction: the approach that stemsfrom psychometricsand the one that stemsfrom experimental and social psychology. Examining the strengths and weaknessesof theseprior attempts can help us in formulating a more adequatetheory. What I hope will emergefrom this initial discussionis the possibility that deduction may be a centrally important cognitive component , one deserving a much more thorough examination than it has typically received.

Approaches. to Deduction

Some Examples of Deduetive~Reasoning

Consider what peoplesay while they are solving problems that have traditionally come under the heading of deduction. In experimentsof this sort, " " subjectsare under instructions to think aloud, saying whatevercomesto mind during the solution process. After a little practice they usually comply with this request, particularly if the solution is fairly lengthy. Tape recordings and transcripts of their monologues can then provide some ' hints about the underlying mental processes. Newell and Simon s ( 1972) tour de force, HumanProblemSolving, is the classicexampleof this method in current psychology. Controversy surrounds the claim that these transcripts ' are faithful traces of the subjects mental activities (Ericsson and Simon 1984; Nisbett and Wilson 1977; Wason and Evans 1975), since some of what the subjectssay may be their own theories or rationalizations rather than direct reports of their thinking . This is especially true when subjectsdescribea completedprocessrather than an ongoing one" " " when the protocol is " retrospective rather than concurrent. However, ' we neednt take sidesin this controversy for our present purposes; even if it should turn out that the monologues are after-the-fact justifications rather than true observations, they are well worth studying in their own right . Although the subjectsmay initially solve the problem in someother way, the transcripts provide examplesof fairly complex analysisthat itself calls for explanation. Example 1: SententialReasoning The first example is taken from an experiment (Rips 1983) in which subjects received12 arguments(from table 11.6 of Osherson 1975) and had to decide while thinking aloud whether each was deductively correct or incorrect " ' . The subjects exact task was to decidewhether the conclusion has " I told the to be true if the premiseis true. subjectsto answerthe question then natural first in whatever way seemed , in order to encourage the ; subjectsto mention all the stepsin their thinking , I asked them to explain their answer again as if they were talking to a child who did not have a clear understanding of the premise and conclusion. Of the 12 problems, half were in fact deductively correct in classicalsentential logic (seechapter 2 below) and half were incorrect. All the problems were described as " being about a set of letters that might be written on a blackboard in some " other room of this building . Table 1.1 is a complete transcript from one

Chapter 1

Table1.1 Protocolfromsubject7 on theargument" If is not truethat thereis bothan M anda P, then " thereis an R; therefore , if thereis no M, thenthereis an Ro IRidalsolutioa " a. Thesentence abovethe line reads , If it is not true that thereis both an M anda P, " thenthereis an R. " b. Thesentence belowtheline reads , If thereis no M thenthereis an R." c. If it is not true that thereis both an M anda P- if you comeupona blackboardand thereis an M anda P, therewill alwaysbean R. d. If youcomeuponanyblackboardwithoutan M or withouta P, withoutboth of them , therewill bean R. together e. Sowith an M anda P, no R, andif it is not true that they're both there, thenthereis an R. f. Now thesentence belowsaysthat if thereis no M, thenthereis an R. ' g. That s'true. h. Now I m construingthe top sentence a little differently , but I think that showshowI'm . reasoning , correctlyor otherwise i. If it is not true that thereis both, thenthereis an R. " Explaaatioato . " child j . OK. Anytimeyou seeboth the lettersM and P on a blackboard , thenyou canbesure thereis no R. k. But if oneof thelettersis missingor if both of thelettersis missingthenyou'll seean R on the blackboard . I. Now look at this blackboardhere. m. Thereis no M on it, so if eitheran M or a P is missingor both of themaremissing , thentherehasto bean R. n. Now whataboutthis blackboard ? o. Thereis an M missing . p. Whatdoesthat say? q. That saystherehasto bean R there.

of the subjects, a graduate student in English (with no training in fonnal logic) who was working on the argument shown here as (2). (2) If it is not true that there is both an M and a P on the blackboard, then there is an R. If there is no M , then there is an R. In the subject's initial solution, he readsthe premiseand the conclusion of the argument (lines a and b in table 1.1) and then begins working over the premiseby paraphrasing it in various ways, not all of which are necessarily correct (lines ce ). The most helpful of theseparaphrasesoccurs in line d: " If you come upon any blackboard without an M or without a P . . . there will be an R." From this sentencethe answerseemsto be self- evident because , after repeating the conclusion in line f, the subject declaresthe

Ps:vchol I Approaches to Deduction Og Jcal

conclusion to be true. Although this last step is not elaborated in the initial solution, the subject's explanation to the imaginary child provides someinsight into his reasoning. He first tries to get the child to understand that if either an M or a P is missing, or if both of them are missing, then there has to be an R- essentiallya restatementof line d. He then has the child imagine a situation in which the antecedent (i.e., if -clause) of the conclusion is true: " Look at this blackboard here. There is no M on it ." Becausethe Mismissing , " there has to be an R there." We can summarize the main part of the subject's reasoningas in (3). (3) The premisestatesthat if there is not both an M and a P, then there is an R. This meansthat if there is no M or there is no P- if either an M is missing or a P is missing- then there is an R. , there is no M . Suppose, as the conclusion suggests Then, according to the premise, there will be an R. So the conclusion must be true: If there is no M , then there is an R. The subject transforms the premiseto make it more intelligible or more useful to the task at hand. He also makes use of the conclusion to direct his thinking while he is working out the answer. Not all the subjectswere as successfulor as articulate as this one, but nearly all followed similar patterns of transforming the premisein a way relevant to the conclusion. Example 2: Reasoningwith Instances Our secondexample deals with a puzzle about a fictitious realm whose inhabitants are called " knights" and " knaves." Knights and knaves are impossible to distinguish by appearance. In fact, there is just one characteristic that separatesthem: Knights always tell the truth , whereasknaves always lie. There are many puzzles concerning knights and knaves; (4), from Smullyan 1978, is typical. (4) Supposethere are three individuals, A , B, and C, each of whom is either a knight or a knave. Also, supposethat two people are of the sametype if they are both knights or both knaves. A says, " B is a knave." B says, " A and C are of the sametype." Question: Is C a knight or a knave?

Chapter 1

To seehow people tackle problems like these, I asked subjectsto solve four knight/ knave problems- problem (4) among them- while thinking aloud (Rips 1989a). They were told: " Read the problem aloud and then keep talking about it while you are trying to come up with your answer. Say everything that you are thinking about, even if it seemstrivial . If you are silent for more than a few seconds, 111remind you to keep talking ." Table 1.2 contains a solution to problem (4) from one of thesesubjects- a college freshman who, like the subject of our earlier example, had no training in formal logic. This subject beginsby assumingthat person A is a knight . Sincewhat A says is true on this assumption and since A says that B is a knave, the ' subject infers that B would be a knave. B s statementthat A and C are of the same type must therefore be a lie. But, by assumption, A is a knight; thus C must be a knave. So by line d of table 1.2 the subject is able to conclude that Band C are knaves if A is a knight, and she calls this her " first " possibility. She then turns to the second possibility that A is a knave. This means that B is a knight, so that A and C are of the same type, namely knaves. In line g, though, the subject runs into a temporary ' problem: She has forgotten C s type under the first possibility. (This was not uncommon in the experiment, since the subjectswere not allowed to write down their intermediate results.) In lines hand i she goes back to reconstruct the first part of her solution, and in line j she beginsagain on the secondpart . But before she developsthe latter possibility she apparently has some secondthoughts about the first. Finally , she reminds her-

PsychologicalApproaches to Deduction

self of the implications of the second possibility and, in line m, correctly concludesthat on either possibility C is a knave. SomePreliminary Observations Examples 1 and 2 demonstrate that the subjects were capable of approaching such problems in a systematicway. Although they sometimes made mistakes(as in line c of table 1.1) or lost their place in the solution (line g of table 1.2), they neverthelesstook into account the information given in the problem, made assumptions about unknown aspects, and drew inferencesfrom thesegiven and assumedfacts. The overall pattern of the transcripts is similar in somerespectsto informal proofs in mathematics . That is, in solving these problems the subjects generateda seriesof statementslinking the premisesor givens of the problem to the solution. The stepsin such a progressionare intuitively sound, each step providing a justification for later steps, so the entire seriesforms a conceptual bridge betweenthe parts of the problem. Furthermore, someof the stepsare not ' only reasonableby the subjects own criteria; they are also deductively correct, in the sensethat the statementconstituting the step is true whenever the given and assumedstatementsare true. For example, line d in table 1.1- " Ifyou come upon any blackboard without an M or without a P . . . there will be an R" - is certainly true wheneverthe premiseof problem " (2) is true. So is line d of table 1.2- That would mean that Band C " are knaves - in the context of problem (4) and the assumption that A is a knight . Notice, too, that the subjectscould handle theseproblems even though the specific tasks were novel. Although the subjectsmay have had some training on formal proof in high school mathematics, it is very unlikely that they had faced exactly the problems they were confronting in these experiments. Despite the fact that human information -processingabilities are quite restricted, there seemsto be no theoretical limit on the number of arguments we can recognize as deductively correct. This remarkable productivity is what James( 1890) took to distinguish reasoningfrom associative thought. Consider another example: If somesentenceS1 is true and if S2 is also true, then we can infer that 51 and 52 is true; given this new sentence , we can also infer that S1 and 51 and 52 is true; this new sentence in turn permits yet another inferenceto 51 and 51 and 51 and 52; and so on. Of course, one wouldn' t want to produce a stream of inferencesof this trivial sort unlessone had to (e.g., in support of some further theorem);


nevertheless , the fact that these inferencescan be made on demand is that a deduction theory must explain. something The analogous problem in linguistics is well known (Chomsky 1957, 1965), and the analogy itself has been elaborated by Cohen ( 1981), Macnamara ( 1986), Osherson ( 1976, chapter 8), Pollock ( 1989), Sober ( 1978), and others. According to Chomsky ( 1965, p. 15), grammatical competence includes the ability to interpret infinitely many sentences ; therefore "a , generativegrammar must be a systemof rules that can iterate to " generatean infinitely large number of structures. The ability to iterate in the proper way is equally required by a theory of reasoning, since part of the inferencepotential of a sentencedependson its structural composition. Although there is considerable disagreementabout the extent to which this " logical fonn " mirrors " grammatical fonn ," on most syntactic theories there are systematicrelations betweenthe two.!

TheCentralityof DeductiveReasoning The above examples suggestthat deductive reasoning has some stable internal properties. It seemsreasonableto suppose that subjects' utterancesin thesethinking -aloud experimentsare the products of mental processesthat representthe information contained in the problem, transform this information in a sequenceof steps, and employ the transformed facts to decideon an answer to the experimenter's question. More evidencefor this hypothesiswill appear in later chapters (especiallychapters 5 and 7), but here it provides an obvious framework for the data in tables 1.1 and 1.2. In what follows, the terms deductivereasoning, deduction, and deductive inferencewill be usedinterchangeably for this transformation-and-decision process, and all three will be distinguished from deductiveargument(which is a relation betweensentencetypes, not a psychologicalprocessor entity). This conception is in fairly good agreementwith ordinary usage; although " " " " people commonly apply deductive and deduction to almost any sort of reasoning,2 they certainly think of reasoningas a kind of mental process that createsnew ideas from old ones. At this point , we are not in a position to define deductive reasoningprecisely. Clearly, not all mental transformations are deductive ones, and we need to leave room for obviously nondeductive transformations that can occur in image manipulation , in

forgetting, and in various sorts of inductive and analogical inference. Similarly , not all of what people do when they solve deduction problems like those in (2) and (4) is deductive reasoning. It is likely that nondeductive ' processes (e.g., retrieving information from memory) also affect peoples answersto theseproblems, as we will seelater. However, tables 1.1 and 1.2 do exemplify the kind of processthat we would eventually like to explain, 3 and they should sufficeto fix our intuition for the time being. It is not clear from the examples, however, whether there is anything engaged. People important about the activity in which thesesubjectsare ' are able to solve all sorts of unusual puzzles (Rubik s cubes, retrograde chessproblems, crypt arithmetic problems, and so on), but presumablythe cognitive processes responsiblefor thesesolutions need not be especially important to mental life. The skills required for retrograde chesspuzzles, for example, may be specializedcomponents that have little role to play in other intellectual activities. There is evidence, however, that deductive reasoning, perhapsunlike someother problem solving skills, is cognitively for this claim- the willingness evidence of central in this respect. One source to attribute to others simple deductive inferences(such as modus ponens) and the unwillingness to attribute contrary inferences(such as modus shmonens)- was noted in the preface. But theseproclivities show , but also in ordinary situations. Support up, not just in sciencefiction cases for the centrality of deduction comesfrom theoretical accounts of how we ' explain others actions and from more specific observations concerning the role of deduction in other cognitive domains. Theoretical Evidencefor the Importance of Deduction Explaining why peopleperform given actions ordinarily meansattributing to them a set of beliefsand goals: Adele bought a chocolate cake because she likes eating delicious things and believesthat anything chocolate is delicious; Gary made a V -turn becausehe was in a hurry and thought a V -turn would be quicker than driving around the block. But, as Davidson ( 1970) and Dennett ( 1971, 1981) have pointed out , interpretations of this kind presupposemore than a single belief. We assumethat the beliefs we a coherent way to explicitly attribute to others must also be melded in ' s belief about chocolate further beliefsthrough proper inferential links. Adele servesto explain her behavior only if we also attribute to her the belief that the cake in question is chocolate and credit her with the ability to deducethat the cake must therefore be delicious. In other words, we tend

Chapter 1

to rely on assumptions about people's inference abilities (among other abilities) as implicit parts of our explanations. One interesting thing about this, for our purposes, is that if we find that our explanations are wrong we are more likely to call into question the beliefsor desireswe attribute to others than to blame their skill at drawing deductive inferences. If we use our explanation to predict that Adele will buy a chocolate cake on her next trip to the bakery and then find that she comesback with a cheesecake , we are likely to say that shemust not have liked chocolate as much as we thought or that she must have bought the cheesecakefor someoneelse. We don' t doubt her ability to perform the inference(often called universalinstantiation) that correspondsto the argument in (5). (5) All chocolate things are delicious. This cake is chocolate. This cake is delicious. Of course, there are limits to the amount of inferential power that we are willing to attribute to humans (Chemiak 1986; Stich 1981); nevertheless , ' assumptionsabout the correctnessof others reasoning are very resilient for simple deductions. The fundamental nature of theseinferencesis also evident if we try to envision disputes about the soundnessof arguments such as ( 1) or (5). Although you could provide additional facts or arguments to convince someoneof the truth of the premises, it is not clear what sort of evidenceyou could appeal to in overcoming another person's resistance to the soundnessof the argument itself (Carroll 1895; Haack 1976; Quine 1936). To paraphraseQuine: If arguments like ( 1) and (5) are not conclusive, what is? Empirica. Evidence The evidenceso far suggeststhat people believededuction to be a central mental activity . Whether it is in fact central is an empirical question for cognitive psychology. It is not hard to show, however, that deduction must be a component of many other psychologicalprocesseson any reasonable cognitive account of their operation. Take comprehension. Theories of how people understanddiscourserequire deduction (as well as nondeductive inference) to generateexpectationsabout upcoming information and to knit unanticipated input with what has gone before.

Psychological Approach es to Deduction

Consider , as an example

the mini-dialogueshownas(6), whichis taken

from Sadock1977. (6) A: Will Burke win? B: He' s the machine candidate and the machine candidate always WinS. As Sadock points out , understanding(6) involves knowing that B believes ' Burke will win. But B doesnt expresslystate this; a hearer must infer this belief by universal instantiation from Machine candidatesalways win and Burke is the machinecandidateto Burke will win, which has essentiallythe samefonn as (5). It is true that the causal relationship betweencomprehension and inferencecan sometimesbe reversed, so that comprehension affects inference. For example, understanding the instructions for area soning problem is usually required for a correct responsein any explicit test of reasoning. Similarly , in a logic class, comprehensionof instructions can lead to improvements in inference skills. But these enablementsare quite different from the direct dependenceof comprehensionon reasoning that is exhibited in (6). Comprehensionseemsto playa minor role, if any, during the course of reasoningitself, whereasdeduction is often a proper 4 part of ongoing comprehension. We can take planning as a secondinstance of a cognitive activity that presupposesdeductive reasoning. To see this, suppose Adele is looking for something delicious to purchase. Since she believesthat all chocolate things are delicious and notices that one of the displayed cakesis chocolate , she can use an inference corresponding to (5) to deduce that the displayedcake is delicious. Thus, from her original goal to obtainsome thing delicious- she can derive the subgoal of obtaining this particular cake by means of universal instantiation. It is easy to construct similar examplesusing other deductivepatterns. For instance, any planning strategy that usesa processof elimination seemsto involve the inferencepattern called the disjunctivesyllogism. SupposeGary wants to seea movie but can' t decide between Hiroshima, M on Amour and Duck Soup. After pondering the choice, he decideshe is not in the mood for Hiroshima, M on Amour. He should then chooseDuck Soupby the argument shown here as (7). (7) Either I will go to Hiroshima, M on Amour or I will go to Duck Soup.


Gary can then amend his plan to go to a movie in favor of the more specificplan to seeDuck Soup. These examplesare of the simplest sort, but it seemslikely that more complex casesof planning will also include a deductive component with inferencessuch as universal instantiation and the disjunctive syllogism. The relationship betweendeduction and planning is somewhat less onesided than that between deduction and comprehension. If a deduction problem is complex enough- as is, for example, proving a difficult theorem in mathematics- then planning may be necessaryif one is to find a solution within a reasonableamount of time. However, for everyday instances of deduction, such as (5) and (7), explicit planning is probably minimal, whereasmost episodesof planning seemto involve instantiating variables (via universal instantiation ), reasoning by cases(the disjunctive syllogism), or reasoningfrom a conditional rule (modus ponens). (Chapter 3 contains a further discussion of the role that deduction has played in theories of planning in artificial intelligence.) If deduction is part of higher-level mental processes, such as comprehension and planning, as these examplessuggest, it is a little surprising that psychology has usually treated deduction as a relatively specialpurpose mechanism(if, indeed, it has recognizeddeduction as a mechanism at all). The rest of this chapter otTersa glimpse at how this state of affairs cameabout.

Main Trendsin Reasoning Research Psychology has seen two main approaches to the study of high-level cognition: psychometricsand experimental psychology. Both approaches began to deal with reasoning in the 1900s, and they have continued, at least until very recently, to pursue independentmethods in this area. The theory of deduction that we will explore in this book has its roots in the experimental tradition , as does most of contemporary cognitive psychology . However, psychometricshas contributed to the study of deduction a unique perspectivethat also merits discussion. The review in this section is a kind of stage setting rather than an in- depth critique . Some of the experimental work will be examined in more detail when the results become pertinent in later chapters; at this point, an overview of these approachesmay be helpful. Current work in psychology has also produced a number of theoretical proposals about the nature of deduction that

PsychologicalApproaches to Deduction

go under the names of natural-deduction systems(see, e.g., Braine 1978; Braine, Reiser, and Rumain 1984; Osherson 1974b, 1975, 1976; Rips 1983, 1989a), mental models(see, e.g., Johnson- Laird 1983; Johnson-Laird and Byrne 1991), and pragmaticreasoningschemas(see,e.g., Chengand Holyoak 1985). These will be of vital concern in later chapters; for now, however, it is useful to put the psychological study of deduction in its historical context. The PsychometricApproach Psychometricsconcernsitself with the measurementof mental traits such as authoritariannessor paranoia, but its main quarry has beenintelligence. Indeed, psychometricians' interest in deduction is traceable to the tight relationship they perceived between reasoning and intelligence (Carroll 1989). The very first intelligence tests were designedto measureability in children and were grab-bagsof problems, including somethat seemto call for deduction. For example, the Binet test- the first version of which appearedin 1905- contained the item shown here as (8). ' " (8) What s foolish about the following statement? In an old graveyard in Spain, they have discovereda small skull which they believeto be that of Christopher Columbus when he was about ten years old." Performanceon such problems correlated fairly highly with global intelligence measuresbased on a variety of problem types. Thus, Burt ( 1919) claimed on the basisof someearly researchthat , for measuringthe ability of bright children, tests " involving higher mental processes, particularly " ' thoseinvolving reasoning, vary most closely with intelligence. Burt s own test, which was designedfor ages7 through 14, contained a wide variety of reasoning problems, including problems of the type illustrated in (7) (now called linear syllogismsor three-term seriesproblems). (9) Three boys are sitting in a row: Harry is to the left of Willie ; George is to the left of Harry . Which boy is in the middle? The designersof the early intelligencetestsintended them to servepractical ends, such as deciding which children should go into special school programs or which enlisted men should get training for high-level positions , and they were quite unanalytic with respectto the abilities underlying the test takers' performance. Later work , particularly that of L. L.


Thurstone and his followers, stressedinstead basic mental skills that were supposedto be responsiblefor the global results. In his monograph Primary Mental Abilities, Thurstone singled out reasoningas an appropriate " object of these more refined analyses, since one could ask How many reasoningabilities are there, andjust what is eachof them like?" ( Thurstone 1938, p. 2). The general idea was to assemblepencil-and-paper tests that seemedto tap someaspectof the ability in question. In the caseof reasoning , thesemight include syllogism tests, number-seriesor letter-seriestests, analogy tests, arithmetic word problems, geometrical puzzles, classification problems, and so forthis The investigatorsadministereda test battery to a large number of subjects, correlated subjects' scoresfor each pair of tests, and factor-analyzed the correlation matrix. Factor analysis is a statistical technique whose aim is to determine the dimensions responsible for similarities and differencesin performanceon a test (see, e.g., Harman, 1976), and the hope of these investigators was that the revealed factors would " isolate and define more precisely primary abilities in the domain of reasoning" (Green et al. 1953, p. 135). Thurstone identified several factors that he believed had to do with reasoning. For example, he claimed that one of the factors from the 1938 ' study, Factor I (for induction ), tapped subjects ability to find a rule that " applied to each item in a sequence , as in a number-seriestest (e.g., Find the rule in the seriesbelow and fill in the blanks: 16, 20, - , 30, 36, 42, 49, - " ). He also tentatively identified aFactorD as having to do with some sort of deduction skill , a factor that had high loadings from tests such as nonsensesyllogisms(" Good or bad reasoning?: Red-haired personshave big feet. All june bugs have big feet. Therefore, some june bugs are red haired?" ) Two other factors, V and R, also appearedto be connectedwith reasoning. Thurstone said that testsloading on V , such as verbal analogies and antonyms, are " logical in character" and have to do with ideasand the meaningsof words. Factor R had high loadings from a test of arithmetical reasoning(essentiallya set of arithmetic word problems) and from other teststhat " involve someform of restriction in their solution." Unfortunately, the factor-analytic studies of reasoning that followed Thurstone's yielded results that were at odds with these initial findings and with one another, casting doubt on the successof the early psychometric program. For example, on the basis of a large project devoted entirely to reasoning tests, Adkins and Lyerly ( 1952) suggestedthat there were at least three factors responsiblefor induction , rather than the single

PsychologicalApproaches to Deduction

" I factor that Thurstone had isolated. They namedthesefactors perception " and " " " of abstract similarities, concept formation, hypothesis verification " found factor-analytic 1953 also et al. . At about the sametime, Green ( ) evidencethat induction could be decomposedinto three factors- but a "" " different three, which they called eduction of perceptualrelations, eduction " " " of conceptual relations, and eduction of conceptual patterns. The deduction factorD met a similar fate. Although some of the later investigations appearedto confirm a deductive or logical reasoning factor, Thurstone dropped the D factor in his further researchon the grounds " that " it has not been sustained in repeated studies (Thurstone and Thurstone 1941, p. 6). A recent survey of factor-analytic studies (Carroll 1993) suggestsevidencefor a deductive or general sequential reasoning factor, an inductive reasoning factor, and a quantative reasoning factor, " " but concludesthat the evidencefor them is hardly compelling. The ambiguities in the results of thesestudies highlight somedefectsin the methodology (seechapter 2 of Sternberg 1977 for a review of these problems). For example, the factor pattern derived in such experiments dependscrucially on the choice of the reasoningtests. Unless the investigator samplestests very broadly, there is no guaranteethat the resulting factors will reappearin further testing. Moreover, even the samedata can ' produce different results, depending on the investigator s decisionsabout technical matters, including the method of rotating factors and their interpretive " " " labels (e.g., " deduction or hypothesis verification ). However, the major deficiency, from the present point of view, is that the method can yield only factors corresponding to processes that vary substantially acrossindividual test takers. If there is a mental ability that is common to all the test takers (and if they exploit this ability to approximately the same extent on individual tests), then this ability cannot show up as a factor in the results. This is true whether the ability in question plays a role in all the tests or in just a subset. Since characteristicsof deduction may well be universal in precisely this way, psychometric methods based on factor analysisare not appropriate for studying them. Partly as the result of theseproblems, many contemporary researchers apply psychometric methods in a more refined way to a single ty~ of reasoningtest, such as verbal analogiesor linear syllogisms, rather than to an assortment of tests (Embretson, Schneider, and Roth 1986; Rips and Conrad 1983; Sternberg 1980; Whitely 1980). The idea is to quantify individual differencesthat affect component cognitive processes within the


task. Suppose, for example, that performanceon linear syllogismssuch as (9) dependsjointly on the ability to form imagesof spatial arrays and the ability to perform verbal coding and decoding (to simplify the theory of

that vary systematically in their hypothesizeddemandson spatial versus verbal processing.The interpretation of theseestimatescan then be checked ' by comparing them to the samesubjects performanceon pure spatial tests and pure verbal tests. Used in this way, however, psychometric methods are more an adjunct than an alternative to experimentalmethods in cognitive psychology. Carrying out this type of researchmeans developing a substantive theory for the mental subparts of the task and testing subjects in experiments where the problems vary over trials along specific dimensions. Psychometric methods, such as factor analysis or latent-trait analysis, can be helpful in understanding reasoning, since they provide useful tools for measuringdifferencesin subjects' abilities or strategies. However, it has becomeclear that an exclusivefocus on individual differencescannot tell us all we would like to know about deduction- or about other complex cognitive skills. . The Experimental Approach Experimental efforts in the domain of deductive reasoning began about the same time as the psychometric approach, around 1900. Woodworth ( 1938, chapter 30) reports that Gustav Starring introduced syllogistic reasoning problems into the psychological laboratory sometimebefore 1908. Experimentson deduction have continued to the present, though often at the periphery of experimental and social psychology. This book is largely devoted to explaining experimental findings in this area, and specific results will be examined in part II . At this point , though, let us take a brief look at trends in the field in order to prepare for later developments. (For detailed reviews seeDraine and Rumain 1983, Evans 1982, GaIotti 1989, , and Wason and Johnson-Laird 1972.) Rips 1990a The typical experiment in the psychology of deduction is very simple. As the subject, you receive a booklet containing a number of problems. Usually each problem consistsof a set of premisesand a conclusion, as in ( 1), (2), (5), and (7) above. The experimenter tells you that your job is to

PsychologicalApproaches to Deduction

decide whether the conclusion " follows logically" from the premises(or whether the conclusion " must be true when the premisesare true" ). You check a box marked " follows" or one marked " doesn't follow " to indicate your answer, then you go on to the next problem. The basic dependent variable is usually the percentageof subjects who give the " correct" response " " , with correct defined in terms of the validity of the problem as translated into somestandard systemof logic. (We will needto look at this practice, since there are many different systemsof logic; for now, we will simply go along with standard terminology.) In other studies, response time also servesas a dependent variable. As was mentioned above, this basicexperimentaltask has a variation in which the experimenterasksthe subjects to choose the correct conclusion from a set of alternatives. Or subjectsmay seejust the premisesand produce a conclusion on their own. However, the main idea in each caseis to determine how variations in the nature of the arguments affect subjects' ability to detect (or to generate) deductively correct conclusions. SyllogismstIIId Deduct;Oil by Heuristics Nearly all the early researchon deduction concerneditself with syllogisms: arguments containing exactly two premisesand a conclusion.6 Although linear syllogisms (such as (9 and other argument forms sometimesfigured in thesestudies, more often the problems were composed of categorical (or Aristotelian) syllogisms. For this reason, I will use the term " syllogism" to denote a categorical syllogism unless I specify otherwise. In arguments of this type, all three sentencescontain explicit quantifiers- either the universal quantifier all or the existential (or particular) quantifier some. Thesesamesentencescan be negative or positive, producing four basic sentencetypes: Some. . . are . .. , All . .. are . . . , Some. . . are not .. . , and No . . . are .. . (the last being logically equivalent to All . . . are not . . . ). Examples( 10) and ( 11) are typical instances; the former is deductively correct and the latter incorrect in standard systems. ( 10) All square blocks are green blocks. Somebig blocks are squareblocks. Somebig blocks are green blocks. ( 11) All square blocks are green blocks. Somebig blocks are not squareblocks. Somebig blocks are not green blocks.

Chapter 1

As these examplesillustrate, syllogisms of this sort have three terms (in this casegreenblocks, squareblocks, and big blocks); one term appearsin the first premiseand in the conclusion, another term appearsin the second premiseand in the conclusion, and the third (middle) term appearsin both premises. The order of the terms within the premisescan vary, however; thus, ( 12), which reversesthe position of the subjectand predicateterms in the premisesof ( 10), also counts as an (incorrect) categorical syllogism.7 ( 12) All green blocks are squareblocks. Somesquareblocks are big blocks. Somebig blocks are greenblocks. ' The most noticeablepsychologicalfact about syllogismsis that subjects performance, as measuredby the percentageof subjectswho produce the " correct" answer varies , greatly from problem to problem. For instance, in have to choosewhich of a set of possibleconclusions where subjects experiments . . are . . . , Some. . . are not . . . , No . . . are . . . , . . . are .. . Some. All , ( follows from the premises, subjects confronted valid conclusion or no ) with syllogismssimilar to ( 10) and ( 11) tend to selectthe responsesshown in theseexamples. That is, subjectsrespondthat the premisesof ( 10) imply Somebig blocksare greenblocksand that the premisesof ( 11) imply Some big blocks are not green blocks. However, only the first of theseis correct in standard logic systems; thc right answerfor ( 11) should be that none of the conclusionsis valid. In one study (Dickstein 1978a), 100% of subjects were correct on ( 10) but only 14% were correct on ( 11). From the 1920sthrough at least the 1950s, experimental researchon deduction was essentially an attempt to explain errors on syllogisms in . Following tempting shortcuts terms of blases affecting subjects' responses can causesubjectsto deviate from the path of correct reasoning. The ' most famous account of this is Woodworth and Sells ( 1935) atmosphere hypothesis, according to which particular or existential premises (i.e., Some. . . are . . . or Some. . . are not . .. ) tend to create in the subject a dominant impression of the correctnessof a particular conclusion and negative premises(i.e., No . . . are .. . or Some. . . are not . . . ) to create an impression of the correctnessof a negative conclusion. Hence, given a syllogism to evaluate, subjects tend to choose a particular conclusion if either premiseis particular , and they tend to choosea negativeconclusion if either is negative. For example, becausethe second premise in ( 11) is

Approaches to Deduction Psychological

both particularand negative , it shouldcreatea tendencyto choosethe particular, negativeconclusion(Some... are not .. .), which is indeedthe usualerror. A secondsort of biasdependson thecontentin whichthesyllogismsare framed. During World War II socialpsychologists found in syllogismsa meansto explorethe effectsof propagandaor prejudiceon reasoningby ' ' observingthe effectsof subjectsprior beliefin a syllogisms conclusion (Janisand Frick 1943 ; Lefford 1946 ; Morgan 1945 ; Morganand Morton 1944 . Because the deductive correctness of an ) argumentdependsonly upon the relationbetweenthe truth of the premisesand the truth of the conclusion , the conclusionof a correctargumentmay happento be false. (In deductivelycorrectarguments , theconclusionhasto betrue only when all the premisesare also true.) The hypothesisbehindthis researchwas that peopleare morecritical of argumentsif they havefalseconclusions (or conclusionstheydisagreewith), and lesscritical if they havetrue conclusions . Althoughthe resultsof theseearlystudiestendedto supportthis idea, thesefindingsare difficult to evaluatebecauseof internal flaws, including inadequatecontrolsand dubiousrewordingsof the standardsyllogistic forms(seeHenle 1962and Revlinand Leirer 1978for critiques). Still, thereare a few convincingexamplesof beliefbias. For instance , in earlierresearch , Wilkins (1928 ) found that 31% of her subjectsincorrectly thought the syllogismin ( 13) valid, whereasonly 16% thought that the logicallyequivalent( 14) wasvalid. (13) No orangesareapples. No lemonsareoranges . No applesarelemons. ' ' (14) No x s arey s. ' ' No z s arex s. No y's arez's. More recent experimentswith tighter controls (Evans, Barston, and Pollard 1983 ; Revlin and Leirer 1978 ; Revlin et al. 1980 ) confirm that ' subjectsprior beliefin the conclusionproducesa significant(but often small) effecton validity judgments . For example , Evanset al. compared 15 and 16 which share the same form. ( ) ( ), syllogistic

Chapter 1

( 15) No cigarettesare inexpensive. Someaddictive things are inexpensive. Someaddictive things are not cigarettes. ( 16) No addictive things are inexpensive. Somecigarettesare inexpensive. Somecigarettesare not addictive. Prior ratings from a separate group of subjects demonstrated that the conclusion of ( 15) is more believablethan the conclusion of ( 16). Accordingly , 81% of subjectsdecided correctly that syllogisms such as ( 15) were valid, whereas only 63% decided correctly that syllogisms such as ( 16) were valid (Evans et al. 1983, experiment 2, group 2). Researchershave also attributed errors on syllogisms to the way subjects construe the basic sentencetypes. For example, subjectsmay interpret a sentenceof the form Somex are y as suggestingthat somex are not y and, conversely, interpret Somex are not y as suggestingthat somex are y (Ceraso and Provitera 1971; Wilkins 1928; Woodworth and Sells 1935). These may be Gricean implicatures, arising from everyday usesof such sentences(seechapter 2). As a way of explaining errors on syllogismslike ( 11), several investigators have also proposed that subjects understand Somex are not y as entailing Someyare not x and understand All x are y as entailing All yare x (Ceraso and Provitera 1971; Chapman and Chapman 1959; Revlis 1975a,b; Wilkins 1928). In ( 11), for instance, if subjects take All squareblocks are greenblocks to imply that all green blocks are square, then the set of square blocks is the same as the set of green blocks. Thus, if somebig blocks are not square(as is assertedin the second premise), it follows that some big blocks are not green which is, again, the usual mistake. A few of the early experimentersin this area apparently believed that error tendencies, such as those just discussed, could fully explain performance on syllogism problems. For example, Morgan and Morton ( 1944) " began their paper on belief bias with this statement: Our evidencewill indicate that the only circumstanceunder which we can be relatively sure that the inferencesof a personwill be logical is when they lead to a conclusion " which he has already accepted. Pollard ( 1982) givesthis idea a more modem cast by invoking the findings of Tversky and Kahneman (e.g., 1974) on heuristicsand blasesin judgment and choice. But a pure heuristics

PsychologicalApproaches to Deduction

approach to deduction is a minority position, and for good reason: The proposed heuristics seemto account for only a relatively small portion of the data. For example, if we compare the predictions of the atmosphere effectagainst published data sets, we find that it accountsfor only 44% to ; Roberge 50% of responsesin a multiple -choice format (Dickstein 1978a Laird and Bara Johnson test fill in the-conclusion in a 43 1970) and % ( the 1984). Even Woodworth and Sells ( 1935) saw atmospherehypothesis as an explanation of errors, and not as a complete theory of syllogistic reasonIng. Infornuit ;on- ProcessingModels for SyUogisms After the emergenceof information -processingpsychology, generativegrammar, and artificial intelligence in the 1960s, theories of syllogistic reasoning began to adopt a more analytic stance. These theories specify the individual mental steps people go through in producing answersto syllogism problems, and they attempt to account quantitatively for correct responsesas well as for errors (Erickson 1974; Guyote and Sternberg 1981; Johnson-Laird and Bara 1984; Revlis 1975b). These models share the notion that typical performanceon such a task includes someblend of correct reasoningand mistakes(or misinterpretations), where the latter might include processing limitations (e.g., from short-term memory), inaccurateheuristics(e.g., from atmosphere), or comprehensiondifficulties (e.g., from conversion of All x ' are y to All yare x ). According to Erickson s ( 1974, 1978) model, subjects internally represent each premise as a combination of Euler circles, as shown here in figure 1.1. Thus, an ideal subject representsthe premise All squareblocks are greenblocks of ( 10) in two different ways: as a circle for square blocks within a circle for green blocks and as two coincident circles. The secondpremise, Somebig blocks are squareblocks, should be representedin four ways: as a circle for big blocks within a circle for square blocks, as a circle for square blocks within a circle for big blocks, as two coincident circles, and as two overlapping ones. To evaluate the conclusion, the subject must combine the premise representationsinto representationsof the syllogism as a whole: In general, this can be done in more than one way (as shown at the bottom of the figure), becauseeach premisecan have severalrepresentationsand becausea representationof the first premisecan combine with a representationof the secondpremise in several distinct configurations. If a potential conclusion holds in all theselegal combinations, then that conclusion is deductively correct. The

Chapter 1

PREMISE REPRESENTATIONS : Allequereblockseregreenblocks.

Some big blocks ere Iq U8reblocks.


~ " , ~ .\":,,.'.-",.,'~ ' to \ ,.\tI " , I ,-',' '~ =II""."."~ ,~ .".".'".:, '"",,:' ,'...",,".,,: .:'.',0 .CJ " .,. @ 0 " . " " , . , , . \ " ' . , ' r I , ) : " ' ,~ \ \ 0 J 0 , " " , ~ " . .. 0 '~ @ ,0 ""-,."',,..i, :i,',"'.''"I\1,\f ",'.II\J ..,.) I,";~ ('.~ ,@ ' ' " . m 01 . GJP w . , .:"-.''",.,, ,.",--..,"";/'\I CO t\-_"",,I @ ,\'':""-:,'",1 I\',,0 :''~ ,I."'0 C " ~ ' . ".'~',..':,,i ,\I,:'"r..~,I,,OJ , , " ' " " " I I t \ , . , p , : : I ' , . ' . \ " ' , . ,'.~I' ,_." @ .-.lV@.~'Q CJ

Fiaure1.1 A representation in termsor Eulercirclesor thesyllogism All squareblocksaregreenblocks. Somebig blocksaresquareblocks. Somebig blocksaregreenblocks . Thesolidcirclesrepresent , thedashedcirclesgreenblocks big blocks , andthedottedcircles squareblocks.

PsychologicalApproach es to Deduction

model attributes errors on the syllogismsto failure to consider all relevant representationsand combinations of the premises. In addition , errors can arise from bias in the way subjects representthe premisesor in the way they derive conclusionsfrom the combined diagrams. ' Models such as Erickson s have beenfairly successfulin accounting for observed responsesin syllogism experiments, and they present a deeper explanation of reasoningthan one gets by cataloging blases. Their weakness lies in the difficulty of extending them to a class of inferenceswider than syllogisms. Although some researchershave tried to apply their ' models to related deductive arguments(see, e.g., Guyote and Sternbergs ( 1981) treatment of conditional syllogisms), it is fair to say that theseextensions have beenmodest.8 You can get a feelfor this limitation by trying ' to imagine how a model like Erickson s would handle arguments such as (2) and (4) above. One causeof this deficiencyis, no doubt, the rigid form of the syllogisms: Sincethere are only a relatively small number of (deductively correct) syllogismtypes but an infinite number of deductivelycorrect argument types, any model that is specializedfor syllogisms is likely to encounter problems in generalizing to a broader class. In other words, syllogism models have difficulties in explaining the productivity of deduction , largely becausesyllogisms themselvesare nonproductive. Although Aristotle believed that all deductive inferencescould be captured by sequences of syllogisms, modern logic makes it clear that this is not the case that more powerful deductive machinery is neededeven for proofs of someof the geometry theoremsthat Aristotle discussed(M ueller 1974). Why then did experimental psychologists concentrate on syllogisms rather than on the wider range of argumentsgeneratedby modern logical techniques? This may be due in part to ignoranceor historical accident. As " Lemmon ( 1965, p. 169) put it , predicate calculus is to syllogism what a precision tool is to a blunt knife. . . . Nonetheless, whenevera new piece of those who prefer the equipment is introduced, there will always be found " It is more familiar. are outdated machinery with which they likely , however lends structure , that syllogisms appeal to researchersbecausetheir itself to experiments. The small number of syllogistic forms makesit possible for investigators to present all of them to subjects within a single experimental session. Moreover, the combinatorics of syllogisms provide a ready-made set of dimensions to manipulate in an experimental design. In order to find out what makes syllogisms easy or difficult , we can vary the quantifiers that appear in the premisesand the conclusion (the


syllogistic mood), the order of terms in the premises(the syllogisticfigure), the order of the premises, and the phrasesthat instantiate the terms. The ' very factors that limit syllogisms generality are the factors that make them attractive from an experimental standpoint. Designing experimentsaround modem logical systemsis a more challenging task than designing experimentson syllogisms, but it is a necessary step if we want to study reasoning of the sort that the subjects of tables 1.1 and 1.2 engagein. We will return to syllogismsin chapter 7 after considering what a more general theory of deduction might be like. Deductionin Later CognitipeResellrc1l Although researchon categorical syllogismsremains the prototype, the psychology of reasoningbroadened in the 1960sand the 1970sto include studiesof other forms of deduction. Some of this researchcentered on simple arguments- in particular , on conditional syllogisms (such as ( 1) above) and linear syllogisms (such .as (9)). (For studies of conditional syllogisms, see Clement and Falmagne 1986; Cummins et al. 1991; Evans 1977; Marcus and Rips 1979; Markovits 1987, 1988; Rips and Marcus 1977; Taplin 1971; Taplin and Staudenmayer 1973; and Staudenmayer1975. Studies of linear syllogisms include Clark 1969; DeSoto, London , and Handel 1965; Huttenlocher 1968; Potts and Scholz 1975; and Sternberg 1980). Other relevant studies, inspired by earlier research in psycholinguistics and concept learning, focused on ' peoples understandingof sentencescontaining logical connectivessuchas not (Carpenter and Just 1975; Clark and Chase 1972; Trabasso, Rollins, and Shaughnessy1971) and if (Fillenbaum 1975; Johnson- Laird and Tagart 1969; Legrenzi 1970; Wason 1968). In researchof the latter sort, subjectsare not asked to assessindividual argumentsas such; they are asked to paraphrasesentencescontaining the connective, or to decide whether given information would make these sentencestrue or false. On one trial , for example, the experimenter might present a visual display of green dots along with the sentenceThe dots ' aren t red and ask the subjects to decide whether the sentenceis true or falseof the display. The amount of time they take to reachtheir decisiontheir response time- would constitute the dependent variable. But, although no argument explicitly appears, many of the investigators believed that subjectsmentally representsentencesand displaysin a common logical form. Thus, determining whether the display made the sentence true is clearly similar to determining the deductive correctnessof ( 17).

PsychologicalApproaches, to Deduction

( 17) The dots are green. The dots aren' t red. Findings from these experiments will be important in chapter 4 and 5, where we will take up propositional reasoning. For the moment, however, let us consider the general framework in which the investigators tried to explain their findings. The theories or models that researchersproposed for theseresults were specializedin much the sameway as the syllogism models: They provided a rigorous account of how subjectshandle a particular task by specifying the elementary information -processingsteps- for example, encoding and comparison operations- that subjectscarry out in determining their answers . But the models were hard to extend beyond the task. Models of the way subjectscompare simple affirmative or negativesentencesto pictures, for example, provided accurate quantitative predictions about response time in terms of matching of constituents in the mental representationsof ' the sentences(e.g., those in ( 17 . But they werent able to explain, without new assumptions, the comparison of other proposition types (e.g., If some of the dots are green then not all of the dots are red). What seemedto be missing was a broader theoretical framework that could bridge between modelsfor isolatedlogical connectivesin isolated paradigms. Thus, Newell " ( 1980, pp. 694- 695) complained: Theorists seemsimply to write adifferent theory for eachtask. Details get filled in experimentally, but the frameworks . .. are just written down. . . . The difficulty lies in the emergenceof ' each of the microtheories full blown from the theorist s pen. There is no way to relate them and thus they help ensure the "division of the study of human cognition into qualitatively isolated areas. ' Reasoningas Problem Soi, ing Newell s solution to the problem is to view thesespecificmodels as specialcasesof generalproblem-solving abilities . What unites the various tasks and models, according to this framework , is the notion of a problemspace, which Newell ( 1980, p. 697) defines as " a set of symbolic structures(the statesof the space) and a set of operators over the space. Each operator takes a state as input and produces a state as output . . . . Sequencesof operators define paths that thread their " way through sequencesof states. Problem-solving theory is the study of behavior explainable in terms of problem spaces. The idea of a problem spaceis illustrated most easily in a game such as chess: the set of possible

Chapter 1

configurations of pieceson the board can be taken as the states, and the legal moves in the game as operators. According to Newell, however, the " problem-spaceconcept is applicable to any symbolic goal-oriented activity " and to deduction , problems in particular . For example, we can redescribe Erickson' s syllogism model in terms ofa problem spaceby taking configurations of mental Euler circles as states and the repositionings of thesecircles as operators. Newell ( 1980, 1990) offers two rather different problem-spaceformulations for syllogisms. There is no doubt , then, that problem-solving theory is applicable to clear-cut deduction tasks. Newell is also quite right in pointing to the ad hoccharacterof the reasoningmodelsmentioned above. However, it is not clear whether problem-solving theory is the correct solution to this difficulty . The extremely general nature of the problem-space concept may itself be cause for concern, since it could turn out that the concept is too abstract to provide much insight into the nature of deduction. There are really two problems here: Problem-solving theory fails to explain someimportant distinctions that we needin accounting for inference, and the problem-space notion may itself be too loosely constrained to be empirically helpful. The problem-space concept can describe an extraordinary range of cognitive tasks- not only chess puzzles and syllogisms, but also such relatively low-level processes as item recognition in short-term memory (Newell 1973b, 1990). This very flexibility creates a problem when we consider the role that deduction plays in explaining other cognitive activi ties. As was noted above, deduction ( but not, say, solving chesspuzzles) is a well-entrenchedpsychological concept, useful in accounting for other mental processes. The problem-spacehypothesiscannot explain this contrast betweencentral and peripheral processes, sinceboth can be couched in terms of state-spacesearch. What causestrouble for the idea of deduction as problem solving is the asymmetry of cognitive explanations: Although deduction may be helpful in accounting for how one solves such chesspuzzles, no one would appeal to solving chesspuzzlesin accounting for deduction. In other words, a theory of deduction has to include, not only problem spacesfor particular deduction tasks, but also an explanation of the role that deduction plays in the rest of the cognitive system. Furthermore, some of the problems that Newell raises with respectto previous models of reasoning also occur with respect to the choice of a problem space. Sincethere are few limits , if any, on the statesof the space

PsychologicalApproaches to Deduction

or on the operators defined over them, a theorist is free to choose his problem spacein a new way for each task. This means that aproblem spaceaccount of deduction can be ex post facto in nearly the sameway as other models. In simple tasks, it might be possible to construct the problem spacefrom surfacefeaturesof the problem statement(seeNewell -picture comparison); however, predicting problem spaces 1990on sentence ' is difficult for complex deduction tasks. One of Newell s own theories for syllogisms (Polk and Newell 1988; Newell 1990) consists of six separate problem spaces- somecontaining representationsof objects corresponding to the terms of the syllogism, others containing representationsof propositions expressedby the premises, and yet others containing both sorts of representations. Perhaps these are the problem spacesthat subjects use, but it seemsunlikely that they can be predicted from nothing ' more than the subjects instructions for this particular task. In order to achieve a principled theory for deduction tasks within a problem-solving framework, investigators have to choose their problem ' spacesin a way that reflects subjects general understanding of all , some, not, and if , as well as their understandingof what it meansfor one sentence ' to follow logically from another. I don t want to suggestthat Newell and other researchersin problem solving have consciously adopted the view that deduction is nothing but searchin a problem space. In other psychological domains (e.g., parsing and object perception), they have beenquite willing to accept more narrowly specified theories within the problem space framework (Newell and Simon 1976). But the problem space hypothesisis not sufficient, and the missing parts of the theory are not to be found in problem-solving research.The idea of a problem spacegivesus a generalization that applies to any rule-governedcognitive performance, including item recognition, dichotic listening, phonememonitoring , word or picture naming, lexical decision, letter matching, and other favorite tasks. This generalization may be a useful framework for cognitive research ; however, as I will try to show in later chapters, it is not to be confusedwith a theory of deduction.

andProspects Conclusions The psychological approaches that I have surveyed falter in explaining basic facts about deduction becauseof their narrow view of the subject

Chapter 1

matter. By concentrating exclusively on individual differences, psychometricians ignore universal aspectsof deduction that may account for its productivity . Experimental psychologists tend to focus on deduction in the context of particular tasks, such as evaluating categorical or linear syllogisms. This is not necessarilya bad strategy; experimental precision may evenrequire it. However, becausethe argumentsthat appear in these tasks have usually been nonproductive ones (and becausethe proposed modelsare tailored to the tasks), the deduction modelsare not productive. To make matters worse, a task-by-task approach makes it difficult to identify general properties of deduction or to understand the central role that deduction plays in supporting other mental processes. It doesn't help very much to view thesetasks within a problem-solving framework if deduction problems are treated on the samefooting as chess puzzlesand crypt arithmetic (Newell and Simon 1972; Newell 1990). The elementsof problem-solving theory- problem states, operators, and control methods- are too unconstrained to explain what is essentialabout deduction. If problem-solving theory merely generatesa new set of states and operators for eachdeduction task , we are not much better off than we would be with task-tailored models. Clearly, what is missing is a psychological theory of deduction that is broader than a model for a particular type of argument or a particular type of experiment but which captures what is characteristic of deduction with respectto other kinds of thinking . The rest of this book is devoted to . The developing such a theory and evaluating its empirical consequences main statementof the theory will appear in part II ; however, as a start on this project, we will look to modem logic and artificial intelligence in the next two chapters, since thesedisciplines may well provide us with hints about the shapethat a cognitive theory should take.


Reasoningand Logic

In 1926ProfessorJ. Lukasiewiczcalled attention to the fact that mathematiciansin their proofs do not appeal to the thesesof the theory of deduction, but make useof other methodsof reasoning. The chief meansemployedin their methodis that of an arbitrary supposition. S. Jaskowski ( 1934)

, start from basiclogicalpropositions , doesnot, in general Natural deduction , however . areapplied . . . to whichlogicaldeductions , but ratherfrom assumptions G. Gentzen(1935 ) / 1969 Traditional logic restricted researchin the psychology of reasoning. Because the main instrument of traditional logic- the categorical syllogism - is a narrow type of deductive argument, experimental studies that centered on syllogismssufferedin generality. For the most part , thesestudies (and the models derived from them) gaveno hint as to how people reason about argumentswhosesentenceshavemore than one quantifier or whose grammar is more complex than the simpleS is P fonn. Sincemodern logic ' overcomessome of the syllogism s limitations , it seemsa natural place to look for guidance in constructing a more adequate reasoning theory. Of course, even current logic has its limits , since there are deductive inferences ' that aren t representedin thesefonnal systems. The inferencefrom Mike is a bachelor to Mike is unmarried, is one example. Nevertheless, logic is surely the only well-developedsystemfor assessingthe deductive correctnessof arguments, and it is therefore a worthwhile starting point . The idea that fonnal logic bears a close relationship to human reasoning is extremely controversial within cognitive science. For example, Wason and Johnson- Laird ( 1972, p. 245) concluded their influential study " of reasoningby stating that only gradually did we realize first that there' was no existing fonnal calculus which correctly modeled our subjects " inferences,and secondthat no purely fonnal systemwould succeed. This ' kind of opposition is basedon evidencethat subjects judgments about an derived argument sometimesdepart from the answer that the experimenter and of some into assessing the logic system argument by translating its correctnesswithin that system. The strength of this evidence, however, ' clearly dependson the system the investigator uses. If the investigator s conception is too narrow, what is classifiedas nonlogical behavior may turn out logical after all. To evaluatethe evidence, we needsomeclear idea of what logic has to offer. This chapter discusses aspectsof logic that have played a role in the debate on logic in reasoning and that will play an important part in the

Chapter 2

theory to be developed in later chapters. Most cognitive theories that contain a logic-like component (e.g., Draine 1978; Johnson- Laird 1975; Osherson, 1974b, 1975, 1976; Rips 1983) are basedon the notion of proof, " " particularly the natural deduction proofs originally devisedby Gentzen ( 1935/ 1969) and Jaskowski ( 1934). The first section describestheseproof methods, and the secondsection illustrates them in a sample system. The basic notion is familiar to anyone who has taken a high school mathematics course: If you want to know whether a particular argument- is deductively correct, you can find out by taking its premisesas given and then trying to derive its conclusion by applying a specifiedset of rules. If a proof or a derivation is possible, then the argument is deductively correct; the conclusion is deduciblefrom the premises. We can also say, byexten sion, that the argument as a whole is deducible. The left column of table 2. 1 summarizesthis terminology. Contemporary logical theory supplementsthe notions of formal proof and deducibility with a twin " semantic" systemwhosecentral conceptsare the truth of a sentenceand the validity of an argument ( Tarski 1936/ 1956). In the semantic system the deductive correctness of an argument is a matter of the relationship betweenthe truth of the premisesand the truth of the conclusion. In particular , an argument is deductively correct if and only if the conclusion is true in all states of affairs in which the premises are true. In this case, the conclusion is ( semantically) entailedby the premises , and the entire argument is valid, as shown in the right column in table 2.1. (We can also speakofa single sentenceas valid if it is true in all states of affairs. This is, of course, a stronger notion of semanticcorrectnessfor a sentencethan the simple truth of a sentencein the actual state of affairs; it is the difference between Insects have six legs, which is true in our presentstate but falsein a logically possiblestate in which they have eight, and Six is lessthan eight, which is true in all possiblestates.) Table2.1 oflogicalterminology . Summary Proof theory


Correctnessof the relation of the conclusion to the premises

Deducibility( = conclusion provablefrom the premises )

Correctness of an argument


Semanticentailment ( = conclusiontruein all statesof affairsin which the premises aretrue) Validity

Reasoningand Logic

In this dual setup, then, we have two criteria of deductive correctness: deducibility and validity . We might hope that these two criteria coincide in confirming exactly the same set of arguments. For simple logical systems- for example, classical sentential and predicate logic they do ' coincide; such systemsare said to be complete. In thesesystems, it doesnt matter which criterion we use, as long as all we are interestedin is determining which arguments are the correct ones. However, there are three reasonsfor keeping both criteria in mind. First , the proof-theoretic description is computationally relevant: The description contains rules that yield a finite proof of a deducible argument. By contrast, the semantic description is computationally independent, sincethe criterion it givesfor validity does not dependon there being any finite procedurefor assessing it. Second, there are more complex logical systemsin which the criteria ' don t coincide- incomplete systemsin which some entailments are not deducible. Third , according to an intriguing theory put forth by Johnson Laird ( 1983), it is the semantic rather than the proof theoretic criterion that is important in human reasoning (contrary to the other cognitive theories cited above). In the present chapter we will stick to concerns about deducibility ; we will return to semantics and its putative role in reasoningin chapters 6 and 10. Of course, changing the deduction rules ofa logical system(or changing the way in which truth is assignedto propositions) can alter which arguments are deemeddeductively correct. Logicians and philosophersof logic have in fact proposed a variety of plausible systems that differ in the elementsof natural languagethat come in for formal treatment, and in the analysisthat they give theseelements. Current logics are available for concepts , such as knowledge and belief, temporal precedenceand succession . and and and possibility necessity , logical , permission causality obligation For each of theseconcepts, there are alternative logics differing in exactly which argumentsare deductively correct. Psychologistshave almost completely overlooked this variety, tacitly assuming a single standard of deductive ' correctness.This meansthat , when subjectsjudgments havefailed to conform to the standard, psychologistshave been too ready to label them illogical , or , at least, to assumethat logic is of no help inunderstanding ' them. We can t begin to survey all these systems(see van Benthem 1985and Haack 1974for reviews), but we can consider in a preliminary way somealternatives that may be especiallyimportant for psychological


purposes: alternative rules for classical logic and strengthenings of the rules that go beyond classicallogic. This chapter does not aim to settle on a single logical systemthat best representshuman reasoning. Determining which system(if any) is best in this sensewill have to await the empirical evidencepresentedin part II . Nevertheless,we can keep psychologicalplausibility in mind in examining logical techniques. On one hand, this meansthat certain devices(e.g., truth tables) that appear in many logic texts will receivelittle or no attention , since they are unlikely to playa significant role in human reasoning. (For arguments against the psychological reality of truth tables, seeOsherson 1974b, 1975.) On the other hand, much of this chapter will be devoted to natural-deduction systems. The first section, in particular, describes the general properties of natural deduction and contrasts it with earlier axiomatic approaches. The secondsection setsout in more detail aparticular natural-deduction systemfor classicalsentential and predicate logics and considerssome possiblemodifications that might bring it into better agreementwith intuitions about deductive correctnessby people with no formal training in logic. Readerswho are already familiar with natural deduction from introductory logic may want to refresh their memory by glancing at the rules in tables 2.3 and 2.5 and then skip to the subsections on " possible modifications." The third section discusses an objection by Harman ( 1986) that rules like those of natural-deduction systemscan't " " possibly serveas rules of reasoning. Formal Proof

At the most general level, a formal proof is a finite sequenceof sentences (sl ' S2' . . . , Sk) in which each sentenceis either a premise, an axiom of the logical system, or a sentencethat follows from precedingsentencesby one of the system's rules. An argument is deducible in the systemif there is a proof whose final sentence , Sk' is the conclusion of the argument. For example , consider a systemthat includes modus ponensamong its inference rules. Modus ponensstipulates that the sentenceq follows from sentences of the form IF p THEN q and p. Thus, ( I ) is deducible in this system. ( I ) IF Calvin deposits50 centsTHEN Calvin will get a coke. Calvin deposits 50 cents. Calvin will get a coke.

Reasoningand Logic

The proof consistssimply of the sequenceof three sentencesin the order listed above, since(a) each sentencein the sequenceeither is a premiseor follows from preceding sentencesby modus ponens and (b) the final sentence is the conclusion of the argument. (Capital letters are usedfor IF and THEN to mark the fact that thesewords are parts of the proof systemand are not necessarilyequivalent in meaning or force to English if . . . then. . . ; this practice will continue through the book.) In this stripped-down system, we could also prove the deducibility of 2 ( ). (2) IF Calvin deposits 50 centsTHEN Calvin gets a coke. IF Calvin gets a coke THEN Calvin will buy a burger. Calvin deposits 50 cents. Calvin will buy a burger. In this case a proof involves two applications of modus ponens, as (3) shows. (3) a. IF Calvin deposits50 centsTHEN Calvin gets a coke.


b. Calvin deposits50 cents.


c. Calvin getsa coke.

Modus ponens

d. IF Calvin getsa coke THEN Calvin will buy a burger.


e. Calvin will buy a burger.

Modus ponens

Sentences(3a), (3b), and (3d) are premisesof argument (2), and sentences (3c) and (3e) are derived by modus ponensfrom precedingones. To prove a more diverse set of arguments, we will clearly need greater deductive power. We can get it , within this framework, by introducing axioms or additional inference rules. An axiomatic (or " logistic" ) proof systemcontains a set of axioms and usually has modus ponensas its only rule. Natura/-deductionsystemscontain severaldistinct inferencerules and eliminate axioms. The two kinds of systemcan be equivalent in proving exactly the same set of theorems, but they possessrival advantagesand disadvantagesin other respects. On one hand, axiomatic systemssometimes have an advantage over natural-deduction systemswhen we must


derive characteristics about the proof system itself (though naturaldeduction systemshave interesting metatheoretic properties of their own - seeFine 1985a , b; Gentzen 1934/ 1969; Prawitz 1965; Ungar 1992). On the other hand, it is usually easier to prove theorems within anatural deduction system, and consequentlymost elementary textbooks on logic make use of natural deduction as a main proof method (e.g., Copi 1954; Fitch 1952; Lemmon 1965; McCawley 1981; Quine 1950; Suppes 1957; Thomason 1970a). For the same reason, natural deduction has been the " method of choice in psychological models. Whether formal natural-deduction " methods are natural psychologically is a major question that we will explore experimentally in part II . (The description of deduction systemsin this chapter will necessarilybe brisk compared to that of the textbooksjust mentioned. Readerswho would like a lesscondensedtreatment should consult thesesources.) The Axiomatic Method In this book we will focus on natural deduction, but it may be helpful to give an exampleof an axiomatic proof in order to highlight the differences. Of course, an axiomatic systemmight have generatedthe proof in (3); but since modus ponens was the only rule we needed, it is not very revealing . Instead, we can take as our example the somewhat similar argument shown in (4), which has the form known as modustollens. (4) IF Calvin deposits 50 centsTHEN Calvin gets a coke. Calvin does not get - a coke. Calvin does not deposit 50 cents. We will take as a sample axiom system one that contains three axiom schemasin addition to the usual modus ponens rule (Mendelson 1964, p. 31). We can spell out theseschemascompactly if we allow P, Q, and R to stand for arbitrary sentencesand abbreviate IF P THEN Q as P -+ Q and NOTP as - PoWe then allow a.sentenceto count as an axiom if it is the result of substituting sentencesfor P, Q, and R in any of the following schemas: (P -+ (Q -+ P (I ) P -+ (Q -+ R -+ P -+ Q) -+ (P -+ R (II ) (III ) ( P -+ - Q) -+ - P -+ Q) -+ P).

TableU An sampleaxiomaticproofof theargumentd-+c; - c; therefored . 1. - - d - ( - d - - - d) - d - - d) - - d) 2. ( - d - - - d ) 3. [ ( - d - - - d) - - d - - d) - - d)] { - - d - [ ( - d - - - d) - - d - - d) - d)] } - d - [ ( - d - - - d) - - d - - d) - d)] 4. - d - - d ) - d] } S. { - - d - [ ( - d - - - d) d d d ( )] [ [ - - d - - d - - d) - d)] } { - d - - d) - d )] 6. [ - - d - ( - d - - - d ) ] - [ - - d - d - - d) - d)

7. - - d -

8. ( - d - - d - - d) - - d)) - d - ( - d - - d)) - ( - d - - d)) 9. d - - d - - d) - - d 10. ( - d - ( - d - - d)) - ( - d - - d) 11. - d -+ ( - d -+ - d) 12. - d - - d 13. ( - d -+ - d) -+ ( - - d -+ ( - d -+ - d 14. - - d - + ( - d - + - d )

15. [ - - d -+ - d -+ - d) -+ d)] -+ [ ( - - d -+ ( - d -+ - d -+ ( - - d -+ d)] 16. ( - d -+ ( - d -+ - d -+ ( - - d -+ d) 17. -

- d -

18. - c -


d -

- d -

19. - c 20 . - - d -

- c ) -

- c

21. (d - c) - ( - - d - Cd- c 22. d - c 23. - - d - (d - c) - - d - d ) - ( - - d - c)) 24. ( - - d - (d - c )) d c) 25. ( d d) ( 26. - - d - c - - d - c) 27. ( - - d - - c ) 28. ( - - d - c) - - d 29. - d

- d)

AxiomI AxiomIII AxiomI

Modusponens (from4 & 5) Modusponens (from 1 & 6) AxiomII Axiom I Modus ponens (from 8 & 9) Axiom I Modus ponens (from 10 & 11) Axiom I Modus ponens (from 12 & 13) Axiom II

Modusponens (from 7 &; IS) Modusponens (from 14&; 16) AxiomI Premise Moduspollens (from 18& 19) AxiomI Premise Modusponens (from 21& 22) AxiomII Modusponens (from 23& 24) Modusponens (from 17&; 25) AxiomIII Modusponens (from 20&; 27) Modusponens (from 26&; 28)

Chapter 2

Table 2. 2 gives a full proof of (4), with d standing for Calvin deposits50 centsand c standing for Calvin gets a coke. The justification for each line " " of the proof appearsat the right; for .example, Axiom I in the first line of the table means that this line is an instance of axiom schemaI (above) when - - dissubstituted for P and - d for Q. The proof proceedsin this way by finding substitution instances for the axiom schemasand then applying modus ponens to the instances. The point of this exerciseis simply to illustrate the difficulty of finding an axiomatic proof for even a relatively brief argument such as (4). Of course, we could reducethis complexity for a given argument by choosing a different set of axioms; however, the indirection that usually accompanies axiomatic proofs suggests that other methods may be more appropriate for capturing ordinary deductivereasoning for example, the reasoning we glimpsed in tables 1.1 and 1.2. The role that assumptions or suppositions play in human reasoning was noted in chapter 1: People " " often assumecertain sentencesare true for the sake of argument in order to simplify their thinking . This key notion leads to proofs that are much more lucid than the one in table 2. 2 and that are presumably better candidatesfor human proof finding. The Natural - Deduction Method In table 2. 2 we assumedthe premisesof (4) and usedthesepremisesas lines of the proof. 1 Natural - deduction methods enlarge on this idea by permitting us to make other temporary assumptionsor suppositions. As an example of how sup positionscan simplify a proof, consider(5)- an informal justification of (4). (5) a. According to the first premise, if Calvin deposits50 centsthen Calvin gets a coke. b. Suppose, contrary to the conclusion, that Calvin doesdeposit 50 cents. c. Then (by modus ponens), Calvin would get a coke. d. But the secondpremisetells us that Calvin does not get a coke. e. Hence, Calvin must not have deposited50 cents. This justification embodiesa typical reductio ad absurdum pattern: In (5b) we assumetemporarily the opposite of the conclusion we wish to prove

Reasoningand Logic

and then show that this leads to contradictory information. Since the supposition could not hold, the conclusion itself must follow from the premises. In other words, (5) tacitly appeals to an inference rule stating that if a supposition leads to a pair of contradictory sentencesthen the negation of that supposition must follow. This reductio rule, together with modus ponens, is sufficient to show that (4) is deducible. Natural -deduction systemsformalize this method of making suppositions in the service of a proof, as the quotations from Gentzen and Jaskowskiat the beginning of this chapter attest. Within thesesystems, we can introduce suppositions freely as lines of a proof in order to draw further inferencesfrom them. Before the proof is complete, however, we must apply a rule that resolvesor " discharges" the supposition, since the conclusion of the proof must dependon the premisesalone and not on any of the arbitrary assumptions we have made along the way. In (5), for example, the supposition made in line b is resolvedwhen we concludethat the supposition could not in fact hold (owing to the contradiction ). To state this a bit more systematically, we can use the term domain to refer to a designatedset of lines of the proof that are associatedwith a supposition. Then no supposition (apart from the premises) can include the conclusion of the proof in its domain. Figure 2.1 illustrates the notion of a domain, using (5) as an example. In the figure the premisesin (5) establisha domain consisting of (5a), (5d), and (5e). The supposition in (5b) establishes a second domain that comprisesjust (5b) and (5c). As this example illustrates, one domain can be subordinated to another: The domain of (5b) and (5c) is a subdomain of the premises' domain. If D is the smallestdomain that includes a domain D ' as subordinate, then D ' is an immediatesubdomainof D. We can also say that D is the immediatesuperdomain of D ' . Thus, the domain of (5b) and (5c) is also the immediate subdomain of the premises' domain, and the latter is the immediate superdomain of the former. A sentencecan be said to hold in the domain in which it appears, and this means that deduction rules that apply to the domain can useit freely. In the systemsfor classicalsentential and predicate logic, which are outlined in this chapter, sentencesalso hold in all subdomains of their domain. According to this system, for example, (5a) holds throughout the proof in (5), including the subdomain. (Of course, (5c) holds only in the subdomain.) Other logical systems, however, place special restrictions on which sentenceshold in subdomains (seethe first " " possiblemodifications subsectionbelow).

Chapter 2


. T~ D:

Figure1. 1 : Graphof thedomainsin (5) for thefollowingargument . a coke Calvin IF CalvindepositsSOcentsTHEN gets Calvindoesnot get - a coke. Calvindoesnot depositSOcents.

Texts on natural deduction have used a variety of devicesto indicate domains and a variety of rule systemsto manipulate them. As an example, let us consider a system(similar to that of Jaskowski 1934) that represents ' domains by a list of sentences , with a plus sign in front of the domain s suppositions. We indicate subdomains by indenting the list , as shown below. The system will consist of all simple English sentences(e.g., Dogs bark) and of more complex sentencesformed in the following way: If P and Q are sentencesof the system, then so are sentencesof the form IF P THEN Q and NOT P. Again, IF . .. THEN and NOT are capitalized to emphasizethat these connectivesmay be somewhat different from their counterparts in natural language. The heart of the systemconsistsof the following three inferencerules.

Reasoningand Logic

Modus ponens(IF Elimination ): If sentencesof the fonn IF P THEN Q and P hold in a given domain, then the sentenceQ can be added to

that domain. Reductioad absurdum: If the sentencesQ and NOT Q hold in a subdomain whosesole supposition is NOT P, then the sentenceP can be added to the immediate superdomain. If the sentencesQ and NOT Q hold in a subdomain whosesole supposition is P, then NOT P can be added to the immediate superdomain.2 Conditionalization(Conditional Proof, IF Introduction ): If a sentenceQ holds in a subdomain whosesole supposition is P, then IF P TH EN Q can be added to the immediate superdomain. The Conditionalization and Reductio rules make crucial useof suppositions and thus allow the systemto emulate thesecharacteristicsof human reasoning. The rules are also compatible with the productivity of reasoning , since they can prove an unlimited number of arguments if we repeat and combine them. With this system, a formal proof of (4) would look like (6) (where a justification for each line appearsat the right). (6) a. + IF Calvin deposits50 centsTHEN Calvin getsa coke. b. + NOT Calvin getsa coke. c. d.

+ Calvin deposits50 cents. Calvin getsa coke.

e. NOT Calvin deposits50 cents.

Premise Premise Supposition Modus ponens Reductio

This proof bearsan obvious similarity to the informal on~ illustrated in (5) and in figure 2.1. We assumein (6c) that Calvin deposits50 cents, contrary to the conclusion that we wish to prove. From this supposition, it follows that Calvin gets a coke; however, since NO T Calvin gets a coke is one of the premises(and therefore holds in the subdomain of (6c) and (6d , the reductio rule appliesand yields the conclusion we were hoping for. Indeed, ' the only differencesbetween(5) and (6) are (6) s more formal representation ' of domains and the consequentreordering of the proof s lines. Note also that the third rule, conditionalization , is not used in (6), though it is required to equatethe power of our samplenatural deduction systemwith


that of the axiomatic systemdescribedabove. (We will soon make use of conditionalization in a further example.) This proof of (4) is not quite as simple as the corresponding proof of ( 1), which contains just three lines- the premises and the conclusion itself. Modus ponens is the only rule required in such a proof, whereas both modus ponens and reductio are used in (6). This accords with the results of many experimentsshowing that ( 1)' s correctnessis much easierto identify than (4)' s for college-age subjects. Sandra Marcus and I (Marcus and Rips 1979) found that 98- 100% of the subjects we tested (across conditions ) recognizedthat the conclusion of arguments like ( 1) followed from the premisesbut only 57- 62% recognizedthe conclusion of (4) as following from the premises. This provides a first hint that natural-deduction systemshave properties similar to those of human reasoning, though of course the difficulty with (4) may be due to other factors as well. (See Evans 1977; Markovits 1987, 1988; Rips and Marcus 1977; Taplin 1971; and Taplin and Staudenmayer1973for similar findings.)

TheCPL (ClassicalPredicateLogic) System In this section, to illustrate natural- deduction techniques, I develop a full rule systemfor classicalpredicate logic (or CPL ). I start in the usual way, by specifying rules for the sentenceconnectives AND , OR, NOT , and IF . . . THEN ; I then consider possible modifications to these rules that might bring them in closer touch with human inference. Next I discuss rules for the quantifiers FOR ALL and FOR SOME. The complete set of rules (connectives plus quantifiers) provides the basis of the deduction systemthat will be developedin succeedingchapters. Rulesfor Connectives Although the natural-deduction schemejust discussedyields fairly direct proofs for ( 1) and (4), it is not quite so convenientfor other arguments. The only parts of the languagethat we can formally analyzein this systemare the sentenceconnectivesIF . . . THEN and NOT . This meansthat in order to establish the deducibility of argumentsthat depend on other words we would have to show how theseargumentscan be translated into oneswith IF . . . THEN and NOT as their only operational expressions. Even an argument as elementaryas (7) is not yet deducible.

Reasoningand Logic

Linda is a feminist AND Linda is a bank teller .Linda is a bank teller .

A surprising amount can be accomplishedby translation. Each of the two deduction systemsthat we have examined so far- the axiomatic system with axiom schemasI - III and the natural -deduction system with Modus ponens, Conditionalization , and Reductio- is able to prove all the arguments that are valid in classical sentential logic, the logical system generally presentedfirst in elementarytextbooks. Valid argumentsin this systemcontaining the sentenceconnectivesAND , OR, or IF AND ONLY IF can be proved by paraphrasingtheseconnectivesusing NOT and IF . . . THEN . For example, as (8) shows, if we rephrasethe first premiseof (7) as NOT (IF Linda is a feminist THEN NOT Linda is a bank teller), we can then show that (7) is deducible. (8) a. + NOT (IF Linda is a feminist THEN NOT Linda is a bank teller). b. c. d.

+ NOT Linda is a bank teller. + Linda is a feminist. IF Linda is a feminist THEN NOT Linda is bank teller.

e. Linda is a bank teller.

Premise Supposition Supposition Conditionalization Reductio

In this example, lines (8a) and (8e) form a first domain, lines (8b) and (8d) a second, and line (8c) a third , as indicated by the indentation. Line (8d) is our first example of the Conditionalization rule. The underlying idea is that we can derive a sentenceof the form IF P THEN Q provided that we can show (in a subdomain) that Q follows from the supposition P according to the rules of the system. In the present case, we assumein line (8c) that Linda is a feminist. NOT Linda is a bank teller (i.e., (8b holds in this innermost subdomain, becauseof the convention of classicallogic, mentioned above, that superdomain sentenceshold in their subdomains. It follows by Conditionalization that IF Linda is a feminist THEN NOT Linda is a bank teller in the domain of lines (8b) and (8d). The conclusion of Argument (7), Linda is a bank teller, then follows by Reductio, sincethe contradictory sentences(8a) and (8d) hold in the same domain, whose supposition is NOT Linda is a bank teller.

Chapter 2

Formal Rules Even though arguments such as (7) are provable in this natural-deduction system, the proofs are more roundabout than we would like. For one thing, the translation step may not be obvious: What justifies taking Linda is a bank teller AND Linda is a feminist as equivalent to NOT ( IF Linda is a bank teller THEN NOT Linda is a feminist) ? For another thing, the proof in (8) seemsless straightforward than the argument (7), and from a psychological point of view it seemsodd to derive a trivial argument by meansof a nontrivial proof . Someonewho was unsure about the correctnessof (7) would probably not be convincedby the proof in (8). What we require, then, are inferencerules that will apply directly to arguments such as (7) and that embody the deductive potential of the ' connectives AND and OR. Gentzens systemsNK and NJ fulfill these requirements, and variations on these systemsappear in the logic textbooks cited above. The rules in table 2. 3 constitute one such variation that includes specific rules for AND , OR , IF . .. THEN , and NOT . The table statesthem in the form of simple proceduresfor carrying out an individual ' inferencestep. These rules are essentiallythe sameas those in Gentzens NK , except for a minor modification in the rules for NOT (i.e., Double Negation Elimination and Reductio). The systemof table 2.3 derivesexactly the sameresults as the three-rule systemdiscussedabove, but it doesso in a more straightforward way. For example, the proof of argument (7) is now just a list of the prel~1iseand the conclusion, where the conclusion follows by the new AND Elimination rule. The proofs of arguments( 1) and (4) are the sameas before, since the expandedsystemalso contains the Modus ponensand Reductio rules that " " we used in generating those proofs. We use IF Elimination as another " " name for Modus ponensand IF Introduction as another namefor Conditionalization in table 2.3 to bring out the symmetry betweenthem.3 We also refer to the two forms of Reductio as NOT Introduction and NOT Elimination . As a fuller example of how these rules work , we can prove that argument (9) is deducible. (9) (Jill is in Pittsburgh AND Robin is in Atlanta ) OR NOT Jill is in Pittsburgh) AND (NOT Robin is in Atlanta (IF Jill is in Pittsburgh THEN Robin is in Atlanta ) AND (IF Robin is in Atlanta THEN Jill is in Pittsburgh) The proof of (9), which appearsin table 2.4, illustrates many of the rules from table 2.3. To make the proof more readable, Jill is in Pittsburgh is

Table1.3 Inferencerulesfor classicalsententiallogic. IF Elimi8ti OD(Mod. poaens) (a) If sentencesof the form IF P THEN Q and P hold in a given domain, (b) then the sentenceQ can be added to that domain. IF I Dtrod_ don (CoadidoaaHzadon) (a) If a sentenceQ holds in a subdomain whosesupposition is P, (b) then IF P THEN Q can be added to the immediate superdomain. NOT Elimi DadOD (Reducdoad alllllrdam 1) (a) If the sentencesQ and NOT Q hold in a subdomain whosesupposition is NOT P, (b) then the sentenceP can be added to the immediate superdomain. NOT I Dtrodl KdOD(R. . do ada - . dam 2) (a) If the sentencesQ and NOT Q hold in a subdomain whosesupposition is P, (b) then NOT P can be added to the immediate superdomain. Double Neaadon FJimiaadoa (a) If the sentenceNOT NOT P holds in a given domain, (b) then the sentenceP can be added to that domain. AND Elimi8don (a) If the sentenceP AND Q holds in a given domain, (b) then the sentencesP and Q can be added to the domain. AND I DtrodDetiOD (a) If the sentenceP and the sentenceQ hold in a given domain, (b) then the sentenceP AND Q can be added to that domain.

ORElimination P OR Q holdsin a givendomainD, (a) If thesentence subdomain of D whosesupposition (b) andthe~ tenceR holdsin an immediate is P, subdomain of D whosesupposition (c) andthe~ tenceR holdsin an immediate is Q, (d) thenR canbeaddedto D. OR latroduetioa

~ P holdsin a givendomain (a) If thesenten , P OR Q andQ OR P canbeaddedto that domain, whereQ is (b) thenthesentences an arbitrarysenten ~.

abbreviated as p and Robin is in Atlanta as q. The basic strategy of the proof is to show that the conclusion of (9) follows from either Jill is in Pittsburgh AND Robin is in Atlanta (i.e., p AND q) or ( NOT Jill is in Pittsburgh) AND ( NOT Robin is in Atlanta ) (i.e., ( NO Tp ) AND ( NOT q) , taken separately. Thus, the conclusion follows from the disjunction of thesesentencesin the premise by OR Elimination . Lines b- i of table 2.4 establishthe first part of this strategy, and linesj - u the secondpart . In line b we assumethe first disjunct, p AND q, and we can then get q by AND Elimination . In line d we also assumep, and since q still holds in the subdomain created by p we can derive IF p THEN q by IF Introduction . The rest of the proof follows through repetition of roughly the same combination of steps.



Table2..4 from -premise(p Sample -- AND q) OR (NOT p AND NOT q) to - natural-deductionproof conclusion(IF p THEN q) AND (IF q THEN p). Premise 8. (p AND q) OR (NOT p AND NOT q) + pAN Dq Supposition AND Elimination(from b) q +p Supposition IF Introduction(from Cod) IF p THENq


+q IF q THEN P (IF p THEN q) AND (IF q THENp) + NOTp AND NOTq NOTq +q + NOTp

P IF q THENp NOTp +p



ANDElimination (fromb) Supposition IF Introduction (fromf, g) ANDIntroduction (fromc, h)

Supposition AND Elimination(fromj ) Supposition Supposition NOT Elimination(from k - m) IF Introduction(from I, n) AND Elimination(fromj ) Supposition Supposition NOT Elimination(from p- r) IF Introduction(from q, s) AND Introduction(from 0, t) OR Elimination(from i, u)

The new rules for Double Negation Elimination , AND Introduction , AND Elimination , and OR Elimination all seem intuitively reasonable. Double Negation Elimination enablesus to rewrite a sentencewith two initial negatives(i.e., NOT NOT P) as a positive sentence(P). AND Introduction permits us to add P AND Q to a domain in which we already have P and Q stated separately. AND Elimination simply restatesthe inference that is embodiedin (7). Although the OR Elimination rule hasmore conditions than the others, its basic task is also quite reasonable, and we have just seenan exampleof its usefulnessin the proof of table 2.4. Assumethat we havea disjunction P OR Q that holds in a domain and we wish to show that somefurther sentenceR follows. Then one way to establishR is (a) to show that R follows from P and (b) to show that R also follows from Q. If both of thesehold, then R must follow , no matter which of P and Q is the case. OR Elimination merely formalizes this idea in terms of domains and subdomains. Other rules in table 2. 3 deservespecial comment, however, since they lead to inferencesthat many find counterintuitive. The OR Introduction rule directly sanctionsargumentsfrom a given sentenceto a disjunction of

Reasoningand Logic

that sentenceand any other. For instance, argument ( 10) is deducible in any systemthat contains such a rule. ( 10) Linda is a bank teller. Linda is a bank teller OR Cynthia owns a Dell.

( 12) a. + Linda is a bank teller. b. + NOT Linda is a bank teller. c. d.

Premise Supposition

+ NOT Cynthia owns a Dell. Supposition NOT Elimination Cynthia owns a Dell.

e. IF NOT Linda is a bank teller THEN Cynthia owns a Dell.

IF Introduction

Line d of ( 12) follows by Reductio (NOT Elimination ) becausethe contradictory sentencesLinda is a bank teller and NOT Linda is a bank teller both hold by assumption within the innermost subdomain whosesupposition is NOT Cynthia ownsa Dell. Notice that arguments( 10) and ( II ) are closely related since their conclusions are equivalent within the system: From Linda is a bank teller OR Cynthia owns a Dell you can prove IF NOT Linda is a bank teller THEN Cynthia ownsa Dell, and from IF NOT Linda is a bank teller THEN Cynthia ownsa Dell, you can prove Linda is a bank teller OR Cynthia owns a Dell. The premisesof ( 10) and ( II ) are


This argument has the form of one of the so-called paradoxes of implication : From any sentenceone can deduce a conditional consisting of the negation of the original sentenceas antecedent(if-clause) and an arbitrary sentenceas consequent(then-clause). The proof of ( I I ) proceedsas shown in ( 12).


( II ) Linda is a bank teller. IF NOT Linda is a bank teller THEN Cynthia owns a Dell.


Although somesubjects(even those without special training in logic) find arguments like ( 10) acceptable, many others do not (Rips and Conrad 1983). (This problem will be discussedfurther in the next section.) The IF Introduction and Reductio rules also presenta problem, though a more subtle one. Although these rules came in handy for the earlier proofs, they also allow us to show that somequestionablearguments, such as ( II ), are deducible.


also the same; thus, in one sensethe two arguments are equivalent, even though their proofs involve different rules. Perhaps, then, we can trace the counterintuitive nature of theseargumentsto a common source. PossibleModif I Cations The strangecharacter of arguments( 10) and ( 11) is evidencethat the " OR" and " IF" of the systemdiffer from the " or" and " if ' of ordinary language. This discrepancyhas causedcontroversy within linguistics, psychology, and philosophy of logic. One tactic is to defend the systemof table 2.3 as an approximation of the everyday meaning of if and or, explaining away the oddity of these argumentson other grounds. The best-known recent defenderof the classical system is Paul Grice, who argues in his 1989lectures on logic and conversationthat the conclusionsin ( 10) and ( 11) are lessinformative than their premises. If you know that Linda is a bank teller, then it will be uncooperative of you to tell me only that either Linda is a bank teller or " Cynthia owns a Dell. The latter statement carries the implication of the 's " speaker uncertainty of which of the two it was (Strawson 1952, p. 91). Similarly , it would be uncooperativein the samecircumstancesto say that if Linda is not a bank teller then Cynthia owns a Dell. On Grice's theory, people standardly use sentencescontaining or or if in situations where ' , and this is they don t already know the truth of the embeddedsentences what accountsfor the peculiarity of ( 10) and ( 11). On this theory there is ; if we are playing a party nothing wrong with the arguments themselves in which I have to the game guess occupations of the guests, there can be no objection to your giving me the hint " Linda is a bank teller or Cynthia owns a Dell, but I won' t tell you which" (or " If Linda is not a bank teller, then Cynthia owns a Dell" ) even if you already know that Linda is a bank teller. Hence, on Grice' s account the meaningsof or and if don' t preclude the correctnessof ( 10) and ( 11).405 The other reaction to the problematic argumentsis to reject the parts of table 2.3 that generatethem and devisean alternative systemthat brings OR and IF into better conformity with everyday usage. This strategy is quite common for IF (Lewis and Langford 1932; Stalnaker 1968, 1976) and for counterfactual conditionals (Lewis 1973a), and is sometimesextended to OR (Anderson and Belnap 1975). The connectivesthat result from thesemodifications can be called intensionalOR and intensionalIF to distinguish them from the extensionalOR and IF of the table. One idea " " along these lines is to restrict when a sentencein one domain holds

Reasoningand Logic

within a subdomain. In the classicalsystemsentencesfrom a superdomain hold in all subdomainsembeddedbeneathit , so we were able to use( 12a) within the nested subdomains beginning in ( 12b) and ( 12c). We might - for example consider, instead, allowing only speciallydesignatedsentences , only those that are necessarilytrue to hold within the subdomain mentioned in the IF Introduction rule, thus blocking the proof in ( 12). Intuitively , we can think of thesesubdomains as containing information about what would be the case in a state of affairs in which the supposition were true. Natural -deduction systemscontaining such restrictions are presentedby Anderson and Belnap ( 1975), by Fitch ( 1952, chapter 3), and by Thomason ( 1970b). The samerestrictions will also block the derivation of ( 13), which seems dubious in much the sameway as ( II ). owns a Dell. ( 13) Cynthia IF Linda is a bank teller THEN Cynthia owns a Dell. As ( 14) shows, the conclusion follows for IF (but not for intensional IF ), since the premise Cynthia ownsa Dell holds within the subdomain whose supposition is Linda is a bank teller. ( 14) a. + Cynthia owns a Dell. b. + Linda is a bank teller. c. IF Linda is a bank teller THEN Cynthia owns a Dell.

Premise Supposition IF Introduction

However, if the subdomain introduced by ( 14b) is an intensional one, then ' ' Cynthia ownsa Delldoesnt hold within it and consequentlycan t be used in IF Introduction to generatethe conclusion. Stalnaker ( 1968) discusses other examplesof questionableargumentsthat are deducible with IF but not with intensional IF . For thesereasons, it seemslikely that intensional IF provides a better match than IF to at least one senseof ordinary conditionals. It is an empirical question, however, whether the facts about human inference are best explained by positing a single intensional IF connective, or by positing both IF and intensional IF , or by positing some third connectivethat combinesthe advantagesof both. Other modifications could be made in the interest of a better fit to English usageor to psychologicalnaturalness. For example, Draine ( 1978) and McCawley ( 1981) advocateextending the AND and OR rules to allow

Chapter 2

each of theseconnectivesto take more than two sentencesas arguments. Thus, in addition to sentencesof the form ~ AND ~ or ~ OR P2, we could have ~ AND ~ AND . . . AND Pt. or ~ OR ~ OR . . . OR Pt.. We could also supplementthe rules in table 2. 3 with others that we believeto be psychologicallyprimitive . Consider, for instance, a rule for the so-called disjunctive syllogism that would enable us to infer Q directly from P OR Q and NOT P. Although any argument provable by meansof such a rule can already be proved with the rules of table 2.3, including the disjunctive syllogismas a primitive rule may produce a proof systemthat corresponds more closely to human reasoning. (SeeBraine 1978, Johnson- Laird 1975, and Rips 1983for proposals along theselines; seeAnderson and Belnap 1975for objections to the disjunctive syllogism.) In the chapters that follow , we will keep open the possibility of thesemodifications and examine ' subjects judgments to see whether they warrant such a shift. Chapters 3, 4, and 6 successivelyalter the rules to make them both more generaland more reasonableas psychologicalcomponents. In the meantime, table 2.3 can usefully serveas a basic exampleof a natural deduction system. Rulesfor Quantifiers In addition to argumentswith sentenceconnectivessuch as AND and IF , we would like to be able to deal with arguments that depend on quantifiers- for example, all and some, which we encounteredin our review of syllogism researchin chapter 1. Consider ( 15), which repeatsan argument from that chapter. ( 15) All squareblocks are green blocks. Somebig blocks are squareblocks. Somebig blocks are greenblocks. " To determine whether ( 15) is correct, we might reason: Some big blocks ' ' are square blocks; so take an arbitrary big square block and call it b . Block b must be green, since b is square and all square blocks are green. Hence, somebig blocks (b, for instance) are green, as stated in the conclusion ." The strategyhereis to consideran arbitrary exampleor instantiation of the premises, test whether this example guarantees properties mentioned in the conclusion, and then generalizeto the entire conclusion. This strategy is, of course, common in mathematical reasoning. For instance, to prove a theorem we often consider an arbitrary case of the hypothesis

Reasoningand Logic

" " ( Supposec is an arbitrary vector in the null spaceof a matrix . . . ), derive " " some properties of the instance( c is orthogonal to the row space ), and " finally generalizeover all like entities ( Sincec is arbitrary , the entire set of " vectors in the null spaceis orthogonal . . . ). Formalizing this reasoningmeansabandoning our practice of treating simple sentencesas unanalyzed wholes. We will follow the usual procedure of taking a simple (atomic) sentenceto be composedof a descriptor (a predicate) and one or more arguments. In this notation , the sentence ' " " Adam's Rib " stars Katherine Hepburn would appear as Stars( Adam s ' " " " ' " Rib , Katherine Hepburn) , where Stars is the.predicate and Adam s " Rib'" and " Katherine Hepburn are its arguments. This examplecontains namesof particular individuals - a movie and a person- as arguments, but we will also useargumentsof two other types: variables and namesof " " " " arbitrarily selectedindividuals. Block b and vector c in the preceding paragraph are examplesof namesof arbitrarily selectedindividuals. Thus, if b is an arbitrarily selectedblock, then Green( b) will representthe fact that b is green. We will continue to use letters from the beginning of the alphabet for these arguments, which we can call temporary names. By " ' '" " '" contrast, we will call nameslike Adam s Rib and Katherine Hepburn permanentnames. Finally , we will employ variables in connection with quantifiers to generalizeover individuals. For instance, we can expressthe ' " ' " fact that someonestarsin " Adam s Rib as ( FOR SOME x ) Stars( Adams "x Rib , ); similarly , the sentenceEverything is equal to itself will come out as ( FOR ALL x ) Equal( x ,x ), or , equivalently, ( FOR ALL x ) x = x. As in these examples, letters from the end of the alphabet will be used as variables. In this notation , then, argument ( 15) will have the form shown in ( 15') if we consider square-block, big-block, and green-block as single predicates. ' ( 15) (FOR ALL x) (IF Square-block(x) THEN Green-block(x . (FOR SOME x) (Big-block(x) AND Square-block(x . (FOR SOME x) (Big-block(x) AND Green-block(x . Fornull Rules In most natural deduction systems,a proof of an argument with quantifiers proceeds by instantiating premises with quantifiers to get new sentenceswith temporary names, applying the rules for sentence connectivesto the instantiations, and then generalizingto the quantifiers in the conclusion. This is much like the informal method we used in


discussing( 15). We will thereforehave to supplementour formal apparatus with rules for introducing and eliminating quantifiers, and table 2. 5 offers one set of rules of this sort. Theserules are essentiallythose proposed by Borkowski and Siupecki ( 1958) and by Suppes ( 1957), and are similar , b for a to the system of Copi ( 1954) and Kalish ( 1967). (See Fine 1985a with the . Combined and other these of quantifier systems) comparison are that the to derive allow us rules in table 2.3, they arguments exactly deductively correct in classicalpredicate logic. In the new rules, P( v) representsa (possibly complex) expressioncontaining variable v- for example, Big-block( v) AND Green-block( v) . We say that v is free in this expression, since there is no quantifier to govern it . On the other hand, in the sentence( FOR ALL v) ( Big-block( v) AND Table2.s Inferencerulesfor quantifiersin classicalpredicatelogic. In the followingrules, P(v) is a ) formulacontainingvariablev; t is a name(eithertemporaryor permanent (possiblycomplex . ); a andb aretemporarynames FOR ALL EUmiDatioa (a) If (FOR ALL v) P(v) holds in a given domain, and P(t ) is the result of replacing all free occurren~ of v in P(v) with t , (b) (c) then P(t ) can be added to the domain. FOR ALL Introduction (a) If P(a) holds in a given domain, and a does not occur as a subscript in P(a), (b) and a was not produced by FOR SOME Elimination, (c) and a does not occur in any suppositions that hold in the domain, (d) and a does not occur within the scopeof (FOR ALL v) or (FOR SOME v) in (e) P(a), and P(v) is the result of replacing all occurren~ of a in P(a) by v, (f ) (g) then (FOR ALL v) P(v) can be added to the domain. FOR SOME EUmiDadoa (a) If (FOR SOME v) P(v) holds in somedomain, and b has not yet appearedin the proof, (b) and ai , a2" ' " at is a list of the temporary names(possibly empty) that (c) appear in P(v) and that first appearedin a supposition or in an application of FOR ALL Elimination, and P(ba.. aa..... a. ) is the result of replacing all free occurren~ of v in P(v) by (d) ba.. aa..... a. . (e) then P(ba.. aa. .... a. ) can be added to the domain. FOR SOME Introduction (a) If P(t ) holds in a given domain, and t does not occur within the scopeof either (FOR ALL v) or (b) (FOR SOME v) in P(t ), and P(v) is the result of replacing all occurren~ of tin P(t ) with v, (c) (d) then (FOR SOME v) P(v) can be added to the domain.

Reasoningand Logic

Green-block( v) ), v is boundby the quantifier. P( t ) in table 2.5 corresponds to an expressioncontaining a name t , such as Big-block( t ) AND Greenblock( t ), where t can be either a temporary or a permanent name. (For reasonsthat will be discussedshortly, we must sometimessubscript temporary names, such as ab and abibz.) These rules contain a variety of conditions that we will examine after looking at an example of how the rules apply within a proof. ' Given the quantifier rules, it is easy to show that the argument in ( 15) is deducible. In our usual proof style, the derivation looks like ( 16). ( 16) a. + (FOR ALL x) (IF Square-block(x) THEN Green-block(x . b. + (FOR SOME x) (Big-block(x) AND Square-block(x . c. Big-block(b) AND Square-block(b). d. e. f.

Big-block(b). Square-block(b). IF Square-block(b) THEN Green-block(b). Green-block(b).

g. h. Big-block(b) AND Green-block(b). i. (FOR SOME x) (Big-block(x) AND Green-block(x .

Premise Premise FOR SOME Elim. AND Elim. AND Elim. FOR ALL Elim. IF Elim. AND Intro . FOR SOME Intro .

The only novel aspectsof this proof are the applications of FOR SOME Elimination and FOR ALL Elimination (lines c and f ) and the application of FOR SOME Introduction (line i ). Line b tells us that there is a big " " square block, which we can then representwith the temporary name b in line c, using FOR SOME Elimination . Line a saysthat all squareblocks are green; so if block b is square then it is green. FOR ALL Elimination gives us this result in line f. Since block b is big and square, it must therefore be big and green (line h). Hence according to the FOR SOME Introduction rule, there is a big, green block (line i ). Basically, the quantifier-introduction rules take us from sentencesof the form P( a) to ones of the form ( FOR ALL v) P( v) or ( FOR 80M E v) P( v) . The elimination rules take us in the reverse direction. However, theserules require a number of conditions to keep the proof from lapsing


into error. Theserestrictionsare particularly important for FOR ALL Introductionand FOR SOME Elimination, sincetheseinferencepatterns . We can't generallyconcludethat are reasonable only in specialcontexts selected everythinghasa givenpropertyon thegroundsthat an arbitrarily 't can we ALL Introduction does FOR ordinarily go ); similarly, ( thing from the fact that someobjecthasa propertyto the conclusionthat an arbitrarily selectedonedoes(FOR SOME Elimination). In the caseof FOR SOME Elimination, we haveto ensurethat the arbitrarily selectedobjectis not one to which we havealreadyascribed specialproperties(conditionb in table2.5), andwehaveto specifythat the objectmay dependon earlierchoicesfor other temporarynames(condition c). Without theformerconditionwewouldbeableto prove, asin (17), that thereis somethingthat is not identicalto itselffrom the premisethat . therearetwo thingsthat arenonidentical + (FOR SOMEx) (FOR SOMEy) x ~ y. Premise FOR SOME Elim. (FOR SOMEy) a ~ y. FOR SOME Elim. a ~ a. FOR SOME Intro. (FOR SOMEx) x ~ x. Theasteriskindicatestheincorrectline in this" proof." Conditionb blocks the temporaryname" a" hasalreadyappearedin line b. this line because To understandthe reasonfor thethird conditionon FOR SOME Elimination , we needto think about the relation of this rule to FOR ALL Introduction. Consider(18), which is a pseudoprooffrom the premise that everyonehas a father to the conclusionthat there is a particular ' personwho is everyones father.

( 17)

a. b. . c. d.


a. + (FOR ALL x) (FOR SOMEy) (IF Person (x) THEN Fathery, x . b. (FOR SOME y) (IF Person (a) THEN Fathery, a . * c . IF Person (a) THEN Father(b, a). d. (FOR ALL x) (IF Person (x) THEN Father(b, x . e. (FOR SOME y) (FOR ALL x) (IF Person (x) THEN Fathery, x .

Premise FOR ALL Elim. FOR SOME Elim. FOR ALL Intro. FOR SOME Intro.

Reasonini and Logic

' Line a statesthat for any person x , there is somey who is x s father. So an arbitrarily selectedindividual a has somey as a father, as shown in line b. At this point it might seemthat if an arbitrary person a has a father, then " " we can also give that father a temporary name (say, b ), as in line c. But ' intuitively , the named father can t be just anyone; the father in question ' depends on the person we are dealing with (Lauren s father is Steve, ' James is Roger, and so on). To mark this dependenceofb on a, conditions c and d of FOR SOME Elimination require us to subscript b with a. Thus, in a correct proof, line c would appear as IF Person ( a) THEN Father ( ba,a) . In line d, we try to generalizeover people. But since the father named by ba depends on the previous choice of person a, this step is incorrect and is explicitly blocked by condition b of FOR ALL Introduction . So condition c of FOR SOME Elimination is motivated by the way it setsup FOR ALL Introduction . Two conditions on FOR ALL Introduction also deservecomment (besides condition b, whosejustification we havejust seen). Condition c prohibits generalizing on a temporary name that FOR SOME Elimination has produced. This restriction keepsus from deriving a universal conclusion from a particular premise- in ( 19), the conclusion that everyone is rich from the premisethat someoneis rich. ( 19)

a. + (FOR SOME x) Rich(x). b. Rich(a).


FOR SOME Elim. * c. (FOR ALL x) Rich(x). FOR ALL Intro . " " Since the temporary name a comesfrom applying FOR SOME Elimination , we are barred from using it in line c of ( 19). Similarly , conditiond forbids generalizingon a temporary name that appearsin asupposition that is, from going from the supposition that there is an arbitrary object with some property to the conclusion that all objects have that property. The remaining restrictions on FOR ALL Introduction and on the other three rules of table 2.5 are simply technical devicesto ensureproper substitution of names or variables. For example, condition e of FOR ALL Introduction and condition b of FOR SOME Introduction prohibit substituting a variable for a name if that substitution results in the new vari' able s being accidentally bound by the wrong quantifier. If we have, for ' instance, ( FOR SOME x ) ( Big-block( a) AND Green-block( x ) ) , we don t want to generalizeto ( FOR ALL x ) ( FOR SOME x ) ( Big-block( x ) AND


Green-block( x ) ) . The trouble here is that ( FOR SOME x ) already governs the inner expressionBig-block( a) AND Green-block( x ) , so substituting x for a allows this new x to be bound by ( FOR SOME x ) rather than by ( FOR ALL x ) as desired. The region of a sentencewithin which a quantifier can bind a variable is called the scopeof the quantifier. Thus, to eliminate the conflict, the conditions in question forbid substitution of a variable for a temporary name if that name already appears within the scopeof a quantifier employing the samevariable. PossibleModifications It might seemthat the conditions on the quantifier rules block correct inferencesas well as obviously faulty proofs such as ( 17) ( 19). As was noted in chapter 1, many investigators believe that subjects in syllogism experimentsinterpret sentencesof the form Somex are y as suggestingSomex are not y (or , equivalently, Not all x are y). Indeed, in a direct test of this argument, between 58% and 97% of subjects(across conditions) stated that it was correct ( Newsteadand Griggs 1983). But there is evidently no way to show that such an argument is deducible in the systemof table 2.5. Although we might try to alter the quantifier rules to accommodatethe argument, this is probably not the best corrective. If Somex are not y were deducible from Somex are y, then sentenceslike Someof the blocks are green- in fact , all of them are green would be contradictory . As Horn ( 1973, 1989) points out, however, such sentencesare perfectly consistent. Thus, most investigators have taken the Some-to-Some-not relationship as a caseof Gricean conversational implicature (see, e.g., Beggand Harris 1982; Horn 1973, 1989; Newsteadand Griggs 1983; McCawley 1981). In a conversationalcontext, it would be uncooperativeof a speakerwho knew that all x are y to assertonly that some x are y. Thus, the audiencecan normally assumethat when a speakersincerelysaysthat somex are y (and is in a position to know whether all x are y), the speakerthen believesthat some x are not y. (Seenote 5 for a discussionof the inferencefrom All x are y to All yare x.) There are other facts about quantifiers, though, that do seemto show the limitations of our rules. Natural languages contain a much wider range of quantifiers than all and some, including (in English) most, many, few, afew , a lot, several, any, each, and no. We could paraphrasesome of theseusing FOR ALL and FOR SOME, or we could formulate rules in the style of table 2.5 to handle them directly . (SeeFitch 1973for Introduction


and Eliminationrulesfor no, any, andevery.) However,quantifierssuchas ' mostand morethanhalf cant be definedin termsof FOR ALL and FOR SOME (Barwiseand Cooper 1981 ), and valid argumentsbasedon such be what can outrun capturedin a naturaldeductionsystem(or, quantifiers methodof any sort- seethe discussion inference in fact, in a mechanical ' on generalized of Churchs Thesisin the followingchapter). Research ) ; Westerstal1989 ; Barwiseand Cooper 1981 quantifiers(Barwise1987b of settheoryfor which indicatesthat theseconstructionsinvolveresources . Accordingto theseproposals , for example thereis no completeproof system Most sentence in the , the generalizedquantifier mostpoliticians the setof all setsof objectsthat includemost politicianswhinerepresents asa wholewill be true if the setof whinersis sentence the and politicians, ' ' This sets . these one of complexitydoesnt meanthat peoplecant make with suchquantifiers inferences , from Most politicianswhine ; for example loudly it followsintuitivelythat Most politicianswhine.But it doessuggest that peoplerecognizeascorrectwill beonly a selection that theinferences of the valid ones, and how to describethesepsychologicallyrecognizable itemsis an openquestion.6 A final worry about the quantifierrulesis that they sometimesmake . For example , as (20) shows,the trivial inferences tediouslycumbersome is to Not rich is not Someone from everyone rich, which seems argument line a five proof. straightforwardlycorrect, requires (20) a. + (FOR SOMEx) NOT Rich(x). b. + (FOR ALL y) Rich(y) NOT Rich(a) c. Rich(a) d. e. NOT (FOR ALL x) Rich(x)

Premise Supposition FOR SOME Elim. FOR ALL Elim. NOT Intro.

Although we may sometimesreasonas in (20) by instantiatingand then again, in manysituationsweseemableto movedirectlyfrom generalizing to another(Braineand Rumain1983 ). In fact, it onequantifiedexpression ' is possibleto view Aristotles theory of syllogismsas a type of naturaldeductionsystemcomprisingrulesfor deducingone kind of quantified ; Smiley from othersin this moredirectmanner(Corcoran1974 expression 1973 ). One way to attack this for psychologicalpurposesis to supplement the rules in table 2.5 with explicit quantifier-to-quantifier rules, eventhoughthe latter are redundant(sincetheycanbe derivedfrom the

Chapter 2

fonner). A rule that takesus from ( FOR SOMEx ) NOT P( x ) to NOT ( FORALL x ) P( x ) mightbeoneexampleofa redundantbut psychologically " " primitiverule. Another Aristotelian examplemightbetheinference from ( FORALL x ) ( IF P( x ) THEN Q( x ) ) and( FORALL x ) ( IF Q( x ) THEN R( x ) ) to ( FOR ALL x ) ( IF P( x ) THEN R( x ) ) . Another approach to this problem- onethat we will explorein part II - is to work in whichno explicitquantifiersappearandin which with a representation . TheCPL thework of thequantifiersis perfonnedby namesandvariables the fonn of a hints about in the tables as fonnalized , , provides System . of the final statement but is not the model theory psychological The Place of Logic in a Theory of Reasoning

Even if we amend our natural-deduction rules in the ways contemplated in the preceding section, we may doubt that such rules are important in reasoning- or even that people use them at all. Some of the doubts have to do with computational problems that arise in incorporating rules that were originally designedfor pencil-and-paper proofs. Other doubts center on the ability of the rules to predict data from psychologicalexperiments . We will face these computational and psychological problems in later chapters, but there is a third set of theoretical difficulties that we should consider here. These difficulties threaten to show that . no matter how we tinker with rules like those in tables 2. 3 and 2. 5, they can never in principle explain human reasoning. One objection of this kind - due to Harman ( 1986)- is based on the idea that rules of reasoningand rules of proof are entirely different types of things. The former govern the way people change their beliefs in response to evidence, whereasthe latter concern the way one sentence(or ) implies another. According to this view, the rules of group of sentences tables 2.3 and 2.5 are rules of proof, not rules of reasoning. Of course, laskowski , Gentzen, and others designed these rules as part of a proof system- a method for showing that the conclusion of an argument follows from its premises. But theselogicians also intended the rules to come close to human reasoning. What is wrong, then, with supposing that they were right - that theserules are literally part of the mental equipment we usein reasoning?

andLogic Reasoning

Hannan' s main argument against taking proof rules as rules for belief change is that doing so would yield disastrous cognitive results. One of the problems is that AND Introduction and OR Introduction , if applied blindly , can produce an avalanche of completely trivial inferences. For example, from the sentencesTom is at work and Michelle is at school, it follows by AND Introduction that Tom is at work AND Michelle is at school. By a secondapplication of AND Introduction , it follows that Tom is at work AND ( Tom is at work AND Michelle is at school) ; by a third , we get Tom is at work AND ( Tom is at work AND ( Tom is at work AND Michelle is at school) ) ; and so on. However, problems of this sort can be handled by modifications in the statement of the rules themselves , as will be shown in the next chapter. For present purposes, a more interesting type of example is the following : Supposeyou believeboth that Calvin deposits 50 cents and that if he deposits 50 cents then he will get a coke. Supposeyou also seenothing coming out of the coke machine and come to believe that Calvin will not get a coke. If you apply IF Elimination (modus ponens) to the first two ' beliefs, then you will deduceCalvin gets a coke; but you don t want to end up believing that, given what you just witnessed. Instead, you should take this result as evidencethat one or more of your initial beliefswas falseand ' ceasebelieving it (perhapsCalvin didn t really deposit 50 cents, or perhaps it was not true that if he deposits 50 cents then he will get a coke). Hence, ' ' on Hannan s account, IF Elimination can t be a rule about what sentences " " to believe, and therefore it is not a rule of reasoning. It is hard to see how any simple patching of the IF Elimination rule could escapethis difficulty . Of course, we could amend the rule in table 2.3 by adding the further condition that no proposition of the fonn NOT Q holds in the domain in which IF P THEN Q and P hold. This modified IF Elimination has the advantagethat it will not produce the contradictory beliefs that Calvin does and does not get a coke in the broken cokemachine example. Unfortunately , though, directly contradictory beliefs are only part of the problem. We still need to change our beliefs about Calvin in order to get rid of the source of the inconsistency, and the only way to do that is to drop one or more of the beliefsIF P THEN Q, P, and NOT Q. Perhapswe could endow mental IF Elimination with the power to delete beliefs in situations like this, but which beliefs should it delete? We have our choice of Calvin deposits50 cents, Calvin does not get a coke, and If Calvin deposits 50 cents, then Calvin gets a coke (or any


combination of these). Furthermore, our decision ought to dependon how much support we have for one belief relative to the others, where the support relation should be able to take nondeductivefactors into account. You might reasonably decide to drop the belief that Calvin deposited 50 centson the grounds that your view of his deposit was obscuredand so ' you didn t seeexactly how much money he put in , or on the grounds that you know Calvin is forgetful and thus he may not have noticed the new price increase. A modified modus ponens that can take such factors into account is no longer much like the modus ponens we started with ; it is no longer a purely deductive principle. Harman does not imply that rules such as IF Elimination play no role in reasoning (i.e., change of belief). Principles governing change of belief could take such proof rules into account (though Hannan also believes that logical rules such as those in tables 2. 3 and 2.5 have no privileged 7 position in this respect). If you realizethat your beliefsIF p THEN q and ' pimply q, then that is usually a reasonfor believing q. But implication isn t always decisive, as the preceding example shows. In other words, it is ' consistentwith Harman s view to take the rules in the tables as psychologically ' " real components of reasoning, even though they aren t rules of " in his sense. reasoning The approach just contemplated demotes IF Elimination and similar ' rules to a cognitive position in which they don t fully determine belief ' change. But there is another way of attacking Harman s problem that promotes these rules. Let us assume, with Harman, that rules of belief changeinclude a set of heuristicsfor resolving recognizableinconsistencies among beliefs. Harman informally describessomeheuristics of this sort " for example, Make only minimal changes(deletions and additions) to ' " " immediately your current beliefs and Don t give up beliefs that you would " These are . other beliefs from e. . back ) again ( g , by implication get ' heuristics in the sensethat they don t guarantee a way around the inconsistency , since some inconsistenciesmay be unresolvablefor practical . purposes Let us also supposethat these heuristics can be specifiedas computational procedures for actually modifying beliefs. (See Pollock 1987 and 1989for one attempt at this; seelater chapters of the present work for a discussionof related work in AI .) In particular , they could be formulated as a production systemfor belief revision, sinceany computational procedure can be so formulated. (Seechapter 3 of Anderson 1976for a proof.)

Reasoningand Logic

The systemwould consist of a set of productions of roughly the following ' form: IF conditions cl ' C2' . . . , Ckare met by one s current beliefsTHEN modify the beliefs through (internal) actions ai , a2' . . . , am, The system applies these productions by monitoring beliefs and executing the specified actions when the conditions are satisfied. But notice that this method of applying the rules is nearly identical to IF Elimination ; the only important differenceis that the systemexecutesthe actions in the THEN part of the conditional rather than adding a new sentenceto a proofis Thus, according to one way of formulating rules of belief change, the rules obey a simple variant of the IF Elimination proof rule. Promoting IF Elimination ' to the status of a general operating principle avoids Harman s problem becausethe rule no longer simply adds beliefson the basisof othersit could very well discard some. What the principle happensto do depends on how it is " programmed" ; that is, it depends on the contents of the individual productions. Our promotional and demotional approachesto IF Elimination (and to the other rules in tables 2. 3 and 2.5) offer us a choice as to how we should understand the role of logical rules in cognition. If Harman is right (as I believe he is), these rules are not methods for automatically changing beliefs. But should we think of them as general-purpose symbol manipulators, or as specific tests for determining when certain sentenceslogically imply others? On one hand, as demoted rules, they seemto have the better claim to the title " rules of reasoning." If we elevatelogical rules to the level of production appliers then the rules are quite remote from reasoningas it is ordinarily conceived. On the other hand, the hypothesisthat theserules are part of our basic cognitive architecture is inviting , becauseit helps explain why such principles seemso deeply embeddedin thinking . But we needn't choose between promoting and demoting the rules, since both logic-basedprocessingrules and lower-level reasoning principles could have separateparts to play in cognition. Imagine that a suitably modified version of the rules acts as a general-purpose mental programming systemalong the lines of PROLOG and other logic-basedlanguages (Clocksin and Mellish 1981; Kowalski 1979). Such rules would generalize the production applier just discussed,extending to a wider range of logical forms than the simple IF . . . THEN of production systems. In addition , " the system could maintain a stock of " demoted rules for determining implications among sentencesin reasoning. Some of these rules might " " duplicatethe promoted ones(perhapsbeingderivedfrom them by thought

Chapter 2

experiments); others could be acquired through training in mathematics and logic. The general rules, by contrast, would be a fixed component of the individual 's cognitive architecture. Although positing multiple levels of logical rules may seemstrange, analogiesare easyto find. Most computers , after all, have a kind of logic built into their wiring and can use this logic circuitry to run theorem-proving programs that embody their own logical principles. Of course, a multilayer systemmay be lessparsimonious than we would like, but the psychologicalfacts may require it . We will pursue this two-tier approach in what follows, but the present ' point is simply that Hannan s exampleshould not be seenas evidencethat ' logical rules can t be cognitive rules. Hannan is right that the logical rules in the tables would produce psychological havoc if we used them to change our beliefs in a purely reflexive way. He may also be right that there must be a different speciesof rules that govern belief change. But there are ways that logical rules could sensiblyfit in a cognitive theory, as I will try to show in parts II and III .

( and Reasoning


Children, who have only a little experience, are neverthelessable to understanda ' great deal that a skilled instructor explains to them, even if he doesnt show them . it is that but describes conceptsof all those only Therefore, necessary anything in them and arise the with which are latent few they are already from many things acquainted.. .. It follows irrefutably that if somebodyenteredin a catalog all the primitive concepts which that child has, with a letter or character assignedto each, together with all the conceptscomposedof these( i.e., all the conceptswhich could be explainedto that child without putting anything new beforehis eyes) , he would be able to designate [ all of these] with combinationsof thoseletters or characters. . . . Thus I assertthat all truths can be demonstratedabout things expressiblein this languagewith the addition of new conceptsnot yet expressedin it - all such truths, I say, can be demonstratedsolo calculo, or solely by the manipulation of characters according to a certain form , without any labor of the imagination or effort of the mind,just as occursin arithmetic and algebra. Leibniz (cited in Mates 1986)

The study of deductive reasoning is the study of a psychological process. Although logical principles may be relevant to this process, they must be embodiedas mental operations if they are to exert a direct causalinfluence on thinking . In chapter 2 we glimpsed some of the computational issues surrounding the useof logical principles as psychologicalprocedures. For example, we noticed that even the seeminglyinnocent natural-deduction rule AND Introduction (p and q entail p AND q) can lead to a torrent of trivial inferencesthat would flood our cognitive resources. Along the samelines, consider the rule of OR Introduction in the form in which it appearedin table 2.3. This rule statesthat a sentencep entails p OR q for any q. Thus, from the sentencethat JamesadmiresPie" e, we are entitled to concludeJamesadmiresPie" e OR Craig dislikesJulia. But the rule can apply again to this new sentence, leading to , say, ( JamesadmiresPie" e OR Craig dislikes Julia ) OR Simone approvesof Alma. In one further application we get ( ( JamesadmiresPie" e OR Craig dislikes Julia ) OR Simoneapprovesof Alma) OR Gaston distrusts Leslie, and so on for as long as we care to continue. Applying the rules in an ungovernedfashion . can yield an accumulation of worthless sentences As might be expected, this combinatorial problem is the crux of research on deduction in computer science.Any attempt to build an efficient theorem prover (or deduction-basedproblem solver) must find a way to guide the program toward the conclusion to be proved, avoiding runaway inferenceslike those of the preceding paragraph. In one of the earliest

Chapter 3

) papersin computertheoremproving, Newell, Shaw, and Simon(1957 highlightedthis problemin the descriptionof what theycalledthe British MuseumAlgorithm.Adaptedto therulesof chapter2 above,thealgorithm goeslike this: To provethat a givenconclusionfollowsfrom a setof premises , start by writing . Apply the rules to the premises down a list of the premises , and add any new sentences theyproduceto the premiselist. If the conclusionis in this list, thenthe , apply the rules algorithmhalts and the conclusionhasbeenproved. Otherwise new items to the premise onceagainto the enlargedlist of sentences , addingany list. Continuein this wayuntil the conclusionis proved. and-derived-sentence The completenesstheorems of sentential and predicate logics guarantee that the British Museum algorithm will eventually come up with a proof if the argument is valid. (If the argument is invalid , the procedure may continue indefinitely.) But the computational cost of finding a proof in this way may be enormous. Newellet al. ( 1957) estimatedthat it would require hundreds of thousands of years for the computers of that time to use the British Museum algorithm to generateproofs for just the sentential theorems in chapter 2 of Whitehead and Russell's Principia Mathematica ( 1910- 1913). Since 1957there have been big improvements in the speed with which programs can uncover logical proofs. Although theseadvances have not escapedthe basic difficulty of exponential searchtime, they have made automatic theorem proving a practical matter for many kinds of problems. The advanceshave come from the discovery of more efficient proof methods and from the discovery of heuristic techniquesfor keeping the program on track. The L T (Logic Theory) program of Newellet al. was able to prove 38 of the first 52 theorems in the Principia through heuristic methods. Only three years later, Wang ( 1960) described a program that proved all of the 200 plus theorems in the first five chapters of the Principia in less than 3 minutes. Currently, automatic theorem provers are able to assist in answering previously open questions in matheematics. (For someexamples, seechapter 9 of Wos et al. 1984.) During the sameperiod, the stake in finding successfultheorem provers has increased, since AI investigators have turned up new ways of putting proof to use. Proof techniqueshave becomevaluable, not only in determining the correctnessof arguments in symbolic logic, but also inguiding robots, parsing sentences , synthesizingcomputer programs, designing electronic circuits, and other problem-solving tasks. In fact, theorem prov -

Reasoningand Computation

ing now servesas the basis of a general AI programming languagecalled PROLOG (short for PROgramming in LOGic ), which is the chief rival to LISP (Clocksin and Mellish 1981). There is clear disagreementin the AI community about the desirability of deduction as a general problemsolving technique (see, e.g., Nilsson 1991 vs. Birnbaum 1991; see also McDermott 1987and commentaries in the same issue of Computational Intelligence). But the current successof deduction systemsplacesthem in the top ranks of AI architectures. How relevant is this progressto the study of human reasoning? On one hand, it seemsextremely unlikely that ordinary people (without special instruction ) can apply the newer techniques for speeding proofs, since theseinvolve regimentingargumentsin ways that are often unintuitive . To model the kind of deduction that subjects are able to carry out , it is necessaryto forgo the more streamlined methods in this area. On the other hand, many of the procedures that Newellet al. ( 1957) and other researchersproposed in the early days of automatic theorem proving are still relevant to the study of natural reasoning. The first section of this chapter reviewssome ways to improve on the British Museum algorithm and attempts to identify, in a preliminary way, which of them could be parts of everyday deduction. This review suggestssome possiblerevisions to the deduction rules of the percedingchapter- revisions that help avoid the difficulties we have seenwith the principles of AND Introduction and OR Introduction . The second section outlines some of these revisions, ' which will be developedin detail in part II . Of course, theserevisionsdon t handle the combinatorial problem as well as do the more advancedtechniques that we must sacrificefor psychological fidelity , but we can justify certain inefficienciesif they mimic the difficulties people experiencewhen they engagein parallel tasks. The third section of the chapter considers someAI methods for using theorem proving to solve general problems. Cognitive psychologists, for the most part, have ignored theseefforts to solve problems through proof . Although theories of question answeringin psycholinguisticstook over a few of the main ideas (seethe discussionof sentence-picture verification in chapter 1), deductive problem solving has usually been dismissedon the grounds that people are not very good at ' deduction. If ordinary people can t evaluate even simple syllogisms, why should we supposethat they use deduction as a means for dealing with other sorts of tasks? For example, in explaining why psychologistshave usually not assumedthat people remember information in the form of


sentencesin predicatecalculus, Rumelhart and Norman ( 1988, p. 19) comment that the two most important reasons are " issuessurrounding the organization of knowledge in memory and the notion that the logical theorem proving processes so natural to the predicate calculus formalism do not seem to capture the ways people actually seem to reason." As I have already mentioned, it is true that some methods of computer theorem proving may be out of the reach of untrained subjects. But psy' chologists dismissalof theorem proving has kept them from noticing some interesting developmentsthat narrow the gap betweenautomated deduction and the ways " people actually reason." Prnhlem

of Searchin Deduction

Blindly applying logical rules is a poor strategy for even simple deduction problems. As long as the rules are stated in the unconstrainedform of the preceding chapter (as is customary in most elementary logic textbooks), they will not automatically lead to a proof in an acceptableamount of time. In view of the potential of the rules to produce infinite setsof irrelevant sentences , finding a short proof by randomly applying rules would be a matter of sheerluck. If deductive reasoningis anything like a practical process, then there must be better ways to direct the searchfor a proof . The constraints that investigatorsin computer sciencehave found differ in how encompassingthey are. At one extreme, there are general methods that are applicable to all proof-finding situations; at the other, there are special-purpose heuristics that take advantageof particular types of rules or lines in the proof . It turns out that the heuristicsare more important for psychological research, and we will concentrate on them in this chapter. However, the new general methods- resolution theorem proving and its variants- are the triumphs of automated deduction, and we should try to understandtheir impact. There are sometheoretical limits to what eventhe best generalmethods or heuristicscan do to simplify the proof-finding process. The best known of these limits is the undecidability theorem of Church ( 1936a), which establishedthat there is no decision procedurefor classicalpredicatelogic (CPL ). That is, there is no procedure that, given an arbitrary argument, will necessarilystop after a finite number of steps, correctly labeling the

Reasoningand Computation

" " argument as valid or invalid in CPL . The notion of procedure employed here is generalenough that we can reasonably assumeit to encompassany function that is computable at all , an assumption called Church' s Thesis (Church 1936b; for expositions see Boolos and Jeffrey 1974, Mendelson 1990, and Rogers 1967). Thus, the undecidability result applies to all actual computers, and presumablyto humans. But although decisionprocedures are out of the question for CPL , there are proof proceduresthat will exhibit a proof of an argument if that argument is valid. Proof procedures suffice for many practical purposes, as we will seein the third section of this chapter. Further , there are parts of CPL that do have a decision procedure- for example, classical sentential logic (the logic associated with the sentential rules in table 2. 3) and monadic first -order logic (CPL restricted to predicates with at most one argument). It is possible that human reasoningcan make use of such decision proceduresin situations that require less than the full power of CPL . These situations include almost all the argumentsthat have figured in psychologists' investigations of deductive reasoning: evaluation of categorical syllogisms, evaluation of sentential arguments, verification of negative sentencesagainst pictures, . and other tasks that we glimpsed in chapter 1. The secondtheoretical limitation , however, concernsthe computational complexity of these decision procedures. A landmark theorem due to StephenCook ( 1971) showed that the task of deciding the validity of an arbitrary sentencein classicalsententiallogic belongsto the classof problems termed " NP -complete." (Seechapter 2 of Garey and Johnson 1979 for a clear exposition of Cook' s theorem, and chapter 4 of Cherniak 1986 for a discussionof possibleimplications for cognitive science.) This means that validity testing for sentential logic is equivalent in computational complexity to problems (such as the Traveling Salesmanproblem) for which every known algorithm requires an amount of time equal to some l exponential function of the length of the problem statement. In particular , suppose we have a decision procedure that takes any argument as input and returns a proof of the argument if it is valid in sentential logic, stopping (with no proof) if it is invalid. If the number of symbols in the argument is n, then the amount of time consumed by such a procedure could be, in the worst case, on the order of k " seconds(where k is a constant greater than or equal to 1). But although an exponential increase in time seemsunavoidable for a decision procedureof this sort, the implications of Cook 's theorem for theories of human and machine reasoning


are uncertain. First , in the caseof human reasoning, we may be dealing with a deduction systemthat is in some ways less powerful than the full sententiallogic to which Cook 's theorem applies(Levesque1988). Second, there may be decision proceduresfor people or computers that operate with a value of k sufficiently close to 1 that exponential complexity is no specialrestriction for problems of practical interest. Let us seewhat can be done in the way of improving proof methods. Proof Heuristics: L T and G PS One innovation in the Logic Theory machine of Newellet al. ( 1957) was to apply the system's proof rules in a backward direction. In order to spell out this idea, I will use the term assertionsof a proof (at a given stage of the proof process) to mean the set consisting of axioms, premises, and all other sentencesderived from the axioms and premisesat that stage. The most obvious way to translate logical rules such as those in tables 2.3 and 2.5 into computational proceduresis to fashion them as routines that take assertionsas input and produce further assertionsas output . On this pattern , Modus ponens would becomea procedure that looks for assertions of the form IF P THEN Q and P and, if it finds them, adds Q to the proof. I will call such assertion-to -new-assertion routines forward rules. In L T , however, the logic routines worked backward from the conclusion in order to aim the proof in the right direction. Modus ponens in L T takes a conclusion Q as input and checks for a corresponding assertion IF P THEN Q; if suchan assertionis available, the rule then tries to prove P. If it succeeds , then Q must follow. In this case, P servesas a subgoalfor Q. I will call these goal-to-subgoal proceduresbackward rules. Evidence from students constructing geometry proofs suggeststhat these students employ both forward and backward rules (Anderson, Greeno, Kline , and Neves 1981). People with greater expertisetend to rely more on forward rules and less on backward rules, both in geometry (Koedinger and Anderson 1990) and in other areas (Larkin , McDermott , Simon, and Simon 1980). Selecti ,e Backward Rules The forwardjbackward distinction and the notions of assertionsand subgoalswill play essentialroles in the rest of this book, so it might be helpful to have a concreteexample.2 Supposewe want to show that argument ( 1) is deducible.

and Computation Reasoning

( 1) IF Calvin deposits 50 centsTHEN Calvin gets a coke. IF Calvin gets a coke THEN Calvin buys a burger. Calvin deposits 50 cents. Calvin buys a burger. Using Modus ponens in the forward direction, we can combine the first and third premisesin ( 1) to derive a new assertion: Calvin gets a coke. We can then apply the samerule to the secondpremiseand the new assertion to prove the conclusion. This pure forward method is illustrated in the upper panel of figure 3.1, where the numbering of the assertionscorresponds to the order in which the proof deals with them. Arrows indicate entailments among the sentences : Sentences1 and 2 jointly entail 3, and sentences3 and 4 jointly entail 5. Compare this forward deduction strategy with the backward strategy, illustrated in the lower panel of figure 3.1. In the backward mode we start with the conclusion, which we can think of as the main goal of the proof . Goals and subgoals are representedin the figure with a question mark after the sentences ; thus, Calvin buys a burger? means that the sentence Calvin buys a burger is one to be proved (and not that we are trying to prove a question). Assertionsend with a period, as they did in the preceding diagram. Since the conclusion-goal matchesthe consequentof one of the premises( IF Calvin gets a coke THEN Calvin buys a burger) , backward Modus ponenstells us that the goal follows if we can establishCalvin a gets coke. This latter sentence(the antecedentof the premise) becomes our new subgoal. To fulfill this subgoalwe can employ the samebackward Modus ponens method. Calvin gets a coke matches the consequent of another premise, ( IF Calvin deposits50 cents THEN Calvin gets a coke) ; hence, we will fulfill this subgoalif we can show that the antecedentis true. This time, though, the antecedent, Calvin deposits50 cents, is already given as a premise, and this means we have succeededin proving the subgoal and, in turn , the conclusion of the argument. The arrows again point from , but the order in which the sentencesare entailing to entailed sentences consideredis the opposite of figure . Newellet al. ( 1957) describedtwo additional backward rules that are of interest in proving conditional sentences , and both of these take advantage of the transitivity of material conditionals (i.e., the IF . . . THEN sentenceswithin CPL ). From IF P THEN Q and IF Q THEN R, it follows that IF P THEN Ran argument pattern sometimescalled the


S ).~ '"-ode C 1.Fcc.wt .wt :-:~ . ~ -""~7 """""""" "'-....,. 3.~ _ /aa..

~ ,:-=-;:-:~ ~~ . .-. 4. 1F ~~ _ aa ab ~ ~ Jrg

/' ~ ' " " 6". ~" tM 'Y8

b 8.~ -~-: ~~...

-:':;-:-'.~ = :-crm-4 . F~ ~"""_ nB """"""""" a"""" .~ /

:~ .- m - :~ : -:.= 6.~ -1 id = = : =- .

" " " '~ ~/ ' Iua -' Fiaure3.1 Prnnfnftheargument Calvin deposits SOcents.

IF Calvin . SO cents THENCalvin deposits getsacoke IF Calvin . getsacokeTHENCalvin buysaburger

Calvin buys a burger.

Reasoningand Computation

hypotheticalsyllogism. Thus, in the backward mode, if we are interestedin proving a conclusion of the form IF P THEN R, it is useful to find out if there are any assertionsof the form IF P THEN Q or IF Q THEN R. Four outcomesare possiblehere: (a) If both of the latter conditionals are assertions, the conclusion is proved. (b) If we have an assertionIF P THEN Q, we can set up IF Q THEN R as a subgoal, since establishing that subgoal would suffice to prove the conclusion. (c) If we have an assertion IF Q THEN R, we can set up IF P THEN Q as a subgoal, for the samereasond ) If neither conditional is an assertion, L T abandonsthe strategy. " Newellet al. referred to outcome b as " forward chaining and to outcome " " " c as backward chaining, and treated them as separaterules or " methods. But sinceboth rules operate in the backward direction (from a conclusion or main goal to a subgoal), the terminology is somewhat awkward. For uniformity , we can refer to them as the backward Transitivity rules, bearing in mind the two options (b and c) for establishing the relationship. The empirical investigation that Newellet al. carried out with the Principia theorems convinced them that L Ts strategy of working backward was a vast improvement over the British Museum algorithm . But why should this be so? At first glance, it is not clear why working backward from the conclusion should be any more of an advantage to a theorem prover than working forward from the premises. This is particularly ' puzzling with Modus ponens or Transitivity , since these rules don t possess the problems that we encounteredwith AND Introduction or OR Introduction . These latter rules produce an unlimited number of sentences when they operate in a forward direction; Modus ponensand Transitivity do not. Once Modus ponens has produced Calvin gets a coke from IF Calvin deposits50 cents THEN Calvin gets a coke and Calvin deposits50 cents, it can produce no further conclusions from the same pair of sentences . (Of course, we could apply Modus ponensto thesesamesentences again to obtain a redundant token of Calvin getsa coke; but as long as we ' have someway to eliminate theseduplicates, we won t have problems with runaway conclusions.) Thus, in view of the tame nature of Modus ponens and Transitivity , why should working backward help?

Chapter 3

The reason for the improvement is that the backward-oriented rules in L T are quite selective. The British Museum algorithm applies its rules in a forward direction, with no thought of the conclusion. The rules of L T work backward but keep the premises(and other assertions) in mind. L T 's backward Modus ponens, for instance, will never attempt both P and IF P THEN Q as subgoalsif it needsto deduceQ; rather, it attempts only P as a subgoal and only when it already has IF P THEN Q as an assertion. For example, we were able to get started in the backward proof of figure 3.1b becausethe conclusion-goal, Calvin buys a burger, was the consequent of one of the premiseconditionals. Similarly , backward Transitivity never tries to prove both IF P THEN Q and IF Q THEN R as subgoals to IF P THEN R, but will attempt to prove one of them if the other is already an assertion. This assertionsensitivity meansthat L T will bother to apply a rule only when the assertionsindicate that there is somereasonable chance of success ; in a certain sense, the subgoalsare ones that are already relevant to the assertionsat a given stageof the proof. Selective Forward and Hybrid Rilies This appraisal of L Ts backward rules suggeststhat we might also incorporate selectivity in forward rules. We should be able to deviseforward rules that are conclusionsensitivein a way that parallels the assertion sensitivity of the backward rules just discussed.For instance, it is easyto conceiveof a forward version of AND Elimination that works as follows: Whenever P AND Q appears as an assertion, the rule checksto seewhether the conclusion contains Q but not P AND Q (or P but not P AND Q). If it does, then AND Elimination is likely to be relevant to the proof of the conclusion, and the program should go ahead and apply it . If it does not, then AND Elimination is likely to be irrelevant, and the program should therefore try some other rule instead. The advantage of restricting forward AND Elimination in this way may be minimal ; in fact, it could sometimesbe disadvantageous . But on balancethe benefitsmay outweigh the costs, and the benefitsmay be much higher for other rules- AND Introduction , for example. Selective forward rules of this type form the basisof a theory of human deduction proposed by Osherson( 1974b, 1975, 1976), which we will examine in chapter 9. It is also possible to combine forward and backward procedures in a single rule. Hybrid rules of this sort formed the core of the program that becamethe successorto L T: the General Problem Solver of Newell and

Reasoningand Computation

Simon ( 1972). As its name implies, G PS was a much more ambitious program than L T , since its inventors intended it not just as a theorem prover but as a model for human problem solving in many domains. Later in the book , we will look at G PS as an empirical theory of deductive reasoning, but it is useful to mention it here becauseof its connection with ' heuristic proofs. GPS s central idea was means-endsanalysis- the notion that, in order to fulfill a goal, operators should be applied to reduce the differencebetweenthe current state of knowledgeand the goal state. In the context of logic problems, this meant that GPS applied rules to assertions in order to produce further assertionsthat were syntactically closer to the conclusion or goal sentence. For instance, if the conclusion contained a connectivethat did not appear in the assertions, then GPS would seekto apply a rule that would produce a new assertionhaving the connectivein question. Thus, GPS works forward if there is nothing to deter it . However , if such a rule did not quite apply to the current assertions, GPS would propose a subgoal of producing an intermediate assertionto which the rule in question would apply. As an example, consider how GPS might prove argument ( 1) using a hybrid version of Modus ponens. The conclusion of ( 1), Calvin buys a burger, occurs as the consequentof the second premise, IF Calvin gets a coke THEN Calvin buysa burger; thus, in order to produce the conclusion, the program should attempt to eliminate the differenceby getting rid of ' that premises antecedent, Calvin gets a coke. Modus ponens is obviously relevant to such a task, but the rule does not directly apply to the second premiseas stated. To carry out the rule, we needto have the antecedentas a separateassertion. This suggeststhat we should form the subgoal of proving the antecedent. But this sentenceis itself the consequentof the first premise, IF Calvin deposits50 cents THEN Calvin gets a coke, and hencewe can derive the subgoal if we can eliminate the antecedentof this premise. Once again, Modus ponens is relevant to such a task, and this time there is no barrier to its application. The program can then generate the subgoal sentence , Calvin gets a coke, and useit to produce the conclusion . In comparing GPS to LT , Newell and Simon ( 1972, p. 428) speakof G PS as working forward, and in one sensethis is true. In the absenceof any obstacle to applying the rules, G PS produces one assertion after another until the goal is reached. But, as the Calvin example demonstrates, ' whenever the assertions fail to meet the rules conditions, subgoals are createdin much the samebackward-oriented manner as in LT . Indeed, the


order in which G PS considersthe sentencesin this exampleis the sameas that of the pure backward proof in the lower panel of figure 3.1. We can therefore think of the GPS rules as hybrids- combinations of backward and forward search. The point to be emphasizedis that the sensitivity of the rules in L T and GPS to the assertionsor to the conclusion simplifiesproving by restricting the spacewithin which the programs can look for a proof. But this selectivity comesat a high price- one that Newellet ale( 1957) clearly identi' fied: The restrictions on L Ts or G PS s rules can keep the program from finding a proof for perfectly valid arguments. For example, it was impossible for LT to prove the valid sentenceP OR NOT NOT NOT P from the axioms of the Principia. Proving this sentencewould have required attempting as subgoals both premisesof Modus ponens, something L T could not do without giving up its selectivity advantage. Thus, L T is not a complete proof procedure for classicalsentential logic. It is in this sense that L T and G PS use " heuristic" methods; the rules forfeit the British Museum algorithm ' s guaranteeof finding a proof for the chanceof finding one within reasonabletime bounds. It is natural to ask whether it is possible to have a procedurethat takes no more time than L T but that ensures # successfor valid arguments. Improved Algorithms for Proofs In the 1950s- around the time of L Ts birth - new techniques became available in logic (see, e.g., Beth 1955) that pointed the way to better proof algorithms. For sentential logic, L Ts own domain, thesetechniquesprovided both a decisionprocedureand a proof procedure: For any argument at all, they could recognizewhether or not the argument was valid and display a proof if it was. Moreover, thesealgorithms required only a small fraction of the effort that L T expended. The algorithms also generalized, in a natural way, to proof proceduresfor all of classical predicate logic (CPL ). Two of these algorithms are of special importance for therorem proving, both based on a reductio or refutation of the conclusion to be proved. We will not pursure the details of thesemethods, since they lead away from psychologicallyplausible theories. But we needto look at them briefly in order to understand their relation to the approach we will take in part II .

Reasoningand Computation

TreeProofs As we haveseen, L Ts backward rules had to be restricted in order to work efficiently. That is becauseModus ponensand Transitivity , when applied in the backward direction, can lead to ever-more-complex subgoals: If we want to derive a conclusion Q by Modus ponens, we can try to prove as subgoalsP and IF P THEN Q; to prove the former we can attempt the sub- subgoalsR and IF R THEN P ; to prove the latter, we can attempt the subgoalsS and IF S THEN ( IF P THEN Q) ; and so on. Exactly the sameproblem arisesfor L Ts Transitivity rules. If we could get along without Modus ponens and Transitivity and rely instead on rules leading only to subgoals less complex than the original goal, then there would be no need to impose restrictions. That is, if each subgoal produces only finitely many sub-subgoals that are less complex than itself, then production of subgoals must eventually come to a halt. Gentzen ( 1935/ 1969) laid the theoretical groundwork for such an algorithm in the same paper in which he introduced the natural-deduction systemsmentioned ' in chapter 2 above. Gentzens main theorem (known in English as the Cut Elimination Theorem) showedthat within certain related systems, the derivations could be built up from proof lines no more complex than ' the derivation s final line (seealso Prawitz 1965and Ungar 1992). Wang ' ( 1960) proposed some computer implementations of Gentzens idea that yield fairly economical proofs for at least the sentential part of CPL . Both the original theorem and the algorithm , however, require basic changes in the format of the proofs themselves , making the proofs a kind of cross between axiomatic and natural-deduction techniques. We will look instead at a related technique, called the tree method, which produced proofs that are more similar to those we have seen. We will make use of this method in chapter 4, so it is worth examining carefully. The tree method is based on a reductio ad absurdum strategy. We start with the premisesof an argument and assumethe negation of the ' , we argument s conclusion. By applying certain rules to these sentences sentences can deduce simpler sentencesfrom them ultimately , atomic (sentenceswith no connectivesor quantifiers) or negationsof atomic sentences . If the procedure shows the set of sentencesto be contradictory , then the premisesand negatedconclusion cannot be true simultaneously. Sincethe premisesare given, the negation of the conclusion must be false, and the conclusion itself must follow from the premises. Prawitz, Prawitz, and Voghera ( 1960) incorporated the tree method in a theorem-proving program. A simple exposition of the tree method can be found in Jeffrey 1967, a more advancedtreatment in Smullyan 1968.


In the version of the tree systemthat we will look at, a proof beginswith the premises and the negated conclusion in vertical list. Let us call a sentencea literal if it is either an atomic sentenceor the negation of an atomic sentence. If some of the sentencesin the initial list are not literals, we apply one of the rules in table 3.1 to the first of them. Which rule we apply is completely determined by the form of the sentencein question. For example, if the sentenceis a conjunction, we apply rule I ; if a negated conjunction, rule 2; and so .on. The sentencesproduced by the rule are written at the end of the list we are compiling. (When two sentences appear beneaththe inferenceline in the table, as in rules 1, 4, and 6, both sentencesare written at the end of the list.) We then advanceto the next nonliteral sentenceand apply the corresponding rule, continuing in this way until all the nonliterals (both the original sentencesand the newly . produced ones) have been processed Most of the rules in table 3.1 are variations of ones we have already encountered. The only rule that calls for comment is the one for disjunctions , rule 3. This tells us that if one of the sentencesis a disjunction, P OR Q, we must split the vertical list we are construction into two branches, placing P at the beginning of one branch and Q at the beginning " of the other. The " I sign in the table representsthis splitting . ( Thus, the structure we are creating will generally be a tree structure rather than a Table 3.1 Rulesfor tree proofs. (The sentences) below the inferenceline is (are) to be placed at the end "" of each branch in the tree. In rule 3. the 1 indicates that eachbranch must split in two . with P at the head of one branch and Q at the head of the other.)





and Computation

list.) If we then apply a rule to a sentencethat occursabove a branch point , we must write the sentencesthat the rule producesat the end of eachlower branch. As an illustration , figure 3.2 shows a tree proof for argument( I ). In the figure, the negation of the conclusion is written at the top , and then the three premises. Applying rule 5 to the first conditional produces ( NOT Calvin deposits50 cents) OR Calvin gets a coke, sentence5 in the figure. Similarly, applying rule 5 to the second conditional premise produces sentence6. Since these two new sentencesare disjunctions, we must use rule 3 to divide the tree into branches. The first disjunction sprouts one branch headed by NOT Calvin deposits50 cents and another headed by Calvin gets a coke. These two branches then split in turn when rule 3 is applied to the seconddisjunction . A path in a tree is the set of all sentencesthat we encounter by starting at the top of the tree and following a route downward to one of the . For instance, if we take the left forks in the tree of terminal sentences a path of sentencesthat consist of the six items along we obtain 3.2 figure , " of the tree " the trunk plus sentences7 and 9. We can call a path closedif the path includes an atomic sentenceand its negation; otherwise, the path is said to be open. Thus, the path just described is closed, becauseboth Calvin deposits50 cents and NOT Calvin deposits50 cents occur in it . If all paths in a tree are closed, the entire tree is in a contradictory state, and the original argument is deducible. This is the casefor the example tree, so ( I ) is deduciblein the system. If any of the paths are open, then the argument is not deducible. There are some short cuts that make the tree method somewhat simpler (see Prawitz et al. 1960), but we will ignore thesesimplifications here. The most important property of this method for automatic theorem proving is that each step of the proof leads to successivelyshorter sentences . The rules in table 3.1 guaranteethat we will eventually decompose all nonliteral sentencesand that we can easily check the resulting literal sentencesfor contradictory pairs. The tree method provides a simple algorithm for sentential logic that avoids the incompletenessof the heuristic approach of Newellet al. However , it has several clear drawbacks, both psychological and computa that all framework tional. On the psychological side, the new requires for some arguments unintuitive seems which form reductio a assume , proofs . As a computational method, the algorithm is problematic because







/ "





~ ~



~ .




. ~


Reasoningand Computation

it can bog down badly in proving argumentsin the full CPL (seePrawitz 1960). In a CPL proof, universally quantified variables must be replaced by names. In general, the number of possible instantiations that must be produced in this way is an exponential function of the number of these variables, so the proofs rapidly becomeunwieldy. Tile ResolutionTechnique Resolution theorem proving, first set out by J. A. Robinson ( 1965), is an attempt to overcome the multiple -instantiation problem that plagued Prawitz et alis ( 1960) algorithm and other early computational methods, such as that of Davis and Putnam ( 1960) and of ' Wang ( 1960). Robinson s paper was a watershed in automatic theorem proving, and much of the later progressin this field has come from refinements of the basic resolution method (seeWos and Henschen 1983for a review). Unlike the tree method, resolution usually requires all sentences in the proof to have a specialformat called clausalform . First , eachsentencein the argument is transformed to an equivalent one in which all quantifiers appear at the beginning of the sentence , followed . by a conjunction (possibly just a single conjunct) Each conjunct in this expression must itself be a disjunction (possibly just a single disjunct) of negatedor unnegatedatomic formulas. For example, the sentenceIF Calvin deposits50 cents THEN Calvin gets a coke is rewritten as a single conjunct containing the disjunction ( NOT Calvin deposits50 cents) OR Calvin gets a coke. Second, all quantifiers are omitted, and existentially quantified variables are replaced by temporary names. For instance, the universally quantified sentenceAll squareblocksare green, which we expressedearlier as (2a), is transformed to (2b). (2) a. (FOR ALL x) (IF Square-block(x) THEN Green-block(x . b. (NOT Square-block(x ) OR Green-block(x). The x in (2b) retains its universal interpretation; (2b) asserts that any individual is either a nonsquare block or a green block. The existentially quantified sentenceSome big blocks are square goes from (3a) to (3b), where a is a temporary name in the senseof the FOR SOME Elimination rule of table 2. 5. (3) a. (FOR SOME x) (Big-block(x) AND Square-block(x . b. Big-block(a) AND Square-block(a).

Chapter 3

If an existential quantifier follows one or more universal quantifiers, then the corresponding existential variable is replaced by a temporary name with subscripts for the variables of those universals. For example, the sentenceEvery personhasa father , which we earlier wrote as (4a), becomes (4b). (4) a. (FOR ALL x) (FOR SOME y) (IF Person(x) THEN Fathery ,x . b. (NOT Person(x OR Father(ax' x).

The subscript in (4b) reflects the fact that the identity of the father in question may depend on that of the person. (Person 1 may have father 1, whereasperson 2 has father 2. ) After the quantifiers have been eliminated in this way, we are left with sentencesof the form ( ~ OR . . . OR Pi ) AND ( Ql OR . . . OR Q. ) AND . . . AND ( R1 OR . . . OR R..) . We can then delete the ANDs , treating the disjunctions as separatesentencescalled clauses. Geneserethand Nilsson ( 1987) specify an algorithm for transforming an arbitrary CPL sentence into clausal form.3 Once the sentenceshave beenreexpressed , the method attempts to show that the argument is deducible through a reductio. It acceptsthe clausal forms of the premisesas givens, adds the clausal form of the negation of the conclusion, and tries to derive a contradiction from theseclauses. If it identifies such a contradiction , then the original argument must bededucible . Becauseof the uniform structure of the clauses, however, the method needsonly a single inferencerule to derive a contradiction if there is one. The rule applies to casesin which the proof contains one clause with disjunct P and a second with disjunct NOT P. That is, one clause must contain P, possibly with other disjuncts Ql' . . . , Q. ; the other must include NOT P, possibly with other disjuncts Rl ' .. . , R... In this case, the rule permits us to conclude that Ql OR . . . OR Q. OR Rl OR . . . OR R... In other words, we can combine the two original clauses, omitting the complementary pair P and NOT P. If repeatedapplications of this rule produce contradictory clauses(one clause containing P alone and another containing NOT P alone), then the reductio proof is complete and the original argument is deducible. Figure 3.3 illustrates this method applied to the argument in ( 1). Sentences 1, 4, and 6 in this figure are the premisesof the argument in clausal form, and sentence2 is the negation of the conclusion. Applying the reso-

Reasoningand Computation

~buy8 .~ . ~ buy8 (NOT ~ '1818 ' )OR I.NOT ~.oak8 ~ 1818 .oak8 a.NOT . ...m088 4.(NOT )OR ~ ~1818 .~oak8 . ~ ~...m... - I.NOT ~ ~..

m ~

Fiaure3.3 Proofof theargument

Calvindeposits SO<:cots . IF Calvindeposits SO~ ntsTHENCalvingetsa coke . IF Calvingetsa cokeTHENCalvinbuysa burger . Calvinbuysa burger . the resolution . method by


lution rule to sentences1 and 2 yieldssentence3- NOT Calvingetsa coke- as an additional proof line. We can then usethis sentencewith 5, 4 in a secondapplicationof the resolutionrule to getsentence sentence contradictsone NOT Calvindeposits50 cents.However,this last sentence 6), eliminatingthis remainingpair. The of the original premises(sentence argumentis thereforededucibleaccordingto this method. thansomeof the Theproofin figure3.3, however , is no moreimpressive of so the earlierproofswe haveseenof the sameargument , advantages ' mentioned resolution s crucial . As already resolutionmust lie elsewhere , : Instead comesin its wayof dealingwith quantifiedsentences achievement of generatingarbitraryinstantiationsandthencheckingto seeif theyhave thepropertiesneededto completetheproof(asin thealgorithmof Prawitz et al.), the resolutiontechniqueonly instantiateswhen this is usefulin applyingthe resolutionrule. To understandthe advantageof resolution, we thereforeneedto consideran argumentthat dependson quantification . We can get a tasteof the advantages by consideringthe argument from Mirandais a personand Everypersonhasafather to the conclusion There's someone who's Miranda's father, illustratedin (5). (5) Person (Miranda). (x) THEN Fathery,x . (FOR ALL x) (FOR SOME y) (IF Person (FOR SOMEz) Father(z, Miranda) As we sawa momentago, the clausalform of the secondpremiseis (4b), and the negationof the conclusionreducesto NOT Father( z,Miranda) . From theseclauseswethengetthe resolutionproof (6). (6) a. (NOT Person (x OR Father(ax'x) b. Person (Miranda) c. (NOT Person (Miranda OR a ,Miranda) Father( Miranda d. Father(aMiranda ,Miranda) e. NOT Father(z, Miranda) f. NOT Father(aMiranda ,Miranda)

Premise Premise Instantiationof line a (substitutingMirandafor x) Resolutionrule (from linesb andc) From negationof conclusion Instantiationof line e . (substitutingaMirall4 Gfor z)

Reasoningand Computation

Lines d and f of (6) are directly contradictory , so argument (5) must be correct. The important point is that the instantiations in lines c and fof (6) match (unify) an atomic formula from one clause with its negation in another. The attempt to set up the resolution rule guidesthe instantiation and thus leadsdirectly to the contradiction. The resolution technique clearly improves the efficiency of proofs with quantifiers, but it also takes a large step away from cognitive concerns. Indeed, Robinson ( 1965, pp. 23- 24) was quite clear on this split: Traditionally, a singlestepin a deductionhasbeenrequired, for pragmaticand reasons , to be apprehended , to besimpleenough,broadlyspeaking psychological as correctby a humanbeingin a singleintellectualact. No doubt this custom originatesin thedesirethat eachsinglestepof a deductionshouldbeindubitable, . ... eventhoughthedeductionasa wholemayconsistof a longchainof suchsteps Part of the point, then, of the logicalanalysisof deductivereasoninghasbeento reducecomplexinferences , whichare beyondthe capacityof the humanmind to , eachof whichis within the , to chainsof simplerinferences graspas singlesteps . ... [ But] whenthe transaction as a mind to human single grasp capacityof the modemcomputing is a inference of an the out principle application agentcarrying machine , the traditionallimitation on thecomplexityof the inferenceprinciplesis . no longerveryappropriate The resolution rule is a principle of the latter (machine-oriented) sort, since " it condonessingle inferencesthat are often beyond the ability of a " human to grasp (other than discursively) (ibid., p. 24). Non - Resolution Methods and Revisions to the Natural - Deduction Rules


' ue ~

' ~ uo A

" ~ tell '

' "

" SU}

jO 01U SOOA1 / - ~ /

U jO ~ - '





Reasoning and Computation

Table3.1. rulesfor sententiallogic. Revisedinference ForwardIF EUmiMtion(mod. . , o. DS) of theform IF P THEN Q and P hold in somedomainD, (a) If sentences andQ doesnot yethold in D, (b) (c) thenaddQ to D. (d) Else, returnfailure. Hzation ) BackwardIF IntrodKtion (Coaditiona (a) SetD to domainof currentgoal. (b) If currentgoalis not of theform IF P THEN Q, (c) thenreturnfailured ' P. ) Setup a subdomainof D, D , with supposition ' . (e) Add thesubgoalof provingQ in D to the list of subgoals BackwardNOT EUmiMdon (a) SetP to currentgoaland D 'to its domain. NOT P. (b) Setup a subdomainof D, D , with supposition ' . (c) Add thesubgoalof provingQ in D to thelist of subgoals (d) If thesubgoalin (c) fails, (e) thenreturnfailure. ' (f ) Add thesubgoalof provingNOT Q in D to thelist of subgoals BackwardNOT latrodactioa (a) SetD to domainof currentgoal. (b) If currentgoalis not of theform NOT P, (c) thenreturnfailured ' P. ) Setup a subdomainof D, D , with supposition ' . (e) Add thesubgoalof provingQ in D to thelist of subgoals (f ) If thesubgoalin (e) fails, (g) then return failure. ' (h) Add the subgoal of proving NOT Q in D to the list of subgoals. Forward Double Neptioa EHmi Mtioa (a) If a senten~ of the form NOT NOT P holds in somedomainD , and P does not yet hold in D , (b) (c) then add senten~ P to domainD . (d) Else, return failure. Forward AND E6mi18tioa (a) If a senten~ of the form P AND Q holds in somedomainD , (b) then: If P does not yet hold in D , then add senten~ P to D. (c) IrQ does not yet hold in D , (d) then add senten~ Q to D. (e) (f ) Else, return failure. Backward AND I Dtrodactioa (a) Set D to domain of current goal. (b) If current goal is not of the form P AND Q, (c) then return failured ) Add the subgoal of proving P in D to the list of subgoals. (e) If the subgoal in (d) fails, (f ) then return failure. (g) Add the subgoal of proving Q in D to the list of subgoals.

Chapter 3

Table3.1. (continued ) BackwardOR EUmiaadoa (a) SetP to currentgoalandD to its domain. ~ of theform R OR S doesnot hold in D, (b) If a senten (c) thenreturnfailured ' ) Else,setup a subdomainof D, D , with suppositionR. ' . (e) Add thesubgoalof provingP in D to thelist of subgoals (f ) If thesubgoalin (e) fails, (g) thenreturnfailure. " (h) Else,setup a subdomainof D, D , "with suppositionS. . (i) Add thesubgoalof provingP in D to thelist of subgoals BackwardOR Introd_ doa (a) SetD to domainof currentgoal. (b) If currentgoalis not of the form P OR Q, (c) thenreturnfailured . ) Add thesubgoalof provingP in D to thelist of subgoals (e) If thesubgoalin (d) fails, . (f ) thenaddthesubgoalof provingQ in D to thelist of subgoals

that once IF Elimination has applied to a conditional , applying it again will produce no new information . Second, IF Elimination as a forward rule is psychologically natural. As soon as you recognizea conditional and its antecedentas true, it is hard to keep from considering the truth of its consequent(at least temporarily), no matter what goal you are working on at the time.4 Third , having a forward rule like this doesn't preclude having a backward version too. In the context of certain proof systems, it is sensibleto include a rule like the backward Modus ponensof Newellet ale ( 1957) in addition to the forward version just described. There may be situations, for instance, where the system needs to prove Q, has IF P THEN Q as an assertion, but does not yet have P. In this case, it may be necessaryto propose P as a subgoal in order to finish the proof . Two examplesof this sort appear later in this chapter- seeargument (9) and the itinerary planner. Other obviously self-constraining rules include Double Negation Elimination and AND Elimination . Table 3.2 setsout forward versionsof these rules that are similar to Forward IF Elimination; in each case, the system can hold the rules in check by making sure that there are no repeated assertions. In certain situations backward versionsof theserules can also be he~pful, but there is an additional difficulty inrunning them in this direction. For example, a backward Double Negation rule would presumably be triggered by a conclusion P and would result in a subgoal to

Reasoningand Computation

proveNOT NOT P. Assumingthat this subgoalwasnot readilyprovable , the samebackwardrule would apply, yielding NOT by other means NOT NOT NOT P and so on. Thus, we risk a cascadeof subgoalsin a backwarddirectionthat is analogousto the cascadeof assertionsin the forward direction from self-promotingrules. It is possibleto avoid this Eliminationand problemby makingbackwardrulesfor DoubleNegation ' , althoughthis hasnt beendonein table AND Eliminationmoreselective , 3.2. (Wewill encountersucha fonnulationin thenextchapter.) In general . forward is rules the preferreddirectionfor selfconstraining Self-PromotingRules Whereasit is possibleto control selfconstraining self assertions , promotingrules rulesby simply monitoringfor duplicate call for more complexsolutionsand (usually) backwardrather than forward to application. In the caseof AND Introductionit would not help sentence new create can rule the because tokens sentence , checkfor duplicate used typeseachtime it applies. That is, onceP and Q havebeen we but AND P Q; to produceP AND Q, we can preventnewcopiesof ' suchas ( P AND Q) AND Q, ( ( P AND Q) cant preventnewsentences AND P AND Q) AND Q, ( ( ( Q) AND Q) AND Q) AND Q, which the . Moreover, arbitrarily tenninatingsucha sequence rule readilygenerates would keepthe programfrom fonningconjunctionsthat it might needin certainproofs. It thusappearsthat the only way to keepAND Introduction from overgeneratingis to make it a backwardrule or a selective forwardrule of the sort weconsideredin connectionwith LT and GPS. Table3.2 showshowAND Introductionmightlook asa backwardrule. ) of In this setting, AND Introductionappliesto conclusions(or subgoals the separate(sub- )subgoalsP and the fonDP AND Q and then generates directionwoulditselfkeepthe backward in a Q. UsingAND Introduction , sincethe only conjunctions rule from producinghostsof irrelevantconjunctions it tries to provewould be neededin the proof. Furthennore, if theprogramfirst triesto fulfill subgoalP and failsin the attempt, thereis no longerany hopeof provingP AND Q in this manner;it can therefore P is fulfilled, skip subgoalQ, thusavoidingsomewastedeffort. If subgoal seemspsychologically rule . The if it canalsofulfill subgoalQ then the rule succeeds natural, in that it retainsthe idea that proving P AND Q fonnats, , avoidsspecialized amountsto provingboth P and Q separately backward similar a also table The . gives and eliminatesirrelevantassertions rule for OR Introduction.


Rules' I Ia' iiol , e subdonurins Of the rules in table 2. 3, clear examplesof self-constraining rules are AND , IF , and Double Negation Elimination , and equally clear examples of self-promoting rules are AND Introduction and OR Introduction . Symmetry would suggestthat IF Introduction and NOT Introduction should be self-promoting, along with the rest of the Introduction rules, and that NOT Elimination and OR Elimination should be self-constraining, along with the other elimination rules. This intuition is right in the casesof IF Introduction and NOT Introduction . Allowing theserules to operate in the forward direction will produce runaway proofs, which are typical of other self-promoting rules. For example, (7) shows how to derive the sentencesIF P THEN P, IF P THEN ( IF P THEN P ), IF P THEN ( IF P THEN ( IF P THEN P ) ), and so on, by meansof a forward IF Introduction rule.

(7) a.

+ Linda is a bank teller IF Linda is a bank teller THEN Linda is a bank teller


+ Linda is a bank teller


IF Intro. Supposition

d. IF Linda is a bank teller THEN (IF Linda is a bank teller THEN Linda is a bank teller).

IF Intro.



+ Linda is a bank teller.

f. IF Linda is a bank teller THEN (IF Linda is a bank teller THEN (IF Linda is a bank teller THEN Linda is a bank teller

IF Intra.

. . . .

Since the supposition in (7a) holds trivially in the subdomain that it defines , IF Introduction allows us to derive (7b) in the superdomain. Sentence (7b), in turn , holds in the subdomain created in (7c), yielding (7d) in the superdomain, and so forth . A very similar proof will generate NOT NOT P, NOT NOT NOT NOT P, NOT NOT NOT NOT NOT NOT P, . . . from an arbitrary sentenceP on the basisof a forward NOT Introduction rule. It seemscorrect, then, to handle IF Introduction and NOT Introduction in the same way as the other self-promoting rules, using them in the backward direction to fulfill conditional or negative goals. Table 3.2 gives one fonnulation of theserules. This leavesNOT Elimination and OR Elimination . The first of theseis the rule that yields P from a contradiction derived from the supposition


Reaonin~ and Computation

NOT P. It might seemthat this rule would be free from the problem of infinite assertions, since the output of this rule must be simpler than the on supposition from which it is derived. But ualessthere are restrictions the form of the suppositions, it is still possibleto supposesentencesof any length and thereby obtain conclusions of any length. Proof (8) illustrates this difficulty . (8) a. Linda is a bank teller. b. + NOT NOT NOT Linda is a bank teller. NOT Linda is a bank teller. c. d. NOT NOT Linda is a bank teller. e. + NOT NOT NOT NOT NOT Linda is a bank teller. NOT NOT NOT Linda is a bank teller. f. g. NOT NOT NOT NOT Linda is a bank teller.

Premise Supposition Double Neg. Elim. NOT Elim. Supposition Double Neg. Elim. NOT Elim.

. . . .

The obvious way around this problem is to treat NOT Elimination as a backward rule on a par with NOT Introduction and IF Introduction , and this is the way it appearsin table 3.2. Much the sameconsiderationsshow that OR Elimination is also best applied as a backward rule. Although the rules in table 3.2 are clearly sound (i.e., will not lead from true premisesto false conclusions), they will not quite do as they stand. For one thing , there are obviously valid argumentsthat they cannot the prove. For example, there is no way in the present system to derive conclusion of the simple argument (9). (9) IF Calvin is hungry AND Calvin deposits 50 cents THEN Calvin gets a coke. Calvin is hungry. 50 cents. Calvin deposits . Calvin getsa coke. AND Introduction and IF Elimination should be enough to prove (9), but the new directionality that we have imposed on these rules keeps them from cooperating in the right way. Apparently, we will have to introduce further rules if we want the systemto handle all the deductive inferences

Chapter 3

that people can grasp. Furthennore, some of the rules are still too unconstrained as they stand. The Backward NOT Elimination rule, for example is , applicable to any subgoal P and directs us to try to prove any pair of contradictory sentencesQ and NOT Q. But unlesswe have some idea of which contradictory sentencesmight be derived from NOT P, this is surely not a strategy that we want to employ. Thus, although table 3.2 takes us part of the way toward a more tractable natural deduction, we still face residual problems of incompletenessand inefficiency. We will cope with thesein part II . Quantifiers Natural deduction rules for quantifiers are often cumbersome, since they ordinarily demandthat we apply FOR ALL Elimination and FOR SOME Elimination at the beginning of the proof to remove the quantifiers and then apply FOR ALL Introduction and FOR SOME Introduction to replacethem at the end. Derivation ( 16) of chapter 2 provides an example of this style. Although it is possible that some set of quantifier rules of this sort will turn out to be feasible, it is worth considering other ways of handling inferenceswith quantifiers. Computer-basedtechniquesofTer some suggestionsalong these lines, since these methods generally represent quantifiers implicitly and dispensewith rules for transforming them. Of course, these techniqueshave costs of their own; to a certain extent, they shift the burden of the quantifier rules to the processthat obtains the initial representation. Nevertheless, they have the advantageof removing all the quantifiers in a single blow, including quantifiers that are deeply embedded within a premise. We have already glimpsed one method of this sort in regard to resolution proofs; however, those proofs also insist on straitjacketing other aspectsof an argument' s syntax. What would be desirableis a quantifier-free representationthat has the expressivepower of the usual CPL sentencesand that stays reasonably close to surfacc structure. One representationthat seemsto meet theserequirementswas proposed by Skolem ( 1928/ 1967), and it is similar to that employed in theorem. proving systems by Wang ( 1960), by Bledsoe, Boyer, and Hennemal1 ( 1972), and by Murray ( 1982). It follows clausal form in eliminatin ~ ' quantifiers, but it doesnt insist on reducing a sentenceto conjunction~ of disjunctions. This format will be previewed here, since it will play ali important role in later chapters.

Reasoning and Computation

cancontainanymixof of CPL. Thesentence a givensentence Consider , which the usualquantifiersand connectives , exceptfor biconditionals . (For example , P IF mustbe eliminatedin favorof otherconnectives AND ONLY IF Q canbe rewrittenas ( IF P THEN Q) AND ( IF Q to argue THEN P) .) Thisrestrictionseems , sinceit is possible palatable asa combination biconditional thatpeopleordinarilyunderstand relationships into . Wecantransform of one-wayconditionals anysuchsentence the with all is that onein prenex , a logicallyequivalent quantifiers form movethequanti. To do this, wesuccessively at thebeginning appearing in (10)- (13). (In the of widerscope fiersto positions , usingtheentailments , here a andc sentences , x mustnot appearin Q; in theband d sentences x mustnot appearin P.) (10) a. FORSOMEx)P AND Q) ~ (FORSOMEx) (P AND Q) b. (P AND(FORSOMEx) Q) ~ (FORSOMEx) (P AND Q) ~ (FORALL x) (P AND Q) c. ( FORALL x)P AND Q) ~ (FORALL x) (P AND Q) d. (P AND(FORALL x)Q) ~ (FORSOMEx) (P ORQ) (11) a. ( FORSOMEx)P ORQ) ~ (FORSOMEx) (P ORQ) b. (POR(FORSOMEx)Q) ~ (FORALL x) (P ORQ) c. ( FORALL x)P ORQ) ~ (FORALL x) (P ORQ) d. (P OR(FORALL x)Q) (12) a. IF (FORSOMEx)PTHENQ ~ (FORALL x) (IF P THENQ) b. IFPTHEN(FORSOMEx)Q ~ (FORSOMEx)(IFPTHENQ) c. IF (FORALL x)PTHENQ ~ (FORSOMEx)(IF PTHENQ) d. IF PTHEN(FORALL x)Q ~ (FORALL x) (IF P THENQ) ~ (FORALL x) ( NOTP) (13) a. NOT((FORSOMEx)P) ~ (FORSOMEx) ( NOTP) x P ALL b. NOT((FOR ) ) suchas(14a sentence ). To illustratethistransformation , takea complex (14) a. (FORALL x) (IF (FORALL y) (P(x,y THENNOT (FORSOMEz) (R(x,z) AND Q( z ). b. (FORALL x) (IF (FORALL y) (p( x,y THEN(FORALL z) (NOT(R(x,z) AND Q( z . c. (FORALL x) (FORSOMEy) (IF P(x,y) THEN(FORALL z) (NOT(R(x,z) AND Q( z . d. (FORALL x) (FORSOMEy) (FORALL z) (IF p( x,y) THEN NOT(R(x,z) AND Q( z ).

Chapter 3

According to ( 13a), we can move the existential quantifier ( FOR SOME z) outside the scopeof the negation, switching to a universal quantifier in the process. This producesthe sentencein ( 14b). Similarly , we can move ( FOR ALL y) outside the conditional , using ( 12c). This again requires a change in quantifiers, and it yields ( 14c). Finally, we extract ( FOR ALL z) by means of ( 12d) to get the prenex form in ( 14d). In general, there will be more than one choice as to which quantifier to move next. For example, we could have started by moving ( FOR ALL y) rather than ( FOR SOME z) in ( 14a). This meansthat prenex form for a sentenceis not unique. It usually makessense,however, to prefer a prenex form in which existential quantifiers are as far to the left as possible. For a proof that any sentence of CPL can be transformed into an equivalent in prenex form , seepage87 of Mendelson 1964. Given the transformations in ( 10)- ( 13), we can easily derive the quantifier -free form of a CPL sentence. We first make sure that each quantifier within the sentenceis associatedwith a unique variable. For example, ( FOR ALL x ) ( F ( x ) ) AND ( FOR SOME x ) ( G( x ) ), which is acceptable in CPL , will be replaced by the equivalent sentence( FOR ALL x ) ( F ( x ) ) AND ( FOR SOME y) ( G( y) ) . Second, we determine the prenex form, following the rules above. We then deleteeach existential quantifier and replacethe variable it binds with a new temporary name, subscripting the name with the variablesof all universal quantifiers within whosescope it appears. As the last step, all universal quantifiers are deleted. In the case of ( 14d), we find that the only existential quantifier in the prenex form is ( FOR SOME y) . We must therefore replacey with a temporary name (say, a); but because( FOR SOME y) appears within the scope of the universal ( FOR ALL x ), the temporary name will have x as a subscript. The remaining quantifiers are universal and can be dropped, producing the final quantifier-free form shown in ( 15). ( 15) IF P(x,ax) THEN NOT (R(x,z) AND Q( z . Temporary namesin this context are sometimescalled Skolemconstants(if unsubscripted) or Skolemfunctions (if subscripted). The form itself goes under the name Skolemfunction form (Grandy 1977) or functional normal form (Quine 1972). We will refer to it here simply as quantifier-freeform . As further examples, ( 16)- ( 19) repeat some of the sentenceswith quantifiers from earlier parts of this chapter, along with their quantifier-free forms:

Reasoningand Computation

( 16) a. [ = (2a)] (FOR ALL x) (IF Square-block(x) THEN Green-block(x)). b. IF Square-block(x) THEN Green-block(x). ( 17) a. [ = (3a)] (FOR SOME x) (Big-block(x) AND Square-block(x . b. Big-block(a) AND Square-block(a). ( 18) a. [ = (4a)] (FOR ALL x) (FOR SOME y) (IF Person(x) THEN Fathery ,x . b. IF Person(x) THEN Father(ax' x). ( 19) a. [ = conclusion of (5)] (FOR SOME z) Father(z, Miranda ). b. Father(a, Miranda ). Although the rule forgoing from CPL sentencesto quantifier-free sentences is somewhat complex, the new sentencesare not difficult to interpret ; they may evenhave someadvantagesover CPL syntax in this respect. Variables continue to have universal meaning, as they did in clausal form. Temporary names stand for particular individuals, with the identity of theseindividuals dependingon the values of any variables that subscript them. (Details of the semanticsfor quantifier-free form are developed in chapter 6.) For example, we can paraphrase( 18b) as follows: Given any person x , there is a particular a who is the father of x. The dependencies among the quantifiers is more clearly displayed here than it is in CPL . In fact, the role of variables and temporary namesin quantifier-free form is much like that of variables and symbolic constants in ordinary algebra. ' Algebraic equations, such as y = 3x + k, don t display their quantifiers explicitly ; rather, they rely on conventions similar to ours to enable us to understandthat (e.g.) there is a value of k such that , for any x , 3x + k = y. Our quantifier-free sentencesalso presentsomebenefitsover clausal form. Whereas( 17b) and ( 19b) are the sameas their clausal representations, we have beenable to hold onto the conditionals ingoing from ( 16a) to ( 16b) and from ( 18a) to ( 18b), instead of having to transform them to disjunctions as we must for their clausal counterparts. A further possibleadvantageto quantifier-free form is that it allows us to expressrelationships that are impossibleto representin standard CPL . As an example (from Barwise 1979), consider the sentenceThe richer the country, the morepowerfuloneof its officials. This seemsto mean that there is some mapping from each country to an official of that country such


that, if country x is richer than countryy , then the indicated official of x is more powerful than the official of y . We can express this reading as IF ( Country( x ) AND Countryy ) AND Richer-than( x ,y) ) THEN ( Official( bx) AND Official( by) AND More-powerful-than( bx,by) ), where b is the appropriate function. What makesthis impossibleto expressin CPL is the fact that the individual selectedfor the first argument of M orepowerful-than( bx,by) dependsonly on x , whereasthe individual selected for the secondargument dependsonly on y. The closest we can come in CPL is a sentencelike ( FOR ALL x ) ( FOR SOME u) ( FOR ALL y) ( FOR SOME v) ( IF ( Country( x ) AND Countryy ) AND Richerthan( x ,y ) ) THEN ( Official( u) AND Official( v) AND M ore-powerfulthan( u,v) ) ) . But in this last sentence, ( FOR SOME v) is within the scope of both ( FOR ALL x ) and ( FOR ALL y) and so dependson the choices of both x and y. Moreover, any other plausible rearrangementof quanti fiers leaveseither u or v within the scope of both universal quantifiers. ' Intuitively , then, theseCPL sentencescan t be equivalent to the English original or to its rendering in quantifier-free form. (See Barwise 1979, Hintikka 1974, and Quine 1969 for discussionsof such sentences .) This of our free notation is however the fact advantage quantifier , by qualified, that NOT and similar operators do not apply freely to sentencessuch as our example(seechapter 6). In part II we will make use of quantifier-free form to simplify a theory of mental theorem proving . That is, we will assumethat people represent , at least for purposesof deduction, in this way. It will quantified sentences not be necessaryto suppose, however, that in comprehending such sentences people translate natural language into standard CPL syntax and then translate again into quantifier-free form.. There may well be a more direct route to the quantifier-free representation. Certainly, people could not understandthe examplesentenceof the last paragraph in the intended way by translating via the indirect path. Still , the translation rules in ( 10)- ( 13) are handy becauseCPL notation is more familiar than quantifier -free form to many researchers , and the rules facilitate getting back and forth betweenthem.

SolvingProblemsby ProvingTheorems The advances in automated deduction that were made in the 1960s and the 1970s suggested that theorem provers might be tools for more than

Reasoningand Computation

just finding derivations in logic or mathematics. Consider, for example, the problem of designing a systemto answer questions about someempirical domain, such as geographical, taxonomic, or genealogicalrelationships. We would like the systemto be able to answeras wide a range of questions about the domain as possible when given some stock of information . ' Although we might try to prestore in the systems memory answersto all questions that might arise, in most realistic situations there are far too many possibilities for this to be a practical alternative. Instead, we might store a basic set of relationships from the domain and have the system deduceothers from the basic ones if it needsthem in answering a query. Other intellectual tasks, such as planning and (somekinds of ) game playing ' , have the same cognitive texture. A planner or a game player can t store in advanceits responsesto all contingenciesthat might develop. It has to remembersomedomain information , of course; but it must usually generatespecific strategiesas contingenciesarise. Deduction might be an effective way of carrying out this generation step. Let us look at a few exampleshere, reserving more complicated instancesof problem solving in a cognitive framework for chapter 8. SampleUsesof Deduction As a simple example of answering questions by deduction, suppose we have thesegeographicalfacts: In ( Benghazi,Libya), In ( Harare,Zimbabwe), In ( Khartoum,Sudan) and also In ( Libya, Africa) , In ( Zimbabwe,Africa) , and In ( Sudan,Africa ) . We could, of course, also store relationships like In ( Benghazi,Africa), but this will be unnecessarilywasteful of memory spaceif the number of African cities we know is relatively large. A better approach is to derive facts like the last one from our generalknowledgeof the transitivity of In , which we can expressby IF In ( x ,y) AND In ( y,z) THEN In ( x ,z) . That is, if we are asked whether Benghaziis in Africa, we " " can prove that the answeris yes by deducing In ( Benghazi,Africa) from In ( Benghazi,Libya ), In ( Libya,Africa), and the above conditional. In general " " " " , the answer should be yes if there is a proof of the conclusion, no " " if there is a proof of the negation of the conclusion, and maybe if neither can be proved. We can also answer Wh -questions, such as What city is in Sudan?, by a similar deduction procedure. In this case, we convert the question into a potential conclusion of the form In ( a,Sudan) and try to prove it on the basis of the rememberedinformation . A rule analogous to FOR SOME Introduction (table 2.5) will allow us to derive In ( a,Sudan)

Chapter 3

from In ( Khartoum,Sudan); thus, by keeping track of which permanent name matched a in the proof, we can answer" Khartoum ." Deduction can also be used for simple planning if we expressgeneral constraints on plans through sentenceswith universal variables. For instance , imagine that we are planning an itinerary for a trip . We might have specificinformation about transportation from one city to a neighboring one (gleaned from railroad timetables, say), but no overall plan for the tour route. We can representthe domain-specific information as a series of statementsasserting that there are transportation links between particular cities: Link ( Bloomington, Champaign), Link ( Champaign,Decatur), Link ( Decatur,Springfield) , and soon. To plan the route, wealsoneedto represent the fact that a successfulroute is onethat links the origin and destination cities in a chain. This we can do by meansof two conditionals, as in (20). (20) a. IF Link (x,y) THEN Route( x,y) b. IF Link (x,u) AND Route( u,y) THEN Route( x,y) The first conditional assertsthat there is a route between two cities if they are connected by a single link , and the second states that there is a route between two cities if there is a link from the origin to an intermediate city and a route from there to the destination. (Of course, there be other constraints on a desirable route, such as bounds on total may distance and time, but these are all we need for the purposes of the example.) We can thus derive a plan, by asking our systemto prove that there is a route from the origin to the destination. If we wanted to plan a route from Bloomington to Springfield , we would propose the conclusion Route( Bloomington, Springfield ) and trace the derivation from the sentences in memory. A theorem prover might attempt to prove such a conclusion by a backward Modus ponens strategy. It would first try to prove that Link ( Bloomington, Springfield ) , since (20a) guaranteesa route betweentwo locations if there is a single link betweenthem. Failing this, the theorem prover would try to establish Link ( Bloomington,a) AND Route( a,Springfield ) on the basis of (20b). The facts listed earlier would permit the first of these subgoals to be fulfilled by Link ( Bloomington, ) , and the systemwould then attempt to prove Route( Champaign Champaign , Springfield ) . By applying the same strategy recursively to this Route subgoaland later ones, the systemwould eventually completethe proof by working backward, link by link . Reading ofTthe intermediate cities from the proof would then give us the itinerary we sought.

Reasoningand Computation

Methods in DeductiveProblem Solving The idea that theorem proving could be used for planning and other problem-solving situations seemsto have begun with McCarthy ( 1968) and has been elaborated by Kowalski ( 1979) and others. AI researchers who haveexplored thesepossibilitieshave followed one of two approaches, which line up with our earlier distinction betweenresolution and naturaldeduction methods. The central dilemma on both approaches is that solving a problem meansproving a conclusion from premisesthat are not usually fixed in advance. In the context of question-answering, the question correspondsto the conclusion that the systemmust try to prove, but the user of the systemdoes not generally specify the premises. Any memory sentenceis a potential premise; but unlessthe amount of stored information is small, most of this information will be irrelevant and searching through it will drastically slow the proof . One approach to this difficulty is to employ resolution methods and related strategiesto speedthe proof process. Another usesnatural-deduction rules, but surrounds these rules with control strategiesto make them more practical. ' The resolution approach is illustrated by Green s ( 1969, 1969/ 1980) QA3, one of the early deduction problem solvers. This program divided sentencesinto two sets: a Memory set, which contained all the information in the system's database, and a Clauselist, which contained only the sentencescurren,tly participating in the resolution proof. Initially , the Clauselist included just the clausal form of the negation of the conclusion. The program then transferred sentencesfrom Memory to the Clauselist according to resolution-based heuristics. For example, it included only Memory sentencesthat resolvedwith items already on the Clauselist , giving priority to resolutions in which one of the input sentenceswas atomic or was the negation of an atomic sentence.QA3 also placedbounds on the number of intermediate steps; that is, it deducedonly sentencesthat could be obtained in fewer than a fixed number k of applications of the resolution rule. If the procedureexceededthesebounds without finding a proof, the program switchedfrom the attempt to prove the conclusion true to an attempt to prove it false. If the secondattempt also exceededthe bounds, QA3 noted that no proof had beenfound and gave the user the option of increasingthe bounds on intermediate steps. The secondapproach originated at about the sametime and is exempli' fied by Hewitt s ( 1969) PLANNER . The distinctive featureof this program

Chapter 3

(or programming language) is that , instead of encoding general relations as sentences , PLANNER encodedthem as procedures. In our earlier example , we expressedthe transitivity of In as a universal sentence- IF In ( x ,y) AND In ( y,z) THEN In ( x ,z) - and let the inferenceprocedures decide how this fact would be usedduring the course of a proof. But it is also possible to specify the use of the relation directly . For instance, we could write a procedure that assertedIn ( x ,z) whenever two statements of the form I n( x ,y) and I n( y,z) appeared in the database. Or we could write a routine that produced the subgoalsof proving In ( x ,y) and In ( y,z) whenever In ( x ,z) was a goal. Essentially, the first procedure (called an " antecedent theorem" in PLANNER ) has the effect of forward Modus " ponens applied to the above conditional , whereasthe second (a consequent " theorem ) has the effect of backward Modus ponens. By expressing the transitivity relation in one of these two ways, we control how the theorem prover will usethis information during a proof, rather than leaving it up to the inference rules. PLANNER also provided a facility for recommendingwhich proceduresthe program should try first in proving a given goal or subgoal, thereby focusing the searchfor a proof . Thesetwo approaches to solving problems by deduction have spawned separateresearchtraditions. The resolution approach has been incorporated in the STRIPS program (Fikes and Nilsson 1971) for robot manipulation and in PROLOG , a general-purpose AI language (Clocksin and Mellish 1981). PLANNER was adopted as part of Winograd's ( 1972) language understander SHRDLU , and can be seen as the ancestor of CONNIVER (Sussmanand McDermott 1972) and later truth -maintenance systems(de Kleer 1986; Doyle 1979; Forbus and de Kleer 1993; McAllester 1978). It is common to view these two approaches as taking opposite standson the issueof whether general relations, such as the transitivity of In , should be representedas sentences(" declaratively," as in QA3) or as routines (" procedurally," as in PLANNER )- seeWinograd 1975. This, in turn , is probably a reflection of the fact that natural-deduction systems must have more elaborate control processes than resolution systems. The proposal developedin the following chapters takes an intermediate position on this question, since it representsgeneralitiesin a declarative way but retains natural-deduction methods. For presentpurposes, the differencesbetweenthe QA3 and PLANNER traditions are of less interest than their shared belief in deduction as a central part of question answering, planning, and other forms of symbolic

Reasonini and Computation

es, but they computation. These programs achievedsome notable success . Marvin Minsky , in also met with criticism from other AI researchers an appendix to his influential paper on frames ( 1975/ 1985), takes the ' deduction approach to task on several grounds. Minsky s most forceful comments concern the alleged inflexibility of this approach in everyday reasoningand decision-making. We will return to thesearguments in detail " " in chapter 8, where we will consider recent nonmonotonic reasoning systemsthat are intended to addressthem. It is possible, however, to view many of the difficulties that Minsky raisedas boiling down to the idea that ' purely deductive systemscan t account for ordinary nondeductive inference and decision making (e.g., deciding on the best time to cross a busy intersection); thus, an approach to AI based solely on deduction, with no means for nondeductive inference, will be too restrictive (see also McDermott 1987). This point is undoubtedly correct; however, for the ' reasonsmentioned at the end of chapter 2, it doesnt precludethe hypothesis that deduction plays a superordinate role in governing other intellectual , including nondeductiveones. Sincethe researchwe have processes reviewedin psychology, logic, and AI has given us a rich baseof knowledge about deduction, it seemsworthwhile to explore this hypothesis in the context of both artificial and natural intelligence.




Mental Proofs and Their Formal Properties

' ' Much virtue in if . As You Like It (V, iv ) " " " " Everythingonly connectedby and and and. Openthe book. Elizabeth


I offered

in psychology






a glimpse



of deduction artificial

, logic , and

a Complete



fonD S it assumes

of the

. The





of the following

chapters is to draw on insights from these fields to shape a theory that accounts





in fonnallogic

no training

- deduction of such


a theory


in artificial






, though



the development




have certain



, are psychologically




I suggests


as axiomatic




in part


a good


or computational

mathematical of reach

of us .

for most This

. The

. Alternatives




lines . The


a theory

















explain data from several kinds of reasoning experiments in which subjects evaluate the deducibility of arguments or solve certain deduction puzzles. The central that to



it by


in the theory



a problem in




of a mental

be that

calls for deduction


a set


. I assume



of sentences

attempt the


premises or givens of the problem to the conclusion or solution . Each link


in this network 3 .2 , which

of table


an inference

the individual




to those

sound . Taken

as intuitively

together , this network of sentences then provides a bridge between the premises and the conclusion that explains why the conclusion follows . Of course


, people















( relative






system ). They may not possess or may not be able to apply an inference that


is crucial

, capacity a proof lead them


for a particular in working

restrictions . They


to conclusions

- for example limitations . Resource them from completing may keep memory


even possess not


, the claim is that people will solving problem process .

nonstandard in classical

at least



of inference



logic . Nevertheless

a mental





4 Chapter

The first sections of this chapter outline the main assumptionsof the theory: the way it representsproofs in memory and the way it assembles these proofs on the basis of rules of inference. These assumptions are similar to those that I have proposed in earlier works (Rips 1983, 1984; Rips and Conrad 1983), but with a number of modifications that grew out of my experiencein translating the theory into a computer model and in exploring its formal properties. The essentialidea of the theory is that we can get a satisfactoryaccount of human deduction by marrying the notion of a supposition from formal natural-deduction systemsin logic with the notion of a subgoal from models of problem solving in artificial intelligence . Suppositions and subgoalsare, in a way, conceptualduals, and they provide the twin columns to support the theory. In the remainder of the chapter, I take a more abstract look at this system. I prove that when this systemis given an argument to evaluate it always halts either in a state in which it has found a proof for the argument or in a state in which it has unsuccess fully tried all available subgoals. I also show that the systemas outlined is incomplete with respect to classical sentential logic but can ' easily be extendedto a complete systemthrough the addition of two forward rules for conditionals. In a sense, these proofs locate the difference between the psychological system proposed here and standard logical systemsin the way that conditional sentencesare handled. In this chapter and the next we will explore the idea of mental proof by asking how sententialproofs are produced. But of course this is only one ' of the challengesthat a theory of deduction faces. We don t want to stop ' with sentential reasoning, sincemuch of deduction s power dependson its ability to bind variables. To take advantage of this power, we need to expand the theory to handle quantification . Chapters 6 and 7 will deal with this. However, let us first consider proof production in sentential logic, sinceit provides the basisfor thesefurther developments. Oveniew of the Core Theory

The basic inferencesystemconsists of a set of deduction rules that construct mental proofs in the system's working memory. If we present the systemwith an argument to evaluate, the systemwill usethose rules in an attempt to construct an internal proof of the conclusion from the premises. If we presentthe systemwith a group of premisesand ask for entailments

Mental Proofs and Their Formal Properties


of those premises, the systemwill usethe rules to generateproofs of possible conclusions. The model comes up with a proof by first storing the input premises(and conclusion, if any) in working memory. The rules then scanthesememory contents to determine whether any inferencesare possible . If so, the model adds the newly deducedsentencesto memory, scans the updated configuration, makes further deductions, and so on, until a proof has beenfound or no more rules apply. Thus, the inferenceroutines carry out much of the work in the basic system, deciding when deductions are possible, adding propositions to working memory, and keeping the procedure moving toward a solution. In what follows, I will refer to the 1 systemas PSYCOP (short for Psychologyof Proof ). The systemexistsas a PROLOG program for personalcomputers that includesthe procedures describedin this chapter and in chapter 6. PSYCO P' s strategy in evaluating arguments is to work from the outside in, using forward rules to draw implications from the premisesand using backward rules to create subgoals based on the conclusion. (See chapter 3 for the forwardjbackward distinction.) The forward rules operate in a breadth-first way, and create a web of new assertions. The backward rules operate depth-first: PSYCOP pursues a given chain of backward reasoning until it finds assertions that satisfy the required subgoals or until it runs out of backward rules to apply. In the first case, the proof is complete, sincethere is a logical pathway that connectsthe premisesto the conclusion. In the secondcase, PSYCOP must backtrack to an earlier choice point where some alternative subgoal presenteditself and try to satisfy it instead. If all the subgoalsfail, PSYCOP gives up. In situations where PSYCOP is expectedto produce conclusionsrather than to evaluate them, it can useonly its forward rules to complete the task, sincethere is no conclusion-goal to trigger the backward rules. As an exampleof how the systemoperates, consider( l )- a simple argument that it can evaluate using the rules of table 3.2. ( 1) IF Betty is in Little Rock THEN Ellen is in Hammond. Phoebeis in Tucson AND Sandra is in Memphis. IF Betty is in Little Rock THEN (Ellen is in Hammond AND Sandra is in Memphis). At the start of this problem, working memory will contain just thesethree sentences , as shown in figure . The conclusion appearswith a question

Chapter 4


IF B8Ity18kI L.IIa8~ 18kI . . . .. .;;.~ . 1IEN ~ IF Belly ~ ~ LIta8Rod( 1IoEN~

~ ~ t~ . .. J"~ Nm ~

ra Ie ~ * , .....L~ )?

b IF BIIty Ie k8 LIII8 Rook THENaIe k8 ~ 1iI""'~.

Ie" *'..."~. 88Idra IeI"I 1\.-1Nm ~..lIb5 , , I , I ,,''" "I;I" / ' ~ S8d8 Ie"..*'.Ii't"'~..lIb5 Ie" 1'-OIon .

F a. y ~ kI lIaI8 Ro* 11 & ca...

c IF HIlty Ie " L.III8 ~ " ~ . ... ~ . THENaIe

~ . l8b5 .. ~ ~ - -----

.. ' .. . ' .' . ; : / ~.O1Ib5 ~~~

NG 88IdraII ~ * " j ~ ", ," ," ' 4~ . . . . . ~ ~ * , . .-,t~

~ BIlly 18kI LIIM8

Fiaure 4. 1 ' Development of PSYCO P s working -memory representationfor the proofof IF Betty is in Little Rock THEN Ellen is in Hammond. Phoebeis in Tucson AND Sandra is in Memphis. IF Betty is in Little Rock THEN Ellen is in Hammond AND Sandra is in Memphis. Solid arrows are deduction links, dashed arrows are dependencylinks, and double lines representa match betweena subgoal and an assertion. Sentencesending with periods are assertions; those ending with question marks are goals.

" / ' . . / . / ~ ~ ~ " / " L . ' ~ , ' . . / ~ \ ..,# /'~ . ~ . ! : ~ ' ~ ;-.~ . ~ " y "-',~ oA ~ ! . j / ~ f ' ~ , ;",'"",:'"~ -/"~ " " : , oA ! " " i."> . I . / ~ '.,L . ~ ~ '.~ " -".("J;''Y " ~ . ./;-.;# ~

" , / ' " ~ . ",.;:~ .."~ : " , ' '"~ ,"."A .-V ("joA

""" . ~ . " .~~ ~ ,~


4 Chapter

mark to indicate its status as a goal, and the premisesend with periods to show that they are assertions. To begin, PSYCOP notices that the second premiseis a conjunction and is therefore subject to Forward AND Elimination . Applying this rule createstwo new sentences , Phoebeis in Tucson and Sandrais in Memphis, which it storesin working memory (figure ). (The solid and dashedarrows in the figure show how thesetwo sentences are deduced; this will be discussedin the next section.) At this stageof the proof no other forward rules apply, but it is possible to begin some work in the backward direction. Sincethe conclusion (and goal) of the argument is a conditional , Backward IF Introduction is appropriate here. According to this rule (table 3.2), we should try to deduce the conclusion by setting up a new subdomain whosesupposition is Betty is in Little Rock (the antecedent of the conditional conclusion) and attempting to prove Ellen is in HammondAND Sandra is in Memphis in that subdomain. Figure showsthis supposition and the resulting subgoal in the developing memory structure. (The structure representssubdomains by means of the pattern of dashed arrows, as will be explained below.) Since we are now assuming both Betty is in Little Rock and IF Betty is in Little Rock THEN Ellen is in Hammond, the forward IF Elimination (i.e., Modus ponens) rule will automatically deduce Ellen is in Hammond (figure 4.ld ). However, we must still satisfy the subgoal of proving Ellen is in Hammond AND Sandra is in Memphis. The relevant rule is, of course, Backward AND Introduction , which advisesus to set up subgoalscorresponding to the two halves of this conjunction. The first of these, Ellen is in Hammond, is easy to fulfill , since it matches the assertion that we have just produced. (Double lines are used in the figure to representthe match betweenassertionand subgoal.) The secondsubgoal can also be fulfilled by an earlier assertion. Satisfying thesetwo subgoals satisfiesthe conjunction, and this in turn satisfiesthe main goal of the problem. Thus, figure 4.le is a complete proof of argument ( I ). PSYCO P' s proof of argument ( I ) is more direct than is often the case when it evaluatesarguments. One reasonfor this is that the set of rules in table 3.2 is quite small, so it is clear which rule PSYCOP should apply at any stageof the proof . In fact, as was noted above, the rules in the table are too limited and would keep the model from proving many argumentsthat people find obviously correct (e.g., argument (9) of chapter 3). But adding further rules complicatesthe searchfor a proof , sinceit makesit easierfor the model to run into dead ends by following subgoals that it cannot

Mental Proofs and Their Fonnal Properties


satisfy. With a richer set of inferenceroutines, PSYCOP can be facedwith a choice of which backward rule to try first, where a poor choice can mean wasted effort. For these reasons, PSYCOP contains some heuristics to point along the most promising path. PSYCO P' s core systemfulfills some of the criteria that were discussed in chapter I for an adequatededuction theory in psychology. The productivity of the system, in particular, is ensured by the productivity of the inferencerules and their modesof deployment. Even with the limited rule set in table 3.2, PSYCOP can prove an infinite set of theorems in sententiallogic . Of course, we also needto be careful that the model isn' t too productive, generating irrelevant inferenceswhen it should be focusing more narrowly on the task at hand. The combination of forward and backward rules, however, helpseliminate irrelevancieswhile ensuring that the model will be able to derive neededsentences . Rules such as IF Introduction also allow PSYCOP to manipulate suppositions in order to advanceits proof of complex arguments. This characteristic matchessub' jects use of suppositions in some of our earlier examples(seetables 1.1 and 1.2). In figure 4.1, for instance, the systemused the supposition Betty is in Little Rock as a crucial elementin the proof . The memory links shown in the figure provide a way of keeping track of thesesuppositions and the sentencesthat depend on them. Core Assumptions Assumptionsabout Memory The PSYCOP model possess es a standard memory architecture that is divided into long-term and working -memory components. (The distinction between the two memory systemswill become important when we consider PSYCO P's ability to accomplish cognitive tasks that require long-term memory search.) The two systemsare similar in structure, the main differencebetweenthem being working memory's smaller capacity. Both memory systemscontain internal sentencesconnected by labeled links, as in earlier memory models proposed by Anderson ( 1976, 1983), Collins and Loftus ( 1975), and others. We can examine the main features of both memory systemsby returning to the working -memory proof in figure 4.1. The links connecting memory sentencesare probably of a large variety of types, but there are two that are especially relevant to the deduction

Chapter 4


links process . Let us call them deduction links as solid arrows represented

and dependency in the figure -



sentences , in

to figures



4 . lb - 4 . le is


Tucson , indicating former


in a single






is a deduction








step . Each



links . The run


entail that







. For




example the


sentence is


is the product


the of a

particular rule (AND Elimination , in this example ), and I will sometimes tag the links combine


to produce

the name of that a third

rule . In many

. In the present


cases two sentences , the sentence



in Hammondis deducedby IF Elimination from both IF Betty is in Little Rock THEN Ellen is in Hammondand Betty is in Little Rock. To indicate that sentenceslike theseare jointly responsiblefor an inference, I use an arc to connect the deduction links emanating from them, as in figures 4.ld and 4.le. Thus, deduction links give us a record of the individual stepsin a derivation, eachlink (or set of arc-connectedlinks) corresponding to the 2 application of a single deduction rule. The dependencylinks in memory (dashed lines) representthe way the sentencesdepend on premises or suppositions in the mental proof . In the natural-deduction systemsthat we surveyedin previous chapters, the samesort of dependencywas captured by indenting sentenceswithin their subdomains. The deductive status of any of theseindented sentencesdepends on the supposition of the subdomain in which it appears, as well as on the suppositions or premises of any superdomains in which it is embedded. For example, in our old system a natural-deduction proof of argument ( I ) would look like (2). (2) a. IF Betty is in Little Rock THEN Ellen is in Hammond. b. Phoebeis in Tucson AND Sandrais in Memphis. c. Phoebeis in Tucson. d. Sandrais in Memphis. e. + Betty is in Little Rock. f. Ellen is in Hammond Ellen is in Hammond AND Sandra is in g. Memphis h. IF Betty is in Little Rock THEN (Ellen is in Hammond AND Sandrais in Memphis)

Premise Premise AND Elim. AND Elim. Supposition IF Elim. AND Intro . IF Intro .


In the context of this proof , Ellen is in HammondAND Sandrais in Memphis in line g is ultimately derived from the supposition Betty is in Little Rock and the two premisesof the argument. If we were to withdraw the supposition or premises, then we could no longer derive line g. I indicate this in the memory representation by directing three dependencylinks from the supposition and premisesto Ellen is in HammondAND Sandrais in Memphisin figure 4.le. The conclusion of the argument in line h, however , depends only on the premises; in the language of formal naturaldeduction systems, the supposition is " discharged" when the conclusion is drawn. The corresponding sentencein the figure therefore has incoming dependencylinks from the premisesalone. In general, the indented sentences in (2) are those that receive a dependencylink from the supposition Betty is in Little Rock, whereasthe unindented sentencesare those that are not linked to that supposition. The dependencylinks thus partition the proof sentencesin the sameway as the indentation, marking the same distinction between superdomain and subdomain. This is important becausesubdomain sentencesusually do not hold in their superdomains , and confusion about a sentence's domain could lead to logical inconsistenciesin memory. PSYCO P' s dependencylinks also fulfill a function similar to that of truth -maintenance systemsin AI research (de Kleer 1986; Doyle 1979; Forbus and de Kleer 1993; McAllester 1978). These systemsmonitor the relation betweensentencesin a data baseand the assumptionsfrom which the sentenceshave been derived. Roughly speaking, they ensure that the currently believedsentencesrest on logically consistentassumptions, identifying any assumptionsthat lead to contradictions in the data base(see chapter 8). This provides the system with a way of returning directly to problematic assumptionsand revising them, rather than reviewing in chronological order all the assumptionsthat the systemhas made. Truth maintenancesystemsalso provide a record of the sentencesfor which each assumption is responsible; thus, if an assumption must be withdrawn , all the dependentsentencescan be withdrawn too. This facility is important for PSYCOP in problem-solving situations where it must make arbitrary choicesof suppositions that may later turn out to be incorrect. In earlier versions of the deduction system(Rips 1983, 1984), working memory was divided into two parts, one concerned with assertionsand the other with subgoals. Each of these two parts contained separate segmentscorresponding to subdomains in the proof . To indicate that a


Chapter 4

particular subgoal was satisfied under a given set of suppositions, links connectedsegmentsin the subgoal structure to correspondingsegmentsin the assertionstructure. The presentrevision simplifies memory by merging thesetwo parts. The subgoal or assertion status of a sentenceis now part of the sentence's representation(the periods and question marks in figure 4.1), so there is no need to store them in separate memory locations. The newer method also eliminates someduplication of sentencesbetween structuresand thereforeachievesa more compact proof .3 Notice, too , that the dependencylinks make it unnecessaryto placesubdomainsin different parts of memory. Becausethere seemto be no empirical data that would favor the earlier representationover the newerone, we settle on the revised version on grounds of simplicity . We assume, of course, that working memory has a limited capacity. If mental proofs exceedthis capacity, a person will usually have to recompute and recodeinformation in order to avoid making mistakes. This is in line with demonstrationsby Hitch and Baddeley( 1976) and by Gilhooly et al. ( 1993) that a working -memory load from an unrelated task can produce errors on deduction problems. One of the initial examplesin chapter 1 above (table 1.2) is evidencefor this forgetting effect. Assumptionsabout InferenceRoutines The core system consists of inference routines from table 3.2, supplemented ' by additional rules that are intended to reflect peoples primitive inference patterns. The additional rules are of two types: forward rules that capture intuitively obvious deductions and backward versions of someof the previously specifiedforward rules. We must also consider how the systemshould control theserules, since(as we saw in chapter 3) some of them would causeproblems if applied blindly . Additional Forward Rules Exactly which inferencesare immediate or primitive is an empirical matter, and so it is not possibleto enumeratethem before the evidenceis in. Nevertheless,theoretical considerationsand previous experiments suggestsome plausible candidatesfor rules to add to PSYCO P's repertoire. Table 4.1 lists someof them (along with the earlier forward rules), in their conventional argument form (on the left side of the table) and then in the form of inferenceroutines (on the right). Theserules come from the theories of Braine, Reiser, and Rumain ( 1984), JohnsonLaird ( 1975), Osherson( 1975), Rips ( 1983), Sperberand Wilson ( 1986), and

Mental Proofs and Their F onnal Properties


Table4. 1 PSYCOP' s forwardroutines .

Forward IF Elimiaati OD IF PTHENQ P Q

Forward ANDEHmiaatiOil PANDQ P PANDQ Q

ofthefonnIF PTHENQholds insome (a) If asentence domain 0, in 0, (b) andPholds (c) andQdocsnotyetholdin D, (d) thenaddQto D.

of the fonn P AND Q holdsin some (a) If a sentence domainD, (b) thenif P doesnot yet hold in D, thenadd P to D, (c) andif Q doesnot yet hold in D, (d) thenaddQ to D. (e) ForwardDoubleNeaadom EHmiaadoa NOT NOT P of the fonn NOT NOT P holdsin some (a) If a sentence domainD, P (b) and P doesnot yet hold in D, (c) thenadd P to D. ForwardDisjl8 Cd,e SylI OIism of the form P OR Q holdsin somedomainD PORQ (a) If a sentence NOTQ , thenif NOT P holdsin D and Q doesnot yet hold (b) P in D, PORQ thenaddQ to D. (c) NOTP (d) Else , if NOT Q holdsin D and P doesnot yet hold in D, Q thenadd P to D. (e) ForwardDisjmcdveMod. I Po. I8 IF P OR Q THEN R of the form IF P OR Q THEN R holdsin (a) If a sentence P somedomainD, (b) and P or Q alsoholdsin D, R (c) and R doesnot yet hold in D, IF P OR Q THEN R (d) thenadd R to D. Q R ForwardCoalI8edveMod. Po. IF P AND Q THEN R of the fonn IF P AND Q THEN R holdsin (a) If a sentence P somedomainD, Q (b) and P holdsin D, (c) andQ holdsin D, R (d) and R doesnot yet hold in D, (e) thenadd R to D. Forward DeMorpDl( NOToyerAND)


of theformNOT(P ANDQ) holdsin some (a) If a sentence domainD , (b) and(NOTP) OR(NOTQ) doesnotyetholdin D, (c) thenadd(NOTP) OR(NOTQ) to D.


Chapter 4

Table 4.1(continued ) oyer Forward an OR ) DeMor & (NOT NOT of theform NOT (P OR Q) holdsin some - - - (P OR Q) (a) If a sentence domainD, NOTP thenif NOT P doesnot yet hold in D (b) NOT (POR Q) thenadd NOT P to D. (c) andif NOT Q doesnot yet hold in D, (d) NOTQ e thenadd NOT to D.

() Q YeSyllo Pm ForwardCoajalleti of the form NOT (P AND Q) holdsin some NOT (P AND Q) (a) If a sentence P domainD, (b) thenif P alsoholdsin D NOTQ (c) andNOT Q doesnot yethold in D, NOT (P AND Q) thenaddNOT Q to D. (d) Q Else, if Q holdsin D, (e) andNOT P doesnot yethold in D, NOTP (f ) thenaddNOT P to D. (8)


of the form P OR Q holdsin somedomainD (a) If a sentence , (b) andIF P THEN R holdsin D, (c) andIF Q THEN R holdsin D, (d) and R doesnot yet hold in D, (e) thenadd R to D.

others. (Not all rules from these sourcesnecessarilyappear in the table. The names given these rules are not standard ones in all cases, but they may help in keeping the rules in mind.) Many of the inferencepatterns embodied in the rules of table 4.1 could be derived in alternative ways(possibly with help of the backward rules in the following section). For example, we can capture Disjunctive Modus Ponens- the inferencefrom IF P OR Q THEN Rand P to R- by means of the regular Modus ponens rule (IF Elimination ) together with OR Introduction . Psychologically, however, Disjunctive Modus Ponensappears than OR Introduction . In researchto be discussedin chapter 5, I simpler estimatedthe probability that subjectscorrectly applied eachof theserules when it was appropriate for them to do so in evaluating a sample of sententialarguments. According to this measure, subjectsapplied the Disjunctive of relevant trials, but applied OR Modus Ponens rule on 100010 Introduction on only 20%. If these estimates are correct, the subjects could not have been using OR Introduction to achievethe effect of Disjunctive Modus Ponens. Thus, it seemsreasonableto supposethat Dis-

Mental Proofs and Their Formal Properties


junctive Modus Ponensfunctions as a primitive inferencerule on its own, despiteits logical redundancy. Braine et al. ( 1984) and Sperberand Wilson ( 1986) present further reasonsfor favoring Disjunctive Modus Ponensas a primi ~ive. Psychologicalmodels of deduction differ in which rules they consider to be the primitive ones. These differencesare not merely notational , since someargumentsare provable in one systembut not in others. For example , the present model is able to prove argument (3), which could not be proved in the model of Osherson( 1975). (3) (p OR q) AND NOT p. q. But the choice of rules is not the most important point of comparison between models. This is becausemost investigators have been cautious about claiming that the particular rules incorporated in their models exhaust the primitive ones. (The work by Braine ( 1978) and Braine, Reiser, and Rumain ( 1984) is an exception.) The usual claim (Osherson 1975; Rips 1983) is that the incorporated rules are a subsetof the primitive ones. It is also worth noting that the evidenceon the status of a rule is sometimes ambiguous, particularly if primitive rules can vary in how easythey are to deploy. It is not always clear, for example, whether subjectsare proving a given argument by meansof two very efficient primitives or by meansof one more global but not-so-efficient primitive . The problem becomeseven more complex if we allow for individual differencesand for the learning or the compiling of new primitive rules. For these reasons, it seemswise to concentrateon other criteria in comparing and evaluating deduction systems . In developing PSYCOP, I have tried to include most of the rules ' that earlier models have taken as primitive , so the model s current repertoire is fairly large. We will return to the question of model comparison in ' chapters9 and 10, after finishing our tour of PSYCO P s inferenceabilities. Additions tIIId Clulngesto Btlckward Rules The systemdescribedso far is psychologically incomplete in that it is unable to prove some arguments that subjectsfind fairly easy. One reasonfor this is that we have beentoo restrictive in confining IF Elimination and other principles to forward inferenceroutines. Although theserules should be used as forward rules, we sometimesneed them as backward routines in order to motivate the searchfor subgoals. For example, we currently have no way to show that (4) is deducible.


4 Chapter

Table4.2 PSYCOP's backwardrules. BackwardIF I Dtroddoa (CoadiIioaaHzadoa ) +p (a) SetD to domainof currentgoal. (b) If currentgoalis of thefonn IF P THEN Q,' . nor its immediate (c) andneitherD nor its superdomains Q subdomains containsuppositionP andsubgoalQ, or (d) andIF P THEN Q is a subfonnulaof thepremises IF PTHENQ conclusion , ' (e) thensetup a subdomainof D, D , with suppositionP. (f ) Add thesubgoalof provingQ in D' to thelist of . subgoals

Backward NOTEHmU . doa +NOTP P


(a) SetP to currentgoaland D to its domain. or conclusion (b) If P is a subfonnulaof tbe premises , or (c) andQ is an atomicsubfonnulain tbe premises conclusiond nor its ) andneitberD nor its superdomains immediatesubdomains containsuppositionNOT P andsubgoalQ AND ( NOTQ), of D, D', witb supposition (e) tbensetup a subdomains NOT P.

of provingQ AND( NOTQ) in (f ) andaddthesubgoal D' to thelistof subgoals . BackwardNOT latroducdoa +p



_ tiOB BackwardOR Ii".Hmi PORQ


R +Q

. R

(a) SetD to domainof currentgoal. (b) If currentgoalis of thefonn NOT P, Jaof the premises or (c) and P is a subfonnu conclusiond Jaof the premises or ) andQ is an atomicsubfonnu conclusion , nor its (e) andneitherD nor its superdomains immediatesubdomains containsuppositionP and sUbgoalQ AND ( NOTQ), ' (f ) thensetup a subdomainof D, D , with suppositionP, and add the of (g) ' subgoal provingQ AND (NOT Q) in D to thelist of subgoals . (a) (b) (c) (d) (e) (f )


(g) (b) (i) (j)

SetD to domainof currentgoal. SetR to currentgoal. If a senten ~ of theform P OR Q holdsin D, andboth P andQ aresubfonnulas or negationsof subfonnulas of the premises or conclusion , and R is a subfonnulaor negationof a subfonnulaof thepremises or conclusion , andneitherD nor its superdomains nor its immediate subdomains containsuppositionP andsubgoalR, andneitherD nor its superdomains nor its immediate subdomains containsuppositionQ andsubgoalR, ' thensetup a subdomainof D, D , with suppositionP, andaddthesubgoalof provingRin D' to thelist of . subgoals If thesubgoalin (i) succ : eeds .


Mental Proofs and Their Formal Properties

Table 4.1. (continued) (k) then set up another subdomain of D , Dw, with supposition Q, 0) and add the subgoal of proving R in DWto the list of subgoals. Backward AND latrodl M:dOD p Q P AND Q

(a) (b) (c) (d) (e) (f ) (g)

Backward OR Iatrodl Kti OD -P P OR Q Q P OR Q

(a) (b) (c) (d) (e) (f ) (g)

Set D to domain of current goal. If current goal is of the form P AND Q, and D does not yet contain the subgOalP, then add the subgoal of proving P in D to the list of subgoals. If the subgoal in (d) su~ -! . and D does not yet contain the subgoal Q, then add the subgoal of proving Q in D to the list of subgoals. Set D to domain of current goal. If current goal is of the form P OR Q, and D does not yet contain the subgoal P, then add the subgoal of proving P in D to the list of subgoals. If the subgoal in (d) fails, and D does not yet contain subgoal Q then add the subgoal of proving Q in D to the . list of subgOals

BackwardIF ElimiaadOl (mod. I" " " ) IF P THEN Q (a) SetD to domainof currentgoal. P (b) SetQ to currentgoal. IF P THEN Q holdsin D, (c) If thesentence (d) andD docsnot yetcontainthesubgoalP, . (e) thenadd P to thelist of subgoals Rark. . rd AND ~Hmi_ tioa (a) SetD to domainof currentgoal. PANDQ (b) SetP to currentgoal. P P AND Q is a subfonnulaof a (c) If thesentence QANDP tbat holdsin D. sentence (d) andD doesnot yetcontainthe subgoalP AND Q. . (e) thenadd P AND Q to the list of subgoals (f ) If thesubgoalin (e) fails, Q AND P is a subfonnulaof a (g) andthesentence sentence tbat holdsin D. QANDP, (h) andD doesnot yetcontainthe (i) thenaddQ AND P to thelist of

Double Baekward Nepdoa NOT NOT P P

~ubgoal Al . 51lbgn

(a) SetD to domainof currentgoal. (b) SetP to thecurrentgoal. NOT NOT P is a subformulaof a (c) If thesentence that holdsin D, sentence (d) and D doesnot yetcontainthesubgoalNOT NOT P, . (e) thenaddNOT NOT P to the list of subgoals


Chapter 4

Table 4.2(continued] BackwardDis ) aidy. Mnd. . Po. . IF fOR QTHEN R P R IF P OR Q THEN R Q R

(a) SetD to domainofcurrentgoal. (b) SetR to currentgoal. IF P ORQ THENR holdsin D, (c) If thesentence P, (d) andD doesnotyetcontaintheSUbgoal . (e) thenaddP to thelistof subgoals in (e) fails, (f ) If thesubgoal (g) andD doesnotyetcontainthesubgoal Q, . (h) thenaddQ to thelistofsubgoals

Backward Dill . - ed. e SyllOlilm


ofcurrent . (a) SetD todomain goal . (b) SetQtothecurrent goal oftheConn PORQorQORPholds in (c) If asentence D. ofasentence thatholds (d) andNOTPisasubfonnula in D. notyetcontain thesubgoal NOTP. (e) andD does . (f) thenaddNOTPtothelistofsubgoals BackwardCOD ) aI CdYeSyUOl8m NOT (P AND Q) P NOTQ NOT (Q AND P) P

(a) SetD to domainof currentgoal. (b) If thecurrentgoalisof theformNOTQ ~ NOT(P ANDQ) or NOT (c) andeitherofthesenten (Q ANDP) holdsin D. P. (d) andD doesnotyetcontainthesubgoal . (c) thenaddP to thelistofsubgoals

NOTQ BackwardDeMorpn ( NOTo, er AND ) (8) SetD to domainof currentgoal. (b) If currentgoalis of theform( NOTP) OR (NOT Q), (c) andNOT (P AND Q) is 8 subformul8of 8 sentence that holdsin D.

NOT P AND ( ) OT Q N OT P ( )OR (N ) Q


(d) thenaddNOT(PANDQ) tothelistofsubgoak

BackwardDeMorau ( NOTo,er OR) (a) SetD to domainof currentgoal. (b) If currentgoalis of theConn( NOTP) AND (NOT Q), (c) andNOT (P OR Q) is a subfonnulaof a sentence that holdsin D, . (d) thenaddNOT (P OR Q) to thelist of subgoals

Mental Proofs and Their Fonnal Properties


50centsTHENCalvingetsa coke. (4) IF NOT NOT Calvindeposits 50 cents .


deposits -


gets a coke .

We would like to apply IF Elimination to derive the conclusion. But since IF Elimination is a forward routine, we must wait until the antecedentof the first premise (NOT NOT Calvin deposits50 cents) comes along. Although there is a way to deducethe antecedentfrom the secondpremise, it requires NOT Introduction - a backward rule. Thus, before we can bring NOT Introduction into play, we must have a subgoal that asks the systemto derive NOT NOT Calvin deposits50 cents. We are stuck in the presentsituation becausethe systemhas no way to propose this subgoal. One solution is to add a backward version of IF Elimination (similar to the Modus ponens rule in L T (Newellet al. 1957 that can handle subgoals .4 Table 4.2 states this rule (along with similar ones for backward disjunctive modus ponensand conjunctive syllogism). The right column of the table gives the operational format; the left column gives the nearest analoguein traditional natural-deduction style. The new backward IF Elimination is triggered when a subgoal Q matchesthe consequentof someconditional assertionIF P THEN Q. The rule then proposesthe further subgoal of proving P, since if this subgoal ' can be fulfilled then Q itself must follow. PSYCO P s internal proof of (4) would then proceedas shown in figure 4.2. As the first step, backward IF Elimination notices that the main goal of the proof Calvin gets a coke matchesthe consequentof the first premise. The antecedentof that conditional is the double-negativesentencementioned above, and Backward IF Elimination turns it into the next subgoal. Once we have this subgoal, however, NOT Introduction can apply, positing NOT Calvin deposits50 centsas a supposition and looking for a contradiction as a subgoal. Since the supposition itself contradicts the secondpremise, the subgoal is easily fulfilled , and this completesthe proof . In addition to Backward IF Elimination , we also add to PSYCOP backward versions of some of the other forward rules. For example, it seemsreasonableto use AND Elimination and Double Negation Elimination in a backward direction in order to show that (5) and (6) are deducible.

IF NOT ~ ":.80 08D C8If ~ : = " L 80 ,,I"III . . ~ .\~ 08 g8 J II \NOT ~ c ; * ? , I , NO ~ . 8 ~ 0 \\\ ,":,II,\'\\' ~ \ , I II ' ' N ~ 80 * O ? ~ I \ I , , ' I I ' . 8 N 0 O8 D \\\:II\\.\.~ ~ " NOT . 80 * ? ~ I If 1 NOT NO ~ . 80 * ? ~ \\~ I,g8 .o8t 120


F"~ 4.2 -memory PSYCOP'sworking fortheproofof representation IF NOTNOTCalvindeposits SOcentsTHENCalvingetsa coke . Calvindeposits SOcents . Calvingetsa coke .

Chapter 4

Mental Proofs and Their Formal Properties


However, the new Backward IF Elimination rule will not work hereeither: To trigger that rule, some subgoal must match the consequent of the conditional (Calvin gets a coke AND Calvin buys a burger), but the only subgoal in play is the conclusion (Calvin getsa coke). The proof procedure ought to realizethat the consequententails the goal and that it is therefore worthwhile trying to prove the antecedent; yet it has no way of knowing this. Backward AND Elimination would be helpful in this context; it would tell us that we can prove Calvin gets a coke if we can show that Calvin gets a coke AND Calvin buys a burger. From that subgoal, we can then useBackward IF Elimination and NOT Introduction, as in figure 4.2, to show that (5) is deducible. Much the same strategy, coupled with a backward Double Negation rule, will handle (6). The difficulty with backward versionsof AND Elimination and Double Negation Elimination is that they can lead to an infinity of subgoals, as was observedin chapter 3. For example, if we useAND Elimination backward on goal P to produce subgoal P AND Q, then we should be able to useit again on P AND Q to obtain the subgoal ( P AND Q) AND R, and so on. Perhapsthe most reasonablesolution, under thesecircumstances,is ' to restrict the rules so that no subgoal can be produced that isn t already a " part" of the current set of assertions. To pin down this notion of part more precisely, let us use the term subformulato denote any consecutive string of symbols in a sentence(including the entire sentence) that would also qualify as a grammatical sentenceon its own. Thus, Calvin getsa coke AN D Calvin buys a burger is a subformula of the first premise of (5), but ( Calvin gets a coke AND Calvin buys a burger) AND Calvin deposits50 cents is not. Since the proposed backward AND Elimination rule can only produce subgoalsthat are subformulasof an assertionor conclusion, we can generatea subgoal for the first of these sentences(thus allowing PSYCOP to prove (5 but not the second. A similar restriction can be placed on Backward Double Negation Elimination and the DeMorgan rules. Table 4.2 givesone formulation of thesebackward rules as they are S incorporated in PSYCOP. We must also confront the problem of controlling the rules that produce subdomains- IF Introduction , OR Elimination , NOT Introduction, and NOT Elimination . Although it is not really difficult to find ways to prevent theserules from leading to infinite backward searches, it is not at all obvious whether theserestrictions will also reducethe overall power of the system. We could stipulate, for instance, that OR Elimination can


Chapter 4

apply only once per derivation, but this type of arbitrary limitation would almost certainly prevent PSYCOP from showing that some simple valid argumentsare deducible. The version of theserules in table 4.2 gives one possiblesolution to this difficulty . The most general of the restrictions we impose on these rules is that new subdomains must have a supposition and a subgoal distinct from its superdomains. This prohibits infinite embedding of subdomains. Some of the rules also have unique restrictions. For example, we require that the supposition created by NOT Introduction be a subfonnula of the premisesor of the conclusion of the argument. Later it will be proved that there is a certain sensein which theserestrictions do not reduce PSYCO P' s deductive power: We can include the restricted rules in a systemquite similar to PSYCOP that is in fact complete with respectto classicalsentential logic. Assumptionsabout Control The rules as fonnulated in tables 4.1 and 4.2 leave room for some further decisions about the order in which PSYCOP should deploy them. We need to specify, for example, when PSYCOP should apply forward rules and when backward rules, as well as which rules in eachclassit should try first. In making thesechoices, we need to consider the internal characteristics of the rules. Clearly, PSYCOP can apply its forward rules in a nearly automatic way, since their self-constraining nature requires little external monitoring and will never lead to infinite forward searches (as will be shown below). It therefore seemsreasonableto activate them as soon as possible whenever a triggering assertion appears in the database. Backward rules, however, present more of a control problem. Although the constraints we have placed on the backward rules keep them from producing infinite loops, they neverthelesshave the potential to produce extremely inefficient searches. This means that we might want to adopt a flexible approach to using theserules, allowing the systemto profit from heuristic advice about which subgoalsto follow up. Currently, when PSYCOP has to evaluate an argument it begins by applying its forward rules to the premises until no new inferencesare forthcoming. It then considersthe conclusion of the argument, checkingto seewhether the conclusion is already among the assertions. If so, the proof is complete; if not, it will treat the conclusion as a goal and attempt to apply one of the backward rules, as in the exampleof the previous section. PSYCOP testseachof the backward rules to seeif it is appropriate in this situation, and it does this in an order that is initially detennined by the

Mental Proofs and Their Formal Properties


complexity of the backward rules. The idea is that simple rules should be tried first, sincelesswork will have beenlost if theserules turn out to lead to dead ends. PSYCOP prefers backward rules that can be satisfied by a single subgoal and that do not require new subdomains; thus, it tries Backward IF Elimination and similar rules first. If none of these rules is applicable, it next tries Backward AND Introduction , which requires two subgoalsto be satisfiedbut which doesnot usesubdomains. Finally , it will resort to the subdomain-creating rules IF Introduction , OR Elimination , NOT Introduction , and NOT Elimination . In later phasesof the proof, PSYCOP revisesthis initial ordering in such a way that any rule it used success fully is given first priority in the next round of deduction. Thus, PSYCOP incorporates a simple procedure for learning from practice- a " " procedurethat can give rise to a kind of set effectin deduction (cf. Lewis 1981). Once a backward rule has been activated and has produced a subgoal, PSYCOP checks whether the new subgoal matches an assertion. If not , PSYCOP places the subgoal on an agenda, reruns the forward rules in case some assertions were added, and repeats its cycle. In principle, PSYCOP could try the subgoals on the agenda in any order- for instance " " , according to a heuristic measure of how easy these subgoals seem. In the absenceof other instructions, however, it will follow a depthfirst search; that is, it first tries to fulfill the conclusion-goal, then a subgoal to the conclusion that a backward rule has proposed, then a sub-subgoal to the first subgoal, and so on. If it reachesa subgoal to which no backward rules apply, it backtracks to the precedingsubgoal and tries to fulfill it another way via a different backward rule. There is also a provision in PSYCOP for a bound on the depth of its search, limiting the length of a chain of subgoals to some fixed number. Finally , PSYCOP halts with a proof if it has found assertionsto fulfill all the subgoalsalong some path to the conclusion. It halts with no proof if it can complete none of the subgoalpaths. As will be proved later in this chapter, evenwithout a depth bound it will always end up, after a finite number of steps, in one of these two conditions. Fonna '


For the most part , psychologistshave not beenconcernedwith the properties of deduction systemsthat are of most interest to logicians. All the


Chapter 4

psychological deduction systemsfor sentential reasoning cited earlier in this chapter are soundin the logician' s senseof only producing proofs for argumentsthat are valid in classicalsentential logic (seechapters 2 and 6 for the notion of validity ). But there are no proofs that thesesystemsare complete- that they generateproofs for all argumentsthat are valid. Since completenessis one of the first properties that a logician would want to know about a logical system, it is surprising at first glance that psychologists have paid it so little attention. There is somejustification , however, for the psychologists' indifference. After all , the purpose of thesecognitive proposals is not to capture inferencesthat are valid according to some abstract standard, but to capture the ones that untrained people can accept in ideal circumstances. The systemsshould be psychologicallycomplete (since we want the systems to find a proof for all the sentential argumentsto which peoplewill agreein ideal situations), but not necessar ily logically complete (since people may not, even in ideal situations, accept all the sentential inferencessanctioned by a given logical system). , in contrast to logical completeness , is for the Psychologicalcompleteness most part an empirical matter, to be settled by observation and experiment rather than by mathematics. (SeeOsherson 1975for the distinction betweenlogical and psychologicalcompleteness .) There are, nevertheless , some reasons for exploring the formal properties of PSYCOP. Since the classical sentential calculus is a landmark system in logic and philosophy, it is appropriate to wonder whether the sort of reasoning that people do lives up to this standard. Of course, the fact (if it is one) that human sentential reasoningisn' t equivalent to classi' callogic doesnt imply that humans are irrational ; there are logical systems other than the classicalone, and thesecould be viewed as providing alternative criteria of rationality . However, a divergence with classical logic would be interesting, particularly if we could spot the source of the discrepancy. This would help us locate human-style reasoning within the spaceof possiblelogical theories. In this section, I proceed with the following agenda. I first show that PSYCOP is not complete with respectto classicallogic, by exhibiting as a counterexample an argument that is valid in classical logic but which PSYCOP is unable to prove. I then supplementthe set of rules by adding two new forward rules whose purpose is to transform conditional sentences into ones without conditionals. It is then possibleto prove several related facts about this enlarged system, which I call PSYCOP + . First ,

Mental Proofs and Their Formal Properties


. (This estab the systemalwayshaltswhengivenan argumentto evaluate ' lishesthat PSYCOPitself alwayshalts, sincePSYCOP s routinesare a subsetof PSYCOP+ 's.) Second , any proof in a demonstrablycomplete of classical sentential ) or logic (onesimilar to that of Jeffrey( 1967 system be into a natural deduction to that of Smullyan( 1968)) can mapped proof of a certaintype. Thismappingmeansthat anyvalidargumenthasa proof in what I will call a canonicalnatural-deductionformat. Finally, I establish that PSYCOP+ will alwaysfind this canonicalproofif it hasn't found that PSYCOP+ will find a proof a simpleroneinstead.This guarantees . Sincethe for any valid argument that is, that the systemis complete rulesof the systemare sound, and sinceit alwayshalts, PSYCOP+ is in fact a decisionprocedurefor classicalsententiallogic. Thus, although PSYCOPis not complete , we can makeit completeby endowingit with conditionalsentences in a particularway. the ability to paraphrase ' The upshotof theseresultsis somewhatappealing , sinceclassicallogics hasalwaysseemedits mostunintuitive handlingof conditionalsentences . This intuition however , , mightseemhardto reconcilewith the fact aspect - IF that the rules for conditionalsin most natural- deductionsystems Elimination(modusponens )) and IF Introduction(conditionalization . One way of viewingthe results, then, is that a seementirelyacceptable deductionsystemcan contain versionsof both IF Introduction and IF Elimination(asindeedPSYCOPdoes) andstill not forceoneto recognize all inferencesinvolving conditionalsthat are valid in classicallogic. In proving theseresults, I havetried to keepthe developmentin the text simple; the moreinvolvedaspectsof the proofsaregivenin the appendix to this chapter. Readerswith no patiencefor formalizingshouldskip to . thefinal sectionfor a summary with Respectto ClassicalLogic PSYCOP' s Incompleteness Considerthe followingargument(from Adams1965 ): ). (7) NOT (IF Calvinpasses historyTHEN Calvinwill graduate Calvin passes history. This argumentis valid in classicallogic (i.e., when IF is interpretedas CPL's materialconditional), sinceany negatedconditionalsemantically seemssuspectand has . The argumentnevertheless entailsits antecedent theform of oneof the" paradoxes" of implication. Of course , it is derivable in the formal deductionsystemssurveyedin the previouschapters , since


Chapter 4

these systemsare complete; but PSYCOP cannot derive it . In what follows " " , I work with a vanilla PSYCOP that excludesspecialheuristics on the rules and that adopts the depth-first strategy (without depth bounds) for backward searchthat I discussedin the subsectionon control . I also assumethat there are no limitations on the sizeof working memory. That is, I focus on a slightly idealized PSYCOP in order to study the nature of its rules. To seethat PSYCOP is unable to prove (7), notice that the only rule in table 4.1 or table 4.2 that will apply to this premise-conclusion pair is the rule for NOT Elimination . Thus, PSYCO P' s first move in trying to find a mental proof for this argument is the attempt shown in figure 4.3a, where p has beensubstituted for Calvin passeshistory and q for Calvin will graduate . The supposition NOT p in the figure beginsa new subdomain, which I have indicated by indenting the supposition and its subgoal as in the earlier natural-deduction notation. According to the formulation of the NOT Elimination rule in table 4.2, the subgoal the rule producesmust be the conjunction of an atomic sentenceand its negation, where the atomic sentenceis one that appearsin the premiseor in the conclusion. PSYCOP could begin with either p or q as the atomic sentence ; however, for definiteness we can assumethat PSYCOP always tries the atomic sentencesin the order in which they appear in the argument (in this case, p before q), although this ordering will not affect the results. So we now needto prove p AND NO Tp , and the only rule we can apply in this context is backward AND Introduction . (Another instance of NOT Elimination is blocked at this point by part b of that rule.) AND Introduction instructs PSYCOP to attempt to prove p, with the result shown in figure 4.3a. At this stage, backward NOT Elimination is again the only applicable rule, and using it (together with AND Introduction ) producesthe memory structure in figure 4.3b. Note that part d of NOT Elimination keeps PSYCOP from setting up supposition NOT p with subgoal p AND NOT P once again; so this time PSYCOP is forced to try subgoal q AND NOT q. From this point, PSYCO P' s only course is to continue with NOT Elimination and AND Introduction . After two further applications of theserules, its memory structure looks like figure 4.3c. No further progress is possible, even with NOT Elimination , since part d of the rule prohibits any further subdomains. This structure activates none of the other backward or forward rules; hencePSYCOP must back up to the last point at


Proofsand Their Formal Properties

tOrp. p At GtGr p?


tar (F P 1tB q). tarp . tarp .

q NG tar rtI p? pNGtmp ?

tGJ' (F P , . . q). tDrp . tDrp . tDrq . tDrq .

QNG JGrat pt p NGt Grpt qf

a NGr Grdl pt p NG t Jr pt

.3 F1iwe4 PSYCOP'sunsuccessful toprove attempt

NOT (IFpTHEN q). p.



4 Chapter

which choice of a subgoal was possible. It returns to the place where it supposed NOT q with the intent of proving p AND NOT p and tries subgoal q AND NOT q instead. After severalmore futile attempts of this sort, it finally backs up to the initial goal of proving p from NOT ( IF p THEN q) . But, since it has now exhausted all its resources, it halts, deciding that the conclusion is not deducible from the premise. Thus, it is clear that PSYCOP is unable to prove all arguments that are valid in classicallogic. Since(7) seemsquestionableon psychologicalgrounds, this result seemsreasonable. But of course there is still a long way to go to show that PSYCOP is a correct psychological model, so not too much should be read into this finding. There may be further arguments that PSYCOP fails to prove that seem intuitively correct, and others that PSYCOP proves that are psychologically marginal or unacceptable. A possibleexampleof the first kind is the argument in note 5. As a possible exampleof the latter sort, PSYCOP has no trouble with ( 11) of chapter 2, another of the paradoxes of implication . A Haltiog Theoremfor PSYCOP and PSYCOP + Although PSYCOP is not itself complete with respect to classical sentential logic, it is quite easy to extend it to a complete proof procedure while keeping all its current rules and its control structure intact. All that is needed, in fact, is the addition of the two forward rules shown in table 4.3. The first of theserules allows us to deduceboth P and NOT Q from a sentenceof the form NOT ( IF P THEN Q) ; the second allows us to deduce( NOTP ) OR Q from IF P THEN Q. In effect, theserules translate conditional and negatedconditional sentencesinto sentenceswithout conditionals . Obviously, the negatedconditional rule makes the proof of (7) trivial , since the conclusion follows from the premisein one forward step. Also, theserules seemto be the sort that one would not want to include in a cognitively realistic deduction system, for exactly the same reason that makes(7) itself dubious. I call the PSYCOP systemthat includes the table 4.3 rules " PSYCOP+ " to mark its departure from the official system. The proof that PSYCOP+ is complete is somewhat lengthy, so I will take it in severalsteps. In this subsection , I first show that, given any list of assertionsand any subgoal, PSYCOP+ will always come to a stop after a finite number of steps. We will need this result later, but it is important in its own right sinceit implies that PSYCOP (without the plus) also never attempts endlesssearches. The next subsection demonstrates

Mental Proofs and Their Formal Properties


Table4. 3 Newforwardrulesfor PSYCOP+ .

Coaditional TraMformatioa Negated oftheformNOT(IF PTHENQ) holds in NOT(IF PTHENQ) (a) If asentence some domainD , P notyetholdin D, (b) thenif Pdoes N0 T (IF PTHEN Q) thenaddPtoD, (c) notyetholdin D, NOTQ (d) andif NOTQdoes thenaddNOTQtoD. (e) Coaditional Tra18formadoa insome IF PTHENQ oftheformIF PTHENQholds (a) If asentence domainD , (NOTP) ORQ notyetholdin D, (b) and(NOTP) ORQdoes (c) thenadd(NOTP) ORQtoD. how any proof in a complete tree -style proof system (such as that of Smullyan 1968) can be turned into a proof in a natural - deduction system ' containing a subset of PSYCOP + s rules. Finally , I will show that

PSYCOP+ will eventually discover this proof . Proving that PSYCOP+ halts entails showing that its ensemble of rules never leads to infinite forward or backward searches or to infinite loops. For this purpose, it is useful to consider the different types of rules 6 separately, beginning with the rules that createsubdomains. ' Halting Proof I : PSYCOP + s rilies create Dillyjillitely , . " y domainsfor " gi ,ell . gumellt. Clearly, one way that a system such as ours could fail to halt is if rules that create subdomains (i.e., IF Introduction, OR Elimination , NOT Introduction , and NOT Elimination ) createan infinite number of them. Is this possible? Notice that each of theserules produces only subdomainswhose supposition and goal are chosenfrom a fixed set of sentences(subformulasof the premisesand the conclusion, or restricted combinations of these subformulas). Moreover, the conditions on these rules require the supposition-goal pairs that initiate nestedsubdomainsto be unique. (Seefigure 4.3 for an example of nesting.) Since only a finite number of unique pairs can be formed from the finite set of candidate suppositions and goals, only a limited set of subdomainsis possiblefor a particular argument. The proof of proposition HP -I in the appendix gives the details of this result. Halting Proof II : Tile hckward rllks prodllce oilly 8' ./illite IIIImber of sllbgo8lswithin 8' domaill. Having eliminated the possibility that infinite


Chapter 4

searches could occur through an infinite number of subdomains, we are left with the possibility that an infinite searchcould occur within a single domain. This could happeneither becauseof the forward rules from tables 4.1 and 4.3 or becauseof the backward rules of table 4.2 (excluding the four domain-creating backward rulesjust examined). Notice, though, that the forward and backward processes are independentinside a domain, in the sensethat once the forward rules have run in that domain then further ' application of the backward rules can t trigger any more forward inferences . This is a consequenceof the fact that the backward rules that operate within a domain don' t produce any new assertions, so there are no further sentencesfor the forward rules to operate upon. Thus, to prove that PSYCOP+ convergesit sufficesto show, first, that the backward rules produce only finitely many subgoalswithin a given domain and, second, that the forward rules can produce only finitely many new assertionswhen applied to a fixed set of assertions. The backward case is quite easy. An infinite backward search in a domain would mean that the rules would have to produce an infinite sequenceof subgoals Go, G1, G2, . . . , where G1 is a subgoal of Go, G2 a subgoal of G1, and so forth . However, the conditions of the within domain backward rules (i.e., those that follow OR Elimination in table 4.2) restrict the possible subgoals to those formed from subformulas of the assertionsor from subformulasof precedinggoals in the series. This means that only a finite set of distinct subgoalsare possiblewithin a domain. And since the conditions of the rules prohibit duplicate subgoalsin this situation , there can be no infinite sequenceof them. (Seeproof HP -II in the appendix.) Halting Proof III : Tileforward rulesprodllce oilly a ./illite IIIImber of user tions within a domllill. The only remaining way that PSYCOP+ could get stuck in an endlesssearchis for its forward rules (in tables 4.1 and 4.3) to continue producing assertions indefinitely. Becausethese rules have variedforms , it is not obvious at first that this can' t happen. However, severalfactors conspire to limit the number of assertions. First , the rules produce only sentencesthat contain no more atomic sentencesthan appear in the set of sentencesto which they applied. For example, IF Elimination produces a sentenceQ on the basis of the sentencesP and IF P THEN Q, so any atomic sentencesin Q must already be in IF P THEN Q. Second, the conditions on the rules limit the number of NO Ts in the

Mental Proofs and Their F onnal Properties

assertionsthey produce. Proof HP -III in the appendix demonstratesthat theseconditions restrict the forward rules to producing only finitely many new sentenceswithin a domain. Taken together, the results on domain-creating backward rules, within domain backward rules, and forward rules that appear in HP -I , II , and III lead directly to our main conclusion that PSYCOP + must always halt. HP -I establishes that PSYCOP+ can produce only a finite number of domains, and HP -II and III show that within the domains there can be only a finite number of subgoalsand assertions. PSYCOP + can continue only as long as new subgoalsor assertionsare generated, and so it must alwayscome to a stop. Recall, too, that PSYCOP+ differs from PSYCOP only by the addition of the two forward rules in table 4.3. Removing these two rules would clearly lead to shorter, not longer, searches, since there are fewer opportunities for new inferences.Thus, PSYCOP, like its cousin PSYCOP + , never venturesbeyond a finite number of rule applications. CanonicalProofs Of course, it is one thing to show that our program always halts and another to show that it halts with a correct answer. To get the latter result, I take a somewhatindirect path that makes useof the tree method, which we examinedin chapter 3. This method has a simple algorithmic character and is also known to be complete. Tree proofs, as it turns out , can be ' reexpressedin a natural-deduction systemlike PSYCOP + s, and we will call thesereexpressedversionscanonicalproofs. Sinceany valid argument has a tree proof (by the completenessof the tree method), any valid argument also has a canonical natural-deduction proof in a PSYCOP-like system. The final section of the chapter shows that PSYCOP+ (but not ' PSYCOP, of course) will actually hit on this proof if it hasn t found a simpler one. (See Prawitz 1965 and Smullyan 1965 for other canonical natural-deduction systems.) Tile Tree Met/aodof Proof The tree method yields a proof by contradiction . In the version of chapter 3, inferencerules break down the premises and negated conclusion into atomic sentencesand negations of atomic sentences . An argument is deduciblejust in caseall of the paths in the tree that is created in this way contain both an atomic sentenceand its negation . Both Jeffrey( 1967) and Smullyan ( 1968) give proofs of the soundness and completenessof the tree method. The tree proofs summarizedhereuse


Chapter 4

NOT( F P net

(q AIm .

rAlm . FpnENq p NOT(q AIm . ) r . ( NOTp) ~ q ( NOrq) ~ ( NOre)

/ NOTp

/ NOTq




" q / " ""


" NOTe

/ NOTq

" " "


.4 Fiawe4 A treeproofof theargument IF p THENq. rAN Ds. IF p THEN(q ANDI ). a somewhatdifferent set of rules than are found in thesetwo sources, but the differencesdo not affect either of theseproperties. As an illustration, figure 4.4 shows a tree proof for (8) (which is equivalent to ( 1 , using the rules from table 3.1. (8) rAND s. IF p THEN q. ' IF p THEN (q AND s). In the figure, the negation of the conclusion is written at the top , and then the two premises. Applying rule 6 to the negatedconclusion yields p and NOT ( q AND s), and applying rule 1 to rAND s produces the two

Mental Proofs and Their Fonnal Properties


additional sentencesrand s. ( NO Tp ) OR q comesfrom IF p THEN q by rule 5, and ( NOT q) OR ( NOT s) from NOT ( q AND s) by rule 2. When we get to ( NOT p ) OR q, however, we must split the tree into two branches, according to rule 3. Finally , each of thesebranches must divide when we apply rule 3 to ( NO Tq ) OR ( NO Ts ) . Each of the paths in this tree are closed(i.e., contains a pair of contradictory sentences ). For example contains both p and the tree es of , the path that follows the left branch NOT p. Sinceall paths are closed, argument (8) is deducible. CalProofs: An algorit/ tmfor trlln Slating trees to natural-deduction Cl I II Oni structures. Tree proofs such as that in figure 4.4 can be mimicked by meansof natural -deduction proofs in PSYCOP + . Most of the tree rules in table 3.1 correspond to forward rules in the program: rule 1 to AND . Elimination , rule 2 to DeMorgan (NOT over AND ), rule 4 to DeMorgan (NOT over OR ), rule 5 to Conditional Transformation, rule 6 to Negated Conditional Transformation, rule 7 to Double Negation Elimination . These are the only forward rules that we will need in mimicking tree proofs, and we can call them the key forward rules. ( This is where the use of the conditional transformation rules is crucial; without them, we can' t find natural-deduction proofs that correspond to arbitrary tree proofs.) One exception to the correspondencebetweenthe tree rules and ' PSYCOP+ ' s forward rules is rule 3. The forward rules don t give us a way of dividing the proof into parts on the basis of a disjunction. However, there is a backward rule- OR Elimination - that we can usefor this purpose . The OR Elimination rule produces from a disjunctive assertion P 0 R Q two subdomains, one with supposition P and the other with supposition Q. This is analogousto the way rule 3 operateson P OR Q to divide a tree into two branches. Given this correspondence , we can turn a tree proof T into anatural N in structure deduction roughly the following way: The original premises of the argument become the premises of N , and the conclusion of the argument becomesthe conclusion of N . Following the reductio strategy imposedby the tree proof , we next set up a subdomain in N whosesupposition is the negation of the conclusion. If this subdomain leadsto contradictory sentences , then the conclusion must follow by NOT Elimination . We can then usethe key forward rules to decomposethe premisesand the supposition in N , in the way that rules 1- 2 and 4 7 of table 3.1 decompose the premisesand the negated conclusion in T. In general, this will leaveus with someundecomposeddisjunctions. For any such disjunction,


Chapter 4

P OR Q, we can use OR Elimination to form two subdomains within the domain of the negatedconclusion, one with supposition P and the other with supposition Q. The subgoal of thesesubdomainswill be a contradictory sentence, R AND NOT R, formed from one of the contradictory literals in T. Further disjunctions will result in additional pairs of subsubdomains embeddedin both of those just formed. Procedure CP-I in the appendix gives an algorithm for mapping any tree proof to anatural deduction structure. Sometimesthe structure produced by CP-I will not quite be a proof as it stands, but it can be made into one by inserting one extra level of subdomains(as shown in CP-II of the appendix). As an illustration of this mapping between tree proofs and natural deduction proofs, table 4.4 shows the natural-deduction proof that corresponds to the tree in figure 4.4. The premisesof argument (8) appear at the beginning of the proof along with the sentencesderived from them by the key forward rules. The first subdomain hasas supposition the negation of the conclusion, NOT ( I Fp THEN ( q AND s) ) , with the sentencesthat the forward rules produced from it directly beneath. The last sentenceof this domain (line 34 in the table) is p AND ( NOT p) . The rationale for this domain lies in an application of NOT Elimination , an attempt to show that the conclusion follows by a reductio. Within the domain of the negated conclusion are two immediate subdomains, the first with supposition NOT p and the other with supposition q, each of which has p AND ( NO Tp ) as its final line. We can think of thesedomains as the result of applying OR Elimination to the sentence( NOT p) OR q, which appears above in line 5. The intent is to show that p AND ( NOT p) follows from the disjunction by showing that it follows from NOT p and also from q. Most of the remaining subdomainsare related to OR Elimination in the sameway. The innermost subdomainsof table 4.4 (lines 18- 19, 21- 22, 26- 27, and 29- 30) are not part of the structure produced by the mapping procedure CP-I , but have beenadded to this structure to justify the line that follows. The natural-deduction proof that we construct from a tree in this way is the canonical proof (referred to as N ' in the appendix). Each line of the canonical proof in table 4.4 can be justified using PSYCOP+ ' s rules, though it remains to be shown that this is possiblein the generalcase. The justifications appear in the rightmost column. The canonical proof is not the most elegantproof of argument (8) within our system; figure 4.1 shows


Mental Proofs and Their Fonnal Properties

Table 4. 4 The canonical natural-deduction proof for argument (8).

1. I Fp THENq . 2. rAND s. 3. r 4. 5 S. (NOTp) ORq 6. + NOT(IF p THEN(q ANDs . 7. p





10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

26 .2R 27 .-


+ NOT q p AND ( NOTp) + NOT s p and(NOT p) P and(NOT p) +q + NOT q +p q AND (NOT q) NOT P + NOT p q AND ( NOTq) P 0 AND ( NOT0)

+NOTs +p s AND ( NOTs)

Premise Premise AND Elim. (from 2) AND Elim. (2) CondoTransfonnation( 1) Supposition Neg. CondoTransformation (6) Neg. Cond Transfonnation (6) DeMorgan( NOTover AND) (8) Supposition Supposition AND Intro. (7, 10) Supposition AND Intro. (7, 10) OR Elim (9, 11- 14) Supposition Supposition Supposition AND Intro. (16, 17) NOT Intro. (18, 19) Supposition AND Intro. (16, 17) NOT E1im . (21, 22) AND Intro. (20, 23)

Supposition Supposition ANDIntro. (4, 25) NOTIntro. (26, 27) Supposition ANDIntro. (4, 25) NOTElim.(29, 30) ANDIntro. (28, 31) ORElim. (9, 17- 32) ORElim . (5, 10- 33) NOTElim. (6- 34)

.33 31 P AND 32 ( N ) OT p p .35 AND N OT ( ) p p 34 (Ns)p)OT pAND .IF J)THEN (qAND 29. 30.




Chapter 4

the way PSYCOP would normally prove an argument of this form. So the lengthy nature of the proof in table 4.4 should not be taken as an argument for the tree method over natural deduction. Our goal at this point is a purely theoretical one: to find a uniform procedure for getting from tree proofs to natural-deduction proofs, not to find a quick one. Canonical Proofs II : Canonical proofs are dedllcible ria PSYCOP + 's rules. Of course, in order to show that the structure N that our translation method producescan be turned into a legitimate proof in PSYCOP+ , we must show that each of its sentencesis either a premise or a supposition , or elsethat it follows from the premisesand the suppositions by one of the rules in this system. What enablesus to do this, of course, is the ' correspondencebetweenthe tree rules and PSYCOP+ s rules. As a preliminary , it is helpful to recognizethat if a sentenceis copied from the tree proof to the natural-deduction structure by the algorithm then all the sentencesthat were usedto derive that sentencein the tree are also available for its derivation in the natural-deduction proof . This result is called the Inheritance lemma in the appendix, where a proof of it is given. As an example, the eighth line of figure 4.4, ( NOT p) OR q, is derived in the tree proof by applying rule 5 to the premise IF p THEN q. ( NOT p) OR q also appears in the natural -deduction proof of table 4.4 at line 5, where it is shown as derived from the same premise by the Conditional Transformation rule. Given the Inheritance Lemma, we can easily seethat any sentenceSin tree T that our algorithm copied to the natural-deduction structure N is either a supposition or a premise in N , or else that it follows from the suppositions and the premisesby one of the key forward rules. That is, each of these sentencescan be legitimately derived in N by a rule in PSYCOP + . However, as was remarked above, the structure N is not yet a canonical proof . The canonical proof N ' contains somelines that are not in T: the last lines of the various domains. For example, the conclusion of the proof in table 4.4 and the sentencesp AND ( NOT p) are not part of the tree proof . To justify theselines, we needto employ the backward rules OR Elimination , AND Introduction , NOT Elimination , and NOT Introduction . (Examplesof the usesof all four occur in table 4.4.) As was noted above, this justification processcan involve the addition of one further level ofsubdomains (e.g., that in lines 18- 19 of table 4.4). However, proof CP-II in the appendix shows that each line of the canonical proof so constructed follows by PSYCOP+ ' s rules.

Mental roofs and Their Fonnal Properties


Completenessand Decidability of PSYCOP + Any proof that PSYCOP or PSYCOP+ produceswill be sound: If it can find a proof for an argument, then that argument will be valid in classical sententiallogic. This is a consequenceof the fact that all the rules in tables 4.1- 4.3 are specialcasesof rules that are known to be sound in the classical system. What we saw in the previous subsectionalso goesa long way toward showing that PSYCOP + is a complete proof procedure. Since every valid argument has a tree proof and since every tree proof corresponds to a canonical proof of the same argument, all that is missing for completenessis to show that PSYCOP+ ' s control structure allows it to discover the canonical proof . Of course, it is perfectly all right if PSYCOP+ happensto find a simpler proof of a valid argument. All we ' require is that PSYCOP + be able to find the canonical proof if it hasnt found an easierone. ' SupposePSYCOP + is given a valid argument to evaluate, and let N denotea canonical proof of this argument. PSYCOP + 's first step is to run its forward rules on the premises. If it is lucky, this first step will yield the conclusion, and this will amount to a proof of the argument (though, of course, it will not be proof N ' ). In the general case, however, the conclusion won' t be forthcoming. As has already been shown, forward rule application must eventually stop, and at that point PSYCOP+ will try a backward rule. The rule that leadsto N ' is NOT Elimination , but another of the rules may apply first in this situation. This rule will lead either to a different proof of the argument or to a situation in which no further rules apply. (Infinite searches are barred by the halting theorem proved above.) In the latter case, PSYCOP will apply another of its backward rules, continuing in this way until it either finds a novel proof or attempts to apply NOT Elimination . A check of the conditions on NOT Elimination in table 4.2 showsthat there are no barriers to the rule in this situation. NOT Elimination itself allows a choice of which contradiction it will attempt to prove. (Seestep c in table 4.2.) However, sincethere are only a finite number of candidates, it will eventually hit on the critical one- say, RAND ( NOT R ) - ifit hasn't already found a proof on the basisof some other contradiction. At this point , PSYCOP+ will set up a new domain whosesupposition is the negation of the conclusion and whosesubgoal is RAND ( NOT R ), and it will then apply its forward rules to the new set of assertions. Repeating the argument of the last two paragraphs at each


Chapter 4

stage of the proof shows that PSYCOP+ will (if it has found no earlier proof ) eventually apply the right sequenceof backward rules to yield ' proof N . This implies that PSYCOP+ is indeed a complete proofproce dure. Moreover, we have proved that PSYCOP + will halt for any argument whatever. If the argument happensto be invalid , it must halt without achieving a proof , since its rules are sound. This meansthat PSYCOP+ is also a decision procedure for classicalsentential logic.

Summary The PSYCOP model is a set of representationaland processingassumptions that attempts to explain deductive reasoning. This chapter has surveyed PSYCO P' s core theory: its memory, sentential inferencerules, and control structure. In essence , PSYCOP has a working memory and a long-term memory containing sentencesthat are interrelated by deduction links and dependencylinks. The former connect sentencesto their direct entailments; the latter connect premisesor suppositions to direct or indirect entailments. The entailment relation is established by the inference rules, which divide into our now-familiar forward and backward types. These rules include many natural-deduction principles well known from elementary logic, but they are specializedin a way that keepsthem from spawning irrelevant inferences. PSYCOP applies theserules in the context of a fairly simple control structure: The forward rules create(breadth first) a web of assertionsin working memory, while the backward rules generate (usually depth first) a successionof subgoalsaimed at the web. The program halts if it finds a match betweenassertionsand subgoalsthat suffices for a proof or if its rules no longer apply. Our investigation of PSYCO P' s mathematical properties demonstrated that if we give the program an argument to evaluateit will always halt, but not necessarilywith the answer sanctioned in classical logic. Although PSYCOP produces no false alarms, its hit rate for classically valid arguments isn' t perfect. For certain sorts of conditional arguments, such as (7), PSYCOP will stop without having found a proof. This means that PSYCOP is incomplete for classicallogic, but there is a sensein which it is close: By adding a pair of rules that translate conditionals into truth functionally equivalent expressions(i.e., the rules of table 4.3), we get both completenessand decidability. Accepting thesesomewhatunintuitive

Mental Proofs and Their Formal Properties


rules , then , is one measure of the cost of achieving classical completeness . This result must be interpreted cautiously , though , since completeness could be achieved by other means. For example , we can get a complete ' system by relaxing some of the conditions on P SY CO P s rules in tables 4.1 and 4.2, and it is possible that some such alteration could yield a computa tionally and psychologically plausible theory . In the absence of an acceptable alternative , however , we will assume that human reasoners may not be perfect classical - logic machines , even when we abstract over time and memory limitations , as we have in the present chapter .

Appendix: Proofs of the Major Propositions HP -/ : PSYCOP + 's rules create Oilly finitely nuulY domilillSfor a gi , ell argumellt. Proof A perusalof the IF Introduction , OR Elimination , NOT Introduction , and NOT Elimination rules shows that any subdomain that they produce has a supposition S and an initial goal G that are tightly constrained . For example, the IF Introduction rule produces a subdomain where both Sand G are subformulas of the premisesand the conclusion of the argument. In general, across all four rules, Sand G are either (a) subformulas of the premisesand conclusion, (b) the negation of such a subformula, or (c) a sentenceof the form Q AND NOT Q, where Q is such a subformula. However, there are only a finite number of premises, and the conclusion and premisesare each of finite length. Thus, there are only a finite number of sentencesof type a, and for this reasonsentencesof types band c must be finite too. Consider a particular domainD in some proof. The conditions of the four domain-creating rules require that D 's immediate subdomains have distinct pairs of supposition and goal. (Seechapter 2 for the definition of immediatesubdomain.) That is, if D1 is an immediate subdomain of D with supposition S1 and goal G I ' and D2 is another immediate subdomain with supposition S2 and goal G 2, then it cannot be the case that both SI = S2and G1 = G2. We havejust seenthat the set of possiblesentences that can serveas supposition or goal is finite; hence, the number of possible pairs (Sj,Gj) is finite too. It follows that there can be only finitely many immediate subdomainsin D. This limitation on the number of immediatesubdomains means that if the rules produce an infinite number of domains then this can occur only


Chapter 4

through embeddingone subdomain within another. The initial domainD must have immediate subdomain D1, which itself has immediate subdomain D2, which has immediate subdomain D3, and so on. Figure 4.3c illustrates embedding of subdomains to five levels. (The fact that there must be an infinite nesting of subdomainsin an infinite backward search follows from Konig ' s lemma, which states that any infinite tree in which each point has only finitely many branches must have at least one infinite path. SeeSmullyan 1968for a proof.) However, the conditions on the rules make this infinite nesting impossible. As abov~, the rules require each domain in such a seriesto havea distinct pair of supposition and goal; but sincethere are only finitely many candidatesfor thesepairs, the sequence of nesteddomains must end. Thus, PSYCOP + can produce only a finite number of domains in attempting to prove any given argument. HP -II : Tile backwardrulesproduceoilly - finite I UImber ofsubgolll.f within - domain. Proof Consider a situation in which PSYCOP+ is given a (finite ) set of assertions1: that hold in somedomainD and a goal Go in that domain. An infinite backward searchwithinD meansthat PSYCOP + would produce a successionof subgoalsG l ' G 2' . . . , where G 1 is a subgoal of Go, G 2 is a ' subgoal of G1, and so on, all withinD . (We are again relying on Konig s lemma.) Each G , must be unique, since the rules do not allow duplicate subgoals.Notice that Backward IF Elimination, AND Elimination, Double Negation Elimination , Disjunctive Modus Ponens, Disjunctive Syllogism, Conjunctive Syllogism, and the DeMorgan rules can apply only a finite number of times in this situation. This is becauseany subgoal they produce must be a subformula of some assertion in 1: (see table 4.2), and there are only a finite number of subformulas from that set. The only remaining single-domain rules are Backward AND Introduction and OR Introduction . Thus, after some stage k , all the subgoalsGt +l ' Gt +2' . . . , would have to be produced by one of thesetwo rules. However, a glance at the statement of these rules in table 4.2 shows that the subgoals they generatemust be subformulas of the precedinggoal. Thus, Gt +1 must be a subformula of Gt , Gt +2 must be a subformula of Gt +l ' and so on. Since the subformula relation is transitive, Gt +l ' Gt +2' .. . , must all be unique subformulas of Gt . But this is impossible, sinceGt has only finitely many unique subformulas.


H P - III : Theforward rilies produce only Gfillite " limber of usertioll Switllill II domain .

Proof Call a sentenceproduced by one of the forward rules (tables 4.1 and 4.3) a conclusionfor that rule, and the sentenceor sentencesthat . (E.g., Forward IF Elimination has as premises trigger the rule its premises IF P THEN Q and P and hasas conclusionQ.) We also needto distinguish betweenan atomic sentencetoken that might occur in a complex sentence and an atomic sentencetype. The former is a specificappearanceor occurrence of an atomic sentence ; the latter is the class to which these occurrences belong. Thus, there are two tokens of the atomic sentencetype Calvin deposits50 cents in the complex sentenceIF Calvin deposits50 centsAND Calvin is hungry THEN Calvin deposits50 cents. In theseterms, the number of atomic sentencetokens in the conclusion of a forward rule is no greater than the number of tokens in the set of its premises. (In the case of IF Elimination , for example, the conclusion Q appears in the premise IF P THEN Q. For this reason, the number of atomic sentencetokens in Q must be lessthan or equal to the number in IF P THEN Q and thus lessthan or equal to the number in the set of its ' premises{ P, IF P THEN Q} .) Furthermore, the forward rules don t introduce any new atomic sentencetypes, becausethe conclusions of these rules contain only atomic sentencetypes that are already contained in the premises. (For example, the conclusion of IF Elimination is Q, which is already part of IF P THEN Q, so any atomic sentencetype in the former is also in the latter.) Now consider a particular domainD , and the finite set of sentences . 1: that hold in that domain before application of the forward rules. Let m be the number of atomic tokens in I. and n the number of atomic types (n ~ m). From the first of the two observations above it follows that any sentenceproduced by the forward rules in D can contain no more than m atomic sentencetokens, and from the second observation it follows that the number of types from which these tokens can be drawn is n. For example, if there are a total of three atomic tokens in I. , each belonging to one of two types (say, p and q), then the forward rules can produce a sentencethat is at most three tokens long, each token being either p or q. If our languagecontained only the binary connectivesAND , OR, and IF THEN , this would be sufficient to prove the theorem: Each sentence contains at most m atomic tokens, and hence at most m - 1 binary . . .


Chapter 4

connectives. Thus, there are only a finite number of distinct sentences that can be produced by using theseconnectivesto bind the tokens into sentences . (In our three-token, two -type example, there are 144 distinct sentencesthat are three tokens long, 12 sentencesthat are two tokens long, and two that are one token long.) However, the connective NOT introduces .a complication. Unless the number of NO Ts is limited , new sentencescan continue to be produced even when thesesentenceshave a constant number of atomic tokens (e.g., NOT p, NOT NOT p, etc.). Inspection of the rules shows, though, that they can produce sentenceswith only a finite number of NO Ts. The only forward rules that contain NOT in their conclusion are DeMorgan ( NOT over OR ), Conjunctive Syllogism, Negated Conditional Transformation, Conditional Transformation, and DeMorgan ( NOT over AND ). The first three of thesegenerateconclusionsthat have no more NO Ts than appear in their premises. Conditional Transformation and DeMorgan ( NOT over AND ) are the only forward rules that allow PSYCOP+ to produce a sentencewith more NO Ts than any sentencethat appearsin 1:. The first of theseapplies to sentencesof the form IF P THEN Q and the secondto sentencesof the form NOT ( P AND Q) . Notice, however, that none of the forward rules produce sentencesthat add an AND or an IF . . . THEN . Thus, the total number of NO Ts that can be produced by Conditional Transformation is limited to the number of IF . . . THE Ns contained in 1: , and the total number of NO Ts that can be produced by DeMorgan ( NOT over OR) is limited to the number of ANDs in 1:. Thus, if x is the number of NO Ts in 1: , y the number of ANDs , and z the number of IF . . . THE Ns, no sentenceproduced by the forward rules can have more than x + y + z NO Ts. Thus, any sentenceproduced by the forward rules can have no more than m atomic sentencetokens, m - 1 binary connectives, and x + y + z NO Ts. But the number of distinct sentencesmeeting theseupper bounds must be finite. Hence, the forward rules can produce only finitely many new sentences . CP-/ : An tl /gor;tllm for translating tree proofs to lUIt", aI-dedllct;on strllctllres. In general, the tree method leads to a structure similar to that shown at the top of figure 4.5, where ~ , ~ , . . . , Pi are the premisesof the argument to be evaluated, NOT C is the negation of its conclusion, and the Qs are the result of applying the rules in table 3.1. In particular, each branch



't "

.Pk NOTC Q1 ...1.1 Q1 . ~ ~ . ~ _ Q Q 1 ;~ . 1 . ..1 . - - ---""'" ---

a ....t,t .......-.......

Qu .. .1 QI .. ,I.1

Q ..-..,1

Q . . 1.1 Q . -.l.1 . . Q.1~ Q.....


~,1Q~1 Q8Uto1 . . . . Q.,..,.. Q. nx

Pt P, . . " . + NOTC . + 1: ' ' :, +Q- .1.1 R AND(NOTR)

+08,1+1,1 R) RAND (NOT R) RAND (NOT + Ol.lit .' .

+a . At R AND(NOTR)

+0. . . 1.1 R AND(NOTR) R AND(NOTR) R AND(NOTR) Fiawe4.S Thegeneralformfor t~ proofs(8) andfor theanalogous , natural-deduction structure (b).


Chapter 4

in the tree is the result of applying rule 3 to a disjunction that occurs higher up. Define a branch sentenceto be one that appears immediately beneath the points where the tree splits; thus, Q2,I,I ' Q2,2,1' and QJ,I,I are all branch sentences . Let pivot sentencerefer to any sentencethat is either a branch sentence a , premise, or the negatedconclusion. A final piece of : terminology By derivation of a sentenceQ in tree T , we mean the sequence of sentences51, 52, . . . , 5t , where 51is a premiseor the negatedconclusion, 5t = Q, and 5i+l was derived in tree T from 5i by the application of one of the rules in table 3.1 Supposewe have an arbitrary tree T for some deducible argument. All paths of T will be closed, and it will have the generalform shown in figure 4.5. The algorithm below constructs a natural-deduction structure N that is analogousto T. The bottom half of figure 4.5 shows the basic details of N in our old indented notation for natural deduction, and the steps in constructing it are these: ( 1) Let the outermost (most superordinate) domain of N include the premises~ , ~ , . . . , Pt of T as the first k lines and the conclusion C as the last line. (2) Let the first subdomain include the negatedconclusion, NOT C, as its supposition. As the last line, put the conjunction R AND ( NOT R) , where R is an atomic proposition and both R and NOT R appear in somepath in T. (SinceT is closed, there must be such an R.) (3) Insert in the subdomain created in step 2 two further subdomains: one with supposition Q2,1,1 and the other with supposition Q2,2,1 from T. The last line of each of the new subdomainsis again R AND ( NOT R) . (Seefigure 4.5b.) (4) Continue embeddingsubdomainsin this way: If Qi,J,1is the supposition for somedomainD in N and if the first branch in T below Qi,J,1 leads to the branch sentencesQ'+I,t ,1 and Qi+lit +l ,l ' then place two new subdomainswithinD . Qi+lit, 1will be the supposition of the first, and Qi+lit +l ,1 will be the supposition of the second. Both will have R AND ( NOT R ) as the last line. (5) Take the first sentencein T that is not a pivot . The derivation of this sentencemust have begun with either a premise or the negatedconclusion . We , and may have included one or more branch sentences

Mental Proofs and Their Fonnal Properties


then have the following (mutually exclusive) possibilities: (a) If the derivation included a premise but no branch sentence , copy the sentence in the outermost domain of N (i.e., the main proof ) beneathany sentencesin its derivation. (b) If the derivation included the negated conclusion but no branch sentence,copy the sentenceinto the second domain (i.e., the one whose supposition is NOT C) beneath any sentences in its derivation. (c) If the derivation involved a branch sentence , copy the sentenceinto the domain whose supposition corresponds to the lowest branch sentencein its derivation, again beneath any sentencesin that derivation. Repeatthis step for eachof the remaining nonpivots. /lIlIer;tance Lemma Suppose sentenceQ is copied by the above algorithm from tree T to somedomainD in the natural-deductionstructure N . Then all sentencesin the derivation of Q in T also hold in D. Proof The proof is by induction on the length of the derivation of Q in T: In the first place, the derivation must begin with one of the premisesor with the negatedconclusion, and steps 1 and 2 ensurethat thesesentences hold in D. Let us assume, then, that the first k sentencesof the derivation hold in D. If sentencek + 1 is a pivot, then it also holds in D , given the procedure in steps3 and 4. If k + 1 is a nonpivot, step 5 will have placed it in N in a domain whosesuppositionis a pivot among the first k sentences . Sincethis pivot holds in D , so doesa nonpivot within the samedomain. CP-/ /: CII/IO,.;cal proofs are dedllc;ble ,ia tile rules of PSYCOP + . Proof Algorithm CP-I , above, is a procedurefor mapping a tree proof T into a natural-deduction structure N. However, to turn N into the final canonical proof N ', we must usually add a few additional lines. To show that N ' is a legitimate proof, we first demonstratethat each line in N that came from T follows by one of PSYCOP+ ' s rules. We then specify the ' remaining lines of N and show that theselines also follow by the rules. Let T be a tree proof and N be the natural-deduction structure created from it by algorithm CP-I . Any sentenceSin N that was copied from T is deducible in N by the PSYCOP+ rules. If the sentenceis a pivot in T , steps 1- 4 guaranteethat it will be a premise or a supposition in N , and henceits deductive status is not in question. If the sentenceis not a pivot, it must have beenderived in T from a higher sentenceby one of the rules in table 3.1. The rule in question cannot be rule 3, since this would make


4 Chapter

the sentencea pivot ; hence, the rule must correspond to one of the key natural-deduction rules mentioned earlier. By the Inheritance Lemma, all the sentencesin the derivation of S must also hold in S' s domain. Thus, we can rederiveS in N using a rule of PSYCOP+ . The canonical proof N ' also contains sentencesthat were not in T ' including, each domain s final line. We still need to show that theselines follow according to the natural-deduction rules. For this purpose, we will use just four of the backward rules from table 4.2: AND Introduction , NOT Introduction , NOT Elimination, and OR Elimination . Take the most embeddedpair (or pairs) of domains in N constructed in steps 3 or 4 above. (The special casein which N contains no such pairs will be consideredbelow.) The final sentenceof thesedomains is R AND ( NOT R) . By construction, the sentencesthat hold in each of these domains correspond to those on some complete path through T. Since all paths in T are closed, this path contains some atomic sentenceSand its negation NOT S. If S = R, then the sentenceRAND ( NOT R) can be derived in this domain by AND Introduction . If S :# R in one of the domains D , then we embed one further pair of subdomains in D . One of thesesubdomainsconsistsof just two sentences , supposition R and goal S AN D ( NOT S); the other consistsof supposition NOT R and goal S AND ( NOT S) . Since both S and NOT S hold in these subdomains, SAND ( NOT S) can be derived in them (by AND Introduction again). Because of the first subdomain, we can derive NOT R in D by NOT Introduction ; becauseof the second, we can derive R in D by NOT Elimination . (Inspection of the statementof theserules in table 4.2 shows that all their conditions are satisfied.) Hence, RAND ( NOT R) holds in D , too , by AND Introduction . The structure derived from N by the possible addition of thesesubdomainsis the canonicalproof, N '. Thus, the final lines follow in each member of the most embeddeddomain pairs constructed in steps 3- 5. It remains to be shown that this is also true for the rest of the domains. We do this by working outward from the most embeddedsubdomainsjust considered. That is, take the domainD * , which is the immediate superdomainof the most embeddedpairs. The in the subdomains , say QiJ,l and Q' ,J+l ,l ' must have been suppositions branch sentencesin T , derived by rule 3 from the disjunction Q' ,J,l OR Q' ,J+l ,l . This disjunction was copied to N along with the rest of the T sentences , and by the Inheritance Lemma the disjunction must also hold

Mental Proofs and Their Fonnal Properties


in D * . Thus, since the sentenceR AND ( NOT R) follows in both subdomains , this samesentencemust be derivable in D * by OR Elimination . We can repeat this reasoning with all embedded domain pairs. For * example, if D itself happensto be a member of such a pair, then R AND ( NOT R ) must also be deducible in the other member of the pair; hence, it will follow by OR Elimination in D * 's immediate superdomain. By iterating this process, we will eventually be able to derive R AND ( NOT R ) in the outermost pair of domains (i.e., the oneswhosesuppositions are Q2.1.1 and Q2.2.1 in figure 4.5b). Thus, with one more application of OR Elimination , RAND ( NOT R ) must follow in the domain of the negated conclusion. (In the special casein which there are no embeddedpairs of subdomainscreated in steps3 and 4, T has no branches and consistsof a single path. Since R and NOT R are both in T , they must be in the path and thus hold in the subdomain of the negatedconclusion. Therefore, R AND ( NOT R) follows in this domain by AND Introduction .) Finally, C itself must follow in the premises' domain by NOT Elimination . The construction N ' is then a correct proof of C from ~ , ~ , . .. , Pi via PSYCOP+ 's rules.


Mental Proofs and Their Empirical Consequences

This chapter is devoted to experimental results that compare the model' s perfonnance to that of human reasoners. If the general theory is right , mental proofs may figure in many different kinds of cognitive tasks. As we saw in the third chapter, deduction may playa central role in planning or in question answering, which at first glance are not specifically deduction problems. Theoretically, then, we could usefindings about such cognitive skills to test our theory of mental proofs. There is something to be gained, however, from beginning with experiments that focus on more obviously deductive tasks- for example, tasks in which subjectsevaluate the deductive correctnessof arguments. In the first place, thesetasksare the traditional onesin the psychologyof reasoning (seechapter 1) and so provide some common ground for comparing our theory with alternative proposals. Second, thesetasks allow us some flexibility in testing the theory, since we can single out particular inferences for examination. Of course, we needto proceedcautiously. I assume that subjects bring to bear rules that they judge appropriate for solving math-type puzzlesof this sort (the ones I called demoted rules in chapter 2). These derive from the abstract rules in tables 4.1 and 4.2, but exactly which of these rules subjectsactively employ may be open to individual differencesand pragmatic pressures.Moreover, instructing subjectsto decide whether an argument is " logically valid" or whether the conclusion " " ' necessarilyfollows from the premisesdoesnt guarantee that they will understand these instructions in the intended way. We can' t always rule out the possibility that processes other than deduction interfere with the results. Still , thesededuction tasks are probably as close as we can come to pure testsof the theory, and so they provide a good empirical basefrom which to begin. In the first sectionof this chapter, I review a number of experimentsthat I designed to test the theory of chapter 4 (or a closely related theory). In theseexperimentssubjectsattempted to evaluate the validity of arguments , or to follow proofs or rememberproofs presentedto them. Taken as a group, the experimentsprovide a fairly broad test of the core assumptions . In the second section, I consider whether the PSYCOP theory is consistentwith earlier researchon simpler problems involving conditional and negative sentences , which were briefly encounteredin chapter 1. Although I will discusssomeadditional experimentswith PSYCOP in chapter 7, the presentsetof findings should provide a " consumerreport" on the model' s empirical strengthsand weaknesses .



SomeTestsof tbeProofModel PSYCOP consistsof too many assumptionsto test with any precision in a singleexperiment or type of experiment. The following studies therefore highlight different aspectsof the model- inference rules, real-time processing ' , and memory structure- that combine to produce subjects responses . Although some of thesestudies were designedas specifictests of PSYCOP, we can begin with a look at an older experiment that was conceivedasa test of an earlier version of the model. The PSYCOP theory grew out of a somewhatmore restricted model of propositional reasoning called ANDS (for A Natural -Deduction System), which I developed in the early 1980s(Rips 1983; Rips and Conrad 1983). To a certain extent, PSYCOP can be consideredan extensionof that earlier theory, and so some of the experimentsbasedon ANDS will also serveas testsof PSYCOP. Evidencefrom Evaluation of Arguments The traditional experiment in the psychology of reasoningis one in which subjectsstudy a set of argumentsand decidewhich of them are valid. This ' paradigm dates at least as far back as Storring s ( 1908) experiments; in view of this long history, it might be useful to show that the model can predict its results. In order to conduct such a test, I assembledthe 32 problems listed in table 5.1. (The table uses " & " for AND , " y " for OR , " - " for NOT , and " -+" for IF . . . THEN .) These problems are all deducible in classical sententiallogic and are also deduciblein the model by meansof rules very similar to PSYCO P' s. The critical rules for theseproblems (seetables 4.1 and 4.2) are IF Elimination , DeMorgan (NOT over AND ), Disjunctive Syllogism, Disjunctive Modus Ponens, AND Elimination , AND Introduction , OR Introduction , NOT Introduction , and OR Elimination . The reasonfor singling out theserules was that they constituted the inference schemasfor the earlier ANDS theory; however, all of them also appear ' among PSY CO Ps rules. Although the experiment does not test the full ' range of PSYCO Ps inferenceskills, it does test an important subset. The arguments were constructed so that each could be proved by means of three rules selectedfrom the above list. In addition to these deducible arguments, there were also 32 nondeducibleones, createdby recombining the premisesand conclusions of the first set. Finally , 40 filler arguments wereaddedto the ensemble ; most of thesewere simple deducibleproblems.


lo8E 8










d b










( s



51 J








b b b JAb








J4 "

(p v r ) - + - 5






N -






deducible arguments.

Observed Predicted

. -


T. ~ Soi(continued ) ---..------_t Argumen1 Q P (pvq) -+- r p& - (r & 5) R p-+r (p & q) -+r p- r p - (r & s) T pvq - p - (qvr ) Up (pvq ) - r r- s

Observed 33


Predicted 4O







32 .2





sv t

V p& q q & (p v r) W - (p & q) ( - p v - q) - - r - (r & s) x

( p v s) - + r s -

(r - +

p -+












62 .2







p -+ - (q & r) Z - (p & q) & r (- p v - q) -+s .


s A' (pvq)- (r&;s) p -+ r

B' - r qvr r- - - q C - (p & q) - - q

- p & - (p & q)

D ' (p v q) & r

r v s) - - p)

E' Pv s (p v r) -+s 5V t

F' t

- (r &. s) - r v - I)&. t) v u


Mental Proofs and Their Empirical Consequences


The subjectsin this study saw the argumentsin a singlerandomized list. For each argument, they were to circle the phrase " necessarilytrue" beneath the problem if the conclusion had to be true wheneverthe premises were true, and the phrase " not necessarilytrue" otherwise. Subjects responded to each problem in this way evenif they had to guess. For half of the subjects, the arguments appearedin instantiations having to do with the location of people in cities. For example, argument E in table 5.1 would have looked like ( 1). ( 1) If Judy is in Albany or Barbara is in Detroit , then Janice is in Los Angeles. If Judy is in Albany or Janice is in Los Angeles, then Janice is in Los Angeles. The remaining subjectssaw the sameproblems rephrasedin terms of the actions of hypothetical machines. Thus, for thesesubjects, the sample argument appearedin the following guise: (2) If the light goeson or the piston expands, then the wheel turns. If the light goeson or the wheel turns, then the wheel turns. The subjects in both groups were students or nonstudents of approximately the same age. None of them had taken a formal course in logic. This holds true, as well, for all the experimentsreported in this book.) ( According to our theory, subjects should correctly respond that the conclusion of such an argument is necessarilytrue if they can construct a mental proof of that conclusion, and successin doing so will obviously dependon whether they can muster all the inferencerules neededto complete the proof. As a working assumption, we will supposethat errors on theseproblems are due to a failure to apply the rules. The failure may be due to retrieval difficulties, slips in carrying out the stepsof the rule, failure to recognizethe rule as applicable in the current context, or other factors. In general, we can think of each rule Ri as associatedwith a probability Pi that the rule will be available on a given trial . This meansthat there will be someoccasionson which Ri would be useful in completing a proof but is not available to the subject. On theseoccasions, the subject will have to searchfor an alternative proof that usesrules other than Ri . (Such alternatives are sometimespossible becauseof the redundancy of the system.) If no such alternative exists, we will assumethat the subjecteither guessesat



the answer (with probability p, ) or simply responds incorrectly that the conclusion does not necessarilyfollow (with probability 1 - p, ). For example, when all the rules listed above are available, the model will prove ( I ) or (2) using a combination of IF Introduction , OR Elimination , and Disjunctive Modus Ponens. If these rules are available with probabilities PI ' P2' and P3' respectively, then (assuming independence ) the probability of a correct " necessarilytrue" responsemight be (3) where the first term is the probability of a correct mental proof and the second term reflectsa correct guessafter failure to find the proof . " " (3) P( necessarily) = PI P2P3 + 0.5 p, ( 1 - PI P2P3)' This equation is not quite right , however, since the model can still find a proof of thesearguments even if Disjunctive Modus Ponensis missing. OR Introduction and IF Elimination can combine to fill the same role played by the unavailable rule. (All the remaining rules are necessaryfor the problem, since omitting them keeps the model from producing any proof at all.) To correct for this alternative derivation, we must add some new terms to the equation. If P4 is the probability that IF Elimination is available and Ps is the probability that OR Introduction is available, then the proper expressionis (4). " " (4) P( necessarily) = PI P2P3 + ( I - P3) PI P2P4Ps

+ O.5p, [ 1 - PIP2P3- (I - P3)PIP2P4PS ]' The first tennis again the probability of finding a proof by the original method, the secondtennis the probability of finding the alternative proof, and the third is the probability of a correct guess. To derive predictions from the model, then, we needtwo piecesofinfor mation about each of the argumentsin table 5.1: the rules that are usedin a proof of that argument, and the probability that each of theserules will be available. We can obtain the first type of infonnation by simulation, giving the model the argument and inspecting the proof to find which rules it employs. Theserules can then be omitted (singly and in combination ) to detennine whether there are alternative proofs. The processis then repeateduntil no new proofs are forthcoming. This simulation allows us to fonnulate an equation like (4) for each of the arguments. The rule availabilities can be estimatedby treating them as parameterswhen fitting the resulting equations to the data.

Mental Proofs and Their Empirical Consequences


" Table 5.1 gives the obtained and predicted percentagesof correct nec" essarily true responsesfor the critical deducible problems. An analysisof variance of the data turned up no effectof the problem content (people in locations vs. machine actions) and no interaction of this factor with scores on the individual problems. Hence, the data from the two groups of subjects are combined in the table. The overall rate of correct responsesis fairly low (50.6%), though there is obviously a very wide range acrossindividual problems- from 16.7% correct on the hardest problem to 91.7% " correct on the easiest. The percentageof incorrect " necessarilytrue responses to the nondeducible problems (i.e., falsealarms) was 22.9%. Thus, despite the low hit rate, subjects were distinguishing the deducible from the nondeducible items. In experiments like this one involving choice of alternative responses , the absolute responserate dependsnot only on sub' also on the criterion they adopt for a " necessarilytrue" but jects accuracy response.Cautious subjects, for example, are likely to give somewhat low rates of positive responses , even though they are able to discriminate correct from incorrect argumentswith reasonableaccuracy. In accounting for theseresults, we thereforeneedto concentrateon the relative scoresacross the set of problems.1 The predicted scoresin the table are the result of fitting equations similar to (4) to the data. The full model requiresa large number of availability parameters, sincewe needa distinct parameter for each inferencerule. To reducethis number somewhat, we have collapsed forward and backward versions of a given rule, using the same parameter for both members of a pair. For example, the same parameter representedthe availability of backward and forward IF Elimination . We also set the guessingparameter , p" using the data from the nondeducible problems: If subjects respond " " necessarilytrue to thesenondeducible items only becauseof bad , then the guessingrate (after failure to find a proof) should be twice guesses this value, or 45.8%. (This is not the overall probability of guessing; it is " the conditional probability of guessingrather than saying not necessarily true" given that no proof was forthcoming.) These economy moves still leave parameters, but there are 21 remaining degreesof freedom for a test of the model.2 Although the fit of the model is difficult to summarize becauseof the varied nature of the problems, table 5.1 showsthat the predictions are reasonably accurate. The correlation betweenpredicted and observedscores, 0.93, yields a significant proportion of variance accountedfor when tested


Chapter 5

against the Problem x Subject interaction from an analysis of variance: F ( 10, 1054) = 26.43, p < 0.01. The residual varianceis fairly small, but it is also significant becauseof the large number of residual degreesof freedom: F (21, 1054) = 1.88, p < 0.05. The parameter estimatesare those in table 5.2. For the most part, these parametersare what we might expect on the basis of the intuitive nature of the rules. The easiestrules are those that seemobviously correct, including AND Introduction , AND Elimination , and Disjunctive Modus Ponens. The most difficult rule is OR Introduction , which allows us to deducesentencesof the form P OR Q froin P. Many subjectsapparently fail to apply this rule, probably for pragmatic reasons(Grice 1989; Gazdar 1979; McCawley 1981; Pelletier 1977): The conclusion of such an inference contains information that may seem to be irrelevant to the premise on which it is basedand thus to violate conversational conventions, as was discussedin chapter 2. 3 (For more empirical evidenceon the difficulty of OR Introduction , seeRips and Conrad 1983.) One puzzling aspect of theseestimatesis the very low availability for NOT Introduction versusthe relatively high availabilities for OR Elimination and IF Introduction . The routines for theserules in table 4.2 make them seemabout equally complex: All three are backward rules that involve subdomains. So why should NOT Introduction be so much harder to apply? The difficulty might be explained as an artifact of the particular Table5.2 Parameter estimates for modelasapplied to argument-evaluation experiment. Rule


DisjunctiveModusPonens AND Introduction AND Elimination IF Introduction OR Elimination IF Elimination DeMorgan(NOT overAND) DisjunctiveSyllogism NOT Introduction OR Introduction

1.(XX ) 1.(XX ) 0.963 0.861 0.858 0.723 0.715 0.713 0.238 0.197

Mental Proofs and Their Empirical Consequences


sampleof problems, but in fact we can also find evidencefor suchdifficulty in other paradigms, as we will see in later subsectionsof this chapter " " " ( Evidencefrom Proof Comprehension and Inferenceswith Conditionals " and in the ) chapters that follow . Evidently, the reductio strategy of assumingthe opposite of what one wants to prove is not an obvious move for subjectswho haven' t had extensivemathematicstraining .4 A plausible guessis that this difficulty is related to the conceptualdistancebetweenthe main goal of proving NOT P and the subgoal of proving a contradiction Q and NOT Q on the basis of P. By contrast, OR Elimination and IF Introduction seemmore direct, more intimately tied to the goals and assertionsthat trigger them. One way to see what the model buys us is to compare the fits just mentioned against what we can obtain using other possible measuresof problem difficulty . We might expect, for example, that the greater the number of premises in an argument, the more difficult that argument would be to evaluate. Similarly , the greater the number of atomic sentences ' in the argument, the harder it should be. For instance, argument F in table 5.1 contains four types of atomic sentences(r , s, t , and u), and sevenatomic sentencetokens or occurrences. It should therefore be more difficult than B', which contains only two atomic sentencetypes and five tokens. In general, however, these measuresof surfacecomplexity fail to provide a good account of the data. The correlation betweenthe percentage of correct responsesand the number of premisesis - 0.23, and the correlations with number of types and tokens of atoms are - 0.04 and 0.10, respectively. This suggeststhat the true difficulty of the argumentsis associatedwit ~ the inferencepatterns they display, where these patterns are close to those specifiedin PSYCOP. Evidencefrom Proof Comprehension , allow us Argument-evaluation experiments, suchas the onejust discussed to gather data quickly on a broad sample of problems. However, in these experimentseach subject proceedsat his or her own pace, and hencethe ' studiescan t tell us much about the real-time properties of the model. To get a better test of theseprocessingcharacteristics, we needan experiment in which we can time subjectsas they carry out their inferences. One type of task that is useful for this purpose involves presenting subjects with consecutiveparts of a proof and having them respond to each part under timed conditions. In general, the hypothesisis that those parts of the proof


Chapter 5

that require more effort from PSYCOP should also be the parts producing the slowestresponsetimes from subjects. To understand the details of the predictions, though, we need to step back for a moment and consider somefurther featuresof the model. So far, we have discussedthe way that PSYCOP evaluatesarguments by constructing proofs. The present task, however, requires PSYCOP to deal with proofs that arrive from external sources. Proof Comprehensionill PSYCOP At first glance, we might expectproof comprehensionto be no problem at all for the model. After all , PSYCOP can construct its own proofs in working memory; thus, if a proof is simply ' given to it in input, the model shouldn t have much difficulty understanding it . In fact, though, comprehending proofs is no small matter, as you will probably agreeif you recall struggling to understandproofs in difficult mathematics texts. One reason for this is that most mathematical proofs (outside of elementarylogic and geometry) don' t presentevery step in the inferencechain. Instead, they are usually " gappy proofs" or " enthymemes" that give only the main landmarks on the route to the conclusion, leaving the reader to fill in the rest. The problem that PSYCOP facesis the same: It must translate the externally presented proof into one that contains all the missing details and therefore shows explicitly how each of the ' proof s lines follows from the precedingones. To avoid confusion, we can call the proof that is given to PSYCOP the external proofand the proof that the model creates in response- the one that it stores in working memory- the internal proof. One differencebetweenthem is that, whereas the external proof can be extremely gappy, the internal proof of the sameargument will have all the inferencestepsintact if PSYCOP is successful in comprehendingit . (SeeSinger et al. 1990and 1992for evidence that people also deduce information from their general knowledge in order to closethe gaps in ordinary text.) How does PSYCOP manage to fill the gaps in an external proof ? As eachsentenceof the proof appears, PSYCOP makesa decision about how to handle it and reports that decision to the user. For example, the model may decide that the new sentenceis a restatement of one it has already deduced; in that case, it will print out the message" I already know this to be true on the basisof thesesentences.. . ," filling the blank with the names of the working -memory sentencesfrom which it had earlier deducedthe

Mental Proofs and Their Empirical Consequences


item. Or it may judge that the input sentenceis some new consequence that it hasn't yet inferred, and then try to deduceit from what it currently " , it will print out I knows, using its stock of inferencerules. If it succeeds " have deduced [ the new sentence] . . . , specifying the working -memory sentencesimmediately involved in the inferenceand the suppositions that were in effect at the time. PSYCOP can recall this infonnation about memory sentencesand assumptions, sinceit is representedin its deduction and dependencylinks (seechapter 4). If the model fails to deducethe new " ' sentence , it will note " This doesnt seemto follow and will hold the sentence in a pending state, trying to deduceit again after more infonnation the from proof has appeared. In coming to thesedecisions, PSYCOP makesuseof phrases(discourse " "" " markers) in the input such as supposethat, under thesecircumstances, " this is because" and " therefore," as well as input cuesthat a new sentence , " follows " a certain rule or "follows from" an earlier theorem. These by ' cuescan often speedup PSYCO P s perfonnance; however, the model will often make the right decisionsin the absenceof thesecues, using its default assumptions. In the specialcasewhere the input consistsof just the premises and the conclusion of an argu~ ent, PSYCOP will treat the text as a maximally gappy proof - one in which all the intennediate steps are omitted - and will proceedto try to deducethe conclusion in just the way discussedin chapter 4. ' We can get a feel for how PSYCO P s proof interpretation works by considering how it respondsto a simple external proof. The example we will use (which figures in the reaction-time experiment to be described later) concerns the argument shown here as (5), which is deducible by meansof the rules in tables 4.1 and 4.2. (5) IF q THEN NOT p. -p. IF (IF NOT q THEN s) THEN s. " " An external proof of this argument is shown in boldface in the dialogue (6). After each line, PSYCOP tries to figure out how the line fits into its internal proof and indicates its actions by printing the responsesshown in lightface. (6) a. SupposeIF q THEN NOT p. b. Okay, 111supposeIF q THEN NOT p.


Chapter 5

c. d. e. f. g. h. i. j. k. 1.

Suppose p. ' Okay, I ll supposep. IF NOT q THEN s. Suppose ' Okay, I ll supposeIF NOT q THEN s. Underthesecircumstances , NOT q. I havededucedNOT q . .. underassumptions IF q THEN NOT p, p, and IF NOT q THEN s. Underthesecircumstancess . I alreadyknow that s is true on the basisof IF NOT q THEN s and NOT q ... Therefore , IF (IF NOT q THEN s) THEN s. I havededucedIF (IF NOT q THEN s) THEN s . . . under IF q THEN NOT P and p. assumptions

In linesa and c the externalproof asksus to supposethat the premisesof the argumentaretrue, and PSYCOPfollowsthis adviceby placingthese sentences in working memory, typing " Okay ... " whenit hasdonesois The samegoesfor the extra suppositionin line e. In somesituations, as we will seemomenta will be sufficientto trigger rily, the supposedsentences PSYCOP's forward rules, producingsomeunpromptedassertionsin memory; but at the beginningof this proof no inferenceroutinesoperate. Thus, thestateof workingmemoryis that shownin , containing just the threesuppositions. Whenline g of (6) appears , PSYCOPmustdo somework to figureout to hold. An impasselike this onecausesthemodel why NOTq is supposed to try to provethe newline from previousinformationin essentiallythe sameway that it would provea conclusionfrom a setof premisesin an -evaluationtask. In this case argument , the newsentence , NOT q, triggers PSYCOP's BackwardNOT Introductionrule. This routinetellsPSYCOP to supposeq (seesentence 5 in figure5.1b) andattemptto deducea contradiction . Part of this contradictionis immediatelyforthcoming , sincethe newsuppositionq togetherwith the premiseIF q THEN NOT p yields NOT p by ForwardIF Elimination.Because PSYCOPis alreadyassuming . At this point p (sentence2 in the figure), the contradictionis complete PSYCOPprintsout themessage in line h, notingtheassumptions that are in effectafterit infersNOT q. (Suppositionq is not mentioned , sincethis

Mental Proofs and Their Empirical Consequences

1. IF q THENNOT p.

2. p. I . IF NOT q THEN8.

.,",IF NOT . ,".'1 2 . . q p p " ~ II ,'.",THEN 7 ? ' 5 . . 8 . IF NO TH I . q : q .'b , , I ,,"e ' ~ " ' " , ..'NOT .? " ' : , I.'"..'"p ' , I " ' " , , ' , r--~ , -.'-"--'--.-4 I .-NOT ? q , , ' . t . I . .IF NOT . 2 .II .-------" q p p I"',1 ',THEN b

"~ , 7 . 1 P ' " ' ',,"',.,".',,b '".e I . . 8 . IF NO 8 TH . q : q . , , I ' ' ~ . ' ., " " eN O T . p . , '" II I ' I ' . . .,.",-,,'"-NOT : pi " . ' . , ' . , ' ' -,,'',"'------',,-.NOT :--q 6 ,. " 1 ' I . . , , ~ t . . . I 8 ' . . ,"-,---.1.IF ' , ' 1 ," 81A F 0 THE 8 q )THE (INOT Filure 5. 1 of PSYCOP's internalproofin comprehending (6). Development


Chapter 5

supposition is discharged by NOT Introduction when the conclusion is drawn.) However, this is not quite all that the model does in responseto line g. Since it is assumingIF NOT q THEN s, and since it now knows NOT q, it can also derives by a secondapplication of Forward IF Elimination . Hence, when the external proof gets around to saying that s follows in line i , PSYCOP can respond that it " already knows" this is true. (Seefigure 5.1b.) Finally , the external proof assertsthe main conclusion- one that the model hasnot anticipated. To seewhetherthe conclusionfollows, PSYCOP again turns to its backward rules, specificallyto IF Introduction , sincethe conclusion is a conditional. This routine asks PSYCOP to assumethe antecedent, IF NOT q THEN s, and attempt to deducethe consequents. However, before doing any deductive work on its own, PSYCOP always checks to seeif the necessaryinferencesare already in working memory. In this caseit notices that it is already assuming the antecedent(as sentence 3) and that it has deducedthe consequenton this basis(as sentence 9). Thus, all the conditions on IF Introduction have beencarried out , and the conclusion follows just as assertedin the external proof . The state of the internal proof in working memory is that of figure , and this allows PSYCOP to announce in line 1 of (6) that it has deduced the conclusion " under the " assumptions IF q THEN NOT p and p. (The supposition IF NOT q THEN s is discharged by IF Introduction .) Since these are the premisesof the original argument in (5), the internal proof is complete. The key point of this example is that PSYCOP progresses through a proof by fits and starts, sometimesanticipating a line before it appearsin the external proof and sometimespausing to see how a line follows by meansof its backward rules. This pattern of hesitations and anticipations provides the basisfor our reaction-time test. Reaction- Time Predictions The experiment compared subjects' performanceson pairs of external proofs, such as the example in table 5.3. The B proof (labeled" more difficult " ) is exactly the sameas the one we took up in (6). The A proof (" simpler proof " ) is identical to B with the exception of the very first line, which substitutes IF p THEN NOT q for IF q THEN NO Tp . Sincethesetwo lines are logically equivalent, the two proofs differ minimally in surface form and in logical structure. However, PSYCOP handlesthesetwo' proofs in very different ways, yielding a somewhatsurprising set of predictions.

Mental Proofs and Their Empirical Consequences


Table5.3 . Sampleproofpair fromcomprehension experiment A . Simpler proof

B. More difficult proof

The first point to notice about theseproofs is that PSYCOP is able to anticipate the fourth line in A but not in B. After PSYCOP readslines Al and A2, it will immediately deduceNOT q using Forward IF Elimination ; so when A4 eventually comesin , PSYCOP will respond " I already know NOT q is true. .. ." By contrast, we have already noticed that PSYCOP has no way of anticipating B4 and must rely on its backward rules to determine why it is supposedto follow. Assuming that this differencealso holds for human reasoners, we ought to predict that subjectswould take longer in assimilating B4 than A4. A second point of differenceconcerns the relationship betweenlines 4 and 5. Given proof A, PSYCOP can deduces immediately after reading line 3: lines I and 2 yield NOT q by Forward IF Elimination , as just explained, and line 3 and NOT q produces by the same inference rule. This meansthat we could omit line 4 entirely, and PSYCOP would still be able to deduces before it appearsin line 5. Not so, however, in proof B. Although PSYCOP can also deduces before line 5 in B, this depends crucially on line 4. As we noticed in connection with the dialogue above, line 4 sets up the string of inferencesleading to s. Thus, if line 4 were omitted in B, PSYCOP would have to pauseand conduct a lengthy backward searchwhen it came upon line 5. This yields an interaction prediction for responsetimes: Processingshould be relatively fast for line A5, whether line A4 is presentor absent. But processingof B5 should be fast if line B4 is present, slow if it is absent. In the experiment, each subject studied proofs that appearedone line at a time on a CRT display. The subjectshad beeninstructed to assumethat lines beginning with " Suppose" were true, and that they could take as much time as they liked in understandingthem. When a subject felt ready to continue he or she was to press an " advance" button, which then



displayed the next line of the proof on the screen. Old lines stayedon the screenafter new ones appeared. Each subject continued in this way until " he or she had studied all the " Suppose lines. At that point , a row of " " asteriskswas displayed on the screenas a ready signal, and the subject pushed the advance button once more when prepared to go on. This " " brought up a line beginning Under thesecircumstances, and the subjects were under instructions to decide whether this line necessarilyfollowed from precedinglines or did not necessarilyfollow. Subjectspressedone of two buttons on the responsepanel to indicate their decisions, and the computer recorded their responsetimes from the point at which the sentence appeareduntil the button was pressed. This sequencewas repeated until the subjectshad respondedto all the remaining lines in the proof . The critical proofs follow the pattern of those in table 5.3: Although we constructed the eight proof pairs using different inferencerules, they were all similar to A and B. The within -pair differenceswere confined to " " " the early " Suppose lines; the critical Under these circumstances and " Therefore" lines were identical from one member of the pair to the other. The proof pairs were also similar to A and B with respectto which lines PSYCOP could deduce in advance and which it had to wait to derive. In the simpler member of the pair, there were two consecutivelines- a Preparatory line analogous to A4 and a Succeedingline analogous to AS- both of which PSYCOP could derive before reading them. In the more difficult member of the proof pair, PSYCOP could anticipate the Succeedingline only if it had processedthe Preparatory line. Each proof occurred in two versions: one in which the Preparatory line (e.g., line 4) was present and one in which it was absent. Thus, subjects viewed 32 critical proofs: eight proof pairs with two versionsof each member. In addition , there were 32 additional proofs that contained a " Therefore" or " Under these circumstances" line that was not a logical " " consequenceof the Suppose lines. We included theseproofs as fillers to keep subjectshonest. Without them, subjectscould have respondedwith " " perfect accuracyby always pressingthe necessarilyfollows button. Subjects saw instantiations of the proofs, rather than the schematic forms shown here in table 5.3. The instantiations all employed people-inplaces sentenceslike those of the preceding experiment; for example, the first sentencein A would have appeared to subjects as Supposethat if Martha is in Boston, Claire is not in LA . The two versions of a proof were differently instantiated to disguise their logical similarity , and the

Mental Proofs and Their

Consequences Empirical


A . , . . , . . ..",". LN


~ E t=

&_~ ng Une (notfollowingPrep&ra1OryUne)

...................... ... s._~

u, .

(following Pr8r :18181 DrYLine )


Easy ProofType

Faa- . 5. Z Meancorrectresponse time for preparatory and succeedinglinesin simplerand moredimcult proofs .

computer presentedthe entire set of 64 proofs to eachsubject in a distinct random permutation. Figure 5.2 displays the mean correct reaction times that are relevant to our predictions. Consider first the data for the Preparatory line, represented by the top curve of the figure. Clearly, when this line is present, subjectstake longer to processit in the context of the more difficult proof than in the simpler proof (F ( I ,27) = 6.48, p = 0.02). This is consistentwith the idea that subjectswere able to infer beforehandthe contents of the line in the simpler proof, but not in the more difficult one. (SeeLea et al. 1990 for further evidenceon sentential inferencesduring comprehension.)


Chapter 5

The more interesting prediction concernsthe Succeedingline, which is indicated by the bottom two curvesin the figure. According to our predictions , when the Preparatory line is omitted, the times should once again increasefor the more difficult proof . However, when the Preparatory line is present, subjects should be able to anticipate the Succeedinglines in both proofs, eliminating the difference. The results show that this prediction is correct, asevidencedby a significant interaction betweenproof type (simple or difficult ) and the presenceor absenceof the Preparatory line (F( I ,27) = 6.78, p < 0.01). The figure also showsthat responseswerefaster overall for the Succeedingthan for the Preparatory line. This is probably the result of a differencein the complexity of thesesentences : In all but two of the eight proof pairs, the Preparatory line contained more logical connectives and atomic propositions. The error rates for thesedata were 14% for both the Preparatory and Succeedinglines, and they showed only a main effectof difficulty . Theseresults help confinn the forwardjbackward distinction that is an integral part of PSYCOP. It is the forward rules that account for the model' s ability to foreseethe upcoming Preparatory and Succeedinglines in the simpler proofs. When the forward rules are unable to derive the upcoming lines (as in the case of the Preparatory line of the difficult proofs, or the Succeedingline of the same proofs when the Preparatory line is omitted), PSYCOP must wait until these target lines appear and verify them by backward inference. The responsetimes therefore suggest that our division of inferencesinto forward and backward setsmay reflect a similar split in subjects' deductive processes. Evidencefrom Memory for Proofs The PSYCOP theory assumesthat working memory holds the mental proofs that people usein dealing with deductivequestions, but so far there has been little to say about working memory apart from its role as a container. The designof the previously describedexperimentsdeliberately reducedmemory limitations by giving subjectsfree accessto the premises and the conclusion of the problem (in the argument-evaluation study) or to the individual proof lines (in the reaction-time study). This strategy was motivated by the fact that the experimentswere tests of the inference rules rather than of the memory assumptions. Nevertheless, the naturaldeduction structure of mental proofs makessomepredictions about memory that deservea closer look . One test of this structure comes from an

Mental Proofs and Their Empirical Consequences


' experiment in which Marcus ( 1982) studied peoples recall of individual lines from simple arguments. Although this experimentaddressedhypotheses that Marcus derived from a generic natural-deduction framework, they apply to PSYCOP as a specificinstance. What Marcus observedwas that lines from subdomainsof proofs may have a different memory status than lines from superdomains. Subdomain sentencesresult from rules such as IF Introduction and NOT Introduction , and they playa supporting role in the proof, helping to establish superdomain inferences. These sentencesare also hypothetical, in the sensethat they hold only within the confinesof the subdomain, and not in the larger proof . By contrast, sentencesthat belong to the main proof that is, to the outermost domain- have the same status as the premises and the conclusion. Although thesesentencesmay act as subgoalsduring the proof process, their position is guaranteed if the proof is successful , and they hold in every subdomain embeddedbeneaththem. Experiments on text comprehension(Kintsch et al. 1975; Meyer 1975) have demonstrated that subjectsare better able to recall sentencesfrom higher levels in the organization of a passagethan those from subordinate levels, and a parallel logic would predict higher recall scores for superdomain than subdomain lines in a proof. By manipulating the top- level assertionsand suppositions, Marcus was able to construct pairs of proofs in which essentially the same sentence appearedeither in the top domain or in a subdomain. For example, one of Marcus's proofs readsas shown here in (7). (7) a. Supposethe runner stretches before running. b. If the runner stretches before running, shewill decreasethe chanceof musclestrain. c. Under that condition , she would decreasethe chanceof muscle straind . If shedecreasesthe chanceof musclestrain, shecan continue to train in cold weather. e. In that case, shecould continue to train in cold weather. f. Therefore, if the runner stretches before running, shecan continue to train in cold weather. In this passage , lines band d are assertionsat the top level of the proof, and line f is the conclusion (also at the top level). However, lines a, c,


Chapter 5

and e are all parts of a subdomain, which justifies the conclusion via IF Introduction . Compare the proof in (7) with that in (8), in which the sentenceabout the runner stretching is elevatedto the status of a premise. (8) a. The runner stretches before running. b. If the runner stretches before running, shewill decreasethe chanceof musclestrain. c. Therefore, shewill decreasethe chanceof musclestrain. d. If shedecreasesthe chanceof musclestrain, shecan continue to train in cold weather. e. Thus, shecan continue to train in cold weather. f. If shedid not decreasethe chanceof musclestrain, shewould ruin her chanceof running in the Boston Marathon . Proof (8) contains no subdomains, so all its lines are equally factual. In ' particular , sentences(8a), (8c), and (8e) are now part of the premises domain , rather than being embeddedin a subdomain as their counterparts in (7) are. We can refer to (7a), (7c), and (7e) as embeddedsentencesand to controls. In theseterms, then, we should (8a), (8c), and (8e) as unembedded predict that subjectswill recall the unembeddedcontrols better than the embeddedsentences , for the reasonsjust discussed. Of course, we need to be sure that such a difference is actually due to embedding and not to other variablesassociatedwith (7) and (8). One way to check this possibility is to compare the superordinatesentences(7b) and (7d) with the analogous sentences(8b) and (8d), which we will call superordinatecontrols. ' Since these sentencesdon t differ with respect to embedding, we should observeno differencein their recall. In Marcus' ( 1982) experiment the subjectsheard five passagessimilar to (7) and (8) and then attempted to write them down in as much detail as possible. The same procedure was then repeated with another five passages . Marcus constructed the proofs with embeddedsentencesusing the rules IF Introduction (as in (7)), OR Elimination , and NOT Introduction . She created the control proofs (e.g., (8)) from the experimental ones by changing the supposition to a premise and altering the concluding line. For a given subject, an experimental proof and its control were instantiated with different content; however, the assignmentof content to proof type was balancedacrosssubjects.

Mental Proofs and Their Empirical Consequences







1na18 Control

UMmbedded Conb ' oI

Emt.e.j .~.e.j Unes


Control Proof

Fiawe5.3 Pen:entcorrectrecallof prooflinesasa functionof prooftypeandline status , from Marcus 1982 .

' Figure 5.3 shows the main recall proportions from Marcus study. As predicted, the subjects were significantly more accurate in recalling unembedded controls than embedded sentences . Thus, sentencesin subdomains are lesswell rememberedthan those in superdomains, even when the form and content of the lines are the same. However, the comparison betweensuperordinatesentencesin the experimentaland control passages shows a trend in the opposite direction, with superordinate experimental sentences(such as (7b recalled slightly but not significantly better than controls (such as (8b . This supports the idea that the embeddingeffectis not attributable to the greater complexity of proofs like (7) or to other incidental factors associatedwith the global proof structure.6 The recall


Chapter 5

deficit for embeddedsentences , together with the absenceof a deficit for sentences , superordinate produced a significant interaction for the data in 5.3. figure Marcus' results demonstrate that we can predict memory fornatural deduction proofs like (7) in terms of the underlying configuration of these proofs- in particular , the subdomain-superdomain relations. Of course, the experiment gives us no information about whether natural deduction is more " natural" psychologically than other possibleproof methods, since it didn' t include such a comparison. But the results are important , nevertheless ' , in establishingthat subjects recall is sensitiveto properties central ' to PSYCO P s own memory structure. Summary We are not done with experiments based on PSYCOP; chapter 7 will describe further tests. However, the three experiments that we have just glimpsed provide us with initial evidenceabout the core theory. The first of these showed that we can predict the likelihood with which subjects correctly evaluatean argumentif we assumethat they behavelike PSYCOP in constructing mental proofs. Most of the rules are apparently obvious to subjects, and they find it possible to evaluate arguments that these rules can prove. However, a few of the rules causedifficulties, perhaps because of interfering pragmatic or strategic factors. The secondexperiment employed a reaction-time design that let us test the model without having to fit a large number of free parameters. This study used PSYCO P's distinction betweenforward and backward rules to predict how quickly subjects can assimilatethe lines of an explicit proof . We found that when PSYCOP could anticipate a line through a forward rule, subjects' responsetimes were relatively fast; however, when PSYCOP neededa backward rule to determine that a line followed, subjects' times were correspondingly slow. This accords with what we would expect if the human proof -following ' ability is similar to PSYCO Ps, employing forward rules on an automatic basis and backward rules in responseto specific goals. Finally , Marcus' ' experiment buttresses the idea that PSYCO P s superdomain-subdomain structure also describessubjects' memory for simple proofs. While these three experimentsare by no meansa complete test of the model, we haye somesupport for its basic tenets. However, the studies discussedin this section are rather atypical, since they employ stimulus items that are much more complex than is usual in reasoning experiments. The ability to handle complex arguments of this

MentalProofsandTheirEmpirical Consequences

sort is one of the advantagesI would claim for the PSYCOP approach, but it is neverthelessimportant to show that the theory can also handle the simpler argumentsthat pervadeearlier research. This is becausethere is now a wealth of data abOut thesesimple argumentsand becausesome of the earlier experimentshave turned up surprises.

with EarlierFindings Consistency Most researchon sentential reasoning has centered on arguments containing a singleconnective, usually if or not. The goal of theseexperiments has been to locate the sources of difficulty associatedwith the connectives - the problems subjects have in comprehending them and in combining them with further information . As was noted in chapter I , this has led to the developmentof a large number of mini -models, each specificto a connectiveand a task- models of conditional syllogisms, for example, or of verifying negative sentencesagainst pictures. It would be useful to show that these data and models are consequencesof more general assumptions , and in this section I try to show that the PSYCOP theory allows us to do this. Of course, each type of experiment poses special demandsthat go beyond pure reasoning. In sentence-picture verification, for instance, subjectsmust be able to representthe picture in somesuitable format. Still , PSYCOP should be helpful in providing the central deductive machinery. (It might shed some light on other aspectsof the tasks as well, following the " promotional " strategy discussedin chapter 2. I will return to this possibility in chapter 8 after introducing PSYCO P's handling of variables.) Inferenceswith Negatives The most detailed findings on negativescome from experimentsin which subjects decide whether individual sentencescorrectly describe accompanying pictures. (See, e.g., Carpenter and Just 1975; Clark and Chase 1972.) In their classic study, Clark and Chase ( 1972) presentedon each ' trial a display consisting of a sentenceat the left (e.g., Star isn t aboveplus) and a picture to the right (either a " star" (asterisk) directly above a plus sign or a plus directly above a star). The sentencesvaried in whether or not they contained a negative and whether they were true or false of the picture. On a given trial , subjectsmight have seena picture of a star above a plus, together with one of the sentencesStar is aboveplus, Plus is above



' star , Star isn t above plus, Plus isn 't above star . The time between the -, presentation of the display and the ~ubiect .~ true/ falsedecision wasthe basic dependent variable .

On average, Clark and Chase's subjectstook longer to verify the negative sentencesthan the positive ones, as we might expect from the added complexity involved in encoding the negativeitems. The more interesting result, however, was that for positive sentencesthe reaction times were longer when the sentencewas false, whereas for negative sentencesthe times were longer when the sentencewas true. Figure 5.4 illustrates these results. For example, if the picture showeda star above a plus, then times were shorter for the true positive sentenceStar is aboveplus than for the ~

- - ---~ = == ,SentBnCeS Negative



0-- :::::::





. True

. False Truth Value

F~ SA Meanresponse timesfor correctlyverifyingpositiveandnegativesentences with therelation aboveagainstpictureof a star abovea plusor a plus abovea star (from Clark and Chase 1972 , experiment1).

Mental Proofs and Their Empirical Consequences


false positive Plus is abovestar; however, times were shorter for the false ' ' negativeStar isn t aboveplus than for the true negativePlus isn t abovestar (Clark and Chase 1972, experiment 1). In order to account for this interaction betweenthe truth of a sentence and its polarity (positive or negative), let us supposealong with Clark and Chasethat subjectswere encoding both the sentenceand the picture into a common sentential format. We can then think of the task as one of deciding whether the picture representationentails the sentencerepresentation , as was noted in chapter 1. But in order to perform correctly on this task, subjectsneedto know more than just what was given in the display. They must also bring into play semantic information about the spatial relation above. They must recognize, in particular , that aboveis asymmetric ' , so that, for example, if the star is above the plus then the plus isn t above the star. We might represent the asymmetry in this experimental setup by the premise( IF star is aboveplus THEN NOT plus is abovestar) AND ( IF plus is abovestar THEN NOT star is above plus) . With this addition , determining whether the sentenceis true of the picture in the four conditions is equivalent to determining whether the conclusions of the argumentsin (9) follow from their premises. (9) a. Star is above plus. (IF star is above plus THEN NOT plus is above star) AND (IF plus is above star THEN NOT star is above plus). Star is above plus.

Encoded picture

True positive

b. Star is above plus. (IF star is above plus THEN NOT plus is above star) AND (IF plus is above star THEN NOT star is above plus). Plus is abovestar.

Encoded picture

c. Star is above plus. (IF star is above plus THEN NOT plus is above star) AND (IF plus is above star THEN NOT star is above plus). NOT plus is above star.

Encoded picture

False positive

True negative



d. Staris aboveplus. (IF star is aboveplusTHEN NOT plusis above star) AND (IF plus is above star THEN NOT star is above plus). NOT star is aboveplus.


False negative

It seemssafe to assumethat subjectscan respond with an immediate " true" responseif there is a direct match betweenthe conclusion and one of the premisesand can respond with an immediate " false" if the conclusion directly contradicts one of the premises. Thus, arguments (9a) and (9d) will be disposedof quickly . However, arguments(9b) and (9c) require further inferences. These arguments, unlike (9a) and (9d), depend on the extra premise concerning above, and hence subjects must unpack this premise. Applying Forward AND Elimination will produce IF star is aboveplus THEN NOT plus is abovestar, from which we can derive NOT plus is abovestar by Forward IF Elimination . This last assertion directly contradicts the conclusion of (9b) and directly matches the conclusion of (9c), correctly yielding a " false" responseto the former and a " true" responseto the latter. Thus, the additional inferencesaccount both for why false positive sentencestake longer to verify than true positive ones and for why true negativesentencestake longer to verify than false negative ones. (Of course, since forward rules carry out the extra inferences, subjectsshould make thesedeductionsin all four conditions. However, the decision for true positive and false negative sentencesneedn't wait for theseinferencesto appear, whereasthey must be present before subjects can respond to falsepositives and true negatives.) Although our assumptionsabout the representationsof the picture and sentenceare the sameas those of Clark and Chase( 1972), the rest of the account differs. On their view, the response-time incrementsfor falsepositive and true negative sentencesare due to changes in a special truth register or index whosevalue dependson a comparison betweensubparts of the sentenceand picture representations. According to the Clark -Chase theory, the representationscan be divided into an inner string (plusabove star or star aboveplus) and an outer string (which contains NOT in the case of a negative sentenceand is otherwise empty). For example, the sentenceNOT star aboveplus has NOT as its outer string and star above plus as its inner string. To make their decisions, subjectsare supposedto

Mental Proofs and Their Empirical Consequences


begin with the truth index set to the value True. They then cornpare the inner strings of the sentenceand picture representations, and then the two outer strings. Misrnatcheson either cornparison causethe truth index to switch its value frorn True to False or frorn False to True, each switch producing a constant increasein tirne. In the caseof a true positive trial , for instance, the sentenceand the picture rnatch on both strings (seethe first prerniseand the conclusion in (9a)), so the responsetirne should be rninirnal. However, for falsepositive trials, the inner strings rnisrnatch(see (9b)), causingthe truth index to changefrorn True to False and increasing the decision tirne. Carpenter and Just ( 1975) and Trabasso, Rollins, and Shaughnessy( 1971) present related rnodels that retain the notion of a cornparison betweensegmentsof the sentenceand picture encodings. ' My account predicts Clark and Chases results. In both rnodels, the rnain effectof negation is attributed in part to encoding the negativeparticle . But the interaction, which Clark and Chase explain through extra changesto the truth index, is here the result of extra inferencesteps. One reason to prefer the present account is that it provides a better rationale for the subjects' performance. If we look at the task as involving a sirnple cornparison betweenparts of the representations, then it is difficult to see why a rnisrnatchshould produce a changein truth value, as it doeson the Clark -Chasetheory. After all, if the sentencewas star is near plus and the representation of the picture was plus is near star, then a rnisrnatch between theseinner strings would not rneanthat the sentencewas falseof the picture. The truth index would have to rernain unchanged to yield the right answer. On the theory just outlined, the reasonwhy star is aboveplus conflicts with plus is above star is that subjects know that above is an asyrnrnetric relation, as captured in the second prernise of (9a)- (d). No such information is available for near, since this relation is a syrnrnetric one. Another way to seethis is to notice that the first prernisein (9a) would irnply its conclusion and the first prernise of (9d) would contradict its conclusion, even if we knew nothing about the properties of above. For exarnple, substituting a nonsenserelation for abovewould not affect the correct answer. However, the first prerniseof (9c) does not entail its conclusion and the first prernise of (9b) does not contradict its conclusion without the extra information about above. Our account locates the difficulty of falsepositives and true negativesin the needto derive thesefacts. A cornplaint about rnodels like Clark and Chase's is that they seern ; Tanenhaus, overly specializedto the dernandsof the task ( Newell 1973a


Chapter 5

Carroll , and Bever 1976); they do not offer a clear understanding of how subjects manage to go from their general knowledge and skills to the specific processingstrategy that the model postulates. The approach we have been pursuing may have an advantage over older models in this respect, since our explanation of the data largely follows from the general representational and processingassumptions of the core PSYCOP model.7 Inferenceswith Conditionals Reasoningwith conditional sentenceshas beena popular topic within the psychology of reasoning, secondonly to Aristotelian syllogisms. This research has also been the one with the greatestimpact on investigators in other fields, since the results seem to show that subjects are prone to seriousmistakeson problems whoselogical requirementsare purportedly very simple. It is easy to interpret such data as proving that people are inherently irrational , or that, at best, their reasoningabilities are extremely limited. In this subsectionwe will look at this evidencefrom the perspective of the PSYCOP model in the hope that it might clarify the source of the difficulty with if Although subjects probably do make errors in handling conditionals, part of the trouble may turn out to be a consequence of the structure of the deduction system. Most of this researchhas focusedon conditional syllogisms and on Wason' s selection task. Let us consider thesetwo kinds of experimentsin turn. Conditional Syllogisms Conditional syllogisms are arguments that consist of a conditional first premise, a secondpremisecontaining either the antecedentor consequentof the conditional , and a conclusion containing the conditional ' s remaining part (the consequent if the second premise contained the antecedent; the antecedentif the secondpremisecontained the consequent). The second premise and the conclusion can appear in either negatedor unnegatedform, yielding a total of eight argument typeS.8 Table 5.4 exhibits thesetypes along with some data from experiment 2 of Marcus and Rips ( 1979), in which subjects attempted to decide whether the conclusion of the argument " followed" or " didn 't follow " from the premises. The table gives the arguments in schematicform , but the subjects saw the arguments in three instantiated versions. Thus, the conditional in the first premise might have appeared in any of the following ' ' : ways If theres a B on the left side of the card, then theres a 1 on the right


Mental Proofs and Their Empirical Consequences


" ' " " Percentagesof "follows and doesnt follow responsesand mean responsetimes to eight .) conditional syllogisms (from Marcus and Rips 1979). (n = 248 responses RespOnse -propOrtions -

Syllogism 1. IF A, C A C 2. IF A, C A NOTC 3. IF A, C NOTA C 4. IF A, C NOTA NOTC S. IF A, C C A 6. IF A, C C NOTA 7. IFA, C NOTC A 8. IF A, C NOTC NOTA

time (IDS) Mean respOnse -

" Follows"

" Doesn't follow"

" Follows"

" Doesn't follow"

































side; If the ball rolls left. then the greenlight flashes; If the fISh is red. then it is striped. Thesedifferent versions are collapsed in the percentagesand reaction times in the table. In classical logic and in the PSYCOP system of chapter 4, the only argumentsthat are deducible are the first and last in the table. It is clear, however, that subjectswere much more apt to accept argument 1 (corresponding to the modus ponens form) than argument 8 (corresponding to " " modus tollens). Our subjectsresponded follows 98% of the time on the former but only 52% of the time on the latter. Performanceon the remaining nondeducible problems also varied widely. For arguments2, 3, 6, and ' " " 7 over 9ijO / o of the responseswere doesnt follow , which is the response



that both classicallogic and PSYCOP dictate. This sameresponseis also appropriate for arguments4 and 5, but only 79% of subjectssaid that the conclusion of argument 4 didn ' t follow and only 67% said that the conclusion of argument 5 didn' t follow. Theselast two argumentsare sometimes labeledfallaciesin textbookson elementarylogic (argument4 as the " fallacy of denying the antecedent" and argument 5 as the " fallacy of affirming the " consequent )- and, taken at facevalue, the data suggestthat subjectsare often ready to commit them. Evans ( 1977) and Taplin and Staudenmayer ( 1973) have reported similar results. An account of thesefindings basedon an intensional interpretation of IF , similar to the one discussedin chapter 2, is given in Rips and Marcus 1977. The PSYCOP systemsuggestsa simpler explanation: The difference between arguments 1 and 8 can be ascribed to the fact that the model contains a rule for modus ponens(IF Elimination ) but not one for modus tollens. As a consequence , subjectswould have to derive the conclusion of the latter argument by meansof an indirect proof, using both NOT Introduction and IF Elimination . This is, in fact, exactly the differencethat I noted between these arguments at the beginning of chapter 2. On this account, the longer times and lower responserates for " follows" inargu ment 8 are both due to the extra inferencestep (Braine 1978). However, the tendency of some subjectsto respond " follows" to arguments 4 and 5 must also be explained. Clearly, if subjects interpret the conditional sentencesin this experiment using the IF of the model, then no proof is possiblefor either argument; so whereare theseresponsescoming from? In modeling the data of table 5.1, we assumedthat " follows" responses could sometimesbe due to guessingby subjectswho had failed to find a proof for a particular argument. But mere guessingwould not explain " " why arguments 4 and 5 attract so many more follows responses than the other invalid argument schemasin table 5.4 (i.e., arguments2, 3, 6, and 7). A more reasonablesuggestionis that some subjectstreated the conditional premise as suggestingits converse(for example, interpreting If p then q as implicating IF q THEN p as well as IF p THEN q). This interpretation would yield the same results as if subjects had taken the conditional as a biconditional (p IF AND ONLY IF q), and therefore it is consistent with earlier researchby Taplin ( 1971; seealso Taplin and Staudenmayer1973). On this reading arguments 4 and 5 (but not arguments 2, 3, 6, and 7) are deducible, yielding the elevated responserates. Note, too, that PSYCOP could deduce argument 5 in this case by the

Mental Proofs and Their Empirical Consequences


same IF Elimination strategy that it would use with argument 1, which explains the relatively fast responsesfor thesetwo arguments. This propensity to consider both the conditional and its conversemight well have been encouragedby the nature of one of the conditionals we employed: If the ball rolls left, then the green light flashes. This sentence was supposedto refer to a pinball -machine-like device containing differently colored lights and different channelsalong which a ball could travel. There seemsto be a natural tendency in this situation for subjects to assume a one-to-one relationship between channels and lights, which leads them to think that if the green light flashesthen the ball must have rolled left (Cummins et al. 1991; Legrenzi 1970; Markovits 1988; Rips and Marcus 1977). In the experiment reported in Marcus and Rips 1979, 36% of the subjectsconsistently acceptedarguments 1, 4, 5, and 8 (and rejected the remaining arguments) when the problems were phrasedin terms of the conditional about the pinball machine, but lessthan 10% of subjectsproduced this responsepattern when the conditional was about a deck of cards or the markings on tropical fish. This sensitivity to content appears to be a hallmark of subjects' performance with conditionals, and it will recur in a more extreme form in the following subsection. The present point is that, although PSYCOP does not by itself explain why subjects bring this extra information into the problem (seenote 5 to chapter 2), the model can useit to yield the observedresultsby meansof the samemechanisms that we have usedto account for other tasks. The Selection Task A more popular type of experiment with conditionals concernsa problem invented by Wason ( 1966). Imagine a display of four index cards that (in a standard version of the problem) show the characters E, K , 4, and 7 on their visible sides (one character per card). Each card has a number on one side and a letter on the other. The problem is " Name those cards, and only those cards, which need to be turned " over in order to determine whether the [ following] rule is true or false : If a card has a vowel on one side, it has an even number on the other. According to Wason and Johnson- Laird ( 1972), the two most popular answersto this selectiontask are that one must turn over both the E and 4 cards (46% of subjectsrespond in this way) and that one must turn over just the E card (33%). But neither of theseanswersis right . Certainly the E card must be checked, since an odd number on its flip side would be inconsistentwith the rule. The K card, however, is consistentwith the rule



no matter whether it is paired with an even or an odd number, so this card cannot discriminate whether the rule is true or false. Likewise, if the 4 card has a vowel on its flip side it is consistent with the rule, and if it has a consonantit is also consistent. The 4 card is thereforeirrelevant to the test. This leavesthe 7 card. If its flip side contains a vowel this card contradicts the rule, whereasif the flip side contains a consonant it conforms to the rule. Therefore, the E and 7 cards must be turned over. Wason and Johnson-Laird found that only 4% of their subjects discovered this. In later replications, the percentageof correct responseson similar tasks varied from 6% to 33% (Evans 1982). To determine" whether the rule is true or false" in this context meansto find out whether the rule applies to all four cards or whether it fails to apply to one or more cards. And there are severalways that subjectscould go about choosing which cards must be checked. The simplest possibility, however, might be to find out for each card whether the character on its face-up side, together with the rule, implies the vowel/ consonant or even/ odd status of the character on the flip side. If such an implication can be drawn, then the card must be turned over, since the rule is false of this card when the implication is not fulfilled. If no such implication can be drawn, the card neednot be turned over. In the caseof the E card, the fact that E is a vowel, together with the rule IF vowel THEN even, implies that the flip side contains an even number, so this card must be checked. We can representthe inferential tasks that correspond to the four cards as in ( 10) assumingthat the K is encodedas NOT voweland the 7 as NOT even in this context.

( 10) E card: IF vowelTHEN even. Vowel. 4 card: IF vowelTHEN even.

Even. ? K card: IF vowelTHENeven . NOT vowel. ? 7 card : IF vowel THEN even. NOT even. ?

Mental Proofs and Their Empirical Consequences


Viewed in this way, the selection task bears an obvious similarity to the conditional syllogisms in table 5.4. The differenceis that the present task provides no conclusions, so subjects must produce the conclusions on their own. PSYCOP is able to deduceEvenfrom the first of thesesetsof premises by meansof its Forward IF Elimination rule; however, for the other three sets it draws a blank. The reason is that, with no explicit conclusion, PSYCOP relies solely on its forward rules, and the only forward rule that is applicable in this situation is Forward IF Elimination . In particular , nothing follows from the premisescorresponding to the 7 card, because PSYCOP lacks a forward rule for modus tollens and lacks a subgoal to trigger a backward search for the same inference. (Backward search by means of NOT Introduction is how PSYCOP managed to handle the comparable argument in the syllogism task.) Thus, the differencebetween the E card and the 7 card is much like the difference between the two proofs in table 5.3. Given the simple strategy outlined above, we would therefore expect PSYCOP to respond by turning over just the E card, which is the responsegiven by about a third of the subjects. The other ' popular response- both the E and 4 cards- may be the result of subjects assumingthe converseof the conditional , as was discussedabove. When the converseconditional IF even THEN vowel is added to the premises above, PSYCOP will draw a conclusion for the premise setscorresponding to the E and 4 cards (using Forward IF Elimination again) but to neither of the other sets.9 Of course, a small percentageof subjectsdo manageto solve this problem , and we needto be able to explain how this is possibleon the present theory. A reasonableguessis that the successfulsubjects explicitly consider potential letters or numbers that might be on the flip sides of the cards, corresponding to conclusions for the premise sets in ( 10). This meansthinking about the possibility that there might be an even number or a nonevennumber on the back of the E card, a vowel or a nonvowel on the back of the 4 card, and so on. Considering theseexplicit possibilities would invoke backward rules and allow thorough logical processing. In terms of the argumentsin ( 10), this translatesinto filling in the conclusions Even and NOT even for both the E and K premises and filling in the conclusions Voweland NOT vowel for both the 4 and 7 premises. If a subject determinesthat any of theseconclusions is deducible, then he or she should check the corresponding card. Enumerating possible conclusions on a case-by-casebasisthis way will produce the sameresponsesfor



the E, K , and 4 cards as did the simpler strategy; the 7 card, however, will now be selected, since subjectscan determine (via Backward NOT Introduction ) that IF vowel THEN evenand NOT evenimply NOT vowel. This more complex strategy obviously requires extra effort and presumably is somethingthat subjectswill not do exceptin unusual circumstances. Even then they may fail to deduce the NOT vowel conclusion becauseof the difficulty of the backward inference. If a correct answer to the selection task requires projecting possible valuesfor the conclusionsin ( 10), we should expectfactors that encourage this projection to improve performance. Someexperimental variations on the task are consistentwith this prediction. First , it is possibleto highlight possibleconclusions by telling subjectsexactly which symbols might appear on each card's back (e.g., that the E card might have an 8 or a 9 on its back, that the 4 card could have a U or a V, and so on). Smalley( 1974) found improved choicesafter sucha manipulation . Second, integrating the values of the antecedentand the consequentas parts of a unified object, rather than segregatingthem on different parts of a card, should make it easier for subjects to imagine possible conclusions. In line with this, Wason and Green ( 1984) found more correct answerswhen the problem was about differently colored shapesthan when the problem was about cards that had a shapeon one half and a color patch on the other. Third , ' projection should be simpler if subjects choice is limited to the values of the consequent(even vs. odd numbers in the above example) and if subjectsactually get to view possible completions for the missing value. In selection-task variants of this sort (Johnson- Laird and Wason 1970; Wason and Green 1984), the instructions specify that there are several cards containing odd numbers and several cards containing even numbers . On each of a series of trials, subjects must choose to examine an odd or an even card in an effort to determine whether the rule (e.g., IF vowel THEN even) correctly applies to the entire pack. The results of thesestudiesshow that subjectslearn quite quickly to examineevery card whose value contradicts the consequent(cards with odd numbers) and to disregardeverycard whosevalue is the sameas the consequent(cards with even numbers). PSYCOP can therefore capture the main aspectsof subjects' responses in the standard version of the selectiontask and in some of its variations. However, much of the recent work on this problem has focusedon some dramatic effectsof the conditional rule' s content. For example, if the rule is phrasedin terms of police checking compliance with a drinking regula-

Mental Proofs and Their Empirical Consequences


tion (If a personis drinking beer, then the personmust be over 19) and the cards representages(e.g., 15 and 25) and the beverages(Coca-Cola and beer), performancecan run as high as 70% correct (Griggs and Cox 1982; Pollard and Evans 1987). Similarly , improvement on the selectiontask has beenobservedwhen the problem is phrasedin terms of authorities checking compliancewith rules that emphasizepermissionor obligation (e.g., If ' ' one is to take action 'A, then one mustfirst satisfy precondition pi (Cheng and Holyoak 1985; Cheng et ale 1986). However, not all types of content benefit performance. Pollard and Evans ( 1987) report little facilitation when the problem is stated in terms of secretpolice checking compliance with a regulation about identity cards (If there is a B on the card, then there is a numberover 18 on it ). And there seemsto be no benefit for a rule such as If I eat haddock, then I drink gin when the cards indicate " what I ate" and " what I drank " at a particular meal (Manktelow and Evans 1979). Theseresults may be due to subjects' memories of situations involving the same or similar information , rather than to a change in the way the subjectsreason(Griggs 1983). For instance, if subjectsalready know that violators of the drinking rule are underagepeople and beer drinkers (or know of an analogous situation), then they can determine the correct answerfrom their previous knowledge. Similarly , a rule phrased in terms of necessarypreconditions may remind subjects of a prior situation in which violators werethose who had taken the action and had not satisfied the preconditions. A complete account would, of course, have to explain ' this retrieval step; however, the presentpoint is that subjects performance is a function, not only of the inferencesthat PSYCOP sanctions, but also of the information that the subjectsinclude in their working -memory representation of the problem. The cover stories that experimenterstell to set up the contentful versionsof the selectiontask may invite subjectsto rely on background information , and this will , in turn , enable them to make inferencesthat go beyond what can be derived in the more pristine form of the task. (Seechapters 9 and 10 for further discussionof thesecontent effects. In thesechapters we also explore the possibility that facilitation in the selectiontask is due to useof modal operators.) Summary

The experiments reviewed in this chapter provide some evidencefor the generality and accuracy of the model. PSYCOP seemsgenerally able to



account for the way subjectsfollow and rememberproofs, verify sentences against pictures, and evaluate propositional arguments(conditional syllogisms , as well as more complicated argument forms). For many of these results, it is PSYCO P's distinction between forward and backward reasoning that provides the major explanatory tool. Subjectshave an easier time following proofs based on forward rules, since this allows them to predict what lies ahead; in addition , performance on conditional syllogisms and the Selection task is better when only forward reasoning is required. However, several other factors come into play in predicting ' subjects inference ability . The more difficult problems are those that require more inference steps, more complex rules, and more embedded proof structures. In the secondsection of the chapter, we focusedon PSYCO P's consistency with some prior results on negative and conditional sentences . The model gives us a new way of looking at these results and a uniform perspective on the kinds of inference skills that they require. Most earlier explanations of these findings took the form of information -processing models that the investigators had tailored to the specific task. Although the fits of these models are often excellent in quantitative terms, they " " require many hidden parameters in the form of assumptionsabout representation , processingcomponents, and order of processingoperations. Of course, no model can entirely avoid some specific presuppositions about the task. PSYCO P's goal, however, is to reduce the number of these ad hoc assumptions, attempting to account for these results with a common set of representationsand inference mechanisms. To a first approximation, the model seemssuccessfulin this respect; it appears to capture the basic effects, including the interaction between truth and polarity in verifying sentencesagainst pictures, the relative difficulty of conditional syllogisms, and the most common pattern of responsesin the selectiontask. The findings we havejust consideredcertainly don' t exhaust the previous data on propositional reasoning. For example, Draineet al. ( 1984) and Osherson ( 1974- 1976) have claimed that several experiments similar to the argument-evaluation task reported at the beginning of this chapter confirm their own deduction models. However, the findings reviewed in the second section of this chapter are surely the best-known and bestreplicatedonesand are thereforea good placeto beginassessingPSYCO P' s adequacy. We will return to the other models in chapter 9.


Variables in Reasoning

is all toward abstraction. The momentumof ~ the mind " Wallace Stevens, " Adagia

At this point, we needto return to the discussionof quantifiersand variables that we began in chapter 3 and seehow we can incorporate them in our framework. Chapter 3 describeda representationthat allows us to express ' , such as the logician s 3 quantified sentenceswithout explicit quantifiers " " (for some) and V (for all ). This Skolemized representationsimplifies the ' deduction systemto someextent, sincewe don t needto manipulate quantifiers, but it still forces us to consider the variables (x , y, . . . ) and temporary names(a, b, ax, by, . . . ) that the quantifiers leave in their wake. Recall that, in this notation , universally quantified sentences , such as Everything is equal to itself, are representedin terms of variables, x = x ; existentially , such as Somethingis fragile , are representedin terms quantified sentences of temporary names, Fragile( a); and sentenceswith combinations of universal and existential quantifiers, such as Every satellite orbits someplanet, are represented in terms of combinations of variables and temporary names, with subscripts to indicate dependencies , IF Satellite( x ) THEN of this chapter proposes first section . The ( Planet( ax) AND Orbits( x ,ax) ) a way for PSYCOP to deal with thesevariablesand temporary names- a method that gives it most of the power of predicate logic. The second section looks at some formal results concerning the correctnessof the system. The following chapter takes up experimental data on reasoning with classicalsyllogismsand on reasoningwith more complex arguments . containing multivariable sentences The importance of extending the core systemto variables lies partly in its ability to expressgeneralrules. We would like a deduction systemto be able to carry out cognitive operations over a range of exampleswithout having to fix theseexamplesin advance. For instance, the systemshould be able to store the definition that any integer that is evenly divisible by 2 is an even number and to use this definition to realize that 752, for example, is even. PSYCOP must recognizenew instancesas something about which it has general knowledge, and the obvious way to do this in the current framework is to have it instantiate variables to the names of the instancesor generalizethe instancesto match the variables. The problem-solving illustrations in chapter 3 provide a taste of the advantages of this approach; some additional caseswith more of acognitive psychology flavor appear in part III . It is worth emphasizing, however,


Chapter 6

that instantiation or generalization is a feature of any reasonable cognitive system. Production systems (e.g., Newell 1990), schema theories (Brachman and Schmolze 1985), and network theories (Ajjanagadde and Shastri 1989; Smolensky 1990) all need some technique for binding variables in their data structures to specificexamples. Resultson generalizing and instantiating are likely to haveimplications, then, beyond the PSYCOP systemitself. Moreover, variables have a close relationship to pronouns in natural language. Understanding the sentenceCalvin gave the clock to the woman who fixed it presupposesa way of correctly associating who with the womanand it with the clock. Variables allow us to make these relationships explicit; for example, we can translate this sentenceas Calvin gavethe clock x to the womany such that y fixed x. Thus, variables, or indexing devicesvery similar to them, are independently neededto represent the products of the comprehensionprocess. Extending the Core System to Vanables

By working with quantifier-free representations, we can avoid someof the problems of quantifier introduction and elimination rules that were noted in chapter 2. But in order to incorporate these representations in our proofs we must make some additional provisions. First , we confine our attention to sentencesin our quantifier-free notation that are logically equivalent to sentencesin classicalpredicate logic. As was noted in chapter 3, quantifier-free form provides someextra degreesof freedom, allowing us to expressdependenciesamong temporary names and variables that CPL outlaws. However, the very same dependenciescreate difficulties if we want to manipulate quantifier-free sentenceswith the types of naturaldeduction rules we have used so far. Some of these rules require us to negatean arbitrary sentenceor extract an antecedentfrom an arbitrary conditional , and theseoperations aren' t expressiblein quantifier-free form if the unnegated sentenceor the antecedentis not also representablein CPL . (SeeBarwise 1979on similar logics with branching quantifiers.) We will therefore work with quantifier-free sentencesthat are also expressible in CPL . Second, as already mentioned, we need some regulations for matching or unifying the variables and temporary namesthat the quanti fiers leave behind. We need to be able to deduce, for example, a specific

Variables in Reasoning


conclusion from a more general premise. Third , we must modify parts of the sentential rules in order to passvariable bindings from one part of the rule to another. Thesemodifications will be consideredinformally in this section; somemore rigorous justifications will be offered in next. A SampleProof with Variables and Temponry Names To seewhat is neededto adapt the system, let us consider what a proof with variablesand namesshould look like. We can take as an examplethe following simple syllogism from chapter 2: All square blocks are green blocks; Somebig blocks are squareblocks; Therefore, somebig blocks are greenblocks. For reasonsthat we will consider later, we must first make sure that no sentencein the argument shares variables or temporary names with any other. Also, temporary names in the premisesmust be distinguished from temporary names in the conclusion; we will do this by placing a caret over the former. Thus, the quantifier-free form of our syllogism is as shown in ( 1).

-block(x) THENGreen -block(x). (I ) IF Square Bigblock(a) AND Squareblock(a). -block(b). Big-block(b) AND Green Theproofof thisargument is shownin figure6.1. Asbefore , doublesolid linesindicatethat matchinghastakenplace ; but thistimematchingwill whose , notonlyidentical , butalsocertainpropositions couple propositions variablesand namesdiffer. Initially, the proofconsistsof just the two andtheconclusion I , 2, and5 in thefigure). As the premises (sentences first deductive , Big-block(~) AND step,wecansplit thesecondpremise block(~), intoitstwoconjuncts of ForwardAND Elimination Square bymeans 3 and4. No otherforwardrulesare ; theseappearas sentences at thispoint, though ruleon , sowemusttry usinga backward applicable theconclusion . Backward AND Introductionis theobviouschoice , since the conclusion hasthe form of a conjunction(i.e., Big-block(b) AND -block(b) , andthisruletherefore Green proposes Big-block(b) ?asa subgoal sentence 6 . This asks whether there is a bigblock,andasit ( ) subgoal we know there must beonebecause of theassertion happens already Bigblock(~) in sentence 3. Hence , thesubgoalshouldmatch(i.e., befulfilled , asthedoublelinesindicatein thefigure. by) thisassertion

,G ,-b ,I(i)? Y -b()AN Gr -b()?loI Big

AND . Squa b ( 8 ) look \ , ,, \Squa ',.-II b(8 ).IOO ---

1. Big-blook (i )


2. IF 8quare - blook (x) THENGreen -blookfx ).

.,-,--. --


" " '"" '-", ~ -

, . -~ 8.Big bIOok (8) e. Big-blook ( b)?

8. equare- blook(i )?

6.1 ~ PSYCOP's proofof thesyllogism -block(x) THENG~ n-block(x). IF Square -block(A). Big-block (A) ANDSquare ( b) ANDG~ n block(b). Bigblock

At this stageof the proof, Backward AND Introduction has done half its job of proving the conclusion. The conclusion of ( 1) demandsthat we prove that some block is both big and green, and AND Introduction has so far succeededin showing that there is a big block (namely, d). It remains ' to be shown that this sameblock is also green. ( Wecan t prove the conclusion merely by showing that someblock is big and someblock green, since two different blocks might be involved.) For this reason AND Introduction must ensure that the second subgoal it sets is Green-block( d) ?, as shown in sentence7 of figure 6.1. That is, AND Introduction must substitute the new temporary name d for b in the secondpart of the con-

Variables in Reasoning


clusion. This illustrates one of the ways in which we will need to modify the backward rules. We have no assertionthat statesoutright that a is a greenblock, but we do have a conditional premise that tells us that if something is a square block then it is green. This suggeststhat we should usea variation on our old backward IF Elimination strategy: Since we want to show Greenblock( a) ? and sincewe know IF Square-block( x ) THEN Green-block( x ) , it makessenseto try to prove Square-block( a) ? Backward IF Elimination makes the relevant substitution in the antecedentof the conditional and producesthis subgoal as sentence8. However, this subgoal is easy to fulfill , for we have already deducedSquare-block( a) at the beginning of the ' proof - it s just sentence4. Since a is square, IF Elimination tells us it must be green, and this completes the second half of the proof of the syllogism. (Further examples of proofs using the newly modified rules appear in chapters 7 and 8.) Matching In the proof of figure 6.1, we needed the fact that the subgoal Bigblock( b) ? could be fulfilled by the assertion Big-block( a) , despite the fact that these sentenceshave different temporary names. This raises the general question of when we can deduceone sentencefrom another that has the same predicates and logical constants but may have different variablesand names. To seethe issueassociatedwith matching, considerthe subgoal Red( a) ? i.e. ( , Is anything red?), and supposethat the assertionsof our proof include Red( y ) (i.e., Everything is red). Intuitively , this assertion sufficesto prove the subgoal, sinceRed( y ) meansthat everything is red and Red( a) means that there is a particular individual that is red. Hence, if Red( y) is true, so is Red( a) . By contrast, if Red( y ) ? were the subgoal, we could not fulfill it by meansof an assertion Red( a) , sincethe first of thesesentencesis more general than the second. We clearly need an explicit policy for deciding when we can match variables and temporary names in assertions and subgoals. Let us call two sentencesisomorphicif they are alike except for their variables and names. Thus, isomorphic sentencesmust have exactly the same predicates and logical constants in the same structural relations, but can differ in their arguments. (For example, Red( a) is isomorphic to Red( x ), but not to Green( a) .) Our example suggeststhat we should be


6 Chapter

allowed to match subgoalsto some isomorphic assertions, but not to all. The simplestway to proceedmight be to consider the variablesand names of the subgoalone at a time. If the variable or namecan be matchedto one in a corresponding position in the assertion(according to the rules we are about to state), then we can produce a new subgoal that is also isomorphic to the original but that has the new variable or name from the assertion substituted for the old one in the subgoal. We continue in this way until we either create a subgoal that is identical to an assertion or fail in the matching process. For example, supposewe have Similar( x ,y) as a premise (everything is similar to everything), and Similar( Kathleen Turner, Lauren Bacall) as a conclusion (Kathleen Turner is similar to Lauren Bacall). To show that the premiseimplies the conclusion, we can generalize the latter sentenceto Similar( x ,Lauren Bacall) (i.e., everything is similar to Bacall) and then generalizeagain to obtain the premise. In effect, this processgeneralizesthe original subgoal incrementally in order to mate it with an assertion. The rules in table 6.1 specifythe conditions that govern this generalization process. Basically, theserules permit variablesin subgoalsto match variablesin assertions, permanentnamesin subgoals to match variables in assertions, and temporary names in subgoals to match variables, permanent names, or temporary names. However , we needto impose somerestrictions on matching to avoid fallacious proofs, just as we had to restrict the quantifier introduction and elimination rules in chapter 2. As a preliminary step in carrying out proofs in quantifier-free form , we must make sure that each sentencein the argument to be proved contains distinct variables and temporary names. For example, if one premise states that some integer is even and another states that some integer is odd, we don' t want to symbolize the former as Integer( a) AND Even( a) and the latter as Integer( a) AND Odd( a) . The usual AND Introduction and Elimination rules would then permit us to conclude Integer( a) AND Even( a) AND Odd( a), which is to say that someinteger is both even and odd. We must also take the further precaution of differentiating the temporary namesthat appear in the premisesfrom those that appear in the conclusion, using carets over all temporary namesin the premisesas we did earlier. Hence, if Integer( a) AND Even( a) is a premise it will appear as Integerd ) AND Even( d), but if it is a conclusion it will appear as plain Integer( a) AND Even( a) . In a sense, temporary names in the premises

Variables in Reasoning

Table6.1 an arbitraryquantifier-free Matchingrulesfor predicatelogic. In theserules, P(n) represents that containsargumentn. P and P' aresaidto be isomorphicif theyareidentical sentence exceptfor the particularvariablesand temporarynamesthat they contain(i.e., theymust in thesamepositions havethesamepredicates andconnectives ). Somebackwardrules(e.g., IF Elimination) requirea subgoalto matcha subfonnulaof anassertion , andin that casethe " " . Otherwise , thetarget phrasetargetfonnula in thefollowingrulesrefento that subfonnula . fonnulais a completeassertion ) MatcbiDll (. . ria~ in - bIO81to varia~ in aI Ie I1i O8 : (a) P(x) is a targetfonnulaand P' (y) is an isomorphicsubgoal . Conditions . (b) x andy arevariables (c) whereverx appearsin (a nonsubscript positionin) P(x) eithery or a temporarynamet, with y assubscriptappeanin P' (y). (d) if t, appearsin thesamepositionasx, thent, appeanonly in positions . occupiedby variables (e) whereverx appearsasa subscriptin P(x), y appearsasa subscriptin P' (y). Action: Add thesubgoalof provingP' (x) to thelist of subgoals , whereP' (x) is the resultof: of y in P' (y). (a) substitutingx for all occurrences of t, in P' (y) if t, appeared in someof (b) substitutingx for all occurrences thesamepositionsasx. (c) substitutingx for y in P' (y) at eachremainingsubscriptpositionthat y occupiesin P' (y). DeI It an . - in a8erti0D) Matcbiaa1(temporaryan . - in saba_ I to temporaryor perma : (a) P(n) is a targetfonnulaand P' (t) is an isomorphicsubgoal . Conditions ) name. (b) t is a temporaryname,andn is a (temporaryor pennanent of t. (c) anysubscriptsof n areincludedamongthesubscripts (d) t is not of the fonDA. (e) eithern or a variableappeanin P(n) in all positionsthat ' t occupiesin P' (t). Action: Add thesubgoalof provingP' (n) to thelist of subgoals , whereP' (n) is theresultof substitutingn for t in P' (t) at all positionsthat t occupies in P' (t). an . - in saba_ I to ,aria~ in ~ rdoa) MatehiDl3 (permaaeat . : (a) P(x) is a targetfonnulaand P' (m) is an isomorphicsubgoal Conditions (b) x is a variableandm is a pennanentname. (c) m appeanin P' (m) in eachnonsubscript positionthat x occupiesin P(x). (d) if a temporarynamet appeanin P' (m) in thesamepositionthat an x-subscripted temporarynamet. appeanin P(x), thenanyadditional tokensof t alsoappearin positionsoccupiedby t. . Action: Add the subgoalof provingP' (x) to the list of subgoals , whereP' (x) is the resultof: (a) substitutingx for m in P' (m) at eachnonsubscript positionthat x occupiesin P'(x). samepositionasan (b) substitutingt for t in P' (m) wherevert occun in the ' x-subscripted temporarynamein P(x) andwheret is a newtemporary namewith subscripts consistingof x andall subscriptsof t.


Chapter 6

T. ~ 6. 1 (continued ) Matchiaa4 (temporary. . . . in . . . 081to yariablesin aMerdOD) Conditions : (a) P(XI ' ... ' Xt) is a targetConnula . and P' (t) is an isomorphicsubgoal (b) xI ' . . . , Xt arevariablesandt is a temporaryname. (c) XI or Xzor ... Xt appearsin nonsubscript positionsin P(XI ' ... ' Xt) at eachpositionthat t occupies in P' (t). ' (d) iCa temporarynamet appearsin P' (t) in thesamepositionthat an namet.. appearsin P(XI, ... , Xt), thenany xI-subscripted temporary additionaltokensoCt' alsoappearin positionsoccupiedby t... Action: Add thesubgoaloCprovingP' (XI' ... ' Xt) to the list oCsubgoals , where P' (XI ' ... ' Xt) is the resultor: (a) substitutingXICort whereverto<x: upiesthesamepositionin P' (t) that XI o<x: upiesin P(xI' . . . , Xt). " ' ' as (b) substitutingt Cort in P' (t) wherevert occursin thesameposition " an xI-subscripted name in P (XI, ... , Xt) andwheret is a new temporary consistoCXIandall subscriptsoCt'. temporarynamewhosesubscripts

Variables in Reasoning


Second, we must observe the opposite distribution requirement when matching a temporary name such as b in the subgoal to a name (permanent or temporary) in the assertion. In this case, the assertion name (or somevariable) must cover eachposition in which b appearsin the subgoal. As (3) shows, breaking this rule would allow us to prove that there is somethingthat is not identical to itself from the premisethat there are two things that are nonidentical. (3)

a. a :F c


b. a :F b . c. b :F b

Matching Matching

Again, the converseargument is valid and is not blocked by the rules of table 6.1. (In the following section, we will return to theserules in order to show that they never lead to invalid inferences.) The matching rules are for subgoal-to-assertion matching, which is the most important kind. But there may be situations when we need to match one assertion (or part of an assertion) to another. For example, it follows from the assertions IF Aardvark( x ) THEN Mammal( x ) and Aardvark( Ginger) , that Mammal( Ginger) . As currently formulated, however , the obvious rule for generating this sentence, Forward IF Elimination ' , does not apply, since Aardvark( Ginger) isn t the .same as the antecedentAardvark( x ) . To get around this problem, we could modify the Forward IF Elimination rule, allowing it to match a conditional's antecedent to another assertion. Setting x to Ginger would give us IF Aardvark ( Ginger) THEN Mammal( Ginger) , which would lead to the right conclusion. The trouble is that the self- constraining nature of IF Elimination no longer holds under this type of matching. If the conditional contains a subscriptedtemporary name in the consequent, then matching of this type can produce an infinite number of new assertions. For instance , supposewe have the conditional assertion IF Integer( x ) THEN Successor ( ( x ,dx) AND Integerd %) ) , which statesthat every number has . If the proof also contains Integer( O) , then matching and Forward a successor IF Elimination will produce Successor ( O,do) AND Integer( do) and, hence, Integer( do) by AND Elimination . This latter sentence , however, can also be matched to the antecedentof the conditional , leading to Integer ( d4) , and so forth.


Chapter 6

There may be ways around this problem,! but one obvious possibility is to leave well enough alone: We may not want to make inferencessuch as Mammal( Ginger) on the fly , for reasonsof cognitive economy, especially if we have many other assertions- Aardvark( Marie ), Aardvark( Fred), and so on. If we really needto prove Mammal( Ginger) , we can always do so by means of the Backward IF Elimination rule (in the modification discussedin the next subsection). Of course, Church's Theorem (seechapter 3) makes it inevitable that there will be some infinite searches in any deduction systemthat is complete with respectto CPL , and it is possible to construct examplesin which Backward IF Elimination can produce a subgoal that can be taken as input to a second application of the same rule, and so on (Moore 1982). However, it seemsreasonableto shield the forward rules from these difficulties, given the automatic way in which theserules operatein our presentformulation . Backward rules allow more opportunity for controlling theseproblems, since we can chooseto abandon a series of subgoals if it appears to be leading nowhere. In special contexts where there is no danger of unconstrainedproduction , we might find a placefor a Forward IF Elimination rule of the sort envisionedin the preceding paragraph; in general, however, the difficulties with this rule seem to outweigh its advantages. For the time being we will keep the forward rules in their old versions, which don' t include an instantiation step.

ModifI Cations to Rules

SomeNewForwardRules In addition to carryingover the old forward rulesfrom table4.1, we increaseour stockwith newonesthat seemwell -argumentstructure. Theserulesdo not involveinstantiation adaptedto predicate and so do not presentthe problemsthat besetthe modifications to the forward ruleswejust considered . In particular, Draineand Rumain( 1983 and and 1981 ) Guyote ) suggestrule schemas Sternberg( similarto theoneslistedherein table6.2. TheTransitivityrule allowsusto concludethat IF F( x ) THEN H ( x ), providedthat IF F( x ) THEN G( x ) and IF G( x ) THEN H ( x ) . In other words, this rule licensesthe simpleuniversalsyllogismfrom all Fare G and all G are H to all Fare H. The Exclusivityrule capturesa similar intuition that if all Fare G andno G areH, thenno Fare H either. In our notation, this becomesNOT( F( x ) AND H ( x ) ), providedthat IFF ( x ) THEN G( x ) and NOT( G( x ) AND H ( x ) ) . Finally, the Conversionrule

Variables in Reasoning


Table6.2 Newforwardinference rulesforpredicate logic. Tnnsitiyity IF F(x) THENG(x)


Exelusitity IF F(x) THENG(x) NOT(G(y) ANDH(y NOT(F(z) ANDH(z Couyenioa NOT(F(x) ANDG(x NOT(G(y) ANDF(y

of theformIF F(x) THENG(x) andIF G(y) (a) If sentences THENH(y) holdin somedomainD , (b) andIF F(z) THENH(z) doesnotyetholdin D, (c) thenaddIF F(z) THENH(z) to D. ~ of thefonnIF F(x) THENG(x) and (a) If senten NOT(G(y) ANDH(y)) holdin somedomainD . (b) andNOT(F(z) ANDH(z)) doesnotyetholdin D. (c) thenaddNOT(F(z) ANDH(z)) to D. of thefonn NOT(F(x) AND G(x)) holdsin (a) If a sentence somedomainD, (b) andNOT(G(y) AND F(y)) doesnot yet hold in D, (c) thenaddNOT(G(y) AND F(y)) to D.

allows us to conclude that no G are F on the grounds that no Fare G , which amounts to reversing the order of the conjuncts under negation: NOT ( F ( x ) AND G( x ) ) entails NOT ( G( x ) AND F ( x ) ) . All three rules seemobvious and are self-constraining in the senseof chapter 3. As will be shown in chapter 7, they appear to playa crucial role in the way subjects handle Aristotelian syllogisms. Modijicatioll S to Backward Rilies The use of quantifier-free form forces some revamping in the backward rules of table 4.2. For one thing, many of the backward rules require the theorem prover to coordinate several sentences , and this means that the rules must check that any matching that takes place for one sentencecarries over to the others. For example, suppose we want to prove the conclusion Child-of ( Ed,a) AND Childoff Ann,a) - that is, there is someone who is the child of Ed and Ann. Backward AND Introduction tells us that we can prove this conclusion if we can prove eachconjunct. But if we fulfill the subgoal Child-of ( Ed,a) by matching to an assertionlike Child-off Ed,Benjamin) , we must then try to show that Child-off Ann,Benjamin) . We can' t get away with matching a to one child in fulfilling the first conjunct and to another child in fulfilling the second. The proof of figure 6.1 presented a similar problem for AND Introduction , since we had to show that the same item was both a big block and a green block. Coordination of this sort is also necessaryfor


Chapter 6

other rules that require both locating an assertionand fulfilling a subgoal (e.g., Backward IF Elimination ). One way to coordinate matching is to keep a record of the substitutions that occur when one condition of a rule is satisfied and to impose these same substitutions on sentencesthat appear in later conditions. In the precedingexample, we matched a to Benjaminin fulfilling the first subgoal of Backward AND Introduction and then substitutedBenjaminfor a in the secondsubgoal. We can think of eachmatch as producing an ordered pair, such as ( a, Benjamin) , in which the first member is the " matched" argument and the second member the " matching" argument. In carrying out later parts of the rule, thesepairs must be usedto produce new versionsof subgoalsor assertionsin which each occurrenceof a previously matched argument is replaced by its matching argument. A reformulated AND Introduction rule that includesthis substitution step appearsin table 6.3 . A related point concernsthe consequencesof matching one temporary name to another. Supposewe again want to prove Child-off Ed,a) AND Child-off Ann,a) . We certainly want the subgoal Child-off Ed,a) to match an assertionsuch as Child-of ( Ed,b) : The subgoal asks whether anyone is the child of Ed, and the assertion statesthat someoneis indeed the child of Ed. But once we have matched thesesentencesand substituted b for a to obtain Child-off Ann,b), we are no longer free to match b to other ' temporary or permanentnames. In particular , we can t at this point satisfy Child-off Ann,b) by matching to Child-off Ann,f ) , since f may denote someoneother than the person denoted by b. There may be someonewho is the child of Ed and someonewho is the child of Ann without Ed and Ann's having any children in common. In effect, temporary names that come from premisesand other assertionsact like permanent nameswith respectto matching. For that reason, the matching rule in table 6.1 will not permit matching of tented temporary namesin subgoalsto namesin assertions; and this provides the rationale for distinguishing temporary names that originate in the premises from those that originate in the conclusion. Another modification concerns rules that deal with conditional and . As we noticed in defining quantifier-free form , we must negativesentences be careful in dealing with variables and temporary nameswhen they are within the scopeof a negative or in the antecedentof a conditional. For example, the sentencewritten as ( 3x) Rich( x ) in standard predicatelogic will be symbolized Rich( a) in quantifier-free form, whereasits opposite or

Variables in Reasoning


Table6.3 Backwardinferencerules for predicatelogic. In thesean asteriskindicatesthe result of reversingargumentsby meansof the proceduredescribedin the text. P and P' denote (oneswhichare identicalexceptfor their variablesand names isomorphicsentences ). P* is the resultof applyingthe ArgumentReversalprocedureto P, unlessotherwiseindicated . Notationalvariantsareisomorphicsentences that differonly by substitutionof variablesfor other variablesor temporarynamesfor other temporarynames . Conditionsfor matching are describedin table 6.1. Procedures for ArgumentReversaland isomorphicsentences SubscriptAdjustmentappearin table6.4.


BackwardIF Eliminatioa( Mod. poaeas ) (a) SetR to thecurrentgoal'andsetD to its domain. IF P' THEN R' that holdsin D, (b) If R canbematchedto R for somesentence c then to . e () go Step( ) (d) Else, return' failure. (e) If P' and R sharevariablesandoneor morenamesor variablesin R matchedthese variables , (f ) thensetP to theresultof substitutingthosenamesor variablesfor thecorresponding variablesin P' . Labelthesubstitutingarguments of P andthe , thematchedarguments residualargumentstheunmatched of P. arguments asunmatched . (g) Else,setP to P' . Labelall its arguments to unmatched of P. (h) Apply ArgumentReversal arguments (i) Apply SubscriptAdjustmentto outputof Steph. Call the resultP* . (j) If D doesnot yetcontainthesubgoalP* or a notationalvariant, . (k) thenadd thesubgoalof provingp* in D to thelist of subgoals BackwardIF Introduction(Coaditioaalizatioa ) (a) SetD to domainof currentgoal. (b) If currentgoalis of the form IF P THEN R, andneitherD nor its superdomains nor its immediatesubdomains containsboth (c) suppositionP andsubgoalR (or notationalvariants), and IF P THEN R is a subformulaof the premises or conclusion (d) , (e) thenlet P' bethe resultof substitutingin P newtentedtemporarynamesfor any variablesthat P shareswith R, and labelsubstitutingarguments matchedarguments and theresidualarguments (f ) unmatched , andsetup a subdomainof D, D', with suppositionP* (whereP* is theresultof (g) to unmatched in P' ). applyingtheArgumentReversal arguments ' procedure any . (h) Add thesubgoalof provingR in D to thelist of subgoals BackwardNOT EUmination (a) SetP to currentgoaland D to its domain. or conclusion (b) If P is a subformulaof the premises , andQ is an atomicsubformulain the premises or conclusiond (c) and neitherD nor its superdomains nor its immediatesubdomains contains ) both suppositionNOT p * andsubgoalQ or suppositionNOT p* andsubgoal * NOT Q (or notationalvariants), thensetup a subdomainof D, D', with suppositionNOT P*, andaddthesubgoalof provingQ AND NOT Q* in D' to thelist of subgoals . If thesubgoalin (f ) fails, andneitherD nor its superdomains nor its immediatesubdomains contains both suppositionNOT p* andsubgoalQ* or suppositionNOT p* andsubgoal NOT Q (or notationalvariants), (i) thensetup a subdomainof D, D", with suppositionNOT P* , andaddthesubgoalof provingQ* AND NOT Q in D" to thelist of subgoals .


Chapter 6

Table 6. 3 (continued)

-' -

Backwan NOT Introdllt!tion (a) Set D to domain of current goal. . (b) If current goal is of the form NOT P. and P is a subformula (or notational variant of a subformula) of the premises (c) or conclusiond and Q is an atomic subformula of the premisesor conclusion. ) and neitherD nor its superdomainsnor its immediate subdomainscontains (e) both supposition p * and subgoal Q. or supposition p * and subgoal NOT Q * (or notational variants). then set up a subdomain of D . D '. with supposition P* . and add the subgoal of proving Q AND NOT Q * in D ' to the list of subgoals. If the subgoal in (g)- fails.

and neitherDnor its superdomains nor its immediatesubdomains contains both suppositionp* andsubgoalQ* or suppositionp * andsubgoalNOT Q (or notationalvariants), (j ) thensetup a subdomainof D, D", with suppositionP* , andaddthesubgoalof provingQ* AND NOT Q in D" to thelist of subgoals . BackwardAND I Dtrod~ (a) SetD to domainof currentgoal. (b) If currentgoalis of theform P AND Q, and D doesnot yetcontainthesubgoalP (c) . (d) thenaddthesubgoalof provingP in D to thelist of subgoals (e) If thesubgoalin (d) succeeds , and P is matchedto P' , (f ) and P andQ sharetemporarynames , ' (g) thensetQ to the resultof substitutingin Q that werematchedto thosetemporarynames . anynames ' (h) Else,setQ to Q. ' If D doesnot yetcontainthesubgoalQ , thenaddthesubgoalof provingQ' in D to thelist of subgoals . BackwardOR EHmiaadoa (a) SetR to currentgoalandsetD to its domain. of the form P OR Q holdsin D, (b) If a sentence andboth P andQ aresubformulas or negationsof subformulas (c) of thepremises or conclusion(or notationalvariantsof a subformula ), andneitherD nor its superdomains nor its immediatesubdomains (d) containsboth suppositionP andsubformulaR (or notationalvariants), andneitherD nor its superdomains nor its immediatesubdomains (e) containsboth suppositionQ andsubformulaR (or notationalvariants), (f ) thenreplaceanyvariablesthat appearin both P and Q with temporarynamesthat do not yet appearin theproof. Call theresultP' OR Q'. ' Setup a subdomainof D, D , with suppositionP' . andaddthesubgoalof provingR in D' to thelist of subgoals . If thesubgoalin (h) su~ -!. " ' thensetup a subdomainof D, D , with suppositionQ , andaddthe subgoalof provingR in D" to thelist of subgoals .

g)) (((ih ((jk)))

Variables in Reasoning


Table6.3(continued ] BackwardOR Introduction (a) SetD to domainof currentgoal. (b) If currentgoalis of theform P OR Q, (c) thenif D doesnot yetcontainthesubgoalP or a notationalvariant, thenaddthesubgoalof provingP in D to thelist of subgoals . (d) If thesubgoalin (d) fails, (e) andD doesnot yetcontainsubgoalQ or a notationalvariant, (f ) thenaddthesubgoalof provingQ in D to thelist of subgoals . (g) BackwardAND EUmiaation (a) SetD to domainof currentgoal. (b) SetP to currentgoal. that (c) If P canbematchedto P' in somesubformulaP' AND Q' of a sentence holdsin D, ' and D doesnot yetcontainthesubgoalP' AND Q or a notationalvariant, (d) ' . (e) thenadd P' AND Q to thelist of subgoals f Else if P can be matched to P' in somesubformulaQ' AND P' of a sentence that holds , () in D, ' andD doesnot yetcontainthesubgoalQ AND P' or a notationalvariant, (I) ' . (h) thenaddQ AND P' to thelist of subgoals BackwardDoubleNegationEUmiaation (a) SetD to domainof currentgoal. (b) SetP to thecurrentgoal. that holdsin D, (c) If P matchesP' in a subformulaNOT NOT P' of a sentence and D doesnot yetcontainthesubgoalNOT NOT P' or a notationalvariant, (d) . (e) thenaddNOT NOT P' to thelist of subgoals BackwardDisjiB:tive Syllogism (a) SetD to domainof currentgoal. ' (b) If thecurrentgoalQ matchesQ' in a sentence (P' OR Q ) or (Q' OR P' ) that holdsin D, that holdsin D, (c) andNOT P' is isomorphicto a subformulaof a sentence (d) thengo to Step ( f ). (e) Else,returnfailure. ' (f ) If P' sharesvariableswith Q andoneor morenamesor variablesin Q matchedthese variables , (I) thensetP to theresultof substitutingthosenamesor variablesfor thecorresponding variablesin P' . Labelthesubstitutingarguments of P, andthe , thematchedarguments residualarguments the unmatched of P. arguments asunmatched . (h) Else,setP to P' , labelingall its arguments to unmatched of P andthenSubscript (i) Apply ArgumentReversal arguments . Call the resultP* . Adjustment (j ) If D doesnot yetcontainthesubgoalNOT(P* ) or a notationalvariant, . (k) thenaddthesubgoalof provingNOT(P*) in D to thelist of subgoals BackwardDisjDDetive Mod. P(a) SetD to domainof currentgoal. (b) SetR to currentgoal. ' IF P' OR Q' THEN R' that holdsin D, (c) If R canbematchedto R for somesentence (d) thengo to Step(f ). (e) Else, returnfailure. ' (f ) If P' sharesvariableswith R andoneor morenamesor variablesin R matchedthese variables ,



Table 6. 3 (continued)



(g) thensetP to the resultof substitutingthosenamesor variablesfor thecorresponding variablesin P' . Labelthesubstitutingarguments of P, andthe , thematchedarguments residualarguments the unmatched of P. arguments asunmatched . (h) Else,setP to P' , labelingall its arguments andthenSubscriptAdjustmentto P. (i) ApplyArgumentReversal Labelthe resultp. . If D doesnot yetcontainthesubgoalp . or a notationalvariant, thenaddthesubgoalof provingp. in D to the list of subgoals . If thesubgoalin (k) fails, ' and0 sharesvariableswith R' andoneor morenamesor variablesin R matched thesevariables , thosenamesor variablesfor the (0) thensetQ to theresultof substituting variablesin Q', labelingthearguments asin Step(g). corresponding ' . (0) Else,setQ to Q , labelingall its argumentsasunmatched to Q. LabeltheresultQ*. (p) Apply ArgumentReversalandthenSubscript Adjustment . * (q) If D doesnot yetcontainthe. , Q or a notationalvariant . (r) thenaddthesubgoalof proving in D to thelistofsubaoals BackwardCO DjUI I CdYeSyllolilm

subgoal 0-

(a) Set D to domain of current goal. ( b) If the current goal is of the fonn NOT Q, and Q matchesQ ' in a sentenceNOT (P' AND Q') or NOT (Q ' AND P' ) that (c) holds in D ,

(d) thengo to Step(f ). (e) Else, returnfailure. (f ) If P' sharesvariableswith Q' andoneor morenamesor variablesin Q matchedthese variables , (g) thensetP to the resultof substitutingthosenamesor variablesfor thecorresponding variablesin P' . Labelthesubstitutingarguments of P, andthe , thematchedarguments residualarguments theunmatched of P. arguments asunmatched . (h) Else,setP to P' , labelingall its arguments to unmatched of P andthenSubscript (i) ApplyArgumentReversal arguments . Call theresultp' . Adjustment (j ) If D doesnot yetcontainthesubgoalp' or a notationalvariant, . (k) thenaddthesubgoalof provingp' in D to thelist of subgoals BackwardDeMorpn ( NOToyerAND) (a) SetD to domainof currentgoal. (b) If currentgoalis of theform(NOT P) OR ( NOTQ), andsomesubformulaof a sentence that holdsin D is of thefonn NOT(P' AND (c) ' Q ), and P AND Q matchesP' AND Q', (d) andD doesnot yetcontainthesubgoalNOT (P' AND Q') or a notational (e) variant, ' . (f ) thenaddNOT (P' AND Q ) to thelist of subgoals BackwardDeMorpn ( NOToyerOR) (a) SetD to domainof currentgoal. (b) If currentgoalis of theform(NOT P) AND (NOT Q), andsomesubformulaof a sentence that holdsin D is of theform NOT(P' OR Q'), (c) and P OR Q matchesP' OR Q', (d) and D doesnot yetcontainthesubgoalNOT (P' OR Q') or a notationalvariant, (e) ' . (f ) thenaddNOT (P' OR Q ) to thelist of subgoals

Variables in Reasoning


contradictory , NOT ( 3x) Rich( x ) ( = ( 'v'x ) NOT Rich( x ) , will become NOT Rich( x ) . Variables are interpreted as if they were attached to quantifiers having wide scope. The procedure for translating sentencesfrom classicalpredicate logic to quantifier-free form (chapter 3) mandatesthis interpretation , since it first converts the CPL sentenceto prenex form beforedropping the quantifiers. This meansthat if we want to deducetwo sentencesthat are contradictory , as we must for the NOT Introduction and NOT Elimination rules, we must prove Rich( a) and NOT Rich( x ) . These sentencesare truly contradictory , despite their appearance, since they mean that someoneis rich and no one is rich. By contrast, the pair Rich( a) and NOT Rich( a), as separatesentences , are not contradictory, sincetogether they merely assertthat someoneis rich and someoneis not rich. Thesesentencesare " subcontraries" in scholasticlogic. The same sort of adjustment must be made for IF Elimination and other rules that handle conditionals. Take the sentenceIF Rich( a) THEN Famous( y) , which means that someone is such that if she is rich then everyone is famous. If we want to use IF Elimination with this sentence in order to derive Famous( y) , we must also have Rich( x ) and not just Rich( a) . To prove that everyoneis famous, it is not enough to show that there is one rich person(sincethat personmight not be the one relevant to the conditional ). We must show instead that everyoneis rich. For these reasons, we sometimesneed to reversethe roles of variables and temporary names in sentenceson which the rules for IF and NOT operate. Usually, this is simply a matter of transforming some sentence P into an isomorphic sentencep . in which temporary names in Pare . replaced by new variables in p and variables in P are replaced by new ' temporary names in p . The main complication is due to subscripts, since we have to determine which of pis temporary names should be subscripted by which of its variables. The proceduresin table 6.4, called Argument Reversaland Subscript Adjustment, yield the correct pattern of subscripts for the backward deduction rules. Some of the rules (e.g., NOT Introduction ) require just Argument Reversal. Other rules (e.g., IF Elimination ) require both Argument Reversaland Subscript Adjustment in order to deal with variables and temporary namesthat have beensubstituted from an earlier subgoal. The next section of this chapter gives proofs of the soundnessof someof theserules- proofs that also show why . Table 6.3 collects the changesto the proceduresof table 6.4 are necessary all the backward rules.


Chapter 6

Table 6.4 Argument Reversaland Subscript Adjustment for backward rules. Argument Reyenal Let P be an arbitrary sentencein quantifier-free form. Then the following stepsproduce the argument-reversed, isomorphic sentencePr: I . Assign to each temporary name al that appearsin P a variable YIthat has not yet appearedin the proof. 2. Assign to each variable XI in P a temporary name bl that has not yet appearedin the proof. 3. Let {al} be the set of all temporary namesin P that do not have XIas a subscript. Let {YI} be the set of variables assignedin step 1 to the ai's in {al} . Subscript bl with the variables in {YI} . 4. Replaceeach al in P with the YIassignedto it in step 1. Replaceeach XI in P with the bl assignedto it in step 2, together with the subscriptscomputed in step 3. The result is Pr SubscriptAdjaatmeat Let P be a sentencein quantifier-free form, and let Pr be the result of applying the Argument Reversalprocedureabove to unmatchedargumentsof P (i.e., arguments that do not arise from substitution). The following operations yield the subscript-adjusted sentenceP. when applied to Pr: 1. Let Cj be a temporary name in Pr that derives from reversingan unmatched variable Xj in P. For each such Cj' add to its subscriptsany variable YIin Pr, provided either of the following conditions is met: (a) YIwas matched to a variable XI in P or YIis a subscript of a temporary name that matched XI in P. and no temporary name in P contains Xj but not XI. (b) YIis a subscript of a temporary name that matched some temporary name bl in P, and bl docs not have Xj as subscript. 2. Let dj be a temporary name in Pr that derivesfrom matching to some variable Xj of P. Then for each such dj , add to its subscriptsany variable XI in Pr, provided all the following conditions are met: (a) Xi derives from reversinga temporary name bl from P. (b) bl docs not have Xj as subscript. (c) bl docs not have Xt as subscript, where Yt matched Xt and Yt is not a subscript ofdjo

FormalProperties In examining the sentential rules, we stressedquestions of completeness and decidability. In scrutinizing the way PSYCOP deals with variables, however, we need to give more emphasis to soundness- the system's ability to produce only valid inferences. The main reason for this is that the matching rules in table 6.1 and the deduction rules in table 6.3 are more complex than the sentential rules and not as obviously correct. The methods introduced for handling variables are not as closely linked to earlier methods in formal logic and stand in need of verification. Completeness is less of an issue here. We already know that PSYCOP is not

Variables in

Reason in~


complete with respect to classical sentential logic on the basis of the counterexampleof chapter 4, and a similar counterexampleis enough to show that PSYCOP in its updated version will also be incomplete with respect to classical predicate logic. For these reasons, the bulk of this section is devoted to finding out whether the systemgets into trouble by producing proofs, such as (2) and (3), that are not valid. However, we will look briefly at completenessat the end of the chapter. If you don' t need reassuranceabout soundness,you can skip the proofs in the appendix to this chapter; but it might be helpful, even if you skip those proofs, to look at the material on semanticsin the subsectionjust below, since we will compare semanticmodels of this sort with mental models in chapter 10. Soundnessof Matching In order to prove that the rules of table 6.1 are sound, we needa standard of correctnesswith which to compare them. One way to do this is to link the matching rules directly to a semanticinterpretation for our quantifierfree sentences . If we can specify what it meansfor such a sentenceto be true, then we can establishthe soundnessof the rules by showing that they produce only true conclusionsfrom true premises. In terms of table 2.1, we can try to establish that any argument deducible by these rules is also valid. In using this direct method, however, we should bear in mind that our semanticsfor quantifier-free sentencesis not supposedto be a proposal about how people understand them. " Semantics" here is usedin the ' logician s senseof a formal, set-theoretic method for specifying truth and validity . I will try to argue in chapter 10 that this senseof " semantics" is inappropriate for cognitive purposes, and that it should be sharply distinguished from psycholinguists' useof " semantics" to describethe inferential role of meaningful sentences . Nevertheless, logical semanticsis valuable for our presentconcernssinceit provides an abstract norm against which to test our rules. Semllnticsfor Qllt Ultijier-Free Sentences Descriptions of formal semantics appear in many logic textbooks (see, e.g., Bergmann et al. 1980and Thomason 1970a); more advancedtreatmentscan be found inChang and Keisler 1973and van Fraassen1971. The point of this endeavoris to make precisethe sensein which the validity of an argument dependson the truth of its premisesand its conclusion. We defined validity informally in chapter 2 by saying that an argument is valid if and only if its conclusion is true


Chapter 6

in all states of affairs in which the premisesare true. Formal semantics proceedsby defining a model, a set-theoretic entity that might representa stateof affairs in this definition. Given the concept of a model, we can then restatethe definition of validity in mathematically preciseterms by substituting " models" for " states of affairs" : An argument will be valid if and only if the conclusion is true in every modelin which the premisesare true. I will first define what it meansfor a sentenceto be true in a model, and then give an example of how the definition works. A model M for a logical system like ours is simply a pair ( D,f ) , in which D is a nonempty set (the domain of entities to which the names and variables in the language refer). The second part of the model, f , is an interpretationfunction that designatesfor each permanent name in the language an individual element in D and for each predicate a set of " " tupies of elementsin D: If P is a one-place predicate, f assignsto it a set of elementsfrom D ; if P is a two-place predicate, f assignsto it a set of ordered pairs of elements from D ; if P is a three-place predicate, f assignsto it a set of ordered triples from D ; and so on. These tupies contain the elementsthat bear the relation specifiedby the predicate. For example, supposeD is the set of all people. Since Swedishin the sentence Swedish(x ) is a one-place predicate, f ( Swedish) will be a subset of elements of D (i.e., Swedesin the intended model); since Similar in the sentence Similar( x ,y) is a two -place predicate, f ( Similar) is the set of all ordered pairs ( x ,y) of elementsof D such that (in the intended model) x is similar to y . To determine the truth of a sentencein our quantifier-free language with respectto a model, we also needto specifypossibleassignmentsof the variablesand temporary namesto elementsof D. Let 9 be such a function that designatesfor each variable (x , y, etc.) an element of D. For example , if D consists of all people, then g( x ) might equal Napoleon, g( y ) Madonna, and g( z ) William Estes. For temporary names(a, boX'c,X,, ' etc.), we will employ a second function, h, defined in the following way: If a temporary name b has no subscripts, then h assignsit to a single element of D. If a temporary name b,X" ..."X" has n subscripts, then h assignsit to a function from n-tupies of Delements to Delements. So, for example, h( b) might be Napoleon again; however, h( b,X) would be a function from each element in D to a second(possibly identical) element in D ; h( b,X,, ) would be a function from all ordered pairs of Delements to an elementin D; and

Variables in Reasoning


so on. For example, h( b.x) might have as its value the function from each person to his or her father. Given a model M = ( D,f ) and functions g and h, we can specify the truth of an arbitrary sentenceSin the model. We do this by defining what it means for g and h to jointly satisfy a sentencein the model and then generalizing over different choices of g and h. Thus, suppose we have an atomic sentenceconsisting of an n-place predicate P followed by a sequenceof n arguments(variables, temporary names, or permanentnames) t = ( tl , t2, . . . , Til) ' Let t ' be the sequenceformed by replacing each variable x in t by g(x ), each permanent name m in t by f (m), and each temporary name b.x,..xZo ...'.x1c by h(b)(g(Xl),g(X2)" " ,g(Xt . ( This last expressionis the function picked out by h(b) applied to the k -tuple of elementspicked out by ( g(Xl), g(X2), .. . , g(Xt .) For example, consider the sentence Child-off x ,b.x)' which saysintuitively that every individual is the child of some individual . Then t = ( x ,b.x) and t ' = ( g( x ) ,h( b) ( g( x ) ) ) . If g and h are the sample functions mentioned in the last paragraph, then t ' = ( Napoleon, Napoleon's father) . In the intended model, this pair ought to be among those that the predicate Child-of refers to, and if it is we can say that g and h jointly satisfy the sentenceChild-of ( x ,b.x) ' In symbols, g and h jointly satisfy Child-of ( x ,b.x) if and only if t ' ef ( Childof ) . In general, g and h jointly satisfy asentenceS in M if and only if the following hold: (a) S is an atomic sentencePt and te f (P). (b) S has the form SI AND S2, and g and hjointly satisfy SI and S2' (c) S has the form SI OR S2, and g and h jointly satisfy SI or S2' (d) S has the form NOT SI ' and g and h do not jointly satisfyS I ' (e) S has the form IF SI THEN S2, and g and h do not jointly satisfy SI or g and h jointly satisfy S2' Finally , asentenceS is true in a model M ift' there is a function h such that , for every function g, hand g jointly satisfyS. To see how this apparatus works, take the sentenceIF Satellite( x ) TH EN Orbits( x ,b.x) - that is, every satellite orbits some object. The intended model for this sentencemight have D as the set of all astronomical objects and f (Satellite) = { x : x is a satellite} and f (Orbit ) = { ( x ,y) : x orbitsy } . Suppose g assigns x to the moon and h assigns b.x to the

Chapter 6

function from each individual in D to the object it orbits, if any, and otherwiseto Alpha Centauri . Then 9 and h jointly satisfythe sentencewith respectto the model: Clause e above states that the sentenceis satisfied if its consequent Orbits( x ,b,x) is. But 9 and h map the arguments ( x ,b,x) of the consequent to ( g(x ),h(b)(g(x ) = ( moon,earth) . Since ( moon,earth) E { ( x ,y) : xorbitsy } , the consequentis satisfied according to clause a. Furthermore, this sentenceis true in the model. No matter which object 9 assignsto x , 9 and h will jointly satisfythe sentence.On one hand, if 9 assignsx to a satellite, h(b)(g(x will be the object it orbits, and clause e will satisfy the whole sentence. On the other hand, suppose 9 assignsx to a nonsatellite. Then the antecedentsatellite( x ) will not be satisfied, sincea nonsatellite is not a member of { x :x is a satellite} . So the whol~ sentencewill again be satisfiedfor any choice of 9 by clausee. With the notion of truth in a model in hand, we can go on to define semantic entailment and validity , the crucial semantic concepts in table 2. 1. A finite set of premisesP = {~ ' ~ " " 'Pft} semanticallyentails a conclusion C iff in all models for which each sentencein P is true C is also true. (This implies that the empty set of premisessemantically entails a conclusion wheneverthe conclusion is itself true in all models, becauseit is vacuously true in this specialcasethat all sentencesof P are true in all models.) We can also take an argument with premisesP and conclusion C to be valid iff Cissemantically entailed by P. Semantics tUId Argument Re, ersal When we discussedthe deduction rules for negativeand conditional sentences , we found that we sometimes had to reversearguments (temporary names and variables) to preserve inferencerelationships. For example, to obtain the opposite of a sentence like NO T ( Bird ( x ) ) (i.e., nothing is a bird ) we had to go to Bird ( a) (i.e., something is a bird ). However, the semanticsnever explicitly mentions reversingargumentsfor negativesand conditionals. For example, claused of the definition of satisfaction relates the sentenceNOT 81to 81rather than to a similar sentencewith variables replacing temporary namesand temporary names replacing variables. This may seem puzzling at first, but the reason is that the notion of joint satisfaction defined by clauses a- e dependson the relation betweenspecific elementsof D (as given by ' the functions 9 and h) and formulas of the language; it doesnt directly . If g( x ) is an element of D, for example, then we relate whole sentences want g( x ) to satisfy a sentencelike NOT ( Bird ( x ) ) just in caseit is not

Variables in Reasoning


an element of the subset that Bird denotes (i.e., is not a bird , does not satisfy Bird ( x ) , becauseany individual element of D must be either a Bird or a non-Bird. But this satisfaction relation does not imply that NOT ( Bird ( x ) ) is true if and only if Bird ( x ) is false. It had better not , since the first says that everything is a nonbird , whereasthe secondsays that everything is a bird . Thus, NOT ( Bird ( x ) ) and Bird ( x ) can both be false provided some member of D is a bird while another is a nonbird. To seethat the semanticsand the Argument Reversalprocedure agree in this example, we needto show that NO T ( Bird ( x ) ) and Bird ( a), which are contradictory according to Argument Reversal, also have opposite truth values according to the semantics. To do this, notice that if NOT ( Bird ( x ) ) is true in some model then, by the definition of truth , g( x ) must satisfy NOT ( Bird ( x ) ) for all g. That is, no matter which element we pick from D, that element satisfiesNOT ( Bird ( x ) ) . By claused of the definition of satisfaction, this means that, for all g, g( x ) does not satisfy Bird ( x ) . In other words, there is nothing (no element of D) that satisfies Bird ( x ) - nothing is a bird. But this means that there can be no h such that h( a) satisfies Bird ( a), and this implies that Bird ( a) is false according to our truth definition. So if NOT ( Bird ( x ) ) is true, then Bird ( a) is false. Sincethis relationship also holds in the opposite direction, NOT ( Bird ( x ) ) is true if and only if Bird ( a) is false: Thesesentencesare contradictories, as we claimed earlier.2 We prove later in this chapter that the Argument Reversal procedure is correct in general, not just in this simple example. Are tile Mtltc /aingR" les So" " d1 The satisfaction conditions give us the truth of a complex sentencecontaining any mix of connectives. However, ' for purposesof testing the matching rules in table 6.1 we neednt worry ' about a sentences internal structure. Each of the matching rules statesthe conditions for generalizing a given sentenceto another isomorphic one: The two sentenceshaveexactly the samepattern of connectivesand predicates , and they differ only in the arguments to those predicates. We can therefore regard the entire sentenceand its generalization as if they were atomic propositions whoseargumentsconsist of all their namesand variables . For example, we can representthe sentenceIF Satellite( x ) THEN Orbits( x ,bx) from our earlier example as P( x ,x ,bx) . We note, too , that some of the conditions of the matching rules are irrelevant for soundness . The overall purpose of theserules is to generalizea subgoal until it

Chapter 6

exactly matchesan assertion, and some of the conditions are intended to " aim" the generalization toward a likely target. For example, if we have Similar( Kathleen Turner, Lauren Bacall) ? as a subgoal and Similar( x ,y ) as an assertion, we want to generalize the former sentenceto produce somethingidentical to the latter. However, all that is required for the rules to be sound is that , at each generalization step, the more specificsentence is semantically entailed by the more general one. That is, if the rule produces a new generalizedsubgoal ~ from the original subgoal ~ , we need only show that ~ semantically entails ~ . Soundnessdoes not depend on whether we eventually achievea match to the assertion. Our strategy for proving soundnessproceedsin the way illustrated in figure 6.2. We assumethat the generalization ~ is true in some model M . In terms of the semanticsfor our language, this means that there is a function for temporary names, hJ, such that, for any function gJ for variables , hJ and gJ will jointly satisfy ~ . We also assume, for purposes of exposing a contradiction , that the original subgoal ~ is false in the same model M . Thus, for every function h2, there is a function g2 such that h2 and g2 do not jointly satisfy ~ . We can define h2 in a way that makes its assignmentsquite similar to those of hJ, and then consider a g2 that fails to satisfy sentenceP2' Finally , we show that there is a gJ that , in conjunction with hJ, gives to the terms of ~ the samedenotations that h2 and g2 Funotlon . temporary name.

Sentence .

varlabl . .

Allurned true: (new lubgoel produoed by rule)




Allurned falle : (original lubgoal)


1 h2

1 g2

Fiaure 6.2 Proof strategy for soundnessof matching rules: (a) We assumep, true in M so that there Lnassignmenthi that satisfiesp, for any 9. (b) We construct hz so that hz and hi make Similar assignmentsto temporary names. (c) We assumefor reductio that Pz is false in M; thus, there existsa 9z such that hz and 9z do not satisfy Pz. (d) We construct 91so that hi and 91 give the samedenotations to terms of p, that hz and 9z give to Pz. Thus, hi and 91 musl fail to satisfy p, in M , contrary to (a).

Variables in Reasoning

give to the terms of ~ . Since~ and ~ differ only in their terms, hJ and gJ must not satisfy ~ , contrary to our initial assumption. This contradiction implies that P2must also be true in M , and therefore that ~ semantically entails ~ . The appendix to this chapter contains proofs of soundness along theselines for each of the rules in table 6.1. Soundnessof Deduction Rules The proofs for the matching rules verify PSYCO P' s method for deducing a conclusion from a premise that differs from it only in its terms. But, in general, PSYCOP must also deduceconclusions from premiseswhose forms differ also in the predicatesand connectivesthey contain. For this purpose, it relies on the forward rules of tables 4.1 and 6.2 and the backward rules of table 6.3. Most of these rules are variations on the usual introduction and elimination rules in natural-deduction systemsfor CPL ; however, the backward rules that deal with conditionals and negatives are rather unusual, since they require Argument Reversal, which flips temporary namesand variables. So it might be a good idea to show that the reversalprocedure gives us the right results. Soundnesswitll Respectto CPL: Rulesfor Negation Although we can establishthe soundnessof the deduction rules directly through thesemantics, as we did with the matching procedures, it might be more revealingto show that the negation rules yield the sameinferenceswe would get had we applied standard CPL rules to the equivalent sentences . If so, then, since the CPL rules are sound, the rules for quantifier-free form must be sound too. To see this, consider first the rules for negatives: Backward NOT Introduction and Backward NOT Elimination . These rules use the argument-reversal procedure as part of a proof by contradiction: They show that somesupposition leadsto a pair of contradictory sentencesand that thereforethe opposite of the supposition follows. What the argumentreversal procedure does is ensure that the pair of sentencesis contradictory and that the supposition contradicts the rule' s conclusion. In other words, Argument Reversalshould take as input a quantifier-free sentence P and produce as output another quantifier-free sentencep . such that P and NOT ( p . ) (or , alternatively, p . and NOT ( P ) are contradictory . The . proceduredelivers the right result, relative to CPL , if P and NOT ( p ) are the translations of two contradictory CPL sentences . Proving the sound-


Chapter 6

nessof the negation rules thus amounts to proving the soundnessof the argument-reversalprocedure. As an example, the CPL sentence ( Vx) ( 3y) ( Vz) P( x ,y,z) becomes P( x ,bx' z) in quantifier-free form, according to the translation procedure in chapter 3. The contradictory of the original sentence is NOT ( ( Vx) ( 3y) ( Vz) P ( x ,y,z) ) = ( 3x) ( Vy) ( 3z) NOT ( P ( x ,y,z) ) , which becomesNOT ( P( a,y,c, ) ) in our notation. Thus, we want the argumentreversal rule to map P( x ,bx' Z) to P( a,y,c, ) , or some notational variant. To check that it does, we follow the stepsof the procedurein table 6.4: (a) We assign b to a new variable, say y ' . (b) We assign x and z to new ' ' temporary names, a and c . (c) The set of temporary namesin P ( x ,bx,z) that do not have x as subscript is empty, and the set of temporary names that don' t have z as subscript is { b} . The variable assignedin step a to b is ' ' ' ' y . Hence, we must subscript c with y . (d) We therefore replacebxwith y , ' x with a , and z with c; . in P( x ,bx' Z) to get P( a' ,y',c; .), which is indeed a notational variant of P ( a,y,c, ) . In general, then, we would like to prove the following proposition : Let P be a sentencein quantifier-free form that has beenderived from a CPL sentenceF , according to the procedure of chapter 3. (Recall that the rules in table 6.3 apply only to sentencesthat can be derived from CPL formulas in this way.) Let p . be the sentencewe get by applying the argument-reversal rule to P. Then NOT ( p . ) can be derived from a CPL sentencethat is logically equivalentto NOT ( F ) . Figure 6.3 illustrates these relationships, and a proof of the proposition appearsin the appendix. Soundnesswit" Respectto CPL: Conditionals Like negative sentences , conditionals require specialhandling in our quantifier-free format, and for much the same reason. In CPL a conditional sentencesuch as IF ( Vx) F ( x ) THEN ( Vy) G( y) is equivalent to ( 3x) ( Vy) IF F ( x ) THEN G( y) , and this goes into quantifier-free form as IF F ( b) THEN G( y) . In this case, we simply switch the universal quantifier that is embeddedin the antecedentto an existential quantifier when it is brought out front . But there is an additional complication with conditionals that we didn ' t have to face in dealing with negatives: Some of the variables or temporary names that appear in the antecedentmay also be shared by the consequent , and reversing these latter arguments in the antecedent will not do. Rules like Backward IF Elimination must distinguish theseglobal (or .


Variables in Reasoning

(Q (x.~)i (~1).(Q (8(~.~) ~).(Q 1 ~t> G ~X1 ~NOT

F . 81 .

NOT(F) .

tranalatlon prooedure


tr . elation procedure

P NOT ) ( """~--argument ~ 8 . ~ ----r-."ev --prooedur

FIIDre6.3 Proofstrategyfor soundness of ArgumentRevenal.AssumeP is a quantifier-freesenten C% that derivesfrom asentenC %F that is logicallyequivalentto the prenexsenten C%SI. Let p . bethe resultof applyingthe ArgumentReversalpro<:edureof table6.4 to P. We thenshow that NOT( p . ) is the translationof a CPL senten C%equivalentto NOT( F ) .

matched) arguments from the arguments that are local to the antecedent in order to carry out the deduction correctly. Let us study this situation more carefully in the caseof the Backward IF Elimination rule in table 6.3. We are given a conditional assertion IF P' TH EN R' and a subgoalR that we can match to the consequentR' . By the results proved earlier, we know that if we can deduce R' then R must follow validly . The IF Elimination strategy is to deducethe consequentR' ' by deducing the antecedent; but we shouldn t just attempt to prove P' itself. We may have to adjust some of the argumentsin P' for the reasons just mentioned. The steps of the rule in table 6.3 spell out how these argumentsshould be changed. Essentially, we apply the argument-reversal procedure to just the local arguments of the antecedent and then add subscriptsto ensurethat global variablescorrectly govern local temporary names. The subscript-adjustment procedure in table 6.4 is responsiblefor this latter change.



As an example, we can consider a problem related to the African cities databasethat we looked at in chapter 3. In that context, our domain consisted of locations in Africa, and we used In ( Khartoum, Sudan) to mean that Khartoum is in Sudan and In ( Sudan, Africa ) to mean that Sudan is in Africa. To expressthe transitivity of the In relation, we relied on the generalization: IF In ( x ,y) AND In ( y,z) THEN In ( x ,z) . Now , suppose we want to prove, for some reason, that all locations in the databaseare in Africa, given the transitivity generalization and the fact that any such location is in some place that is in Africa. This amounts to a proof of (4). (4) IF In (x,y) AND In (y,z) THEN In (x,z) In (w,a. ) AND In (ai Africa) In (u, Africa) To prove this by Backward IF Elimination , we must match the conclusion to the consequentof the first premise(stepsad of the IF Elimination rule in table 6.3). In this process, u will be matched to x and Africa to z. According to steps e- g, we must also substitute the same terms in the antecedent, producing I n( u,y) AND I n( y,Africa) . This is sentenceP in the statement of the rule, with u and Africa as matched arguments and y as the sole unmatched argument. In step h, argument reversal applies to this unmatched argument, changing the antecedentto I n( u,b) AND ' In ( b,Africa ) . But this last sentenceisn t quite what we want as a subgoal, for it saysthat there is someparticular location b in Africa that everyplace is in. The proper subgoal should instead say that each location is in some place or other in Africa: In ( u,bu) AND In ( bu,Africa) . The subscript adjustment in step i yields preciselythis result, which is called p . in the rule. (Seetable 6.4 for the details of subscript adjustment.) It is easyto seethat this subgoal matchesthe secondpremiseof (4). In order to show that this adjustment produces valid results in the generalcase, we can again demonstratethat the IF Elimination rule yields exactly the same result we get using standard Modus ponens in CPL . More formally , what we would like to prove is this: SupposeIF P' THEN R' derivesfrom a CPL sentenceSl and R from a CPL sentenceSJ. Let p . be the result of applying the Argument Reversal and Subscript Adjustment ' proceduresof table 6.4 to P , subject to the other conditions listed in the Backward IF Elimination rule. Then p . is a translation of a CPL sentenceS2such that Sl and S2entail (in CPL ) SJ. ( Theuseof primes and

in Reasoning Variables


asterisks here corresponds to the way the rule is stated in table 6.3.) A proof of this proposition is given in the appendix to this chapter. Of course, the proofs we have given here of the matching rules, the negation rules, and IF Elimination do not show that the entire systemis sound. However, they do establish the soundnessof some of its main ' parts- parts that may not be as intuitively obvious as PSYCO P s other principles. Incompleteness As we have seen, PSYCOP is unable to prove some valid propositional arguments that involve transforming conditional sentencesto nonconditionals . Similarly, our extended model for predicate-variable sentences is unable to prove arguments like (5), which is valid according to the semanticsoutlined above (and whose CPL counterpart is, of course, also valid). (5) NOT (IF P(x) THEN Q( x P(a) The causeof this incompletenessis the sameas before: By design , PSYCOP 't doesn possessrules that are capableof deducing the conclusion from the premise. However, there are some issuesconcerning completenessthat are relevant to our choice of quantifier-free form and that merit our attention. On one hand, as long as we stick to quantifier-free sentencesthat are equivalent to sentencesof CPL , the form itself does not pose a barrier to completeness . The trouble with (5), for example, hasnothing to do with the fact that we haveeliminated explicit quantifiers in its premiseand in its conclusion . There are complete proof systemsin logic that make use of quantifier -free form - systemsthat originated in the work of Skolem ( 1928/ 1967) (seealso Quine 1972, chapter 34). Along the samelines, resolution systems in AI are complete even though they operate (after some stage) on quantifierless representations(seechapter 3). Thesesystemshave many advantages : They are more elegant than PSYCOP, since they are based on a uniform reductio procedure. Also, the problems associatedwith argument reversaldon' t arise for such systems. After the first step (in which the conclusion is negated), they get along without rules suchas IF Elimination ' (or they use these rules in ways that don t require switching quantifiers).


Chapter 6

But what suchsystemsgain in simplicity they lose in psychologicalfidelity , ' sincethey don t correctly reflect human proof strategies. On the other hand, if we take full advantageof quantifier-free form and usesentencesthat have no equivalentsin CPL , completenessis impossible no matter which finite set of rules we choose. This is a consequenceof the fact that we can create sentencesin this form with temporary namesthat have distinct setsof variables as subscripts. For example, in chapter 3 we used IF ( Country( x ) AND Countryy ) AND Richer-than( x ,y ) ) THEN ( Official( bx) AND Official( by) AND More-powerful-than( bx,by) to represent Barwise's ( 1979) example The richer the country, the morepowerfulone , the identity of official bx dependson the of its officials. In this sentence value of x , and that of official byon y; but bxdoes as the we choose country not depend on choice of y, and by does not depend on x. It is known , however, that logics that can encode such relationships are incomplete (Westerstahl 1989). Psychologistsand philosophers have sometimestaken this theoretical incompletenessas indicating a defect in proof-basedapproaches. If there are valid arguments that it is impossible for such a system to prove, " " ' doesn't that mean that humansaren t proof-based(or syntactic ) systems " " after all but instead operate on the basis of other (perhaps semantic or " even " pragmatic ) principles? (SeeLucas 1961for related arguments; see Benacerraf 1967and Dennett 1978for refutations.) The trouble with this conclusion is that it presupposesthat the alternative methods forevalu ating arguments are ones that humans can use to do what proof systems can' t. Although it is true that model-theoretic semanticscan describevalid ' argumentsthat can t be proved, thesemethods do not automatically yield proceduresby which people can recognizesuch arguments. On the contrary , what the incompletenesstheorems establish is that any procedure within a very broad class (i.e., those that can be carried out by Turing machines) is incapable of recognizing all valid arguments. Such a procedure will be incomplete whether it usesstandard proof rules or not. For " " example, mental model theories (Johnson-Laird 1983), which are advertised " " as semantic methods, are also supposedto be Turing computable; if so, they are subject to incompletenessjust as proof-based systemsare. (Seechapter 10 below for further discussionof mental models.) Moreover, even if an alternative method can correctly recognize valid arguments that slip by a specific proof system, it remains to be shown that there isn' t a further valid argument that slips by the alternative method. It is

Variables in Reasonin~


possiblethat humans really do have abilities that outstrip the algorithmic incompletenesstheorems apply. But no psychological proceduresto which the ' proposal that I m aware of comesclose to being both sound and complete over the range of argumentsthat can be expressedin quantifier free form. Perhaps objections based on incompletenesshave more point if they focus on the specificdifficulties that PSYCOP has in proving arguments like (5). As just mentioned, there are many well-known systemsthat are complete with respect to CPL ; but PSYCOP is incomplete even in this narrower domain. Nor is this incompletenesslimited to extremely lengthy problems: Any student of elementarylogic should be able to show that (5) is correct. However, whether this is a defect in PSYCOP is an empirical matter. The theory is supposedto representthe deductive abilities of people untutored in logic; thus, we are on the wrong track if thesepeople can ' correctly assessarguments that PSYCOP can t. 'We need data from such people to determine the plausibility of PSYCO Ps restrictions.

: Proofsof theMajor Propositions Appendix In the proofs of soundnessfor the matching rules, p, ')', and t5range over temporary names, and 0' rangesover variables. Matching rille 1 of table 6.1 ( , ariablesin sllbgoal to , ariablesin assertion) is sollnd. Proof For this rule, we take ~ to be the new subgoal P' ( x ) and ~ to be ' the original subgoal P ( y ) (seetable 6.1). As in figure 6.2, we assumethat . If ~ is true in M and choosean assignmenthI that satisfies~ for all g P2 is falsein M , then for any h there is a g such that hand g fail to satisfy P2. In particular , we can define h2 in the following way: Let h2( P) = hI ( P) for all temporary namesP that do not appear in ~ in the sameposition in which x appearsin ~ . For temporary names')'..." ... that do appear in the same position as x , let h2 assign ')'..." ... to the function f2 ( . . .g( y ) . . . ) = g( y) . In other words, ')'..." ... denotesthe function whose value is the same as its argument g( y ) . (Becauseof conditiond and action b in rule I , ')'..." ... cannot appear in ~ ; so we are free to define h2( ')') as we like.) We can then let g2 be an assignmentto variablesso that h2 and g2 do not jointly satisfy P2.


Chapter 6

Define gl ( O ' ) = g2( 0' ) for all 0' ~ x , and gl( X) = g2( Y) . The conditions of rule 1 imply that there are only three ways in which Pl and ~ can differ: (a) P2 has Y where ~ has x ; (b) ~ has Y..., ... where ~ has x ; and (c) ~ has .B..., ... where ~ has .B....x... Since gl( X) = g2( Y) ' terms in positions covered by case a have the same denotations. Moreover, in case c, hl and h2 assign .Bto the same function (say, I , ) . Since f , ( .. .gl( X) . . . ) = f , ( . . .g2( y) . . . ), these terms too have the same denotations. Finally , in caseb, h2 assignsY....,... to g( y) by definition ofh2 , and g2( Y) = gl ( X) by definition of gl . So, once again, x in ~ and Y..., ... in ~ denote the same element of the model. This means that each term in ~ has exactly the samedenotation as the analogous term in ~ . Since ~ and ~ are otherwise identical, hl and gl cannot satisfy ~ . This contradicts our assumption that hl satisfies~ for all g. Mtltclling rule 2 of ttlble 6.1 ( temportlrY ames ;11subgolll to temportlry or pernuulellt IItlmeSill tl Ssertioll) ;s sound. Proof To prove rule 2 sound, we let ~ equal P' ( n) , the generalizedsubgoal , and let ~ equal P' ( t ), the original subgoal. Again, assume~ true in M = ( D,f ) , and for the reductio supposeP2 is false in M . Let hl be an assignmentthat satisfies~ for all g, and define h2 as follows: h2( .B) = hl ( .B) for all .B~ t.x, ....,.x" ...,.xk' where t .x" ...,.x" ...,.xkis the one temporary name that appears in ~ in the same position that n occupies in ~ . Becausen replaces all occurrencesof t.x" ...,.x" ...,.xk' when ~ is formed, t .x" ...,.x" ...,.xkitself does not appear in ~ . Hence, no conflict can arise over the assignmentfor t.x" ...,.x" ...,.xk. If the name n mentioned in the rule is a permanent name, we can let h2 assignt .x" ...,.x" ...,.xkto the function f 2( g( Xl ) ' . . . ,g( Xi) ' . . . ,g( Xt) ) = f ( n) (that is, to the constant function whose value is the element of D that the model gives to proper name n). If n is a temporary name, say d.x" ...,.x" then by condition c of the rule its subscripts are a subset of those of t.x" ...,.x" ...,.xk. So we can let h2 assignt .x" ...,.x" ...,.xk' to the function f2 ( g( Xl), . . . ,g( Xi) ' .. . ,g( Xt) ) = fJ( g( Xl) ' . . . ,g( Xi) ) , wherefJ is the function that hl assignsto d. ( That is, the value of f2 is just the value that fl gives to the subsetof argumentsg( XJ , . . . ,g( Xi) .) Becauseof condition e and the action of rule 2, the terms of ~ and P2 are identical, except that the former has n where the latter has

Variables in Reasoning


tx" ...,x" ...,x" , But by the above construction the denotation of n under hI must be the sameas the denotation oftx " ...,x" ...,x" under h2for any choice of g. Hence, if h2 does not satisfy ~ for someg2, then hI will not satisfy ~ for all g. But this contradicts the hypothesisthat hI satisfies~ for every g. Matching rule 3 of table 6.1 (pernuulent namesin subgolll to variablesin t Usertion) is sound. Proof This time let ~ = P' ( x ) and P2= P' ( m), from rule 3. Following the usual procedure of figure 6.2, we assume~ true in M = ( D,f ) , so there is an hI that satisfiesthis sentencefor any g, We also suppose~ false in M and define h2 as follows: h2( P) = hI( P) , for all temporary namesP, except those that appear in ~ in the samepositions occupied by x-subscriptedtemporary namesin ~ . In the latter case, if b'1" ..."1" appears in the sameposition as ')Ix,)'" ...,)'", let h2 assignb to the function f2 ( g( yJ " " ,g( Yt) ) = fl dig( YJ , . . . ,g( Yt) ) , where h is the function that hI assignsto ')I and d = f ( m) (i.e., the constant element of the model associatedwith the permanent name m). Since is P2 supposedto be false, there must be a g2 such that h2 and g2 do not jointly satisfy ~ . Let gI( O ' ) = g2( 0' ) ' for all 0' ~ x and let gI ( X) = f ( m) . Rule 3 ensures that the terms of ~ and ~ are the sameexcept when (a) ~ has m where ~ has x and (b) ~ has b'1',...,'1" where ~ has ')Ix,'1" ..."1" . But terms in positions covered by (a) have the same denotation, since gI( X) = f ( m) , And the sameis also true of the terms covered by (b), since, under h2 and g2, b)'" ...,)'" refersto f2(g2(Yl )" " ,g2(Yt = fl (d,g2(Yl )" " ,g2(Yt = fl (gl (X),gl (Yl )" " ,gl (Yt , which is the denotation of /'x,)',. ....'1" under hI and gI' Since analogous terms in ~ and ~ have the same referents, hI and gI must fail to satisfy ~ if h2 and g2 fail to satisfy ~ , contradicting the premisethat hI satisfies ~ for all g. Matching rule 01of table 6.1 ( temporary namesin subgoalto variablesin t Usertion) is sound. Proof In this case, ~ = P' ( XI" " ,Xt) and ~ = P' ( t ) . Supposethat ~ is true and, for contradiction , that ~ is false in somemodel M . Let hI be an


Chapter 6

assignmentfor temporary names that satisfies~ for any g. We can then define h2 so that h2( P) = hJ( P) for all temporary names P, with the following exceptions: First , if t is the temporary name mentioned in rule 4 that appears in the same positions as x J, . . . ,Xt, then let h2 assign t to a ' constant function whose value is an arbitrary memberd , of the model s domain. Second, suppose')" " .'.')'J is a temporary name in ~ that appears in the same position as a temporary name Px; ,...,x." " ..." J in ~ , where the subscripts x ; , . . . ,x ; are chosen from among xJ" " ,Xt. Then let h2 assign')" " ..." J to the function 12(g(Y1)" " ,g(Yj = 11(d, .. . ,dig(Y1)" " ,g(Yj , whereh is the function that hJ assignedto Px; '...'X." " ..." J' and where the first ith argumentsof hare d. We also let g2 be a variable assignment, such that h2 and g2 do not jointly satisfy ~ . Define gJ( 0' ) = d if 0' = XJ or X2 or . . . or Xt, and gJ( 0') = g2( 0' ) otherwise . Conditions c- d, together with the action portion of rule 4, force ~ and ~ to be identical, except in caseswhere (a) ~ has t where ~ has Xl ' X2" ' " or Xt, or (b) ~ has ')" " ..." J where ~ has Px;,...,x." " ..." J' By the definitions of h2 and gJ, however, h2( t ) = d = gJ( XJ) = . . . = gJ( Xt) . So terms in positions coveredby (a) have the samedenotations. Similarly, the denotation of ')')," ..." J' as defined above, is 11(d, . . . ,d,g2(Y1)" " ,g2(Yj = 11(d, . . . ,d,g1(Y1)" " ,g1(Yj = 11(g1(X; )" " ,g1(X;),g1(Y1)" " ,g1(Yj , which is the denotation of Px; ,...,x." " ..." J' For this reason, ifh2 and g2 fail to satisfy ~ , then hJ and gJ must fail to satisfy~ , contradicting our original assumption. Soil" dneSSof ArKllme" t -Re, ersGl Procedllre ( table 6.4) Let P be a sentence in quantifier-free form that has beenderived from a CPL sentenceF, . according to the procedure of chapter 3. Let p be the sentencewe get by . applying the argument reversalrule to P. Then NOT ( p ) can be derived from a CPL sentencethat is logically equivalent to NOT ( F ) . Proof According to the translation method of chapter 3, P comesfrom F via a logically equivalent CPL sentencein prenex form: ( QJxJ) ( Q2X2 ) ' " ( QtXt ) G( XJ,X2" " ,Xt) ,

Variables in ~easoning


whereeachQi is an existentialor universalquantifier. Call this prenex sentence Sl (seefigure6.3). Hence, NOT(F) ~ NOT(S1) --=>NOT QI XI)(Q2X2) . . . (QkXk) G(XI,X2, . . . Xk --=>(QI Xl )(Q2X2) . . . (Qkxk)NOT (G( xl ,x2' . . . ,Xk ' where Q; = 'v' if Q; = 3 and Q; = 3 if Q, = 'v'. It then sufficesto show that NOT ( p . ) is a translation of this last expression, which we can label S2. (Seefigure 6.3.) First , supposethat a is a temporary name (possibly subscripted) in P. The argument-reversal procedure (table 6.4) will replace a with a novel . variabley in p in steps 1 and 4. But note that, by the translation method of chapter 3, a must have come from someexistentially quantified variable in Sj. When Sj is negatedto produce S2, the samevariable will be universally quantified (owing to the switch in quantifiers). Thus, we can choosey to stand for this variable when S2is translated into quantifier-free form. Second, supposex is a variable in P. The argument-reversal procedure will assign x to some new temporary name b in step 2. In step 3 the procedure gathers the set of all temporary names { aJ} in P that do not have x as a subscript, and then subscriptsb with the variables in { YJ} that have been assignedto thesetemporary namesin step I . Step 4 replacesx with the temporary name b(YJ} in p ' (where b(YJ) is b subscriptedwith each variable in { YJ} ). Now , x must have originally beenderived from a universally quantified variable in Sj. Similarly , the temporary namesin {aJ} must have been derived from a set of existentially quantified variables { zJ} in Sj- in particular, from those existential quantifiers that precede( 'v'x ) in Sj, given the translation method. (If they followed ( 'v'x ) , they would have receivedx as a subscript.) Thus, when Sj is negatedto produce S2' x will becomeexistentially quantified, and the zJswill be those universally quantified variables whose quantifiers precede x in S2. To translate S2 into quantifier-free form , we must selecta temporary name for x and subscript it with this sameset of universally quantified variables. In the preceding paragraph, we saw that elementsof {zJ} will appear as the elementsof { YJ} in p ' . If we choose b as the temporary name to replace x , then x will becomeb(YJ} in translation. The translation of S2must be of the form NOT ( P' ) , where P' is identical to p ' , with the possible exception of P' and p ' terms. However, we



have just seen that each variable in P' corresponds to a like variable in p ' , and each temporary name in P' to a like temporary name in p ' . This ' means that P = p ' , which is what we set out to prove . ' ' Soundness of Backward IF Elimination Suppose IF P THEN R derives from a CPL sentence Sl' and R from a CPL sentence SJ. Let p ' be the result of applying the argument ~reversal and subscript -adjustment procedures of table 6.4 to P ' , subject to the other conditions listed in the Backward IF Elimination rule . Then p ' is a translation of a CPL sentence S2 such that Sl and S2 entail (in CPL ) SJ. Proof We tackle this proof in three parts . The first two parts show that Sl entails a further CPL sentence, shown in (8) below . The third part ' completes the proof by demonstrating that p is the translation of a sentence S2' which , together with (8), entails SJ. Since Sl entails (8), it follows that Sl and S2 entail SJ. Here are the details : 1. Suppose IF P ' THEN R ' is the translation of Sl' which we can write as in (6). (6)

(QIX1 )


(Qtxt )( IF F THEN G ).

' Then P will be the translation Of ( QIXI ) . . . ( QtXt ) F , and R ' is a translation of ( QIXI ) . . . ( QtXt ) G. To simplify matters , we need some preliminary transformations of (6). Note that when IF Elimination is carried out variables and names in R will be matched to the variables and names ' in R (see step b of the rule in table 6.3). Consider , first , those universal ' quantifiers in (6) that correspond to matched variables in R . ( For example ' , ( QiXi ) will be such a quantifier if Qi = 'v', Xi appears in R , and Xi is matched by a variable or temporary name from R.) By moving some of these universals leftward in (6), we can arrange it so that any universal ( 'v'Xi) whose variable Xi matches a variable Yi from R precedes universals and existentials whose variables match temporary names containing Yi as ' subscript . (Suppose R is G( XI,X2) and R is G( c" ,y ) , so that c" matches Xl and Y matches X2. Then we want ( 'v'X2) to precede ( 'v'XI) in (6), and we can write ( 'v'X2) < ( 'v'XI) to denote this ordering .) In this rearrangement we must also make sure that ( 'v'Xi ) follows universals and existentials whose variables match temporary names that do not contain Yi as subscript . In other words , the ordering of the quantifiers should copy the ordering of the corresponding quantifiers in SJ.


Variables in Reasoning

What justifies this rearrangement? In the first place, such an ordering is possible, since its analogue already appears in SJ. Second, if we can achieve this ordering by moving the universal quantifiers in (6) leftward over existentials or other universals, then the rearrangedsentenceis logically entailed by (6): In CPL ( VXi- l ) ( Vxi) H entails (in fact, is equivalent to ) ( VXi) ( VXi- l ) H , and ( 3Xi- l ) ( Vxi ) H entails ( VXi) ( 3Xi- JH (see Reichenbach1966, pp. 134- 135, or Suppes1957, p. 115). Thus, to move a universal quantifier leftward in CPL we can strip precedingquantifiers by meansof V- Elimination or 3-Elimination from table 2.5, apply one of these entailments, and then restore the precedingquantifiers by V-Introduction or 3-Introduction . Sentence6 entails this result. Third , we can indeed achievethe desiredordering by leftward movementof universalsalone. To seethis, supposethat whenever ( VXi) < ( 3xJ) in (6) then ( VYi) < ( 3YJ) in SJ' where Xi correspondsto Yi and XJto Yj" Then a seriesof left shifts of universal quantifiers is sufficient to achievethe correct ordering. (Suppose (6) is ( 3X2) ( 3X4) ( VX5) ( VxJ) ( 3X6) ( VX1) ( IF F THEN G) and SJ is ( VY1) ( 3Y2) ( VYJ) ( 3Y4) ( VY5) ( 3Y6) G, where Yl matches Xl' Y2 matches X2' and so on. Then by jumping the universal quantifiers in (6) leftward we can obtain the correct ordering. One algorithm would be to start with the universal quantifier that should appear in leftmost position, ( VX1)' moving it in front of the others by a series of leftward steps, and then to move the universal that should appear next, ( VxJ)' into position.) The only barrier to complete alignment occurs if ( VXi) < ( 3xJ) in (6) but ( 3Yj) < ( VYi) in SJ' for in this case it would be necessaryto move the universal in (6) to the right rather than to the left to achieve the proper ordering. But this situation is not consistent with our 'hypothesis that R ' matchesR , becausein such a casethe arguments of R would include Xi ' and b...x,... and the arguments of R would include Yi and c (where c s subscripts do not include Yi). The first two matching rules of table 6.1 would then prohibit matching Yi to Xi and b...x,... to c. An additional changeis neededin (6), however. Within unbroken strings , we can move of universals( VXi ) . . . ( VxJ) in the newly rearrangedsentence whose universals of R in front universalswhose variables match terms in ' P variables are not matched (i.e., variables that appear only in ). This further reshuming is justified by the same CPL principles mentioned above, so that the new sentenceis also entailed by (6). Let us denote the result of thesechangesas in (7). ' ... ' (7) (Q lXl ) (Q kXk) (IFFTHENG



Chapter 6

The sentence ( Q~Xl) . . . ( QtXt ) G translates into a quantifier-free sentence that is similar to R, except that the former sentencemay have variables in some of the positions where R has temporary names. Hence, if we substitute existential quantifiers in corresponding positions in ( Q~Xl) . . . ( QtXt ) G, we get a sentence that translates into R itself. In particular , SJ is ( Q~Xl ) . . . ( 3Xi) .. . ( QtXt ) G, where we have set any universal quantifiers ( VXi) to existentialsif R has a temporary name in place of Xi. Thus, in order to prove our main theorem, if sufficesto show that (7) entails a sentenceof the form shown in (8) and that p . is the translation of ' (8) s antecedent.

' (8) IF (Q;Xl) ' " (Q;xt.)F THEN(Q'lXl) ' " (3Xi)' " (Qtxt.) G. 2. To showthat(8) followsfrom(7), weneeda fewmorefactsfromCPL. First, if ( Qtxt) ( IF F THEN G) entailsIF ( QZxt)F THEN (QtXt) G, then ( QIXl) " ' (Qt- lXt- l) ( QtXt) ( IF F THEN G) entails(QIXl) ' " (Qt- lXt- l) ( IF ( QZxt)F THEN ( QtXt) G) . (Toshowthis, wecanstrip awaythe first k - 1 quantifiersusingV-Eliminationand 3-Elimination fromtable2.5, makeuseof theentailment , andthenrestorethequantifiers with V-Introductionand3-Introduction .) Thismeansthat wecanmove thequantifiers in on a step-by-stepbasis , so that ( QIXl) " ' (QtXt) ( IF F THEN G) entailsIF ( Q;xl) " ' ( QZxt)F THEN (Q~Xl) " ' ( QtXt) G. Second , in CPL, ( 3x) ( IF F THENG) entailsIF ( Vx) F THEN( 3x) G; ( VX) ( IF F THEN G) entailsIF ( Vx) F THEN ( Vx) G; ( Vx) ( IF F THEN G) entailsIF ( 3x) F THEN ( 3x) G; andif x doesnot appearin G, then( Vx) ( IF F THEN G) entailsIF ( 3x) F THEN ( Vx) G. (See Reichenbach 1966 .) , pp. 134- 135 Thesefactsallowusto performthefollowingtransformation on(7): (a) If Qi = 3, movethequantifierinwardwithVin theantecedent and3 in the ; (b) if Qi= V andXiin R' is matched consequent by a variablein R, then in move thequantifierwith V in theantecedent andV in theconsequent ; (c) if Qi = V andXi doesnot appearin G, movein thequantifierwith 3 ' in theantecedent andV in theconsequentd ) If Qi = V andXi in R is matched namein R, thenmovein thequantifierwith 3 in by a temporary theantecedent and3 in theconsequent . Thefinal productof this transformation followsfrom(7) in CPL on thebasisof theentailments mentioned above . Moreover , theconsequent of this productis the sameas the consequent of (8). Thus, let us take thetransformation asfixingthequantifiers in theantecedent of (8) and

Variables in Reasoning


-freefonn attemptto provethat this antecedenttranslatesinto quantifier * asP . our currentstateof affairs. We haveseenthat 3. Figure6.4 summarizes of at the top of the figure. The consequent = indicated 81 (6) entails(8), as and then 8 of antecedent the be 81 () (8) itselfentails83, so if we take 82to * is the translationof 82. P that show to It remains . will entail 83 82 Consideran arbitrary quantifier ( QjxJ) in the antecedentof (8). If clauseb of part 2 of this Qj = V, it must havederivedfrom clausea or an was originally existentialquantifierin proof. If from clausea, then Qj (6) and(7). Sinceexistentiallyquantifiedvariablesin CPL becometemporary , this variablewill becomea temporary namesin quantifier-freesentences ' namein IF P' THEN R , which is the translationof (6) (seethe left -reversalprocedurewill change sideof figure6.4). However,the argument to this (unmatched ) temporarynameback a variablewhenP' is converted * to P (alongthe bottom half of figure6.4). So in this casethe antecedent of (8) agreeswith P* : The variablein p * correctlytranslates( VxJ) in the of (8). antecedent Similarly, if ( VxJ) derivesfrom clauseb, then it was also a universal a variablein R. Because quantifierin (6) and (7), and its variablematched * of this match, it will remaina variablein P . (Argumentreversalapplies , asstipulatedin the IF Eliminationrule.) So arguments only to unmatched * of (8). againp agreeswith the antecedent = , that Qj 3. Thenit must havederivedfrom either , however Suppose clausec or claused of part 2. If it comesfrom clausec, then Qj was an unmatched originally a universalquantifierin (6) and (7) and correspondsto it to a therefore will reversal change variablein P' . Argument * an i.e. 8 with existentially ( )( , temporaryname, say c, in P . This agrees we must also but name a as ), is translated variable temporary quantified ' . Supposethereis some( VXi) checkto seewhetherc s subscriptsalsoagree is If . antecedent Xi unmatched(i.e., comesfrom that precedes ( 3xJ) in the , clausea), then ( 3Xi) < ( VxJ) in (7). Sinceboth variablesare unmatched are universals theymusthavebeenin thesameorderin (6). (Only matched that turned(6) into (7).) This meansthat Xi is movedin the rearrangement translatedinto a temporarynamein P' , one that will not haveXJas a subscript.ArgumentReversalwill thenchangethis temporarynameinto a variable, aswe sawearlier, and will alsoadd the variableasa subscript to c. (Under the sameconditions, if ( VxJ) < ( 3Xi) in (6) and (7), the temporary , sothe converse to Xi will haveXJassubscript namecorresponding alsoholds.)



1DJOj O}



Chapter 6

Variables in Reasoning


On the other hand, supposethat Xi is matched to a variable in R (i.e., comes from clause b). Then ( 'v'Xi) < ( 'v'Xj) in (7). If this same order appeared in (6), then there can be no temporary name in P' that contains Xj but not Xi as subscript, in view of the translation rules for quantifier-free form in chapter 3. Hence, the variable that matched Xi will become a subscript of c becauseof Subscript Adjustment in step i of the IF Elimination rule (seetables 6.3 and 6.4). It is possible, though, that ( 'v'Xj) < ( 'v'Xi) in (6). If both quantifiers come from the same unbroken string of universals , then again there will be no temporary name in P' that has Xj but not ' Xi as subscript, and the variable matching Xi will be addedto c s subscripts. However, if ( 'v'Xj) and ( 'v'Xi) come from different strings in (6), then in order to prompt the movement of ( 'v'Xi) ' there must have been a third quantifier ( QtXt ) in (6) such that ( QtXt) < ( 'v'Xj) < ( 'v'Xi) ' where Xt matcheda temporary name b...)',... and Xi matchedYi. If Qt = 'v', then there is no temporary name in p ' that has Xj but not Xt as subscript; so Yi will become a subscript of c in step i. If Qt = 3, then in P' there is a matchedtemporary name that does not contain Xj as subscript, and Yi will again be addedas a subscript to c by Subscript Adjustment. (Conversely, if ( 3xj ) < ( 'v'Xi) in the antecedentof (8), with Xi matched and Xj unmatched, then ( 'v'Xj) < ( 'v'Xi) in (7). Since universal quantifiers with matched variables in (7) precedeany that are unmatched in the samestring of universals , there must be some ~xistential quantifier betweenthe two: ( 'v'Xj) < ( 3xt ) < ( 'v'Xi) . This sameorder must be preservedin (6), sinceexistentials ' and unmatchedquantifiers like ( 'v'Xj) d .nt move and matched universals like ( 'v'Xi) can move only to the left. So Xt will translate as a temporary namecontaining Xj but not Xi as a subscript, and the variable matching Xi will not be a subscript to c.) We must backtrack, however, to consider the casein which Qj = 3 but this existential derives from claused. Here the quantifier originated as a universal in (7) that was matched by a temporary name, c. This temporary namewill be substituted in P' and will be maintained in P* . This conforms to the antecedentof (8), since ( 3Xj) must also translate as a temporary name. However, we must again check to make sure that the subscripts of c also coincide with (8). Supposethat in (8) ( 'v'Xi) < ( 3xj ) . If the universal quantifier comes from clause a, then in (7) we have ( 3Xi) < ( 'v'Xj) . The ordering must have been the same in (6), since universals can move only from right to left. P' will thereforecontain a temporary name corresponding to ( 3Xi) that does not have Xj as subscript. Argument Reversalwill


Chapter 6

turn this temporary name back into a variable, and Subscript Adjustment will add it as a subscript ofc . (And conversely: If ( 3xj ) < ( VXi) in (8), then ( VXj) < ( 3Xi) in (7), with Xj matched to c and Xi unmatched. If the same ordering holds in (6), then ( 3Xi) becomesa temporary name in P' that has Xj as subscript. This temporary name becomesa variable in p ' , but Subscript Adjustment prohibits it from subscripting c. if ( 3Xi) < ( VXj) in (6), then in order to trigger the shift we must have ( VXt ) < ( 3Xi) < ( VXj) , where Xt matchesa variable Yt from Rand c doesnot have Yt as subscript. In this case, too, Subscript Adjustment prohibits subscripting c- seeline 2c of table 6.4.) Finally , if ( VXi) is the result of clauseb, then in (7) ( VXi) < ( VXj) , where Xi matched a variable in Rand Xj matched a temporary name. Becauseof the way we constructed (7) from (6), this ordering occurs iff the variable matching Xi is already a subscript of c. In all cases, then, c receivesa ' subscript in p iff ( VXi) < ( 3xj ) in the antecedentof (8). Since all terms ' agree, p must be a translation of the sameantecedent.


Reasoning with Variables

' motto for . Make that your Experiment fashion - 1991. When times are tough , it s not .

the time to retreat into old safeand surestyles. Carrie Donovan, New York Times

It is an odd fact that psychologists who favor natural deduction mostly support their theories with experimentson sentenceconnectives, whereas " " psychologistswho lean toward images or diagrams or mental models mostly experiment with quantifiers, usually syllogisms. Perhaps this is becausethe natural-deduction rules for connectiveshave an appeal that some of the rules for quantifiers lack. As was noted in chapter 2, the introduction and elimination rules for quantifiers (table 2.5) sometimes make the proof of an argument more roundabout than it should be. ' But natural-deduction theories aren t necessarilystuck with quantifier introduction and elimination. In chapter 6 we came up with a way to representquantifiers implicitly in terms of the variables they bind, letting the rules for connectivespassthesevariables from premise to conclusion. One point in favor of this approach is that it explains some similarities in the ways people reason with connectives and quantifiers. Independent variables that affect inferenceswith conditional sentences , for example, also tend to affect inferenceswith universals(Revlis 1975a) a fact that is easy to account for if a universal such as All psychologistsare batty is representedas IF Psychologist( x ) THEN Batty( x ) and is manipulated . by the samerules that govern other conditional sentences However, we need to take a more systematiclook at the experimental results on variables and quantifiers in order to tell whether the principles of chapter 6 handle them adequately. Researchon quantifiers in psychology : All Fare G, SomeFare hasconcentratedon the categorical sentences G, No Fare G, and SomeF are not G. We will therefore begin with these sentencesand the syllogistic arguments in which they appear. However, the systemdescribed in chapter 6 covers a much wider set of arguments than syllogisms, and this means that we must come up with new experiments in order to get a fair test of the extendedmodel.

with a SingleVanable EmpiricalPredictionsfor Sentences into our system Wecantranslatethe four categoricalsentences , usingIF F( x ) THEN G( x ) for All Fare G, NOT( F( x ) AND G( x ) ) for No Fare G, F( a) AND G( a) for SomeFare G, andF( a) AND NOT G( a) for Some



F are not G.l These correspond to the standard ways of expressingthe propositions in classicalpredicate logic, and within this framework none of thesesentencesimplies any of the others. The same is, of course, true within our own quantifier-free system. For example, if we try to show that All Fare G entails SomeFare G, our proof will stall almost immediately. Ifwe useBackward AND Introduction to establishthat F ( a) AND G( a) , we must then fulfill both of the subgoalsF ( a) and G( a) . But both of these fail. We have nothing to match F ( a) against, given only IF F ( x ) THEN G( x ) . None of the other rules help us out of this jam . We could use Backward IF Elimination to try to get G( a) , but this rule then requires us to satisfy F ( a) again. Scholasticlogic differs from CPL in identifying two entailment relations . In the older logic, All Fare G entails among the four categoricalsentences SomeFare G, and No Fare G entails SomeF are not G. In their standard representationin CPL , however, both All Fare G and No Fare G are true when there are no Fs at all , and in this samesituation both SomeFare G and SomeF are not G are false. Hence, the former statementscan't entail the latter ones. (For example, if we representAll unicornsare white as IF Unicorn( x ) THEN Whiterx ) and Someunicornsare white as Unicorn( a) AN D Whitera), then the former sentencewill be vacuously true and the latter false.) In order to get the scholastic inferences, we have to supplement the representationof the All and No sentencesto include the assertion that there are some Fs. If we translate the All sentenceas ( IFF ( x ) THEN G( x ) ) AND F ( a) , and the No sentenceas NOT ( F ( x ) AND G( x ) ) AND F ( a), we can then derive Somefrom All and Some-not from No in CPL and in PSYCOP. Subjects' judgments sometimesgo along with theseinferences.2 Begg and Harris ( 1982) and Newstead and Griggs ( 1983) found that 56- 80% of their subjectsagreed that All implies Some and 48- 69% that No implies Some-not. Presuppositionsand Implkatures of Categorical Sentences One way to understandthe scholasticinferences(and subjects' willingness to draw them) is to interpret All and No statementsas involving a presupposition about the existenceof membersof the set named by the subject term (Strawson 1952, chapter 6). On this view, the existenceof membersof this set is a condition on All or No sentenceshaving a truth value. If there are no unicorns, for example, then the question of the truth or falsity of All unicornsare white and No unicornsare white simply doesn't arise. Thus, if

Reasoningwith Variables


subjects are told to accept such a premise as true (as they were in the experimentsjust cited), they must also accept the related existential presupposition , and the scholastic inference will follow. For example, if we assumethat All unicornsare white is true, we are also presupposingthat there are unicorns. And becausethese unicorns must be white on the original assumption, then it must also be true that Someunicornsare white. Parallel reasoning yields Someunicornsare not red from No unicornsare red. It is possible to contend that thesepresuppositionsare in force only when the All or No sentencescontain other phrases, such as definite descriptions or proper names, that independently presupposethe existence of a referent(Vendier 1967, chapter 3). Thus, All the unicornsare white will , but All unicorns are white will not , presupposethat there are unicorns. Although this may well be correct, the differencebetweenthesetwo universal forms may be sufficiently subtle that subjectsoverlook it in typical experimentsand adopt existential presuppositionsfor both cases. Some subjects also go beyond both CPL and scholastic logic in assuming that SomeFare G implies SomeF are not G and conversely. In the Begg-Harris and Newstead-Griggs experiments, 58- 97% of subjects agreed to the inferencesfor Some to Some-not and 65 92% for Some have not to Some. As was noted in chapter 2 above, most researchers ascribed these inferencesto Gricean conversational implicatures (Begg and Harris 1982; Horn 1973, 1989; Newstead 1989; Newsteadand Griggs 1983; McCawley 1981). lfa speakerknows that all Fare G , then it would be misleading, though not false, to say that someFare G. On the assumption that the speakeris being cooperative, then, we should be able to infer other words, some F are not G. The conversathat not all Fare Gin tional implicature operatesin the opposite direction too. A speakerwho knows that no Fare G would mislead us by saying that someF are not G. Hence, from the sincereassertion that some F are not G , we can assume that it is not the casethat no Fare G (equivalently, that some F are G ). This Gricean view seemsquite reasonableand extendsto the scholastic cases. For example, it is misleading to assertthat all the pencils on a table are yellow when there are in fact no pencils on the table at all. Strawson ( 1952, p. 179) recognizedthe possibility of this kind of explanation for the existential presuppositions: ' ' , a generalrule of linguisticconduct, may Certainlya pragmaticconsideration : the rule, namely,that onedoesnot make these to underlie be seen points perhaps the(logically) lesser , whenonecouldtruthfully(andwith equalor greaterlinguistic



. Assumefor a momentthat the form 'Thereis ) makethe greater,claim economy ' not a single... which is not ... wereintroducedinto ordinary speechwith the samesenseas ' - (3x)(fx - gx)' [ = (Vx)(f(x) =>g( x ] . Then the operationof this inhibit the useof this form whereonecouldsaysimply' There generalrule would is not a single...' (or ' - (3x)(fx)'). And the operationof this inhibition wouldtend to conferon the introducedformjust thoselogicalpresuppositions whichI have described . . . . Theoperationof this 'pragmaticrule' wasfirst pointedout to me, in a differentconnexion , by Mr. H. P. Grice. On this view, sentencesof the form All Fare G have F ( a) as an implicature , and G( a) follows as well, since this last sentenceis entailed by IF F ( x ) THEN G( x ) and F ( a) . Similarly , No Fare G might appear misleading , though perhaps not as obviously so, when nothing at all is an ~ . In that case, we can treat both F ( a) and NOT G( a) as implicatures of the No sentencein the sameway. In somerecent pragmatic theoriesconversational implicatures and presuppositions are treated as distinct entities, whereasin others presuppositions reduce to conversational implicatures plus ordinary entailments (seeLevinson 1983for a review). Although we will treat them similarly in what follows, we needn't take a stand on whether purported presuppositions(e.g., the inferencefrom All to Some) and purported implicatures (e.g., the inferencefrom Some to Some-not ) ultimately have the samesource. We can summarizeall four of the non-CPL inferencesjust discussedin a more uniform way if we assumethat eachcategoricalstatementhas both an assertedand an implicated component. The assertedpart corresponds to the usual CPL interpretation, which appearsin the b sentencesof ( 1)(4). The implicated part is an assumptionabout the existenceof Fs that are either Gs or not Gs, and theseappear as the c sentences . ( 1) a. All Fare G. b. IF F(x) THEN G( x). c. F(a) AND G( a). (2) a. No Fare G. b. NOT (F(x) AND G( x . c. F(a) AND NOT G( a). (3) a. Some Fare G. b. F(b) AND G( b). c. F(a) AND NOT G( a).

Reasoningwith Variables


(4) a. SomeF are not G. b. F(b) AND NOT G( b). c. F(a) AND G(a). The c components are similar to presuppositions in that they meet the usual criterion of being unaffected by negation of the original sentence. Thus, the implicature of the All sentencein ( Ic ), F ( a) AND G( a), is the sameas that of Some-not , its negation, in (4c). Similarly , the implicature of the Some sentencein (3c), F ( a) AND NOT G( a) , is the same as that of No in (2C).3 Note, too, that when we consider the implicature and the assertion together the representationfor Someand that for Some-not will be equivalent , in accord with a hypothesisof Beggand Harris ( 1982). However, we must distinguish the roles of the band c sentencesif we want to preserve all four of the non-CPL inferences. The assertion and implicature of a Somesentencejointly entail both the assertionand implicature of a Somenot sentence(i.e., (3b) and (3c) jointly entail both (4b) and (4c)). But the relation betweenAll and Somesentencesworks differently, since( I b) and ( Ic ) do not jointly entail both (3b) and (3c). What is true of all the nonCPL inferencesis that the implicature of the premise yields the assertion of the conclusion. That is, for All to Some ( Ic ) = (3b), for No to Somenot (2c) = (4b), for Some to Some-not (3c) = (4b), and for Some-not to Some(4c) = (3b). Thus, the non-CPL inferencesrequire the implicature of the premise, but sometimesrequire overriding the implicature of the conclusion . (Along similar lines, the non-CPL inferencescan' t be transitive; for if they were we could derive Some-not from All (All -+ Some-+ Somenot) and Some from No (No -+ Some-not -+ Some).) These facts will become important when we consider syllogisms. Of course, not all subjects ; thus we also need to bear accept the extra information in the c sentences in mind the stricter breadings in predicting how subjectswill reasonabout . argumentsthat contain categorical sentences CategoricalSyllogisms A categorical syllogism consists of two categorical premisesand a categorical conclusion, but with some additional restrictions on the form of the argument. In traditional syllogisms, one of the terms from the first " " ( major ) premisereappearsas the predicate of the conclusion, and one of the terms from the second(" minor " ) premise servesas the subject of the



conclusion. The premisesshare the one additional (" middle" ) term. Thus, ' argument (5), which was used in chapter 6 to illustrate PSYCO P s proof procedure, is a syllogism in good standing. (5) IF Square-block(x) THEN Green-block(x).

-block -block (a)AND Big (a). Square Big-block(b) AND Green-block(b).

There is an opaque medieval code that researchersuse to denote syllogisms and to intimidate readers. When I have to talk about the surface forms of individual syllogisms, I will usea somewhat lesscompressedbut more comprehensible notation in which AU( F , G) stands for AU Fare G, No ( FiG ) for No Fare G, Some( FiG) for SomeFare G, and Somenot( F, G) for SomeF are not G. We can then name the syllogism simply by listing the form for the premises and conclusion. For example, in this notation syllogism (5) is . ( All (S,G ), Some(BiS), . . Some(BiG , where S is square-blocks, G is green-blocks, and B is big-blocks. In this example, B is the subject of the conclusion, G is its predicate, and S is the middle term. There are plenty of wrangles about the nature of the syllogism from a historical perspective(see, e.g., Adams 1984and Smiley 1973), but our concern here is whether our extendedmodel is consistentwith the way subjectsreasonabout theseargument forms. Syllogism Experiments To seehow well PSYCOP does with syllogisms, we need information about the difficulty of the individual problems. Although there have been many experiments with syllogisms as stimuli , only a few of thesehaveboth included the full setof itemsand havereported data for each item separately. The main exceptions are the studies of Dickstein ( 1978a) and Johnson-Laird and Bara ( 1984a), and we will examine both of thesedata sets. There are some aspectsof thesestudies, however , that make them lessthan ideal for our purposes. Johnson- Laird and Bara gave their subjects pairs of syllogistic premisesand asked them to produce conclusionsthat logically followed. This method puts a premium on subjects' ability to formulate natural-language sentencesto represent the inferencesthey have derived, and therefore it is open to blasesin this formulation step. Subjectsmay be more likely to considerconclusionsthat are prompted by the form of the premises, that seemmore natural in the

Reasoningwith Variables


' premises context, or that are simpler or more salient. All experiments are liable to interferencefrom responsefactors, but these problems seem especiallyacute for production tasks (as we will seelater). Dickstein employed the more usual method of presentingsubjectswith the premisesand asking them to chooseone of the conclusions AU( SiP) , No ( SiP), Some( SiP), and Some-not( SiP) or to say that none of these conclusionsis valid. This avoids the formulation problem, sincethe range of potential conclusions is available for the subjectsto inspect. However, forced choice also means that , for a given pair of premises, subjects' responses to the different conclusions are not independent. For example, a subject who receivesthe premisesAU( G,H ) and AU( F ,G) cannot respond that both AU( F,H ) and SomerF ,H ) follow , even though he or she may believe this is the case. Any factor that increasesthe frequency of one of these responsesmust automatically decreasethe frequency of the others. Moreover, the combinatorics of syllogismsare such that only 15 of the 64 possible premise pairs have deducible categorical conclusions in CPt (a slightly larger number in Aristotelian and scholasticlogic). Thus, the correct choice will be " none of these" on most trials. As Revlis ( 1975a) has pointed out, this can encouragesubjectsto give incorrect categorical conclusions " " (All , No , Some, or Some-not ) rather than none of these in order to balancethe frequencyof the five responsetypes over trials. Jeffrey Schank and I have carried out a new syllogism experiment that attempts to avoid thesedifficulties. Subjectsreceiveda list of all 256 traditional syllogisms- the 64 premise pairs combined with the four possible categoricalconclusions- and they decidedfor eachof theseitems whether the conclusion " would have to be true in every situation in which the first two sentencesare true." We also informed subjects that the conclusion would be true for only about 10% of the problems, since we wanted to discouragesubjectsfrom adopting a responsebias of the sort just discussed . There were no specialinstructions, however, about the meaning of the quantifiers. Although every subjectjudged all 256 syllogism types, the content of the syllogisms varied from one subject to the next. For each subject, we constructed the syllogisms by randomly assigningone of 256 common nouns (e.g., doors, ashtrays, stockings) to eachsyllogistic form. To generate the three terms of the syllogism, we combined this noun with three adjectives(red or blue, large or smaU,and old or new), and thesenoun phraseswere then randomly assignedto the subject, predicate, and middle term of the syllogism. Thus, argument (6) was a possibleitem in our set.



Chapter 7

All red stockings are old stockings . Some old stockings are large stockings . Some large stockings are red stockings .

\ \ ' e randomized the order of the syllogisms for each subject and then split the list in half . Subjects received a booklet containing the first 128 items in one session and received the remaining items in a second session one to four days later . The subjects were 20 University of Chicago students , none of whom had taken a course in logic . The table in the appendix to this chapter gives the percentage of the subjects that judged each of the syllogisms valid . A glance at this table reveals a fair degree of accuracy . On the 15 syllogisms that are deducible in CPL , subjects said the conclusion followed on 72.3% of trials ; on the 241 syllogisms that are not deducible in CPL , they said the conclusion did not follow on 88.5% . The lower rate on deducible problems may be due to our warning that only a small number of the syllogisms had categorical conclusions . Nevertheless , it is clear that the subjects were able to discriminate deducible from nondeducible syllogisms in the aggregate : The hit and correct - rejection rates just mentioned correspond to a d' of 1.79. In what follows , we will rely most heavily on the data from this experiment and on those from sample 2 of Dickstein 1978a.4 However , Johnson - Laird ' and Bara s production task will be discussed at the end of this section . Our look at the categorical Syllogism Performance : Dedllcible Argllmellts statements suggests that a subject ' s response to a syllogism should depend in part on whether he or she accepts the implicatures of the premises and conclusion . There are 15 syllogisms that are deducible when the premises and conclusions are represented as in the b sentences of ( 1)(4), and of these items only six remain deducible when we add both the ' ' premises and conclusion s implicatures . Table 7.1 identifies these two " " groups of arguments , along with the percentages of follows responses in ' Dickstein s experiment and our own . The table shows that subjects tend to offer more " follows " responses for syllogisms that are unaffected by the " " implicatures . In our experiment , subjects gave 85.8% follows responses to these syllogisms , but only 63.3% " follows " responses to the nine remaining ' problems . In Dickstein s study , the corresponding percentages are 89.4 and 70.8. This difference is, of course , confounded by factors other than the status of the implicatures , but the data hint that implicatures may have played a role in the subjects ' reasoning .


with Variables Reasoning

Table7.1 of " follows" respOnses andinferencerules for each of the deducibltsyllogisms. Percentaae " % '"follows responses -Rips Dickstein Inferen~ rules Schank Syllogism Deducible.JIth or .JItbout implicatares All (G , H ) Transitivity






AI1 (F,0) AI1 (F,H) No(O,H)

All (FiG ) No (F,H) No (H,G) All (F,G) No (F,H) All H,G) ( No (F,G) No (F,H)

Exclusivity Convenion


Exclusivity Convenion


Some AND Elim . (forward (G,H) ) AIl(GiF AND Intro .(back ) ) IFElim .(back ) Some (F,H) -not Elim .(forward Some AND (G,H) ) All(GiF AND .(back Intro ) ) .(back IFElim ) -not Some (F,H) Dedad ~ withoat implieatares AIl(G,H) AND Elim .(forward ) Some AND Intro .(back (F,G) ) IFElim .(back ) Some (F,H) AIl(G,H) AND Elim . (forward ) Some AND Intra .(back (GiF ) ) IFEIim . (back ) Some (F,H) All(H,G) Exclusivity No(GiF Conversion ) No(F,H) AND Elim .(forward Some ) (H,G) AND Intro .(back All(GiF ) ) IFElim .(back ) Some (F,H) No(G,H) AND EIim . (forward ) AND Intro .(back Some (F,G) ) .Syll .(back ) Conj -not Some (F,H) AND EIim . (forward No(H,G) ) AND Intro .(back Some (FiG ) ) .Syll .(back ) Conj -not Some (F,H)







89.S 65.0









Table7.1(continued ) " % "follows responses Inferencerules

No(H,G) Some (GiF) -not(F,H) Some AII(H,G) -not(FiG) Some -not(F,H) Some

ANDElim.(forward ) ANDIntra. (back) Conj.Syll.(back) ANDElim.(forward ) ANDIntra. (back) NOTIntra. (back) IF Elim.(back)





If we take into account the implicatures of the syllogismsand the rules that are needed to derive the conclusions, we can begin to understand the range of difficulty of these problems. Consider first the arguments at the top of table 7.1, which are deducible even when their implicatures are included. We have just seenthat thesesyllogisms are easier to verify than the remaining ones, but the table also shows that, within this group, syllogismsthat PSYCOP can prove via forward rules (seetable 6.2) have higher scoresthan those that also require backward rules(Backward AND Elimination and IF Elimination ). The percentageof " follows" responses was 88.8% for the first group and 80.0010for the second group in our ' experiment, and 90.8% vs. 81.6% in Dickstein s. Scoresalso vary with rules among the syllogismsin the bottom part of table 7.1. Among this group, the more difficult syllogismsall require either the backward NOT Introduction or the backward Conjunctive Syllogism rule. The overall percentageof " follows" responsesfor this subsetof problems is 56.0010 (Schank and Rips) and 55.8% (Dickstein). By contrast, the remaining four syllogisms require either Exclusivity and Conversion or AND Introduction , AND Elimination , and IF Elimination . These problems are decidedly easier, with overall scoresof 72. 5% and 89.5% in the two experiments. In general, then, we can account for these data if we assumethat the rules for NOT Introduction and Conjunctive Syllogism are more difficult for subjects to deploy than the rules for AND and modus ponens. This assumptionseemsintuitively right, and it also accords with the results of chapter 5. (See table 5.2: AND Introduction , AND

Reasoningwith Variables


Elimination, and IF Introductionall receivedhigherparameterestimates thanNOT Introduction.ConjunctiveSyllogismdid not figurein theearlier .) In fact, the only anomalyin the table is that the syllogism problems . ( AII(H,G),No( GiF), . .No( F,H) , whichrequiresonly ExclusivityandConversion , is no easierthan the syllogismsthat PSYCOPproveswith AND Introduction, AND Elimination, and IF Elimination. In earlier research , much wasmadeof the fact that the difficulty of a . Premisesof problemvarieswith the order of the termsin the premises the form ( Ql (G,H),Q2(FiG) tend to be easierthan thoseof the form ( Ql (H,G),Q2(FiG) or ( Ql (G,H),Q2(GiF) , and the latter premisesare the easierthan ( Ql (H,G),Q2(GiF) , whereQl and Q2 are themselves " " also usualquantifiers(All , No, Some, or Somenot). This figure effect holdsin thedataof table7.1: In our study, thescoresfor thesethreegroups were78.8% for ( Ql (G,H),Q2(FiG) , 71.3% for ( Ql (H,G),Q2(FiG) and ' ( Ql (G,H),Q2(GiF) , and 66.7% for ( Ql (H,G),Q2(GiF) ; in Dicksteins experimentthe comparablefiguresare 86.2%, 78.3%, and 67.6%. It is , that this differenceis a by-product of implicatures , however possible and rule use, as just discussed , rather than due to the order of the termsper se. When theserules and implicaturesare held constant, the . Thus, there is no advantageof more equivocal figure effectbecomes . . , . .No( F,H) in ( No( G,H),AII(FiG), . .No( F,H) over ( No( H,G),AII(FiG) . ( F,H) has a ( FiG), . .Some either study. In our study, ( AII(G,H),Some . in accordwith H F . . H , ( , ) (GiF), Some higher scorethan ( AII(G, ),Some ' . And although difference such the figure effect, but in Dicksteins there is no . Dicksteinfound a figureeffectfor ( No,Some,. .Somenot) syllogisms 11- 14 in table 7.1). it , disappearsin our experiment(seeitems ' Although it is possibleto reformulatePSYCOP s rules to make them moresensitiveto the order of the termsin a problem, it is not clearfrom . (Figureeffectsmight be the presentdata that sucha moveis necessary in whichsubjectsproducetheir own conclusions moreprominentin experiments a tendencyto considerconclusionsthat have , sincesubjectmay . They may also be more are similar in surfacefeaturesto the premises prominentfor nontraditionalsyllogismsin which the subjectof the conclusion comesfrom the first premiseand the predicatefrom the second . premise e.g., ( AII(F,G),AII(G,H), . .AII(F,H .) : NolldedllcibleArg"mellts Accordingto the present Syllogismperfornulllce , subjectstest syllogisms(and other argumentforms) by approach



attempting to prove the conclusion. For nondeducible arguments, of course, no such proof is possible, and subjects must determine how to ' interpret this negative evidence. From a subjects point of view, failure to find a proof may mean that the argument is not deducible and that no proof is available; however, it is also possible that a perfectly good proof exists, but one that the subject was unable to derive becauseof information -processinglimits . What subjects do when they can' t find a proof may depend on properties of the experimental setup. In the mathematical model of chapter 5, we simply assumedthat subjectsin this situation respond" doesn't follow " with probability ( 1 - p, ) and guessat random betweenthe two alternatives with probability p, . Thus, ( 1 - p, ) representedthe subjects' certainty that no proof is possible. For syllogisms, however, there is a large body of researchsuggestingthat subjects' decisionson nondeducible problems are not entirely random but are biased by features of the syllogisms and of the presentation method. We consideredsomeof these- atmosphereand belief bias- in chapter 1, and a response-frequencyeffect was mentioned earlier in this section. (SeeChapman and Chapman 1959, Dickstein 1978b, and Evans 1989. Chapter 10contains a further discussionof bias.) For the experiment of the appendix, we took steps to neutralize belief bias and responsefrequency. But there are two other factors that may playa role in theseresults. One of thesedependson the implicatures of categoricalstatements ; the other is related to atmosphere. We noticed earlier that the implicatures of the premisesand the conclusion can turn a deducible syllogism into a nondeducibleone. It is therefore pertinent to ask whether the opposite can also happen: Can a syllogism that is nondeducible when the premises and the conclusion are represented as in the b sentencesof ( 1)- (4) become deducible if we add the ? The answeris that this does occur, though implicatures in the c sentences it is rare. In fact, only 2 of the 241 nondeducible syllogisms become deducible when the implicatures supplement the usual representations of the premisesand the conclusion. ( The two syllogisms in question are . ( Some(G , H ), AlI (GiF ), . . Some-not (F,H and ( Some-not(G , H ), AlI (GiF ), . . . Some( F, H .) Subjects can get a more liberal crop of new deducible syllogisms, however, if they are willing to accept the implicatures of the premise without those of the conclusion. In this case, 18 previously nondeducible syllogisms become deducible, including many of those that

Reasoningwith Variables


Aristotelian and scholasticlogic sanctioned(seeAdams 1984). The appendix identifies these items. Subjects may be tempted to accept just the premiseimplicatures as a kind of opportunistic strategy. If they fail' to find a proof on a stricter interpretation , then including the premises impli catureswill sometimesgive them a secondchance. There is evidencethat thesepremiseimplicatures influencethe way subjects appraise the syllogisms. In our own data set, subjectsjudged that theseitems followed on 41.7% of trials, versus9.0% for the remaining nondeducible ' syllogisms. This differenceis much narrower in Dickstein s data ( 15.1% vs. 12.1%), perhaps in part becauseof the forced-choice format of that experiment. When subjectsjudge the premisepair ( All (G , H ), All (F,G , ' for example, they can t check both of the alternatives All (F ,H ) and Some( F, H ), since only one responseis allowed. Becausethe first of these conclusionsis deducible whether or not the subjectstake the implicatures into account, it should dominate the second, which is deducible only if ' ' subjectsadd the premises (but not the conclusion s) implicatures. ' However, it is clear that implicatures can t be the whole story behind responsesto the nondeduciblesyllogisms. A glanceat the appendix shows a much larger proportion of Someand Some-not conclusions than of All or No conclusions, particularly forsyllogismS that contain a Somepremise or a Some-not premise. This tendency might suggestan atmosphere explanation (Sells 1936; Woodworth and Sells 1935), but the responses don' t always mesh well with the classical atmosphere predictions. For example, atmospherepredicts that when both premisesare Somesubjects should reacha Someconclusion; but the resultsshow about equal proportions . Similarly , syllogisms that contain of Someand Some-not responses and one Some-not premise should Some two Some not premisesor one ; but again there are about equal numbers of produce Some not responses Someand Some-not conclusions. Rather than atmosphere, thesedata suggest that our subjects may have been trying to hedge their bets, on the assumption that it is more likely that some Fare H or that some Fare not H than that all Fare H or no Fare H. Subjectsmay believe that the Someand the Some-not conclusionstake lessinformation to establishand that they are therefore more likely to be the correct answers than their universal counterparts, All and No . If this is right , it seemscloser to what Woodworth and Sellscalled the principle of caution than to their muchbetter-known atmosphereeffect.



SlImmflrY and Model Fitting Weare supposingthat subjectsin syllogism experimentsare guided by the rules of chapter 6. However, they also have some options about the way they should interpret the premisesand conclusion and about how to respond when no proof of the interpreted syllogism is forthcoming. On the interpretation side, they can chooseto adopt the implicatures of the premises, the conclusion, or both, where accepting the premises' implicatures will tend to increasetheir chance of finding a ' proof and accepting the conclusion s implicatures will decreaseit . On the responseside, if subjects fail to find a proof they may be more likely to hazard a guessat the answerwith a Someor a Some-not conclusion than with an Allor a No conclusion. Theseassumptionsare admittedly ad hoc, motivated as they are by our survey of the available syllogism data sets. Nevertheless, it is of interest to seehow close they come to providing an accuratefit to the data: Systematicdeparturesfrom the model can indicate mistakesin our assumptionsor placeswhere further factors are playing a role. Figure 7.1 summarizesthe assumptionsin a way that shows how they lead to a " necessarilytrue" or " not necessarilytrue" responseto a given syllogism. We supposethat subjectswill accept the premises' implicatures with probability Pprem ' and that they will accept the conclusion's implicatures with probability Pconc . Then they seek a proof for the interpreted syllogism, where the probability of finding such a proof dependson the availability of the deduction rules in their repertoire. The rules that are necessaryin constructing a proof will , of course, differ depending on the implicatures adopted; in general, more rules will be neededif the subjects haveto prove the conclusion's implicatures as well as the conclusion itself. We can abbreviate the availability of the rules, using Prl as the probability of having available all rules neededto prove the conclusion when subjects adopt both sets of implicatures, Pr2 as the comparable probability when subjects adopt the premises' implicatures but not the conclusion's implicatures, Pr3as the probability when subjectsadopt the conclusion's ' implicatures but not the premises, and Pr4as the probability when subjects adopt neither set of implicatures. (In fact, none of the syllogisms are deducible when only the conclusion's implicatures are present; thus, Pr3= 0.) The valuesof the Prparametersare functions of the availabilities of the individual deduction rules. For example, if a syllogism is provable on the basis of Forward AND Elimination , Backward AND Introduction , and


Reasoningwith Variables

no 1 - p-

no 1 - p-

no 1 - P..

Figure7.1 . Summaryof theapplicationof the PSYCOPmodelto syllogisms



Backward IF Elimination when both sets of implicatures are accepted, then Prl will be a function of the probabilities associatedwith eachof these rules. The rules that appear in the proofs of the syllogismsare Transitivity , Exclusivity, Conversion, Forward AND Elimination , Backward AND Introduction , Backward Conjunctive Syllogism, Backward IF Elimination , and Backward NOT Introduction . To keep the model simple, we can assume that the availabilities of these rules are independent, as in the simulations of chapterS . We also assumethat the chance of deriving an alternative proof is negligible if the model' s initial proof of the argument fails. Finally , to reduce the number of rule parameters, we can take the probabilities of the three rules in table 6.2 to be equal, since they are all forward rules of about equivalent complexity. A further reduction in rule parametersoccurs becauseAND Elimination and AND Introductional ways appear together in a proof, so we can estimatea single parameter to representtheir joint availability . The overall probability that subjectswill find a proof for the syllogism is therefore

(7) P(proof) = Pprem Pconc Prl + Pprem ( 1 - Pconc ) Pr2 + (1 - Pprem ) Pconc Pr3+ (1 - Pprem ) (1 - Pconc ) Pr4. (As wasmentionedearlier, Pr3is 0 for theseproblems , so the third term .) If subjectsfind a proof in this way, theyshouldrespond" necdisappears " essarilytrue ; if thereis no proof, theymayeitherrespond" not necessarily " true or guessat the correctanswer . As figure7.1 shows,we assumethat the probabilityof guessingis Palwhenthe conclusionis eitherAll or No andPa2whentheconclusionis Someor Some-not. Thisallowsfor (though it doesn't enforce ) the possibilitythat subjectswill guessmoreoften with the " safer" Someor Some-not conclusions . If subjectschoose" necessarily " true on half of their guesses true" , then the probability of a " necessarily for syllogismswith All or No conclusionsis response " true" ) = P( proof) + 0.5( 1 - p(proof Pal' (8) p( necessarily and that for a syllogismwith a Someor a Some-not conclusionis " true" ) = P( proof) + 0.5( 1 - p(proof Pa2 (9) p( necessarily . To evaluatetheseassumptions , wehavefittedtheseequationsto thedata from our experimentusingnonlinearleast-squaresregression . Thepredic-


Reasoningwith Variables

tions from the model appear beneath the observeddata in the appendix. The model fits fairly well overall, although there are some clear deviations . The correlation betweenthe 256 observedand predicted responses is 0.879, and the root -mean-square deviation is 9.55. The rule parameters estimatedfrom thesedata are very much in accord with intuition and with the earlier model fitting in chapter 5. Theseestimatesappear in table 7.2 and show that rule availabilities are highest for the simple forward rules from table 6.2 and for the AND Eliminationjlntroduction pair . Of the backward rules, IF Elimination receiveshigher availability scoresthan the Conjunctive Syllogism rule, and NOT Introduction has quite low availability just as it did in the experiment discussedin chapter 5. The interpretation and responseparametersare also closeto what we anticipated from our look at the data in the appendix. Subjects appear more willing to accept the implicatures of the premisesthan those of the conclusion, and they are more likely to guessat a responsewhen the conclusion begins with Someor Some-not than with All or No .5 The most seriousdepartures from the predictions appear in nondeducible syllogisms containing two All premisesor an All and a No premise. The model clearly underestimatesthe frequency of All conclusionsin the fonner case, and to a lesser extent it underestimatesthe frequency of No conclusions in the latter. In the present context, it seemsnatural to attribute thesediscrepanciesto mistakes in applying the Transitivity and Exclusivity rules. If subjects reversed the order of the tenns in the All premiseson some portion of the trials, these rules would lead to exactly Table7.1 . estimates for themodelof equations(7)-( 9), fitted to thedataof theappendix Parameter Parameter


Presupp (8ition pan~


Probability Probability ReIpo Meparameten Probability P. a Probability P. 2

. Ppre Pcoac

Rule. . nmeten

Plr = Pea= Pcon Pend Pic p" Pit'


of aoceptingpremiseimplicatures of aoceptingconclusion implicatures of guessingwith AIl / No conclusions of guessingwith Somc/ Somc-not conclusions

' Probabilityof forwardrulcs beingavailablefor proof andElimination of AND Introduction Probability Probabilityor IF Elim. Probabilityof ConjunctiveSyllogism Probabilityof NOT Intro.

0.08 0.31 0.95 0.93 0.85 0.70 0.30



the conclusions that posedifficulties for the model. If this tendencyexists it will be quite similar to the traditional notion of " illicit conversion," in which subjectsare supposedto interpret All Fare G as equivalent to (or as inviting the inference that ) All G are F (Ceraso and Provitera 1971; , b). This reversalof arguments Chapman and Chapman 1959; Revlis 1975a must be relatively infrequent, however, since in thesedata no more than 40% of the subjects respond that these deductively incorrect syllogisms are necessarilytrue (see also Begg and Harris 1982 and Newstead and Griggs 1983). In this context, it seemsbetter to regard conversion as an occasionaltendencythan as a dominant part of subjects' understandingof All statements. It is possible to revise the model in order to take this tendency into account, but I have not tried to do so. The point of fitting equations(7)- (9) is not to give a complete account of syllogism responsesbut to show how PSYCOP can be extendedin that direction. It is clear that specificinterpretation and responseprocesses have to supplement PSYCO P's basic tenetsin order to reproduce details in the responseprofiles. But it is also ' important to seethat subjects handling of syllogismsis part of a broader deductive schemethat can encompassmany other sorts of sentential and variable-basedproblems. ProducingConclusionsto Syllogistic Premises As has been mentioned, it is possible to present just the premisesof a syllogism and have the subjects fill in a conclusion. Johnson-Laird and Bara ( 1984a) have reported experiments of this sort and have given the production frequency for each of the 64 pairs of syllogistic premises. We can therefore ask whether a model like the one we are developing can also account for thesedata. This question is an especiallyinteresting one, since Johnson-Laird ( 1983; seealso Johnson- Laird and Byrne 1991) has expresslydenied that syllogistic reasoningis accomplishedby mental rules like those PSYCOP incorporates. In the present subsection we will see whether the syllogism resultsjustify this denial; we will consider Johnsonlaird ' s other reasons for skepticism about rules when we take up his theory in chapter 10. PSYCOP already has a mechanismfor generatingconclusionsto premises : its forward rules. Sincetheserules don' t needa conclusion or subgoal to trigger them, they can apply directly to syllogistic premisesto generate a correct answer. In the caseof categoricalsyllogisms, we would expectthe

Reasoningwith Variables


forward rules in table 6.2 to be the important ones, and we should predict that the deducibleconclusionsproduced by theserules will be much more frequent than deducible conclusions that require backward processing. This is easily confirmed by the data of Johnson- Laird and Bara6: There are six premisepairs that yield conclusionsby the forward rules alone, and for these pairs 77.5% of subjects produced one of these conclusions as a response. By contrast, there are 16 pairs that yield conclusions by backward rules, and for these pairs only 31.6% of subjects produced such a conclusion.7 Suppose, however, that subjectsare confronted with a premisepair for which no forward conclusions are possible. How should they proceed? One simple strategy might be to select tentative conclusions from the possible categorical sentenceslinking the end terms and then test these possibilities to seeif any are deducible. (We proposeda similar strategyfor the Selectiontask in chapter 5.) For example, with a pair such as ( Somenot(A , B), AII(C, B no forward conclusion will be forthcoming; so subjects may generatepossiblecategorical conclusionslike Some-not(A ,C) or All (C,A ) and check if there is a proof for them. This generate-and-test strategy seemsquite reasonablein the context of an experiment like this, where all the stimulus sentenceshave a common categorical format and where there is an implicit demand to link the end terms (i.e., A and C) of the premises. The strategy is a powerful one, since it allows subjects to bring their backward rules into play. Of course, not all subjectswill have the inclination to test each of the possible conclusions- All (A,C), All (C,A ), Some(A ,C), Some(C, A), and so on. This is especiallytrue since, on the majority of trials, subjectswould have to test exhaustively all eight conclusions; only 22 of the 64 premise pairs have conclusions that can be deduced in this fashion (see note 6). The subjects may therefore settle for checking just one or two salient possibilities. If they find a proof for one of these items, they can then ' produce the conclusion as their response; if they don t find a proof, they can either guessor respond that nothing follows, just as in the syllogismevaluation model that we discussedearlier. In deciding which conclusions to test, subjectsare probably influenced by the form of the premises are they considering. If they are studying a premise pair such as ( Somenot(A , B), All (C,B , then the conclusions Some-not(A ,C), Some-not(C,A ), All (A ,C), and All (C, A ) naturally come to mind, since the quantifiers in these sentencesare the same as those of the premises. Newell ( 1990)



incorporates a similar assumption in his account of Johnson-Laird and Bara' s results. (This may qualify as a kind of " matching" bias, related to that discussedin Evans and Lynch 1973, or it may be the result of an availability heuristic such as that documentedin Tversky and Kahneman 1973. Seechapter 10 for further discussionof theseblases.) A look at the data of Johnson-Laird and Bara shows that this correspondence betweenpremisequantifiers and responsequantifiers is clearly present. Of the 743 categorical conclusions produced by subjectsin their experiment, 87.9% had quantifiers that were the same as those in the premises. The percentageexpectedby chance is 43.8%. (The chance percentage is slightly less than 50% because16 of the premise pairs- e.g., ( AII(A, B), AII(C, B - contain just one type of quantifier; for theseitems, the probability that a randomly generated categorical conclusion will havethe samequantifier is, of course, 0.25.) In someinstances, conclusions with the same quantifiers as the premisesare deducible; but even if we confine our attention to syllogismswith no deducibleconclusion, the effect is about the same. There were 567 (incorrect) categorical conclusionsthat Johnson- Laird and Bara' s subjectsgeneratedfor these items, and 91.9% had a quantifier that duplicated one of the premises'. One further fact about this congruencebetween the quantifiers of the premisesand thoseof the conclusion-responseis worth noticing : For pairs in which the premise quantifiers differ (e.g., ( All (A ,B), Some(B,C ), there are two possiblechoicesfor the premisequantifier (All or Somein the example ). In thesecases,the responsesseemto follow a systematicpattern in which No dominatesSome-not, Some, and All ; Some-not dominatesSome and All ; and Somedominates All . Given ( All (A , B), Some(B,C , for example , the modal responseis Some(A ,C); with ( Some-not (B,A ), Some(C, B , the modal response is Some-not(C,A ). Gilhooly et al. ( 1993; see also Wetherick and Gilhooly 1990) have independently noted the same pattern . The ordering clearly differs from an atmospherebias and from the penchantfor Someand Some-not conclusionsthat we met in studying the syllogism-evaluation experiments. Instead, it seemsto reflect the notion that the best responsesare ones that posit the least overlap betweenthe terms. If we have to find a relation betweenterms A and C, we might start with the minimal one in which A and C have least in common. We can then summarizethe subjects' production strategy in the following way: When they receive a premise pair , they first see whether any conclusion follows spontaneously via forward rules. If so, they produce

Reasoningwith Variables


that conclusion as a response. If not, they consider as a tentative conclusion a categorical statementthat links the end terms and whosequantifier is grafted from one of the premises. In most cases, the premiseswill have two different quantifiers, and the one they chooseapparently follows the ordering just described. (When the premiseshave only a single type of quantifier, then of course that quantifier is the one they select.) Subjects attempt to determine whether this tentative conclusion follows, this time employing their full repertoire of rules, and they will again respond with the conclusion if a proof is available. However, should the search for a , subjectsfollow a procedure similar to proof turn out to be unsuccessful that of figure 7.1, either guessingabout the validity of the conclusion, responding that no conclusion follows, or perseveringin checking other possible conclusions. The data suggestthat only a small number of subjects ; for simplicity , we can assumethat most subjectsstop after persevere the first possibility, a small minority continuing until they checking just have found a deducible conclusion or exhaustedall the possibilities. This procedure is consistent with the qualitative features of JohnsonLaird and Bara' s results. For any pair of premises, the responsesshould consist of (a) any conclusions that follow from forward rules; (b) conclusions that share the dominant quantifier of the premises, basedeither on " " ; ; (c) no conclusion follows responses guessesor on backward processing and (d) a small number of correct conclusions with nondominant quantifiers, basedon backward processingby the persistentsubjects. This predicts the possibility of 200 different responsetypes across all pairs of premises. In fact, 165of theseappear in the data, accounting for 90.3% of the reported responsetokens (see note 6). Conversely, there are 376 response types that are predicted not to appear, only 27 of which occur and which constitute the residual 9.7% of the responsetokens). This seems ( an extremely high successrate, in view of the relatively unconstrained nature of the production task.8 Of course, we are again relying on post facto assumptionsthat go beyond the basic PSYCOP framework, the most serious of which is the dominance ordering among the quantifiers. But even if we forget about dominance and assumesimply that subjects attend first to a conclusion containing one of the premise quantifiers, we still do quite well. We now predict the possibility of 296 responsetypes, 175 of which appear in the data, capturing 93.1% of the reported responsetokens. In the other direction , there are 280 responsetypes that are not predicted, and only 17 of



theseactually occur. In other words, even with very simple responseassumptions we can account for the major qualitative properties of the production data. We would have to refine these assumptions in order to produce a quantitative model of the sort we worked up in the previous section. But our aim is accomplished if we can show that PSYCOP is consistent with these previous syllogism data, and the statistics already reported suggestthat this is the case. To get a firmer grip on the strengths and weaknessesof the model, we need to generate new empirical and theoretical results for the wider domain of inferencesthat PSYCOP can prove, and this is the goal that we take up in the following section.

Predictionsfor MultiVariable Contexts As further tests of the extended model, we need problems with variables that are not confined to the All , Some, No , Some-not categorical format. We look at two such testsin this section, the first centeredon PSYCO P's matching rules and the second basedon textbook problems in predicate logic. An Experiment on Matching Time The essential components in PSYCO P's treatment of variables are the matching rules of table 6.1 that enable the systemto recognizewhen one sentenceis a generalization of another. These rules ensure, for example, that PSYCOP can derive instanceslike Calvin = Calvin from the abstract premise x = x. Generalization and instantiation are at the heart of all symbolic cognitive theories, and we want our systemto account for them. We would like the theory to be able to predict the relative difficulty people have in coping with terms at different levels of abstraction, and we can make someheadwaytoward a test by considering the number of different . types of rules that PSYCOP must employ when matching sentences As an example of how we could frame such a test, consider argument ( 10) (with the variablesx and y ranging over people). ( 10) DazzIes(x,y) DazzIes(Fred, Mary) This argument says that Fred dazzles Mary follows from the premise Everyonedazzleseverybody; it seemsquite obvious, sincethe only rule we


Reasoningwith Variables

needto derive it is the one relating permanent namesto variables. It takes a bit more thought, however, to deal with ( 11). ( 11) Dazzles(x, Mary ) Dazzles(Fred, b) We can paraphrase( 11) as: Everyone dazzlesMary ; therefore, Fred dazzles somebody. In this case, we need two distinct rules: one relating variables and permanentnames(x and Fred) and the other relating permanent and temporary names(Mary and b). Figure 7.2 surveyssome of the possibilities for matching in a sentence that contains two arguments(variables or names). Here m and n stand for permanent names; x and y stand, as usual, for variables; a and b stand for temporary names. So P( m, b) in the figure correspondsto a sentence





, PCx ) Y R8 / / R4 / ~ R8

\ PCx ,b) PCm ,n) PCa ,y) ,b) Cm

, PCa


,b) PCa

Fi88e 7.2 ~ containingtwo terms. Arrowsconnectonescnten ~ to Deducibilityrelationsfor senten anotherthat can be dcdu<:edby meansof a matchingrule from table 6.1. Labelson the arrowscorrespondto the orderof the rule in the table. m andII denotepermanentnames ,a andb temporarynames , andx andy variablesin quantifier-freenotation.



like Dazzles( Fred,b) . The arrows indicate the matching relations. For example, the arrow connecting P ( x ,y) to P( m,y) at the top right of the figure meansthat PSYCOP can match thesesentenceson the basisof one of the matching rules- in this case, the third rule in table 6.1. The labels on the arrows representthe rules that are responsiblefor the matches: R2 stands for the second rule in table 6.1 (the rule matching temporary to permanent names), R3 for the third rule (permanentnamesto variables), and R4 for the fourth (temporary namesto variables). Pathwaysof arrows indicate derivations involving combinations of rules. For example, the path from P( x ,y) through P ( m,y) to P( m,n) shows one way to derive P( m,n) from P( x ,y) via two applications of matching rule 3, and the path from P( x ,n) through P( m, n) to P( m, b) shows the derivation of P( m, b) from P( x ,n) via matching rules 2 and 3. The first of thesepathways corresponds to a proof of argument ( 10), the secondto a proof of argument ( II ). The heuristics in the matching rules guarantee that PSYCOP always takes one of the shortest paths when there is more than one route available . In matching P( a,b) at the bottom of figure 7.2 to P( x ,y) at the top, for example, PSYCOP traversesthe path through P( x ,b) rather than any of the more circuitous routes, although those routes are also valid generalizations. Second, the figure shows a single arrow (labeled R3) connecting sentenceslike P( x ,y) and P( m, y) , but most of these adjacent sentences can' t appear together as the premise and conclusion of an argument. As was noted earlier, each sentencein an argument must have distinct variables and temporary names; so if P( x ,y ) were a premise, it could not have P( m,y) as a conclusion, but might have P ( m, z) instead. The inference from P( x ,y) to P( m,z) requires RI to match z to yand R3 to match m to x. However, this will not affectthe predictions that we derive below. Third , there are other types of atomic sentencescontaining two variables that do not appear in the figure (e.g., P( x ,a,x) or P( y,y ) . These sentencesalso entail (or are entailed by) some of the depicted sentences , but since these items don' t occur in the presentexperiment I have omitted them to keep the diagram simple. Our main prediction concernsarguments, suchas ( 10) and ( II ), in which PSYCOP must changetwo terms in order to obtain a match betweenthe conclusion and the premise: It should take subjects longer to determine the correctnessof an argument like ( 11) that requires the useof two different rules than to determine the correctnessof an argument like ( 10) that requires two applications of a single rule. Double application of a single


Reasoningwith Variables

rule should be simpler since PSYCOP will find the rule quicker after its first success(seechapter 4)- a type of priming effect. There are three arguments of this simpler type with sentencesfrom figure 1.2: the arguments from P( x ,y) to P( m, n), from P ( X,y) to P( a,b) , and from P( m,n) to " " P( a,b) . Table 1.3 lists theseunder the heading One rule, two steps, together with the rule that each requires. There are a larger number of argumentsthat dependon two distinct rules (six in all), but the experiment included just three of them in order to equate the frequency of the argument " types. Theseappear under " Two rules, two steps in table 1.3. Notice that within thesetwo groups of argumentsthe samerules are usedequally often; what differs betweenthe groups is the distribution of the rules to the individual items. Thus, any difference in the internal complexity of the matching rules should not affect the results of the experiment. The sameis true of the numbers of variables, temporary names, and permanent names Table7.3 Reactiontimesanderror ratesfor two- pla~ arguments involvingmatchingrules. Argument forms


Tworules , twosteps P(m,y) R2 , R4 P(a,b) P(x,n) R2 , R3 P(m,b) R3,R4 P(x,y) P(a,n) 0- rule , twolIeps R4,R4 P(x,y) P(a,b) P(m,y) R2 , R2 P(a,b) R3,R3 P(x,y) P(m,n) Addidoaal arguments RI, R3 P(x,y) P(Z,n) P(a, n) R2 , R2 P(c, b) R4 P(m,y) P(m,b)

Mean correct responsetime (ms)

Error rate (%)





















that appear in the premisesand conclusions. For example, variables appear four times in the premisesof the two-rule-two -step arguments and four times in the one-rule-two -step arguments; permanent namesappear twice in the premisesof eachargument type; and so on. So sheerfrequency of the different types of terms doesnot confound the comparison. We also included three additional deducibleargumentsthat correspond to sentencesconnectedby a singlearrow in figure 7.2. For reasonsalready mentioned, we sometimeshave to alter thesesentenceswhen they appear together in an argument, and in that case they require two rules for a match. Table 7.3 shows the corrected arguments and rule sets. Only the third of these " additional arguments" can PSYCOP actually deduce in one step. The experiment mixed the nine types of deducible argumentsjust described with nine nondeducible ones that interchanged the premisesand the conclusionsof the items in table 7.3. So, for example, in addition to the deducible argument P( x ,y) , Therefore, P( a,b), subjectsalso receivedthe nondeducibleargument P( a,b) , Therefore, P( x ,y) . As a result, half of the argumentswere deducible and half nondeducible. In this study, 24 University of Chicago students were told that they would seeproblems concerning groups of people and that each problem concerneda separategroup. The problems appearedon a monitor , each argument consisting of a premise written above a line and a conclusion beneath the line. The subjectswere supposedto assumethat the premise was true of the group of people and decide whether the conclusion was necessarilytrue of the samegroup on the basis of this information. Each " " subject indicated his or her decision by pressinga key labeled follows or one labeled " does not follow " on a keyboard, and a computer measured the responsetime from the presentation of the argument to the key press. The computer provided each subject with feedbackon his or her performance after every block of 18 arguments. The feedback consisted of the 's subject averageresponsetime and the percentageof correct responses during that block, but there was no feedbackon individual problems. There were six blocks of trials in this experiment, each block consisting of one instance of each of the deducible arguments in table 7.3 and one instance of each of the nondeducible arguments. To frame the problems, we compiled six sets of 18 transitive verbs, so that the mean word frequency of the verbs was approximately the same per set. One set was then assignedto the six main groups of arguments: the two-rule-two -step

Reasoningwith Variables


group, the one-rule-two -step group, the additional group, and their three nondeducible counterparts. Within these groups, we randomly assigned the verbsto the argument instances. In constructing the stimulus items, we usedeveryoneor everybodyto translate variables, someoneor somebodyfor temporary names, and common male and female first namesfor the permanent names. Thus, subjectsmight have seen( 12) as one instanceof the first argument in table 7.3. ( 12) Janet rewardseverybody. Someonerewards somebody. Universal and existential quantifiers never appear together in any of the sentencesin this experiment, so none of the sentencesexhibit scopeambiguities . Table 7.3 gives the averagetimes for correct responsesand the error ratesfor the deduciblearguments. Each mean represents144observations (24 subjects x 6 repetitions of each argument type), less the number of errors. Overall, subjects took 3882 milliseconds to decide that the tworule-two-step arguments followed, 3235 ms for the one-rule-two-step arguments, and 3529 ms for the additional arguments. An analysis of variance indicates that the main effect of argument type is significant (F(2,46) = 24.03, p < 0.001), and Newman-Keuls tests confirm that each mean differs from the others. The difference between two -rule and onerule arguments is exactly what the PSYCOP theory predicts. And although we had no direct hypothesis about the relation of the additional problems to the other two groups, the times for the separateadditional argumentsalso fit what we would expect from the rule breakdown in the table: The argument that takes two rules and two stepsis slower than the argument that takes one rule and two steps, and the latter argument is, in turn , slower than the argument that takes only one step. The error rates in this experiment generally follow the pattern of the responsetimes. Errors are highest for the two -rule-two-step problems ( 12.0%) and lowest for the one-rule-two-step problems (2.6%), in accord with our prediction about their relative difficulty . For the additional arguments , the error rate is 3.2%. The main effect of argument type is significant (F(2,46) = 18.61, p < 0.001), with the two -rule-two-step arguments differing from the other two groups. Within the additional group, the errors agreewith the responsetime differencesthat wejust examined. The only surprisein the error data is the very high error rate (30.6%) for one of


Chapter 7

the two-rule-two-step arguments, the problem corresponding to argument ( 11). The reason for the large number of errors is unclear, although this argument is the only one that has different permanent namesin the premise and the conclusion. It may haveconfusedsubjectsto have to generalize from one permanent name while generalizingto another. For the most part, however, the results of this experiment provide ' support for PSYCO Ps method of handling variables and names. The matching rules differ in the requirements they impose on subgoals and ' assertions, and thesedistinctions among the rules evidently affect peoples facility with them. Arguments that require two sorts of matching are more difficult than argumentsthat require only one. In this experiment we have highlighted the matching processby keeping other features of the arguments as simple as possible. But the simplicity of the arguments is a bit deceptive, since the deductive operations must be fairly sophisticated to avoid difficulties of the type we studied in chapter 6 (e.g., the incorrect proofs in examples(2) and (3) of that chapter). It may be, in general, that much of the deductive work that we carry out from day to day consistsof stepsof this kind - stepsso routine that they seemnot to require deduction at all. The seemingtransparencyof theseinferencesmay also account for why (to my knowledge) no previous experiments on deduction have focuseddirectly on the processof abstraction or that of instantiation. Complex Multi -Vanable Arguments ' In the experiment discussedabove, we examined PSYCO P s method for matching variables and temporary names in a pure context where there were no other logical operators to worry about. However, our systemcan handle inferencesthat depend on both sentential connectivesand terms. We have seena simple example of this dual capacity in the caseof syllogisms , but the argumentsthat PSYCOP can deal with have much greater ' variety. To get a better idea of the systems adequacy, we ought to apply it to a larger range of arguments. For this purpose, I collected a set of argument exercisesfrom introductory textbooks on predicate logic that might form the basis of a stimulus set. The textbooks were Bergmannet al. 1980, Copi 1973, Guttenplan and Tamny 1971, and Leblanc and Wisdom 1976; the argumentsselectedwere all items that PSYCOP could prove by meansof the rules of chapter 6 and that seemedshort enough to be comprehensibleto subjects.9 Moreover, in order to obtain stable parameter estimatesand to avoid capitalizing on

Reasoning with Variables


chance, the key arguments were ones whose proofs used rules that also figured in the proof of at least one other argument in the group. Thus, none of the proofs required a unique rule. The final set of arguments (in quantifier-free form ) appearsin table 7.4. The subjectssaw these25 valid argumentsrandomly mixed with an equal number of invalid ones, created by re-pairing premisesand conclusions. The stimuli also included 30 filler arguments( 14 valid and 16 invalid ), for a total of 80 arguments. All the arguments were presentedto subjectsin the form of sentences about members of a club, and the predicates of the sentencestherefore referred to human characteristics or relationships. The one-place predicates were selectedrandomly from the phraseshas blue eyes, is taU, has , haslong hair, and is cheerful; the two-place predicatesfrom praises glasses and helps; and the three-placepredicatesfrom introduces. . . to . . . and talks about . . . to . . . (No predicates with more than three places were used.) For example, one of the problems (argument 1 in table 7.4) is shown in ( 13). ' ( 13) There s a person A such that for any person B: if B is cheerful then A has glasses . ' . There s a person C such that: if Cischeerful then C has glasses The subjects read the problems in printed booklets, with the order of problems in a different random permutation for each subject. The instructions were generally similar to those used in the earlier experimentswe have reviewed, but with specialwarnings about the nature of the variables. The subjects were told that each club had at least one member and that the different letters in the arguments could refer to the same individual : " A person labeled ' A ' in one sentencemight also be labeled ' B' in another sentencein the sameproblem. . . . Exactly the same goesfor letters that occur within a single sentence. For example, the sentence ' 'There are people A and B such that A likes B would be true if there is a club member George who likes himself. The A ' s and B' s can refer ' independently to anyone in the club. It s not necessarythat there be two " ' ' ' ' distinct club membersfor A and B to refer to. The instructions included severalsampleargumentsto clarify thesepoints. The subjects, 20 Stanford undergraduates, were asked to respond to each of the experimental arguments " " " " by circling follows or does not follow in their booklets. " " The percentageof follows responsesfor each of the valid arguments appearsin table 7.4. Overall, the subjectssaid that thesevalid arguments



Table 7.4 Predictedand observedpercentages i of '"follows" responses for arguments with multivariable sentences .

.. F(a).im ., G a AND ( ) AND F G THEN F((xb,a ((ab))AND (IIF (u )THEN ,u 9.G F F x G ( ( ) F(b)AND G (b)Fib 10 . IFF(x,y)THEN (x,a) .

F(z,bz ) F(u,c) II. IFF(x)THEN G(x) IF(F(y)AND H(z,y THEN H(z,y (G(y)AND 12 . IFF(x,y)THEN G(rn x) , F(z,w) G(a,a) 13 . IFF(x)THEN G(y) IFNOT NOT (G(z THEN (F(z

14. IF F(x,m) THENO(y) F(n,z) IF H(u, o) THENO(u) 1S. IF F(x,y) THENF(Z,x) F(m,n) F(~v)
















Reasoningwith Variables

Table7.4(continued ) ObservM

16. F(a) ANDG(x) F(b) ANDG(b) 17. F(a) G(b) F(c) ANDG(d) 18. IF F(x) THENG(x)

IF(G(y)ANDH(z,y))THEN J(z) K(a) AND(F(b)ANDH(a,b)) K(c)ANDJ(c) 19 . IFF(x,m)THEN F(x,n) IF(G(y.z)ANDF(z,m))THEN (G(y,8y)ANDF(ayin ))

20 .(IF x)AND G THEN H(NOT x,y)(H(a,b (F (AND ((yF(b F a NOT AND ( ) ) NOT (F(c ANDNOT (G(c 21. IFF(x)THEN G(ax) IFF(y)THEN G(b) 22. F(a.b)ORG(x.y) F(c.d)ORG(c,d) 23. IFF(x)THEN G(x)

IF G THEN H (zy))THEN (ay)) IF F H ( ( 24 .IF IF F F ((xu,,yv))THEN ((yv,,zu)) F THEN F 25 .IF F((m THEN G x,,a n)) ixG )THEN (n IF G ) ( y in IF F(m G ,z)THEN (n)ib

Pr MictM























followed on 60% of the trials and that the matched group of invalid arguments did not follow on 79% of trials. Thus, their performance was roughly comparable to the performance on other argument-evaluation experimentsdiscussedabove: Although the successrate on valid problems was low, subjectswere able to discriminate reasonably well betweenvalid and invalid items. As was also true of those earlier experiments, the percentage of correct responsesvaried greatly over individual arguments. For " " example, all subjects responded follows to argument 17 of table 7.4 (F ( a), G( b) ; Therefore, F ( c) AND G( d) , whereasonly 15% of subjects " " responded follows to argument 14. Argument 14 is, in fact, something of an outlier among the problems in the set, and it suggestsa difficulty with our formulation of the IF Introduction rule. The argument looked as follows in the guise in which our subjects saw it : ( 14) For any people A and B: if A helps Linda then B is cheerful. For any person C, Rob helps C. For any personD , if D praisesBen then Discheerful . Notice that the antecedentof the conclusionD praisesBen, has a predicate that is not in the premises, and so the suggestedrelationship between praising Ben and being cheerful is not one that the premises establish directly . The conclusion does follow , however, by IF Introduction : This rule tells us to assumethat D praisesBen and to prove that Discheerful , and this goesthrough on a technicality. SinceRob helpseveryone(according to premise 2), he helps Linda in particular, thus making everyone cheerful (according to premise 1). So D must be cheerful too. This reasoning seemsa bit suspicious, though, for exactly the same reasons as in ' argument ( 13) of chapter 2. The fact that the conclusion s consequent is deducible seemstoo weak a reason to accept the entire conditional. The various pragmatic and logical accounts of conditionals mentioned in ' chapter 2 suggestways of accommodatingsubjects intuitions . Although the model clearly fails with argument 14, it is still interesting to ask how it fares with the remaining problems. To find out , I fitted equation 8 to the response proportions under assumptions similar to those I have discussedfor syllogisms. As in the earlier experiment, the model assumesthat the probability of finding a proof for an argument is equal to the product of the probabilities that eachof the rules in the proof is available. (PSYCOP proves the argumentsof table 7.4 with rules drawn

Reasoningwith Variables


from the following set: Forward AND Elimination , Backward AND Introduction , Backward IF Introduction , Backward IF Elimination , Backward NOT Introduction , and the four matching rules of table 6.1.) The model also assumesthat the likelihood is negligible that subjectswill find an alternative proof if their first attempt has failed. However, there are two differencesbetweenthe syllogism model and the model for these multivariable problems. One is that no assumptionsare made in the latter model about presuppositions. The experimental instructions and the wording of the problems were supposed to minimize effects of these factors; although this may not have beencompletely successful , it is likely that presuppositionsplaya smaller role than they do in syllogism experiments . The second difference between the models has to do with the matching rules. In fitting the syllogism data, I tacitly assumedthat these rules were always available to subjects; but preliminary model fitting suggested that this assumptionwas unlikely for the new problems. A possible reason for this is that the rules are more difficult to apply to a sentence that contains more than one variable or temporary name. In a syllogism, where there is only one variable or name per sentence , there can be no need to there is never names hence , ; carry out the b parts any subscripted 1 4 see table 6.1). The model rules 3 and of the actions in matching , , ( therefore included parameters for the matching rules as well as for the deduction rules. The predictions from this model appear next to the observedresponses in table 7.4, and the parameter estimatesare listed in table 7.5. The correTable 7.5 Parameter estimatesfor the model of equation (8), fitted to the data of the multivariable experiment. Parameter


IMters Rule para Elim .. of Forward AND Probability P .nde Intro Backward AND of Probability Pandl IF Elim .. of Backward Probability Plre of Backward IF Intro Probability Pin NOT Intro . ofBackward Probability Pnl Matcbiaaparameters =Pm4 Probability of to Pm I=Pm3 names ofmatching tovariables matching Probability Pm2 Respo .-e parameten ofguessing Probability P.


0 ..9 6 0 8 3 .0.9 0 0 0.6 36 1 0 60 2 1..0 0.70


Chapter 7

lation betweenpredicted and observedvaluesis 0.806, and the root -rneansquare deviation is 9.57- values that are cornparable to those for the syllogisrn experirnent. The rule pararnetersare also what we would predicted frorn the earlier experirnents: high availabilities for the AND rules and for IF Elirnination , intermediate availability for IF Introduction , and low availability for NOT Introduction . In fact, the pararneter estirnates corne very close to duplicating those of the syllogisrnexperirnentfor rules that overlap the two setsof argurnents(seetable 7.2). For IF Elirnination the pararnetervaluesare 0.90 for the presentargurnentsand 0.85 for syllogisrns; for NOT Introduction they are respectively 0.31 and 0.30. The current values for AND Elirnination (0.96) and AND Introduction (0.83) are also close to the cornbined pararneterfor the AND rules in the syllogisrn study (0.93). The rernaining deduction pararneter, the one for IF Introduction , has no counterpart in the previous experirnent. The rest of the pararneters also tell an intelligible story about these argurnents. Prelirninary rnodel fitting indicated that, under a variety of assurnptions, valuesfor rnatching rules 1, 3, and 4 were always quite close to one another and wereconsiderably lower than that for rnatching rule 2. (This is also consistentwith the response-tirne differencesarnong the onerule-two -step argurnents in the previous experirnent; see table 7.3.) For that reason, I have fitted a single value for the former rules in the rnodel reported in tables 7.4 and 7.5. Notice that rnatching rules 1, 3, and 4 are the onesresponsiblefor generalizingsubgoal terms to assertion variables, whereasrnatching rule 2 generalizesterms to narnes. Part of the reasonfor the differencein pararneter values rnay be that the rnatching-to -variable rules require subjects to keep track, not only of the rnain term that the variable will replace, but also of other ternporary narnesin the subgoal(see conditiond and action b of rules 1, 3, and 4 in table 6.1). Rule 2 is sirnpler in this respect, and it tends to be used in the present proofs to rnatch a subgoal to an assertionthat is a notational variant (e.g., to rnatch G( b) to G( Ii ) in problern 8). The only clear disparity betweenthe two experirnentsis that the pararneter for guessingin this experirnent(0.70) is larger than the pararnetersfor guessingin the syllogisrnexperirnent(0.08 and 0.31). This disparity rnay be due to differencesin the sarnpleof argurnentsor , perhaps, to our having warned the subjectsin the previous study that only a srnall percentageof the syllogisrnswere valid. In that situation, subjectsrnay have been rnore willing to think that an argurnent for which they had no proof sirnply

Reasoningwith Variables


' didn t follow . The subjects in the present experiment lacked such assurance and may have been more likely to hazard a guess.

Summary We have been aiming for a unified account of deductive reasoning, one that can explain how people handle arguments that depend on sentential connectivesand on predicate-variable structure. Most previous cognitive models have concentrated on just one of these argument types, and so explanations for inferenceswith connectivestend to look quite different from explanationsof inferenceswith syllogisms. This is a state of affairs we should avoid, if we can, sinceit missessomegeneralizations. It is an empirical fact that variablesthat affect reasoningwith conditionals have similar effects on reasoning with universals- something that is hard to understand if your theory says that people representuniversalswith Euler circles and conditionals with truth tables. Moreover, it is obvious that many argumentsdependjointly on connectivesand variables. The argumentsin table 7.4 are examplesof just this sort, and it is not at all clear how earlier disparate representationscan hybridize to yield a mechanismthat would explain these problems. Ideally, we would like a theory that can handle argumentswith connectives(e.g., those of table 5.1), categoricalarguments (e.g., those of table 7.1), and argumentswith mixed structure (e.g., those of table 7.4). The theory of chapter 6 gives us an approach that may help us explain all these problem types. The basic representation uses connectives explicitly and captures quantifiers implicitly through the structure of the quantified terms. The basic mechanismsare the rules for connectives, which also pass along terms during the proof . The experimental results of this chapter provide some support for the theory. It does reasonably well with syllogisms- psychologists' favorite argument forms- provided that we allow for presuppositionsof the categorical sentences . Of course, we don' t have an explanation for every effect that researchershave demonstrated in experiments with syllogisms. (We will return to some of these additional effects in chapter 10.) But the theory does seem to account for much of the variation among syllogismswhosesubject matter isn' t strongly biased, and it operatesfairly accurately both in contexts in which subjectsevaluateconclusionsand in contexts in which they produce conclusions from premises. Furthermore, the results of the last two



experimentssuggestthat the samemodel applies to nontraditional arguments that contain more than one term per sentence. The parameter estimates from these studies enhance our hopes for a unified theory. Parameter values for the same deduction rules are quite consistent from the syllogisms to multivariable arguments, and there is also substantial agreementwith the parametersfrom our study of sentential reasoningin chapter 5. In all three setsof estimates, the introduction and elimination rules for AND have highest availabilities, followed by the rules for IF . NOT Introduction has consistently low availability in each experiment (see tables 5.2, 7.2, and 7.5). The stability of the parameters acrossdifferent types of problems, different wording conventions, and different groups of subjects suggeststhat the successof our model is not merely due to local features of the data sets, but rather that it reflects deeperproperties of inference.

: SylloaismResults Appendix Table 7.6 gives the percentagesof " yes" responsesfor all classical syllogisms . (Bold entries are observed responses ; lightface entries are predictions from the model describedin the text; n = 20.)


Reasoningwith Variables


( AII(H,G),AII(GiF)> ( AII(G,H),No(F,G)> ( AII(H,G),No(F,G) ( AII(G.H),No(GiF ( AII(H.G),No(GiF (F.G) ( AII(G.H).Some (F.G) ( AII(H.G).Some (G.F)> ( AII(G.H).Some (GiF)> ( AII(H.G),Some -not(F.G)> ( AII(G.H).Some -not(F.G)> ( AII(H.G).Some -not(G.F) > ( All(G.H).Some -not(GiF)> ( AII(H.G).Some ( No(G.H).AII(F.G)> ( No(H.G).AII(F.G)> ( No(G.H).AII(G.F)> ( No(H,G).AII(GiF) ( No(G.H).No(FiG)

.u ,-II

( AII(G,H),AII(GiF)>

Some not(F,H}


40.0 5.0 2S .0 5.0 30.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0 5.0 5.0 0.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0

Some (F,H)


( AII(H,G),AII(F,G)>


A WJIn ~ ~~Or

AI1 (F,H) .oa.b.c 9O 89.1


Premises ( AII(G,H),AII(F,G)>




Chapter 7

Table 7.6 (continued) Conclusion

( No(H,G),No(GiF ( No(G,H),Some (FiG ( No(H,G),Some (FiG ( No(G,H),Some (GiF)) ( No(H,G),Some (G,F -not(FiG ( No(G,H),Some -not(F,G ( No(H,G),Some -not(G,F ( No(G,H),Some -not(G,F ( No(H,G),Some ( Some (G,H),AII(F,G ( Some (H,G),AII(F,G ( Some (G,H),AII(GiF

( Some (G,H),No(FiG ( Some (H,G),No(FiG ( Some (G,H),No(GiF ( Some (H,G),No(GiF ( Some (G,H),Some (F,G ( Some (H,G),Some (F,G ( Some (G,H),Some (G,F))

5.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0


( Some (H,G),AII(GiF))


5.0 15 .0 5.0 36.0 5.0 5.0 5.0 5.0 5.0 10 .0 5.0 5.0 5.0 0.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 0.0 5.0 15 .0 5.0 5.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0


Some not(F,H) .O


Some (F,H)


( No(G,H),No(GiF

5.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0



( No(H,G),No(FiG)}





Reasoningwith Variables

Table 7.6 (continued) Conclusion Premises

(H.G).Some (GiF ( Some -not(F.G)> (H.G).Some ( Some -not(F.G)> ( Some (G.H).Some -not(GiF)> (H.G).Some ( Some -not(G.F)> (G.H).Some ( Some -not(G.H).AII(F.G) > ( Some -not(H.G).AII(F.G)> ( Some -not(G.H).AII(G.F)> ( Some -not(H.G).AII(GiF) > ( Some -not(G.H).No(F.G)> ( Some -not(H,G).No(F,G)> ( Some -not(G,H).No(G,F)> ( Some -not(H,G),No(G,F)> ( Some -not(G,H).Some(F,G)> ( Some -not(H,G).Some(F,G)> ( Some -not(G,H).Some(G,F)> ( Some -not(H,G).Some(GiF) > ( Some -not(G,H).Some -not(F,G)> ( Some -not(H,G).Some -not(F,G)> ( Some -not(G,H).Some -not(G,F) > ( Some -not(G,F)> -not(H,G).Some ( Some

AII(F.H) 0.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 s.o 5.0 0.0 5.0 U 5.0 U 5.0 0.0 5.0 U 5.0 0.0 5.0 0.0 5.0 U 5.0 0.0 5.0



Snme - - ---- not(F.H)

0.0 5.0 0.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 5.0 5.0 5.0 5.0 0.0 5.0 0.0 5.0 10.0 5.0 5.0 5.0 0.0 5.0 5.0 5.0 U 5.0 0.0 5.0 U 5.0 U 5.0 U 5.0 U 5.0 U 5.0 U 5.0

30.0 15.0 20.0 15.0 10.0 15.0 15.0 15.0 15.0 15.0 20.0 15.0 15.0 15.0 35.0b,c 47.3 JO .ob 43.7 5.0 15.0 5.0 15.0 IU 15.0 2S .o 15.0 15.0 15.0 2S .o 15.0 20.0 15.0 2S .o 15.0 20.0 15.0 15.0 15.0 10.0 15.0 15.0 15.0

30.0 15.0 15.0 15.0 10.0 15.0 20.0 15.0 10.0 15.0 30.0 15.0 3S .O 15.0 8 .076.0 15.0 15.0 10 .0 15.0 10 .0 15.0 15.0 15.0 10 .0 15.0 30.0 15.0 15 .0 15.0 15 .0 15.0 15 .0 15.0 % 5.0 15.0 % 0.0 15.0 .0 .10 15.0 15.0 15.0

a. Valid in CPL. b. Valid with premiseimplicatures (but not conclusionimplicatures ) c. Valid with premiseandconclusionimplicatures


Implications and Extensions


TheRoleof Deductionin Thought

Our texture of belief has great holesin it . M . F. K . Fisher, How to Cook a Wolf

One reason that deduction has played a rather minor role in cognitive psychology is that it has been hard for psychologists to envision what purpose deduction serves. If there are no specifically deductive mechanisms and if what pass for deductions are just the results of generalpurposeheuristics, then of coursededuction has no proper use; it has been explained away. Similarly , if it is just a matter of diagram manipulation, then again deduction is only a special case of a more general-purpose mechanism. What is really of interest is the generalmechanism, not deduction itself. I will argue in a later chapter that both the heuristicsapproach and the diagram/ model approach fail, for internal reasons, to give acorrect explanation of inference. If this is right, then deduction may have a bigger part to play in cognitive theory. I touched on the theme that deduction has a role in other cognitive skills back in part I , and I 'd like to return to it here. The crucial ingredient that we need to realize this potential is the ability of our model to bind valuesto variables, becausethat givesit the power to manipulate symbols in memory and thus to perform cognitively useful tasks. It seemsclear, at least in retrospect, that the inability of previous deduction theories to instantiate variables is what makesthem so anemic and useless . In the first section of this chapter, we will look at some ways in which the instantiating of variables can be combined with other logical operations to provide a model for cognitive processes such as simple problem solving and categorization. Categorization posesan especiallyinteresting problem for the present theory, since beliefs about category membership - for example, the decision that a perceivedobject is a birthday present' usually aren t deducible from the evidenceavailable to us. More often, the evidenceprovides inductive warrant for categorizing, just as in more obviously judgmental situations (e.g., whether I should bring red or white wine to a dinner party , whether Harry Aleman is guilty or innocent, or whether a particular scientific theory is true or false). But if so, how can we handle such decisionson our deduction-basedapproach? AI has faced very similar questions about how to handle nondemonstrative (i.e., nondeductive) inferences, and we will examine some of its responsesin the second section. The problem comes up most clearly in reasoningwith defaults or typical properties. For example, we know that,


Chapter 8

by and large, cups have handles. (Perhaps this information is part of a mental schemaor frame or mini -theory that we have about cups.) Thus, if we are told that Calvin sipped tea from a cup, we assumequite reasonably that the cup had a handle. If we later learn that the cup was Chinese or styrofoam, however, we are likely to reverse our decision, and this " " ' defeasibility strongly suggeststhat our original conclusion can t have been deduced from a fixed set of axioms. Defeasible inferencesmust be extremely common in everyday thinking , and any general theory in AI or psychology must accommodatethem. One possibility is to passthe buck to schemas , frames, or similar data structures and let them do the inferencing . Another is to revisethe logical systemin order to take theseinferences into account- a possibility that has motivated " nonmonotonic" logics in AI . In what follows I will try to suggest, though, that schemas don' t provide any specialinferenceabilities, and that nonmonotonic logics are suspectas psychological theories. In contrast, the monotonic system ' developedin part II , although it certainly doesnt provide a generaltheory for inductive inferences,is at least consistentwith them and can supply all the purely logical machinery that they require. Deduction as a Cognitive Architecture

It is worth exploring how our extendedmodel could be usefulin tasks that go beyond those of standard reasoningexperiments. At the end of chapter 3, we looked at some ways in which deduction might come into play in ' general problem solving. PSYCO P s new ability to manipulate variables givesit everythingit needsto implementtheseexamples. Rather than replay these illustrations, however, it might be more instructive to seewhether PSYCOP can handle tasks that cognitive psychologistshave investigated directly, such as fact retrieval and classification. The aim is not to provide a new detailed model of how people accomplishtheseactivities: PSYCOP is abstract enough to be consistentwith many alternative models. Instead, the point is simply to be sure PSYCOP is capable of carrying out these processes in a human-like way. I have picked thesetwo examples, out of the many tasks that cognitive psychologistshave studied, on the grounds that enough facts are known about them to provide some constraints on ' theorizing and that they don t obviously embody deductive reasoning. To the extent that PSYCOP can provide a framework for thesetasks, it seems likely that it can do the samefor many other cognitive processes.

The Role of Deduction in Thought


Problem Solving by Deduction Let us take as an initial example PSYCO P' s performance on an experimental task, devisedby J. R. Hayes( 1965, 1966), that calls for memorizing and retrieving information . In Hayes' study, subjectsfirst memorizeda list ' of paired associates , which were supposed, according to the experiments cover story, to representcode namesof pairs of spies. Each pair designated two spies who were able to communicate; thus, the pair Tango- China meant that the spy named Tango could talk to the one named China. Figure 8.1 showsa possiblelist of pairs of this sort. Notice that the pairs, taken together, form a " spy network" ; for example, the pairs on the left in figure 8.1 constitute the network in the diagram on the right (though the subjects saw only the pairs, not the network). In the test phase of the " experiment, subjectsreceivedinstructions like Get a messagefrom Tango " to Larynx , and they were to solve the problem by saying aloud the path that the messagewould take (e.g., " Tango to China, China to Shower, Shower to Larynx " ). Problem Representation To represent Hayes' problem in the PSYCOP framework, the program must know both the relevant pairs and the definition of a messagepathway. We can use the predicate Talk-to as a way of expressingthe fact that two spiescan communicate directly; thus, let us assumethat the elementsof the paired-associatelist are stored in memory as the assertions Talk-to( Tango, China), Talk-to( China,Shower) , and so on. To record the fact that there is a pathway between two spies along which a messagecan travel, we need a second predicate (say, Path); so Path( Tango,Larynx ) will mean that there is a path betweenTango and Larynx . PSYCOP can then usethis predicate to representthe experimental problem as a goal; for example, Path( Tango,Larynx ) ? representsIs there a path betweenTangoand Larynx ? Solving the problem amounts to searchingthe Talk-to pairs to satisfy the goal, and this requires an explicit way to recognizewhich pairs constitute a path. Clearly, if one spy can Talk-to another then there is a trivial , one-link path betweenthem. PSYCOP can representthis simple caseby the conditional assertion( 1a}. ( I ) a. IF Talk -to( u, v) THEN Path(u, v} b. IF Talk -to( x,y} AND Path(y,z) THEN Path(x, z}


XlW8 ~ 88Jl


146no UlOOUn I119


Mr M

Chapter 8

The Role of Deduction in Thought


Of course, a path will usually consist of more than one pair, but we can simplify the problem by taking it a step at a time. We can find a path from Tango to Larynx , for example, by first finding someoneTango can talk to and then finding a path from that intermediate spy to Larynx . Assertion ( Ib ) representsthis stepwisemethod: There is a path between any two spiesx and z if x can talk to a third spy y and if there is a path from y to z. In order to solve the spy problem, all PSYCOP needsare the Talk -to pairs (i.e., the paired-associatelist), the goal, and the assertionsin ( I ) about the relation between Talk-to and Path. To handle this problem we need make only minimal assumptionsabout memory storage. It is enough to suppose that the Talk-to relations are accessiblein long-term or shortterm memory, and that PSYCOP can retrieve them as part of the process of looking for matchesto its subgoals. For example, a subgoallike Talktor Showeric) ? should be able to retrieve any Talk-to sentencein which Shower is the first argument. This is similar to Norman and Bobrow' s ( 1979) notion of retrieval by partial description. Problem Solllt;on Given the assertionsand the goal, PSYCOP looks for a way to derive the goal from the assertion, using its general rules. Figure 8.2 showsthe stepsit takes in solving the problem when the spy pairs have the configuration of figure 8.1. As we will see, the order of the system's searchdependson the order in which the assertionsare found in memory. For purposesof this illustration , we have assumedthat the order of the Talk-to assertionsis the same as the ordering of the pairs in figure 8.1, and that ( Ia ) occurs before ( Ib ). Changing the order of the Talk-to relations will affect the order in which PSYCOP searches the links, but will not keep it from solving the problem. The same is true of ( Ia ) and ( Ib ); PSYCOP will solve the problem when these assertions are reversed, though the solution will be much lessdirect. (In the next example, order will be more crucial.) The numbered subgoalsin figure 8.2 give the order in which PSYCOP considers these subgoals as it solves the problem; figure 8.2a is the initial segment of the solution and figure 8.2b is the finale. The figure shows the assertionsonly when they directly match the subgoals(as indicated by double lines). The main goal of the problem is the one shown at the bottom of figure 8.2a: Is there a path betweenTango and Larynx? There is no direct assertion about any such path; so PSYCOP has to useits inferencerules to see if it can find one. In the present case, Backward IF Elimination notices


Chapter 8

-to(HIII . Path 12 . Talk ,Larynx ,03 (03 )? )?13 ' <>/ Talk-to(China ,HIII). 10. Talk-to(HIII,Larynx )? 11. Talk- HIII

to( ,c3) AN ath(03,L8rynx )? " ' II ~ 8. Path(HIII,L8rynx 8. Talk-to(China ? ,o2)"? ) " """' , -- / < :;> Talk-to(Tango ,China ,L8rynx ,o2) AND ). 8. Talk-to(China )? 7. TaIk-to( China ath(O2,L8rynx ? ) II """ ; 4. Talk-to(Tango I . Path(China ,o1)? ,L8rynx )7 ..............~, / "' ?" < 2. Talk-to(Tanao - ,Larynx - -)? 3. Talk-to(Tango ,o1) AND ath(O1,Larynx )? ~ . ; 1. Path(Tango ,L8rynx )?


Ta t o . S ( , L ) II Ta C h ? S . , t ( o 11 ) . II + 14 I . . P S ? 2 , ( L ) ' / < " : ; > -4 Talk 7 IPat Tal . Ta .3 C hin t C h o ? o A 2 t.(T o , t.'~ ( ang o , )"? L ) ( a ) a th O ? , L a II,C ~ ; 1 .Talk I< Pat an . 'o "-t,L Y 1 ,o (T o1ar an )AN ?

-to(Tango 2. Talk ,Larynx )? """"'" .Larynx)? 1. P8th(Tango

18.1 FIpre -p- vcnp - - _ C'~ 0 solutionto the problemof tracingthroughthe spy networkof figure8.1 from

Tangoto Larynx. Panela is the initial part of thesolution. Panelb showsthecontinuation , afterthe programhasreturnedto subgoal8.

The Role of Deduction in Thought


that the goal Path( Tango,Larynx ) ? matchesthe consequentof the conditional assertion ( I a). The rule recognizesthat in this situation the goal will be fulfilled if the antecedentof the conditional holds; that is, if Talktor Tango,Larynx ) holds, it must be the case that Path( Tango,Larynx ) holds. The IF Elimination rule stipulates that the main goal is fulfilled if PSYCOP can establish the subgoal of proving Talk-to( Tango,Larynx ) ?, which becomes subgoal 2 in the figure. Unfortunately , though, Talkto( Tango,Larynx ) isn' t among the pairs PSYCOP knows, and there is no indirect meansof establishingit . So subgoal 2 fails outright . PSY,COP needssomealternative way to get at subgoall , and assertion ( I b) suggestsa second possibility. The same IF Elimination rule notices that it can also fulfill the goal by showing that there is a third spy cl to whom Tango can talk (i.e., Talk-to( Tango,cl ) and from whom there is a path to Larynx (Path( cl , Larynx ) . Subgoal 3 representsthis conjunctive subgoal, an instantiated version of the IF part of assertion ( Ib ). (IF Elimination substitutes the temporary name cl for y when it applies the argument-reversal procedure to the antecedent; seetable 6.3.) PSYCOP splits this two -part goal in half using backward AND Introduction . It first tries subgoal 4: Talk-to( Tango,cl ) ? This subgoal succeedseasily, since it matches several of the Talk-to pairs. The first one it finds is Talkto( Tango,China), which takes it one link along the correct solution path. However, PSYCOP must still establishthat there is a path betweenChina and Larynx , and it takes this as subgoal 5. At this point , then, PSYCOP must deal with Path( China, Larynx ) ?, which has the same fonn as the main goal that it started with , Path( Tango, Larynx ) ? In doing this, it follows exactly the same procedure it usedbefore, first checking unsuccess fully for a one-link connection betweenChina and Larynx (subgoal 6) and then looking for an indirect path (subgoal 7). This involves finding someoneto whom China can talk (subgoal 8). But this time PSYCOP unluckily finds Hill as a possibility, since Talk-to( China,Hill ) is the first such pair it locates in memory. As is apparent in figure 8.1, however, this choice leads to a dead end: Apart from China, there is no one to whom Hill can talk ; thus, there is no way to show that Hill can talk directly to Larynx (subgoaI I O) and no way to find a path from Hill to Larynx (subgoals 11- 13). Sinceall thesesubgoalsfail , PSYCOP must back up and try another route. 1 ' Figure 8.2b illustrates PSYCO P s progresswhen it gets back on track. (The unsuccessfulsubgoalsfrom figure 8.2a are omitted to give a clearer


Chapter 8

view of the solution strategy.) The systemreturns to subgoal 8, the point at which it last retrieved a Talk-to assertion. This time, it satisfies Talk-to( China,c2) ? by finding Talk-to( China, Shower), and then attempts Path( Shower,Larynx) ? (subgoal14 in figure 8.2b). For the earlier subgoals, assertion( Ia) was of no help, sincethere wereno one-link pathwaysamong the spies that these subgoals mentioned. For Path( Shower,Larynx ) ?, however, that assertion turns out to be crucial; it allows PSYCOP to satisfy the subgoal if it can show that Talk-to( Shower,Larynx ) ? (subgoal IS). Becausethis is one of the pairs that PSYCOP studied, it can satisfy the new subgoal with a direct match. At this point , then, the solution is complete in that PSYCOP has found a connected path from Tango to Larynx . Reading from top to bottom in figure 8.2b, we find that Talktor Shower,Larynx ) satisfiesPath( Shower,Larynx ) ?; this in turn satisfies Path( China,Larynx ) ? in view of the fact that China can talk to Shower; finally, we can conclude that Path( Tango,Larynx ), since we have found that Tango can talk to China. PSYCOP can stop, its task accomplished. ' Adequacy Hayes ( 1965) data are in reasonableagreementwith the way in which PSYCOP solves such problems. First , solution time increased with the length of the correct pathway, and this is obviously consistent with PSYCO P' s step- by-step method. Second, solution time also increased with the number and length of the dead ends(e.g., the one from China to Hill ) that run from the junction points along the correct route. This is ' " " analogous to Anderson s ( 1976, 1983) fan effects and to other interference phenomenain memory. PSYCOP exhibits the samebehavior, since it simply chains from its current position to a neighboring one, sometimes entering the dead ends. Once it headsdown a dead end, it will continue until it reachesthe terminus and is forced to back up. Third , Hayes' subjects carried out the final step of the problem more quickly than the preceding steps. PSYCOP does too: On each step it first tries for a quick solution via ( Ia ) and then resorts to ( Ib ); on each step but the last, the 2 quick solution fails. The purpose of our example, however, is not to model the results preciselybut simply to demonstratethe main featuresof PSYCO P's performancein a cognitive task. Deductionin Classification ' Hayes experiment is conceptually simple, sincethe definition of the problem and the information that subjectsneedto solve it are mostly specified

The Role of Deduction in Thought


as part of the task. Matters becomelessclear-cut, however, when we turn from fact retrieval to categorization. Categorization (or classification) is the processof recognizing something as part of a larger category. This includes assigning individual entities (or pictures of them) to specified groups- for example, deciding whether a particular object is an apple. However, psychologists also use " categorization" to include judgments about subset-supersetrelations, such as whether applesare fruits. In these examples the categories are ones that subjects know before the experiment , but investigators can, of course, construct artificial categoriesthat subjectsmust learn during the task session. Theories in this area focus on the way people representinformation about categoriesin memory and the processes they use to decide on category membership. In looking at categorization from the PSYCOP perspective, we will focus on a situation in which a person classifiesa perceivedobject as a member of a well-known category, sincethis caseseemscentral to the field and preservesnearly all its interesting issues.

asDeduction ! Supposeyou believethat an objectis a bird Classification if it hasfeathers . If you needto determinewhethera featheredcreature namedGeorgeis a bird, youcanthenusetheargumentshownhereas(2). (2) IF Feathered (x) THEN Bird(x) Feathered (George) Bird(George) Theargumentis simple,andPSYCOPcaneasilycategorize Georgein this 't . 2 isn true But of course the first of , sincearrows, way strictly premise ( ) , andotherobjectscanhavefeatherswithout therebybeing pillows, dusters 't be true either birds; thus, theconclusionof (2) needn , despitetheevident of the . If want to ensure that this deductivemethod we validity argument that is sufficient we need information categorizes thingscorrectly, premise for categorymembership . However, sufficientpropertiesare notoriously hard to find for all but a handfulof natural-languagecategories(see,e.g., Fodor 1981 ; Rosch1978;Smithand Medin 1981).3 ' Theseconsiderations , however , arent enoughto showthat thereis anything wrongwith (2) asa descriptionof how peopleclassifybirds. All they show is that peoplewill sometimesmake incorrectclassifications (false alarms) if they acceptthe correspondinginference . And makingmistakes in categorizingis not that uncommon ; it is exactlywhat we wouldexpect


Chapter 8

from nonexperts. What would cast doubt on a deductively correct argument like (2) as a classification method is, not the absenceof sufficient ' properties, but the absenceof peoples belief in them. In order to use(2) in classifyingGeorge, people would have to accept the conditional premise. If they don' t believe the premise, then (2) is uselessfor classification, despiteits validity . Do peoplebelievethere are sufficient properties for ordinary categories? Of course, people aren' t likely to accept something as obviously faulty as the first premise of (2) and be duped into thinking that arrows are birds; but then (2) doesn't give a very realistic set of properties. By adding other bird characteristicsto the antecedentof the first premise, we might be able to cut the false-alarm rate to a more reasonablesize. We are not concerned with the conditional in (2) itself, but with the possibility that there is some conditional that could fill its role. If there is any nontrivial conditional with Bird ( x ) as its consequentthat people take as true, then we might be able to use a deductive argument as a classification device in situations where the antecedentof the conditional is fulfilled. (The conditional must be nontrivial , sincethere are obviously sentences(e.g., IF Bird ( x ) THEN Bird ( x ) that are of no help in categorizing.) Some evidence from McNamara and Sternberg ( 1983) suggeststhat peopledo think they know sufficient properties. Theseinvestigatorsasked subjectswhether they could identify individual properties or setsof properties that are sufficient for membership in eight natural-kind and eight artifact categories. As it turned out, the subjects named properties they deemedsufficient for each of the natural kinds and nearly all the artifacts. " " " Examplesare bird that appearson quarters for eagleand light source " that has a shade for lamp. This suggeststhat in some circumstancesan inferencealong the lines of (2) might not be implausible as a classifying mechanism.Thesecircumstancesmust be rather limited, however. We may be able to uselight sourcewith shadeto determinethat somethingis a lamp if we hear or read this description; however, it isn' t going to be much help in perceptuallyclassifyingobjects, becauseit relieson light source- a category whose membersare probably no easier to classify than lamp itself. Ditto for eagle, \\"hich relieson bird. So we must impose additional restrictions on the antecedentin (2) to make it usefulfor perceptualclassification. Cltlssijictlt;on tISIndllct;on It is certainly not impossible that deductively correct argumentsalong the lines of (2) could playa part in categorization.

The Role of Deduction in Thought


that whethertheydo is a moresubtlequestion Thediscussionsofar suggests thanit mightat first appear.Of course,if theydo, thenwehaveall we . But consider needto explain how PSYCOPcould simulatecategorization the alternativepossibility. Supposethat, on at leastsomeoccasions , evidence that themselves consider on the basis of they peoplecategorizethings inconclusive(Rips 1989b ; Smith 1989 ). We takethe featheredobject on the sill to be a bird, eventhoughwe're quite awarethat the feathers ' (and the other propertieswe havenoticed) don t guaranteebirdhoodthat thereareobjectsthat havethesepropertiesbut aren't birdsat all. Still, feathersmakeit likely that the objectis a bird, and this may be all the certaintywe demandin this situation. It seemsundeniablethat people sometimes , especiallyin contexts engagein this kind of plausibleinference wherethe informationand the time availablefor the decisionarelimited. We thereforeneedto ask how a deductivemodel like PSYCOPcan . handlethesenondeductive judgments this situation if weturn (2) on its head. Although we can salvage Maybe wemaynot beableto deducedirectlythat Georgeis bird, wemaybeable to deducethe observablepropertiesof Georgeon the assumptionthat he is a bird. This is illustratedby (3). (3) IF Bird(x) THEN Feathered (x) Bird(George) Feathered (George) " -deductive The strategyhereis similar to the " hypothetico the . If we can deducethe factsabout Georgefrom the philosophyof science hypothesisthat he is a bird, then we can inducethe truth of that hypothesis ~In orderfor this strategyto work, of course,peoplewould have for categorymembership to believethat the propertiesin questionarenecessary in exactlythe . The conditionalin (3) makesfeatherednecessary . But perhaps sameway that the conditionalof (2) madefeatheredsufficient 't as difficult to find as sufficientones for aren ; necessary properties , it seemsmore reasonableto attribute to peoplethe beliefthat example all birds are featheredthan the beliefthat all featheredthingsare birds. hereto the useof deductionin AI systemsfor (Thereis someresemblance " " based learning (see, e.g., Delong 1988and Minton et al. explanation 1990 ). However, in thesesystemsproof is used, not to classifyinstances directly, but to determinewhy a given categorymember- a training



example- satisfiesthe prespecifiedcriteria for that category. Thesesystems then generalizethe derived explanation on an inductive basis to classify further instances.) However, this second categorization strategy suffers from the same defectsas the hypothetico-deductive method. Being able to deducea true ' proposition from a hypothesis generally isn t enough to show that there is good evidencefor that hypothesis (see, e.gOsherson et al. 1986 and Salmon 1967). In (4), for example, we can correctly conclude that the " " property of being cold is true of Siberia from the hypotheses in the premises. (4) IF Bird(x) THEN Cold (Siberia) Bird (George) Cold(Siberia) Moreover, Cold( Siberia) is something we can observeto be true. But (4) clearly gives us little confidencein Bird ( George) , even though we can use this premiseto deducea true conclusion. Another possible solution to the categorization problem would be to show that a deduction systemlike PSYCOP can indirectly implement the types of models that already exist in the categorization literature. In this case, we have an array of possibilities to choose from. (See Smith and Medin 1981and Medin 1989for reviews.) Most of thesemodelsare based on the idea that categorizing is a matter of computing the similarity betweenthe item to be classifiedand representativesof its potential categories . For example, we might classify the object on the sill as a bird if it is sufficiently similar to a prototypical bird (Posner and Keele 1968; Reed 1972) or to previously encounteredbird instances(Brooks 1978; Hintzman 1986; Medin and Schaffer 1978; Nosofsky 1986), or if it has sufficient " " family resemblance to theseinstances(Rosch 1978). There is little doubt that PSYCOP could duplicate the behavior of such theories; the problem is that none of these similarity -based models is completely satisfactory for natural categories(Rips 1989b, 1991). There are demonstrations that subjectswill classify an instance as a member of category A rather than category B evenwhen they judge the very sameinstancemore similar to B than A (Rips 1989b; Rips and Collins, in press). Carey ( 1985), Gelman and Markman ( 1986), and Keil ( 1989) have produced related evidence that subjectswill sometimesoverlook similarity in making categoryjudgments.

rhe Role of Deduction in Thought


Unless the notion of similarity is gerrymanderedin such a way that any attempt to consult the properties of the category and of the to -be-classified instance counts as determining their " similarity ," similarity can' t be the only relevant factor in categorizing. Although I believe that neither the pure deductive approach (in either its direct or its inverse version) nor the similarity approach will work for natural categories, there is something right about each of them. On one hand, similarity -basedtheories seemright in supposing that classification dependsin part on the goodnessof fit between the category and the instance . Goodness of fit in this casemay be relative to a contrast class of mutually exclusivecategories. For example, in classifyingGeorge the contrast class might consist of the categoriesof things commonly found on sills; in other situations, it might be the set of basic-level animal categories (bird, mammal,fish , etc.) that this specimenmight fall under. On the other hand, what is right about the deduction theories is that deductive inference may playa role in directing the classification process. This role must be more indirect, however, than either deducing category membership from known properties or deducing properties from the category. ClassificationtISExplanation What both the deductiveand the similarity theories are missing is that classifyingthe instancein the way we do helps us solve a problem. The instanceconfronts us with an array of properties in a particular combination that we have likely never experienced. By classifyingthe instanceas a member of a category, we can explain many of these properties and many more that are unobserved or unobservable. This idea of classificationas explanation is akin to the thought that classifying things is a matter of predicting their behavior or seeinghow they might serve practical goals (see, e.g., Anderson 1990 and Holland et al. 1986). But I doubt that most of the classifying we do is tied to prediction or goal satisfaction to any significant extent, unless of course the goal is simply to know more about the instance. Classifying a plane figure as an isoscelestriangle, for example, tells us something of its properties, but it would be odd to say that it allows us to predict them or to use them to serveour goals. The samegoesfor classifyinga pieceof music as eighteenth-century or a painting as abstract expressionist. Even classifying George may be mainly a theoretical enterprise on our part . Of course we do sometimesclassifythings for practical purposes, but this seemsunlikely to be the basic motive.


Chapter 8

' Thinking about classification as explanation doesnt take us very far without some way of explaining explanation, and adequate general theories of explanation are not much in evidence. (See Salmon 1989 and chapter 5 of Van Fraassen1980for reviews of previous theories of scientific explanation.) Still , .I think some headway can be made if we look at simple casesof categorization for natural kinds such as bird. For these categories, it is reasonableto supposethat what is in back of many of the instances' manifestproperties are causalrelations to whateverqualifies the instancesin the first place as membersof the kind. Obviously the details of the causalstory are mostly unknown to us nonbiologists, and beforethe advent of modem genetic theory any available stories may have been largely false. But the fact that we believe there is a causal connection betweenbeing a member of a speciesand manifesting certain properties may itself be enough to go on (Medin 1989). Determining which natural kind an instance belongs to seemsto entail (a) finding out whether there are causallinks betweenmembershipin one of thesekinds and someof the manifest properties of the instance and (b) making sure no other kind in the samecontrast classprovides a better causalaccount. Perhaps we could dig deeper and attempt to determine what people believe about how membership in a category causesthe properties of interest, but I won' t pursue that project here. Since our focus is on categorization rather than on recognition of causality, we can take the causal links as given in long-term memory and seehow PSYCOP might usethem in determining category membership. A CIl USal Approach to Categori' f.illg Categorization, according to our causal theory, is relative to a set of manifest properties and to a contrast class. Thus, to construct a simple example of a causal model, supposewe observe George's feathers and seek to classify him in a situation where the relevant contrast categoriesare birds, mammals, and fish. Long-term memory tells us that membership in the bird category leads to feathers, versusfur for mammals and scalesfor fish; therefore, this information can help us make the decisionabout George. Argument 5 producesone sort of procedure that results in the information that I sa( George,bird ) . I am not assuming, however, that this information is representedin memory in the form of an argument. Listing it in this way merely highlights the crucial sentences . (Representationalissueswill be taken up in the following section .) In essence , the method works by checking which categoriesin the

in Thought TheRoleof Deduction

" for the contrastclassare" Possible object(i.e., whichonescancausethe 's " " ), and thenassertingthat the object Isa memberof a object properties categoryif that categoryis theonly possibleone. Theconditionalpremises in (5) specifythe methodfor carryingout the categorization , the Cause the term information about the links between premises give long memory , and the Concatand Notequalpremises categoriesand their properties provide the facts about the contrastclass. Although the procedureis , it makesuse of the specialpredicateAssert essentiallyself-contained similar to the PROLOG ( predicateof the samename), which places its argument for example , the sentenceIsa( George ,bird) - in working memorywheneverAssertis encountered duringa proof. (5) IF (Test(xinstl,xpropl) AND Trouble(xinstl THEN ( xinstI ,xpropI ). Categorize IF (Possible xinst2 ,xcat2) AND Assert(lsa(xinst2,xcat2 ) THEN ( xinst2 ,xprop2). ( Categorize IF (Concat(xcat3) AND (Cause (xcat3,xprop3) AND (Assert(Possible (xinst3,xcat3 AND False (m THEN Test(xinst3,xprop3). IF True(m) THEN Test(xinst4,xprop4). IF (Possible (xinst5,xcat5) AND (Possible (xinst5,xcat6) AND Notequal(xcat5,xcat6 ) THEN Trouble(xinst5). True(x). Cause (birdJeathers ). Cause (mammal,fur). ). Cause (fish,scales Concat(bird). Concat(mammal). Concat(fish). Notequal(bird,mammal). Notequal(bird,fish). . Notequal(mammal,fish). ,feathers (George ). Categorize How doesthis work? Notice, first, that the conclusionof this argument setsup the goal to categorizeGeorgeon the basisof his feathers , and the first two premises givetwo alternativemethodsof fulfilling the Categorize


Chapter 8

goal: The top one checks whether any categorization is possible, and it aborts further processingif any " Trouble " occurs. The second premise comesinto play only if the first fails, and it is the one that actually classifies by adding an Isa sentenceto memory (e.g., Isa( George,bird ) . Premises3 and 4 exhaustivelycheck the categoriesin the contrast classto seewhich of them can cause the given property, and they add the sentencePossibierx ) to memory for eachcategory x that can do so. The final conditional notes trouble in casethere is more than one category in the contrast class that is possiblefor this instance. ' Figures 8.3- 8.6 show how this procedureapplies to George s case. As in the earlier example, the figures depict only the sequenceof goals that PSYCOP places in working memory; assertionsappear only if they directly match a goal (indicated by double lines). Since PSYCO P' s main goal is Categorize( George,feathers) ?, it begins by applying Backward IF Elimination to the first premise, which has a predicate of this type as its consequent. This leads it to set up the subgoals Test( George,feathers) ? and Trouble( George) ? via AND Introduction (figure 8.3, goals 3 and 4). To handle the first of thesesubgoals, it tries the third premiseof (5), which initiates exhaustive search of the contrast categories. To satisfy the Test subgoal with this premise, we needto find one of the contrast categoriesa (i.e., one that satisfiesConcat( a) ?), show that this category causesfeathers (Cause( a,feathers) ?), assert that George is possibly a member of this category (Assert( Possible( George,a) ) ?), and finally satisfy the subgoal False( m) ? In fact, this last subgoal can never be satisfied, but the failure to fulfill it causesPSYCOP to cycle back through all the categoriesin the contrast class. Figure 8.3 representsthe effect of trying these subgoals (subgoals 6- 11) when Category a = bird. In this case, PSYCOP easily determinesthat birdhood causesfeathers(becauseof the seventhpremise in (5 and then assertsthat George is possibly a bird. ( Notice that the AND Introduction rule that is reponsible for these subgoals is also in chargeof communicating the fact that a has beeninstantiated to bird.) But becausethe False subgoal fails, PSYCOP must go back and try another possibility from the contrast categories. The point of this is to make sure no rival categoriesalso causefeathers. In figure 8.4, we try Category a = mammal. Subgoals 1- 7 in this figure are the sameas in the previous diagram; PSYCOP doesnot rederivethem. The only new activity is shown at the top , wherethe program tries to fulfill subgoals6 and 7 again, using mammalinstead of bird. This meanstrying

~V 1







. .

The Role of Deduction in Thought







The Role of Deduction in



to show that mammals causefeathers(subgoaI12 ); of course the attempt fails immediately, since we have no way to prove anything of the kind. Thus, PSYCOP skips subgoal13; that is, it is not tempted to assertthat George is possibly a mammal. An identical failure not shown in the figures- occurs when Category a = fish . This inability to satisfy the antecedent of the third premise of (5) means that PSYCOP needs to find another way to fulfill the subgoal Test( George,feathers) ? The fourth premiseotTersthe only other possibility: The Test subgoal follows if we can show True( m) ? Figure 8.5 indicates this as subgoal 16, and it is easily satisfied because of the True( x ) premise. The purpose of this device is simply to allow computation to resumeafter examining the contrast categories. Even after satisfying the Test subgoal, however, we still have to deal with Trouble( George) ? (subgoal 4 in figure 8.5). The point of this latter goal is to ensure that any category we have found for our to-be-classifiedobject is unique. The fifth premise, together with our usual IF Elimination AND Introduction pair, generatessubgoals 17 22, which ask whether there are categoriesband c such that Possible( George,b), Possible( George,c), and b ~ c. This could happenonly if we had found in the previous step that more than one of the contrast categoriescould causefeathers. As it is, the only relevant assertion is Possible( George,bird ), which we derived in fulfilling subgoal 10 (figure 8.3). This means that there is no way to show Trouble( George); subgoal22 fails, and so does subgoals4 and 2. If PSYCOP had succeeded in finding more than one possiblecategory for George, we would be finished , and the procedurewould stop, having noted the possibilitesbut not " " having assertedthat George isa member of either category. At this point , the procedurehas failed all the way back to the main goal Categorize( George,feathers) ?, and PSYCOP therefore needssome other way of handling it. The secondpremiseof (5) allows us to finish up, since it tells us that the Categorizegoal will be fulfilled if we can prove Possibier Georged) ? and Assert( 1sa( Georged) ) ? for some categoryd . These appear as subgoals 24 and 25 in figure 8.6. The first of these matches Possible( George,bird ), which PSYCOP now knows to be the only such match. It then assertsIsa( George,bird ) and stops, having completed the classification. AdeqllllCY Although the above example is a simplified one, it does seem to capture some of what we want from a categorization theory. It avoids







Chapter 8

TheRoleof Deduction in Though1


ird 28 Pollible ir (G.B).eorge .Aller la .B.eo (I1(G ? . . d r I G t la . 25 AaI . ( ( o rg ? .Pollible 24 ) eorged -(G -"23 ~ -".-Pollible ; ' "--(G -.(:)AND >(I(G :Allert ~ la?eorg eorged 1orize ? 1.Cat .g(G .feather )eorge Fiaure 8.6 The categorization exampleconcluded. PSYCOP assertsthat George is a bird.

reliance on an unconstrained notion of similarity . It also provides more flexibility in categorizing than either the deductive method in (2) or the hypothetico-deductive approach in (3). Although PSYCOP derives the conclusion of (5) in a deductively correct manner, this conclusion is not that the instance is a member of a specificcategory. PSYCOP can prove ' the conclusion of (5) while remaining undecidedabout the instances category membership; the conclusion is merely a dummy proposition that triggers the classification process. The point at which the processdecides on membershipis the point at which it assertsthe lsa proposition , and this proposition does not follow deductively from anything in (5). Of course, it is possible to rewrite (5) so that the lsa sentencedoes follow whenever ' there is a unique causalconnection betweena category and the instances properties. But this would mean reworking the premisesin the argument, and these revised premiseswould presumably correspond to beliefs that seemquestionable in precisely the same way as the conditional premises of (2) and (3). We need to distinguish carefully between two different ' claims: (a) that peoples belief that there are causal relations of a certain sort producestheir belief that George is a bird and (b) that people believe that if there are causal relations of a certain sort then George is a bird. What (5) commits us to is (a), not the lessplausible (b).4 The limitations of (5) are not hard to see. For example, the causal relations are much too simple. The bird speciesdoesn't causefeathers, even in folk biology. Rather, spmething about being a bird - maybe having the


. Chapter 8

right geneticstructure- causesthe properties. Moreover, we often classify objects on the basis of second-hand information - say, becausesomeone informs us that an instance is a bird. In that case, if there are causal relations involved, those relations must run through our informants ability to recognizeand communicatefacts about the category. Theseindirect connectionsare also apparent in classifying instancesas membersof artifact ' categories, such as pajamasand chairs. Being a chair doesnt causean object to have a back or seat. If we ask why a particular chair has a back, the answermight be a causalstory, but it would reflect designdecisionson the part of whoever was responsiblefor the chair's manufacture. Clearly, then, a procedure like (5) needsto be elaborated with a theory of how peoplederive causalrelations and which of theseindirect relations qualify instancesascategory members. (Of course, for definitional categories, such as prime numberor isoscelestriangle, causal relationships may be entirely irrelevant. For these categories, however, we already have a method for classification via argumentslike (2).) There are also questions about specifying the contrast class in the Concat premisesof (5). This information must be peculiar to the situation in which the classifyingoccurs, whereasthe conditional premisesare parts of long-term genericmemory. But, of course, the contrast classisn' t given in the external situation in the way that George and his properties are. People must infer the relevant contrast class from other facts. If one is touring a zoo then bird , mammal, and fish might be relevant, whereasif one is glancing at a windowsill then bird , vase, and leaf might be the right categories. It might be possible to treat such inferenceson a par with (5), where we narrowed the contrast set of categoriesto those that are " possible " and then narrowed further to the final category. We might start instead with an even larger set of categoriesand cull the contrast categories on the basisof the situation we are in. An alternative possibility, however, is that the contrast categoriesdepend on the way long-term memory is organized. People may represent coherent facts about specific settings, such as zoos, so that , when they recognize the setting, related information (including the contrast categories) becomesavailable automatically (Tversky and Hemenway 1983). This is the idea that, in the mid 1970s, prompted many AI and memory researchersto proposelarger-scalememory representations- frames (Minsky 1975/ 1985), scripts (Schank and Abelson 1977), or schemas(Rumelhart 1975). Indeed, many investigators

The Role of Deduction in Thought


believe that categorizing is just a matter of applying a script or a frame; thus, we should examine how our assumptionsabout reasoning sit with theseproposals. Frames, Scbemas , Theories, and Deduction The motive behind frames, schemas, and scripts was the idea that skilled cognition dependson integrated packagesof information . The usual example is that in comprehending stories we do more than string together the individual sentencesof the text; we elaborate these sentencesinto a coherent structure by bringing to bear our background knowledge about the story' s eventsand objects. The story producesa higher-level memory representationif we are successfulin understandingit ; but the background facts are parts of higher-level representations, too, according to thesetheories . They are something like mental encyclopedia entries that we can consult to make predictions or fill in missing information . In fact, we might think of the whole of our long-term memory for genericinformation as composedof encyclopedia- entry chunks of this sort. The entries would specify facts that are constant across instances; the entry for bird would mention egg -laying, and the entry for restaurants would note that they servefood. Cross-referencesto schemasfor thesesubconcepts- seeegg, or seefood - would also appear, in the form of memory pointers. In addition , the schemaswould specify facts that are usually but not invariably associated with the relevant instances: default values that should be assumed unlessspecificallycontravened(birds fly ; restaurantshave menus). Finally , the entries would have variables that can be instantiated when the schemas are applied to a specificexample(the color of a bird , or the location of a restaurant). (See Brachman and Schmolze 1985 for a worked-out version of this idea.) Researchon conceptsand categorieshas also pointed to the need for higher-level representations. Murphy and Medin ( 1985; see also Medin 1989) invoke lay theories to explain why somesetsof instancesconstitute coherent categories, while others (e.g., the set of all things that are either parts of Atlas missilesor ingredientsfor cheesecake ) appear to be arbitrary . Similarly, Carey ( 1985) appealsto children's theories of psychology assemblages and biology to explain how they generalizeproperties from one instanceto related ones. I have used the sameidea to explain why adults'


Chapter 8

category decisions sometimesoverride their judgments of similarity between instanceand category (Rips 1989b, 1991; Rips and Collins, in press). There is a quibble about whether the setsof beliefs that children, or even adults, haveabout suchdomains are large enough and coherentenough to " " qualify as theories in the sensein which this term is used in science or in philosophy of science. But call it what you will ; people seemto draw on a set of beliefs about a category in order to explain why the category has the instancesand the properties it does. Although the emphasismay be slightly different, these mini -theories seemquite close to schemasor frames, and it will probably do no harm to identify themis I will use schemaas the cover term for thesesorts of structures. Although I emphasizedthe logical form of individual mental sentences in spelling out the details of the theory in part II , there are severalreasons to think that there is no real incompatibility betweenschematheory and our deduction system. In the first place, Patrick Hayes( 1979) has convincingly argued that the inferential powers of schemasare all available in first-order logic. In particular , PSYCO P' s representations have all the important featuresof schemasidentified by Rumelhart (seeRumelhart and Norman 1988, p. 537): They allow the use of variables; they can express knowledge at any level of abstraction; they can spell out encyclopedic information (rather than simple necessaryand sufficient features); and they can serve as classification devices.6 In fact, these properties were illustrated in the categorization example of the preceding section. The only inferential property of schemasthat goesbeyond what I have discussedis the use of default valuesto expresstypical characteristics. Hayes believes that defaults, too, can be handled in a fi.rst-order systemif we admit names that refer to the state of the system's own knowledge base. (I will return to the problems posed by defaults in the following section.) The importance of schemasseemsto lie, not in their inferencepowers (which are quite simple), but in the way they organize long-term memory and guide retrieval. Schemasare methods for bundling related facts in long-term memory. From this perspective, however, there is no reason ' why the organized information can t itself appear in logical form a la PSYCOP. In fact, the PSYCOP systemis already committed to schemalike structures that relate individual sentences . First , the deduction and dependencylinks (e.g., those pictured in figures 8.2- 8.6) associatethe mental sentenceswithin a proof . The proof itself can be considereda type of schema, with the links providing retrieval pathways. Second, the routine

TheRoleof Deduction in Thought


for categorization that we consideredabove also presupposeshigher-level organization to ensurethat PSYCOP usesthe conditional sentencesin the order in which they appear in (5). The classification routine operatescorrectly only if the systemtries the third premise before the fourth premise of (5) in attempting to satisfya Test goal. Otherwise, Test could be fulfilled without any evaluation of the causal relationships that are supposedto be the key parts of the process. There is some flexibility in the way the sentencesin (5) are ordered, but we usually have to observe the given sequencefor conditionals that share the same consequent. We could get around this, if we wanted to, by disjoining the antecedentsin a single complex conditional. This would force the correct ordering of antecedents , becausethe Disjunctive Modus Ponens rule that processes such conditionals observesa left-to -right ordering of the disjuncts. If we want to keep the conditionals in their simpler form , however, we needto regard them as parts of a partially ordered structure. As originally conceived, schemaswere permanent units in long-term memory. But the inventors of schemasno longer seemto regard permanence as an important property. Schank ( 1982) thinks of scripts as structures created on the fly from more fundamental units called " memory " organization packets, and Rumelhart, Smolensky, McClelland, and Hinton ( 1986) view schemas as emerging from temporary activation patterns in a connectionist network. Whether we should regard schemas as static or as dynamic structures in human memory seemsto be partly an empirical question- one whose answer might depend on the system's state of learning or on the schema's subject matter. It is also an issueon which we can afford to be neutral. If schemasmust be assembledfrom more elementary units, we can do so by taking the units to be our usual mental sentences . As long as a cognitive systemcan put together information about a key object or event for purposesof retrieval and can instantiate variables for purposesof inference, then it has nearly all it needsto account for the phenomenausually attributed to schemas. Natural Deduction , Nonmonotonic Logic , and Truth Maintenance

Schemasappear to be consistent with deduction-basedapproaches, with one potential exception: default values. The problem is that we often want to make simple inferenceson the basisof typical or normal cases. By and large, birds fly , mammals live on land, fruit is sweet, chairs have legs, cars



burn gas, cups have handles, and crimes have victims. If we learn that something is a member of one of these categories, then we should conclude , all things being equal, that the typical properties hold. But of course ' things aren t alwaysequal. We haveto acknowledgethe existenceof flightless birds (ostriches, penguins), aquatic mammals (whales, dolphins), nonsweet fruits (olives, lemons), leglesschairs (beanbagchairs, booster chairs), gaslesscars (electric cars, bumper cars), handlelesscups (styrofoam cups, Chinese teacups), and victimless crimes (littering , jaywalking ). These exceptions don' t mean that people are wrong to make the default assumptions (in most cases, the inferencesare reasonable), but they do make life more difficult for deduction systems. Minsky ( 1975/ 1985) pointed out this problem and stimulated a great deal of researchin AI aimed at solving it. Minsky ' s own solution, and that of many other schematheorists who followed him, was to let the schemas themselvesperform the default inferences.If we learn that objectl is a fruit , then the fruit schemaprovides the information that objectl is sweet. Of course, if we then learn that object1 is an olive, both the olive schemaand the fruit schemaare activated, and we will get conflicting inferencesabout the sweetnessof this item. In suchconflict situations, thesetheoriesusually decidein favor of the default from the more specificschema; thus, objectl will be judged not sweet. This specificity principle is simple enough to make it seemplausible that schemascan handle default reasoningon their own, without the support of an external inferenceengine. It is not hard to see, however, that there are severe difficulties for schema-baseddefault reasoning. Although we can rely on the specificity rule for simple casesof conflicting defaults, there are other situations in which rival information comesfrom schemasat roughly the samelevel of abstraction. To combine an example of Reiter's ( 1987; seealso Reiter and Criscuolo 1981) with one of Tversky and Kahneman ( 1983), supposewe believe that bank tellers are typically conservativeand that feminists are typically not conservative. If we then meet Linda , the famous feminist bank teller, we are left with a situation in which our defaults collide but there is no hope of determining the best inference- conservative or not conservative- on the ground of specificity alone. Conflicts of this sort defy our stereotypesor schemasand call for either withholding the inference (Touretzky 1984) or using some outside strategy to adjudicate it ' (Hastie et al. 1990; Kunda et al. 1990; Reiter 1980). But if schemascan t resolvethe conflict, what can?

The Role of Deduction in Thought


NonmonotonicLogic A different responseto Minsky' ' s problem is to remedy it by modifying the logic responsible for default inferences. (See Ginsberg 1987 and Reiter 1987for reviews.) Traditional logics are " monotonic" in the sensethat the entailments of any set of sentencesare also entailments of any supersetof those sentences . For example, if an entailment of {~ , P2, . . . , Pi} is C, then C is also an entailment of {~ , . . . , Pi, . . . , P,,} . But, intuitively , this seems ' wrong. If we deduce Sweet( objectl) from Fruit ( object1)' we don t want the result about sweetnessonce we learn that the object is an olive. To fix this problem, severalgroups of AI researchersattempted to construct " nonmonotonic logics" that could retract entailments as new information appearedin the database. One way to proceed, for instance, is to include in the relevant conditionals a provision that the defaults hold as long as the object in question is not abnormal. In the caseof sweetfruit , this might take the form of the default condition (6). (6) IF Fruit (x) AND NOT (Abnormal (x)) THEN Sweet(x). We would then list the kinds of things we consider abnormal, the abnormality conditions. For instance, as (7) notes, olives and lemons are abnormal with respectto sweetness . (7) a. IF Olive( x) THEN (Fruit (x) AND Abnormal(x)). b. IF Lemon(x) THEN (Fruit (x) AND Abnormal (x)). Next we would stipulate that the set of objects that are Abnormal must be the smallest set satisfying the default and abnormality conditions. We would also stipulate that the objects that are Sweetcan be any set satisfying the samedefault and abnormality conditions, plus the restriction just mentioned on Abnormal. Thesestipulations are what McCarthy ( 1980, 1986) calls the circumscriptionof (6) and (7), with respectto the predicate Abnormal and with the variable predicate Sweet. For example, suppose the domain of interest contains just three objects- object1, object2' and object3- and supposewe know that Fruit ( object1) ' Lemon( object2) ' and Oliverobject3) . Under the default and abnormality conditions in (6) and (7), the extension of Fruit is {objectl ' object2' object3} ' the extension of Lemonis {object2} ' and the extensionof Olive is {object3} . Moreover, the extensionof Abnormal must include at least object2 and object3' sincewe


Chapter 8

declaredlernonsand olives abnormal in (7). To circurnscribethe Abnormal predicate in this caserneanstaking its extension to include no more than thesetwo objects, so that Abnormal= {object2' object3} . What about Sweet, the rnain predicate of interest in this exarnple? Well, since object. is in Fruit and not in Abnormal, it rnust be in Sweetby the default condition (6). Circurnscription givesthe intuitively correct result in our exarnple. If all we know of object. is that it is a fruit , then we should take it to be sweet on the basisof our beliefs. However, if we should later find out that object. is an olive, then our conclusion rnust change. Sinceolives are abnormal by (7a), the extensionof Abnormal is now forced to include object. as well as object2and object3. But the default condition saysonly that normal fruits are sweet. So object. is not necessarilysweetin this case, and this blocks the default inference. Circurnscription producesnonrnonotonic inferences, becausenew knowledge about abnormal casesreduces the nurnber of instanceseligible for default conclusions. In this exarnple, of course, we are . using the predicate Abnormal in a way that is tacitly linked to sweetness Lernonsand olives are abnormal fruits with respectto sweetness , but they are perfectly normal fruits in other ways (say, in having seedsand in growing on trees). So we need to distinguish different abnormality predicates for different typical characteristics of fruits: one for being sweet, another for having seeds,a third for growing on trees, and so on. In our sweetfruit exarnple, we followed the usual practice of discussing default reasoningas proceedingfrorn knowledge of category rnernbership (i.e., object. is a fruit ) to the conclusion that the instancehas typical properties of the category (object. is sweet). However, default assurnptions rnight bernade in the opposite direction as well. This seernsto be exactly what we did in reasoningfrorn Georgehasfeathers to Georgeis a bird in the categorydecisionof the precedingsection: If George hasfeathers, then, unlessGeorge is abnormal in sorne way, he is a bird. We can rnake the circurnscription trick that yielded the default inference above apply to categorizingjust by reversing the position of the predicatesrepresenting the category and its property, since the logic is indifferent to which predicate is which. In place of (6), we would have IF Feathered( x ) AND NOT ( Abnormal( x ) ) THEN Bird ( x ), and in place of (7) we would have sentenceslike IF Arrow ( x ) THEN Abnormal( x ) . This suggeststhat cir curnscription and other forms of nonrnonotonic logic rnight succeedwhere we failed earlier; they rnight provide a purely deductive solution to the problern of categorizing.

The Role of Deduction in Thought


Unfortunately , though, nonmonotonic logics are lessthan ideal for cognitive purposes. First , despitethe fact that they have beendevelopedby AI ' investigators, nonmonotonic logics don t lend themselvesto simple implementations . Circumscription, in its general form, can be stated only in second-order logic. But second-order logic is not complete- there are no algorithms that can determine whether an arbitrary argument is valid. Much the sameis true for other forms of nonmonotonic logic: There are no general ways to compute the proper nonmonotonic conclusionsfrom a given databaseof beliefs. As we saw in the case of PSYCOP, incompleteness is not necessarilya problem. It may be possible to implement a significant fragment of nonmonotonic logic that achievesmuch of what this logic was intended for , but this remains to be shown.7 (SeeLifschitz 1985for somespecialcaseswhere circumscription can be reducedto firstorder sentences .) Second, these logics don' t seemto ref1~ t the deliberations that actually underlie human reasoningwith defaults (Harman 1986; Israel 1980; McDermott 1987). In these situations we seem to go along with the main conclusion (e.g., object! is sweet) evenwhile acknowledging that the range of potential exceptions(nonsweet fruits) is open-endedfruit may be rotten, unripe, injected with quinine, and so on. But then we must also believe that the result of circumscription is false, since circumscription forcesthe conclusion that abnormal casesare restricted to those explicitly stated. Although there is a perfectly valid deductive argument from the circumscription of (6) and (7) and Fruit ( object1) to Sweet( object1)' we cannot use this argument to sanction our belief that Sweet( object1) , becausewe don' t believe that the circumscription of (6) and (7) is true. Instead, we regard Sweet( object1) as a plausible or inductively strong conclusion from our general knowledge.8 The samenegativelessonapplies to categorizing. In deciding that feathery George is a bird , we do so in the face of the fact that there is an open class of feathery nonbirds (arrows, pillows, hats, people who have been tarred and feathered, etc.). Sincewe regard the result of circumscription as false(it is not the casethat featherynonbirds are confined to an explicit list of exceptions), we can' t usecircumscription as a deductive ground for our belief that George is a bird . What this meansis not that deduction has no role to play in producing this conclusion, but rather that the conclusion doesn't follow by a deductively correct inferencefrom other beliefs. Even in deduction systems, there is no need to supposethat every belief in the databaseeither is an axiom or follows deductively from axioms.



There is, however, a remaining problem for deduction systems. If we conclude Isa( George,bird ) by some nondemonstrative means (as in (5 and then on closer inspection decide NOT ( Isa( George,bird ) ), how do we avoid troubles caused by the contradiction? A possible solution comes from another AI innovation : truth -maintenance systems. Truth Maintenance

A truth -maintenancesystem(TMS ) is a routine for maintaining the logical consistencyof a databasefor an AI problem solver (de Kleer 1986; Doyle 1979; Forbus and de Kleer 1993; McAllester 1978, 1982). Strictly speaking, thesesystemshave little to do with the actual truth value of the data, and " database " consistency maintenance would be a more accurate way to describewhat they do. " Truth maintenance," however, is the term that has stuck. In a typical application, the problem solver plugs along independently , making assumptionsand applying its own set of inferencerules to derive further information . The problem solver, for example, might be solving the Towers of Hanoi or troubleshooting an electronic circuit . The TMS monitors the assumptionsand consequencesthat the problem solver draws. If a contradiction arises, the TMS identifies the set of assumptions responsibleand the consequencesthat depend on them. Thus, the TMS makesit easyto retract assumptions(and dependentinformation ) once it becomesclear that they are faulty . In addition , the TMS inspects the databaseto ensurethat there are no circular beliefsin which the problem solver uses assumption A to derive 8 and also uses 8 to derive A. In ' carrying out these tasks, the TMS doesnt care how the problem solver arrives at its consequences ; it might do so by meansof deductively correct inferences,inductively strong inferences,or mereguesses . The TMS simply that these whatever guarantees processes, they are, will not lead to an incoherent set of beliefs. Since TMSs are in charge of ridding databasesof contradictions, they seemto be ... . .V what we need to take care of the problem of second ' . But they provide thoughts about - - - - gW'; birdiness or object1 s sweetness lesshelp than we might expect, even though they provide an important method for managing assumptions. Notice, first, that many of the benefits of truth maintenance already exist in PSYCOP in virtue of its natural-deduction framework. As we saw in chapter 4, PSYCO P' s dependency links connect the assumptionsand premisesof a proof to the sentences that dependon them. PSYCOP also keepsa record of the converse

! ' exactl ,George s

The Role of Deduction in Thought


relation. Every sentencein a proof contains pointers to the assumptions on which it depends, and every assumptioncontains pointers to the consequences that depend on it. Thus, it is a simple process to identify the assumptionsthat produce a contradiction ; all that is necessaryis to follow the (converse) dependency links from the contradictory sentences . The union of theseassumptionsis the set that is responsible. (For example, if P has conversedependencypointers to assumptionsQ and R, and NOT P has conversedependencypointers to assumptions Rand S, then the set {Q, R, S} is itself faulty - a nogood, in TMS terminology.) Clearly, we also need to extend this facility to assertionsthat PSYCOP produces by nondeductive means. For example, when the sentenceIsa( George,bird ) is entered in memory as the result of the classification procedure in (5), PSYCOP will need to record the fact that it depends on some of the ' assumptionsthat appear as (5) s premises. Then, if later inferenceslead us to conclude NOT ( Isa( George,bird ) ), we can reexaminethe assumptions that landed us in this mess. More important , truth maintenancedoesn't grapple with the crux of the problem that we started with : Of two contradictory propositions, which one should we believe? If we have managed to infer that object1 is both sweetand not sweet, how do we resolvethis contradiction? Although these systemscan identify the assumptionsthat causethe contradictory result, they leaveopen how theseassumptionsshould change. Ruling out inconsistent ' assumptionsusually isn t enough to determine which of the consistent setswarrant our belief. For purposesof philosophy (and perhapsAI ), we need normative rules about how to change our minds in the face of conflicting evidence(Goldman 1986; Harman 1986; Israel 1980); for purposes of psychology, we also needdescriptive information on how people deal with the samesort of conflict. Implementing these rules amounts to placing a priority ordering on sets of assumptions, but coming up with the rules, both normative and descriptive, is a core problem of cognitive scIence.

OpenQuestions In this chapter we have tried to gain some perspectiveon our model by viewing it in a larger cognitive context. The effort has beento show that a deduction systemlike PSYCOP can serveas a type of cognitive architecture - one much on a par with production -based systemssuch as ACT .



(Anderson 1983) and Soar (Newell 1990). That this is possibleshould come as no surprise, since production systemsare a kind of special caseof the logical approach we are considering- one in which the only rules are modus ponens and universal instantiation. The fact-retrieval and categorization examples that we developed at the beginning of this chapter demonstrate how a deduction-based system can account for these typically cognitive operations. Of course, there are many types of cognitive skills, and our two examples don' t represent all of them. Nevertheless, it is not too hard to extrapolate theseexamplesto other cognitive tasks that require symbol manipulation . Whether this exhauststhe domain of " " cognition - whether there are subsymbolic processes that are properly cognitive is a delicate and controversial issue(see, e.g., Rumelhart et al. 1986 and Fodor and Pylyshyn 1988). By most estimates, however, the progressin adapting nonsymbolicmodelsto higher tasks- inference,problem ' solving, decision making, languageprocessing- hasn t beenstartling, and for these tasks it doesn't seemtoo outlandish to pin our hopes on a symbol manipulator like PSYCOP. The most successfulconnectionist systems for inference (see, e.g., Ajjanagadde and Shastri 1989) encode predicate-argument structures, bind variables, and operate according to principles such as modus ponens. But if deduction really has this directorial role, we seemto be left with a puzzle. How do inductively generatedbeliefscome about if tile cognitive system operates on deductive principles? This puzzle arises becausewe think of a belief as the conclusion of some supporting argument. Sinceby ' hypothesisthe argument can t be deductively correct, it must be an argument of someother sort- an inductive argument. But, of course, the only argumentsthat a systemlike PSYCOP can handle are deductively correct ones; so it looks as if inductively generated beliefs are impossible in PSYCOP. The considerations in this chapter suggestthat the way out of this dilemma is to supposethat there are ways of creating and justifying a belief other than making it the conclusion of a specialsort of argument (Hannan 1986). The categorizationexampleis a casein point , sincethe critical belief that George is a bird was not the conclusion of the argument that produced it. From the point of view of the system, the belief was a side effect of the deduction process. This doesn't mean that the belief was unjustified; on the contrary, it was justified by a kind of causal theory about the properties of category members. However, the example does suggestthat

in Thought TheRoleof Deduction



critical beliefsneednot be representedas conclusions, and this allows us to reconcilethe presenceof thesebeliefswith our overall framework. Of course, it is one thing to admit beliefs with inductive support and quite another to give a generalexplanation for them. I have tried to argue here that , for psychological purposes, neither schema theory nor nonmonotonic logics give a satisfactory account of these beliefs. Similarly , Oshersonet al. ( 1986) haveexamineda number of psychologicalproposals about inductive support and found all of them lacking (see,also Rips 1990a). It may be, as Osherson et al. suggest, that no general theory of induction is possible and that inductive support can take a variety of forms for different purposes. We may have to content ourselveswith a number of smaller-scaletheories like the one sketchedfor categorization. For our purposes, though, we can reservejudgment about this issue. If our project can provide a reasonablecognitive framework for specifying possibleforms of inductive support, we will have done well enough.


Alternative PsychologicalTheories: Rule-BasedSystems

Rulesalwayscomeright if you wait quietly. Kenneth Grahame, The Reluctant Dragon

Psychological theories of deduction seem to divide into those that are basedon rules that are sensitiveto an argument' s logical form and those that are basedon other properties of the argument or its context. I will use this split to organize the discussionin this chapter and the next, describing rule-basedsystemshere and non-rule-basedalternatives in chapter 10. It is convenient to do this, and much of the debate in the researchliterature has been fought under these banners. But this distinction shouldn' t be pushedtoo far. In PSYCOP we have a clear exampleof what a rule-based systemis like, and there are severalother proposals with much the same character. It is also clear that a theory stipulating that peopledecideabout the correctnessof argumentson a purely random basiswouldn' t qualify as " rule-based. Although rules may be involved (e.g., " valid if a coin lands on " " headsand invalid if on tails), this sort of rule obviously isn't sensitiveto logical form. But the distinction is rarely this clear-cut. In distinguishing rule-basedand non-rule-basedtheories, investigators tend to have in mind prototype rules, such as IF Elimination and AND Introduction , that operateon linguistically structured representations , that center on the usual connectives and quantifiers, and that are parts of standard logical systemssuchas classicalpredicatelogic. If an investigator devisesa systemwhose principles deviate from this prototype, then there is a temptation to say that the systemis not basedon " rules." For example, it is easyto come up with principles that implement the atmosphereeffect in syllogisms (Begg and Denny 1969). These principles depend on the presenceof SOME and NOT (or NO ), just as standard rules do, but they produce conclusions that are far from acceptablein either scholastic or CPL systems. (Seechapters 1 and 7 for discussionsof atmosphereLike wise, somepsychologistshave proposed that there are principles that capture inferencesbasedon expressionsof permission or obligation , such as the modal must in the sentenceShemust be 16 beforeshecan legally drive (Cheng and Holyoak 1985; Chenget al. 1986). Theseprinciples seemto be defined over linguistic structures, at least on some views, but they go ! beyond the logical connectivesin CPL . It is also possibleto find psychological theorieswhoseprinciples yield exactly the samevalid argumentsas in the sentential portion of CPL , and whose representationsinclude the " " logical constant , (i.e., NOT ) and proposition symbols analogous to P


Chapter 9

and Q but don' t otherwise have typical structure. Johnson-Laird et al. ( 1992) fiercely defend theories of the latter kind as not based on fonnal rules. If the principles of a deduction theory differ from more typical rules, it is seductiveto claim that the principles aren' t rules (or aren' t fonnal rules, or aren' t logical rules). In the same vein, becausethe more typical rules operate on syntactically structured logical fonn , it is tempting to say that the new principles aren' t " syntactic" but something else; " pragmatic" " " (Cheng and Holyoak 1985) and semantic (Johnson- Laird 1983) are the obvious alternatives. There are differencesamong thesetheories, but neither the rule/ non-rule distinction nor the syntactic/ semantic/pragmatic distinction conveys them appropriately (seeStenning 1992). It is important to keep in mind , as we examine these theories, that all the serious onesoperateaccording to principles (i.e., routines, procedures, algorithms) that inspect the fonnat of the argument's mental representation. Each of them dependsindirectly on some method (usually unspecified) fortrans lating natural-language sentencesinto the chosen fonnat. Each retains options for consulting background infonnation from long-tenn memory and selecting responsesin accord with the systems' goals. With a clear view of thesecommon properties, it becomeseasier to seethat the main differencesamong the theories lie in the interaction of the principles and the representational fonnat. In this chapter we will consider deduction theories that are similar to PSYCOP in sticking fairly close to the prototype of principles keyed to logical constants. Chapter 10 strays further afield in examining theories basedon heuristics and diagrams.

AlternativeTheoriesBasedon Natural Deduction PSYCOP has a number of ancestors and siblings that rely directly on natural-deduction rules, and in some ways they provide the clearestcomparison to what has been done in previous chapters. However, it is not always easy to deviseexperimental tests that discriminate cleanly among these theories. Although there is no shortage of research in this area, contrasts are difficult , both becausethe theories cover somewhatdifferent domains and becausethey continue to evolve as the result of new findings. There are no theories that cover both sentential and general predicate' argument structures in the way that PSYCOP does, but Oshersons ( 1976)

Rule - Based Systems


theory handles certain modal arguments (with operators for time, necessity , and obligation) to which PSYCOP doesnot extend. Thus, the overlap between the domains of the theories is only partial . Braine et al. ( 1984) have reported severalexperimentsthat may help in discriminating among the natural-deduction models; however, the messagefrom thesestudiesis somewhatmixed. Despite the difficulty in contrasting the models, it is infonnative to look at the array of options available within the natural-deduction approach. In addition to Osherson's and Braine' s models, we should examine some early researchby Newell and Simon ( 1972), who can claim to be the first to study an entire deduction systemexperimentally. The only remaining proposal along these lines is one by Johnson- Laird ( 1975); but since Johnson- Laird no longer regardsthis theory as correct, I will skip over it here, returning to his current approach in chapter 10. (SeeRips 1983for commentson the earlier theory.) GPS Revisited As was noted in chapter 3, Newell and Simon' s goal was broader than developing a theory of deduction; their concern was to fonnulate and test a general model of human problem solving. Their book concerns itself with a computational theory called G PS (the General Problem Solver). According to the G PS approach, problem solvers representinternally the overall goal that would constitute a solution to the problem at hand. They also describetheir own presentstate of knowledge in tenns that are commensurate with the goal state. By comparing their current state to the goal state, they are able to detect differencesthat they must eliminate in order to solve the problem. For example, in the logic problems that Newell and Simon looked at, the goal state was a description of the conclusion of an argument, the initial state was a description of the premisesof the argument , and the relevant differenceswere the structural discrepanciesbetween the premisesand the conclusion. One such differencemight be that the atomic sentencesoccur in a different order in the premisesthan in the conclusion; another might be that a particular atomic sentenceappears more often in the conclusion than in the premises; a third possibility could be that there is a connective in the conclusion that isn' t in the premises; and so on. G PS reducesthe discrepanciesbetweenthe current state and the goal state by what Newell and Simon call " operators," which are triggered by


9 Chapter

the differencesthey are qualified to eliminate. Applying one of theseoperators . to the current state causesthe problem solver to advance to a new state. Differencesbetween the new state and the goal state are then reassessed , and new operators are called into play to reduce these residual differences. This processcontinues until the problem is solved or abandoned . In the logic context, the operators are deduction rules that allow one sentenceto be rewritten as another, just as in the sort of model discussedabove. Thus, OR Introduction might be used as an operator to eliminate the differencebetweenapremiseS and aconclusionS OR 7: If theseoperators don' t quite apply as they stand, subgoalsmay be set up to achieve some state to which the original operators do apply. This may involve applying other operators, which may invoke other subgoals, and so on. The generalprinciple of solving problems by applying operators in order to reducedifferencesis means-endsanalysis. Newell and Simon's interest in logic wasconfined to its usein testing the GPS theory; they weren't concerned with how people naturally tackle reasoningproblems, or with the rules or operators that people intuitively know. As a consequence , the logic experimentsthey report are extremely artificial . In fact, although the questions that the subjects answered in these experiments were standard sentential logic problems, the subjects were probably unawarethat the problems had anything to do with and, or, if, and not, becauseall the problems appeared in symbolic form rather than as English expressions. Instead of asking the subjects to decide whether, for example, ( IF R THEN NOT P ) AND ( IF NOT R THEN Q) implies NOT ( ( NOT Q) AND P ), they phrasedthe problem as in ( I ). ( I ) (R :::> - P) . ( - R :::>Q) - ( - Q . P) In some logic systems, " . " is an alternative notation for AND , " - " for NOT , and " :::>" for IF . . . THEN . But the subjects were not told about theseequivalences . They weretold only that they were to " recode" the first string into the secondone. Newell and Simon conducted their experimentsby giving the subjectsa list of 12 rules for transforming one expressioninto another. The subjects could consult these rules whenever they wanted. One of the rules, for example, was stated in the form shown here in (2). (2) A :::>8 +-+ - A v B


This meant that the subjects could recode a string that matched the rightor the left - hand side into one that matched the other side . The subjects were then given a set of starting strings and a target string , with instructions that they should try to transform the starting strings into the target method had string using only the rules on their list . (This experimental originally been suggested by Moore and Anderson ( 1954 ).) The problems that Newell and Simon presented to their subjects were very difficult ones , involving as many as 11 steps in their derivations . The subjects were asked to think aloud , and the transcripts of their monologues constitute the data ' ' from the experiment . Thus , the subjects task wasn t just to determine if a derivation was possible ; they had to produce explicitly each intermediate step . (They were not allowed to write anything down ; the experimenters wrote down the steps for them .) The nature of the data makes it very difficult to summarize them neatly . Newell and Simon broke down their protocols into individual problem of one of the rules . solving episodes that centered around the application then the of these They qualitatively compared sequence episodes for each individual subject with the sequence that the GPS program had generated for the same problem . Although it is not always clear how the investigators determined that a subject was trying to apply a certain rule at a given moment , the protocols do seem to support the GPS orientation . It seems true , for example , that the subjects tried to achieve the goal state by applying rules that looked relevant on the basis of superficial differences between the premise strings and the conclusion strings . For example , they talk about applying the rule shown here as (2 ) if the premises contained a with a horseshoe ( :: and the conclusion contained a formula


with a wedge ( v ). If these rules did not apply to the strings as given , the subjects tried to make the strings fit the rules , taking this as a subgoal and rules . achieving it with additional ' This research served Newell and Simon s purposes in demonstrating that GPS could be applied to symbol - pushing problems of this sort . But , of course , it is much less clear whether these results are representative of ' the subjects deductive reasoning . It is easy to imagine that the subjects would have employed different strategies if the meaning of the symbols had been explained to them . This could have made the true relevance of particular rules much more apparent , allowing more direct solutions and less dependence on the surface appearance of the rules and formulas . A related


is that

the rules that


and Simon




Chapter 9

subjectsmay not be the onesthat the subjectswould have employed naturally had they been permitted to justify the correctnessof the arguments according to their own logical intuitions . The rules on their list might not correspond to psychologically primitive rules. Rule 2 in particular is one that we have rejectedas part of the PSYCOP system. ' Some evidenceof the generality of Newell and Simon s results comes from a pair of experimentsby Reed, McMillan , and Chambers( 1979). In ' the secondof their studies, subjectswere given six of Newell and Simon s problems under conditions approximating those of the original experiment . Half of these subjects saw the rules and problems in the form of uninterpreted symbols, as in the original study. In fact, Reed et al. went further in disguising the connectives, using novel symbols- . for AND , t for OR, and # for IF . . . THEN - to eliminate the possibility that subjects would recognizethe symbolsfrom math coursesthey had taken. The other half of the subjectssaw the samerules and problems, but were told about the meaning of the connectives. They knew, in other words, that t meant OR and so on. Thus, if there was any benefit to understandingthe naturallanguage counterparts for these strings of symbols, the subjects in the secondgroup should have turned in better performancesthan those in the first group. The results of this experiment were quite surprising, however, sincethey indicated that , if anything, knowing the meaning of the symbols hurt rather than helped the subjects. Overall, the two groups of subjects were equally able to solve the problems: 74% of the subjectsin the Meaning . Among the group and 75% in the No Meaning group were successful successfulsubjects, however, thosein the No Meaning group reachedtheir solutions faster. The averagesolution time for the No Meaning group was 410 second, versus 472 secondsfor the Meaning group- a difference of about a minute. An analysisof the derivations showedthat the No Meaning subjectsapplied fewer irrelevant rules in the course of their problem ' solving. In this respect, then, Newell and Simon s conclusionsappear to be more general than might be expected: Merely knowing the interpretation of the symbols did not improve the subjects' strategies. It is still possible to object, though, that the Meaning group's relatively poor performance could have beendue to the unnaturalnessof the rules. To take an extreme case, supposethat to subjects in the Meaning group the rules made no senseat all as ways of drawing inferencesfrom sentencescontaining and, or, if, and not. For thesesubjects, the rules would have beenjust as arbitrary with respectto English connectivesas with respectto the meaning-

Rule- BasedSystems


lesssymbols. Forcing the subjectsto apply intuitively inappropriate rules to thesefamiliar connectivesmight even have confusedthem. Reed et al. provide some data on the latter point , too. In a separate experiment, subjects were given arguments formed from each of Newell and Simon' s rules. For example, rule 2 above would have appearedas the two arguments: shown in (2'). ' - A ~ BA (2 ) A # B A# B ~B Like the subjects in the Meaning group, these subjects were informed about the nature of the connectives. The subjectsthen decidedwhether the conclusions followed from the premisesand rated their confidence on a scale from - 3 (for " very certain that the conclusion doesn't follow " ) to " " + 3 (for very certain the conclusion follows ). As a control , invalid arguments were included among those that the subjects rated. Reed et al. found that, although most of the argumentscorresponding to Newell and Simon's rules received positive confidence, there were three arguments whose average ratings were negative. The two arguments in (2') were ' among these, consistent with PSY CO Ps rejection of this conditional transformation principle. This rule was also the one subjects were most likely to misapply in the derivation task that we consideredearlier. The third negatively rated item had - ( - A . - B ) (i.e., NOT ( NOT A AND NOT B ) ) as premise and A tB (i.e., ( A OR B ) ) as conclusion, an argument that PSYCOP also fails to deducewith its current stock of inference rules. So there seemsto be warrant for the suggestionthat inferencesmade with someof Newell and Simon' s rules might have seemedquite foreign to the subjects and might have promoted problem-solving strategies that aren' t typically usedin deduction. Newell and Simon' s original experiment was, of course, never intended as a study of psychologically natural inferencemaking. Their focus was on symbol transformation strategies, and for this purpose there is atremendous advantagein controlling the rules that subjectsuse. Problem-solving strategiesare much easier to spot in thinking -aloud (or any other) data when you know the elementary operations that subjects have available than when you have to infer simultaneously both the operations and the strategies. But something like this more complex process seemsto be necessaryin domains such as this, since there appears to be no simpler method to guide us toward the right choice of psychologicalprimitives.


Chapter 9

Osherson's Models of Deduction Ifwe abandon Newell and Simon' s experimentaltask, we facethe methodological problem of studying deduction under conditions where we have no prior assuranceabout the psychological reality of either the inference rules or the manner in which they are applied. There may be severalways to proceedin thesecircumstances. One possibility is to construct a model including both kinds of assumptions, draw predictions from the model, and hope that the predictions will be sufficiently accurateto lend credibility to the whole package. The first personto adopt this approach in studying deduction was Daniel Osherson, who used it in his work Logical Abilities in Children ( 1974b, 1975, 1976). The details of Osherson's theory change slightly from volume to volume , but in essencehe proposesthat people have a stock of mental inference rules that apply in fixed order to the premisesof an argument. Thus, unlike Newell and Simon, Osherson assumesthat people have internal natural deduction rules, not merely that they can useexternally presented rules. One of Osherson's ( 1975, table 11.1) rules for sentential reasoningis the DeMorgan rule, shown here in (3), with the provision (" helping condition " that neither P OR Q nor Q OR P appear in the conclusion of the ) . argument (3) NOT (P OR Q) (NOT P) AND (NOT Q) Another exampleis a rule for contraposing conditionals shown hereas (4). (4) IF P THEN Q IF NOT Q THEN NOT P Here the helping condition stipulates that the rule applies only if a subformula of P and a subformula of Q appear in the conclusion of the argument as a whole; moreover, thesesubformulas must be negatedin the conclusion if they are unnegatedin the sentenceto which the rule applied (or they must be unnegated in the conclusion if negated in the original sentence ). According to this model, subjects mentally check the rules in a fixed order until they find one that is relevant to the premise. (Osherson's model operates only with single-premise arguments.) This first relevant rule is then applied to produce a single new sentence. Subjectsthen begin again

Rule- BasedSystems


from the top of their internal list of rules, scanning to seeif any apply to the new sentence. This process continues until either the conclusion is produced or no rule is relevant. In the first casethe subjectswill declare that the argument follows, in the secondthat it doesn't follow. Thus, rules like (3) and (4) should be understood to mean that if the last sentencein the proof is of the form shown at the top and if the helping condition holds, then the next sentencein the proof should be the one on the bottom. Osherson's rules all operate in a forward direction, but the helping conditions check the conclusion as well as the last assertion. This sensitivity to the conclusion keeps the model' s derivations from drifting in the wrong direction (seechapter 3). However, unlike GPS and PSYCOP, Osherson's model has no provision for subgoals (other than the conclusion itself). This means that it is impossible for the system to return to a previous point in the proof if it has applied a rule that branches away from the conclusion. Osherson's theory also has no place for suppositions; each sentenceof a derivation either is the premiseor follows from it by means of an inferencerule. The rigid structure of this procedure means that Osherson's model. sometimes missesinferences that seem no more complex than ones it handles. For example, the model easily copeswith the argument from IF P THEN Q to ( IF P THEN ( Q OR R ) ) OR ( S AND T ), but it fails with the formally similar one from IF P THEN Q to ( IF P THEN ( Q OR R ) ) OR ( P AND T ) (Rips 1983). To prove the first argument, the system transforms the premiseto IF P THEN ( Q OR R ) and then transforms this last sentenceinto the conclusion. It is possible to prove the secondargument in preciselythe sameway, but the model overlooks it . Becauseof the presenceof P AND T in the conclusion, the model rewrites the premiseas IF ( P AND T ) THEN Q, which throws it ofT the track of the correct proof . This mistake is a fatal one, sincethere is no possibility of returning to a previous choice point . To assessthis model, Oshersonasked his subjects- who were in upperlevelgradeschool, junior high, or high school- to evaluaterelatively simple argumentsexpressedin English. A typical argument (from Osherson 1975, table 15.2) readsas in (5). (5) If Peter is ice skating, then Martha either goesto a movie or she visits a museum. If Martha does not go to a movie and she does not visit a museum , then Peter is not ice skating .


Chapter 9








( Martha


one rules





Osherson premise

a movie one




as as









, usually

( 4 ) were


a museum






4 -



( 5 ),





addition the


, the



deduction '






3 . In


corresponded of





( 3 ) and

. The







, Osherson ' ( 3 ) and



s '

(4 ),



Bill ' (4 )




the to

, much

respectively ' (3 )






, follows




' four

example NOT



not does





true not



that mow

the are





apartment not


mows and

the Bill

manager , then

lawn does

, then Martha

or not

rakes rake


the the

tenants is


leaves leaves

are the

. .




manager. As we will see, theseone-step argumentsare crucial in testing the theory. For each argument, the subjectsindicated either that the conclusion followed from the premises, that it didn ' t follow , or that they couldn' t decide. They then rated the difficulty of all the arguments they had said were correct. In order to evaluate his model, Oshersonmakes useof two predictions that he calls the " inventory " and the " additivity " requirements. The inventory ' prediction restson the idea that a subjects correct acceptanceof one of the multistep arguments must be due to his possessingall the rules neededto derive it . For this reason, we would expect the same subject to acceptall the one-step argumentsbasedon those rules. Conversely, if a subject incorrectly rejectsa multistep argument, then he must be missing one or more of the rules that figure in its proof . So we should expect the subject to reject one or more of the one-step arguments that correspond to theserules. For instance, sincea proof of argument (5) involves rules (3) and (4) according to Osherson's theory, subjects who judge (5) correct should also say that the two single-step arguments(3') and (4') are correct. But subjectswho judge (5) to be incorrect should say that at least one of ' ' (3 ) and (4 ) is also incorrect. Osherson's additivity requirement also makes use of the relationship betweenthe multistep and the single-step arguments. It seemsreasonable to supposethat the difficulty of deciding whether an argument is correct

Rule - Based Systems


should be related to the difficulty connectedwith the rules in its proof . The difficulty of (5), whose proof involves rules (3) and (4), should depend on the difficulty of the latter rules. So the judged difficulty of (5) should vary ' ' positively with that of the one-step arguments (3 ) and (4 ). To test this relationship, Oshersonsimply correlated the difficulty rating of the multistep argumentswith the sum of the difficulty ratings of the relevant singlestep arguments. At leastat first sight, the inventory and additivity predictions seemquite reasonable; but the experiments Osherson conducted offered somewhat limited support for them. In his third volume (Osherson 1975), which is concerned with sentential reasoning, the percentageof inventory predictions that are true of the data ranges from 60 to 90 across experiments, and the additivity correlations vary between 0.57 and 0.83. In volume 4 (Osherson 1976), which is devoted to reasoning with quantifiers and modals, the inventory predictions vary from 61% to 77% correct, and the additivity correlations vary from 0.29 to 0.90. In an epilogue to the final volume, Osherson himself regards the additivity and inventory predictions as " bad ideas." The basis of his criticism is that the predictions dependon the idea that evaluating a rule embodied in a single-step argument is equivalent to applying that rule in the context ofa larger proof . But this needn't be the case. It may be easierfor subjects to recognizea rule as applicable, for instance, if that rule applies to the premiseof a problem than if it applies to an intermediate line in the proof . After all , the premise is actually written out for the subjects, whereasthe intermediate lines are present, according to the theory, only in the sub' jects memory. If that is the case, an inventory prediction may easily fail for the wrong reason. Subjectsmight be very good at recognizing that rules (3) and (4) are appropriate when each of those rules is embodied in an argument of its own. However, the samesubjectsmay fail to seethat rule (3) is applicable in a multistep argument in which rule (4) also appears. If they reject the multistep argument on this basis, then they will have violated the inventory prediction. This suggeststhat the inventory and additivity requirementsmay have ' underestimatedthe worth of Osherson s model, but there is also a possibility . that the predictions give us too rosy a picture of the model' s success Supposethat part of the difficulty of a reasoning problem is simply due to parsing the premisesand the conclusion. In fact, when subjects think aloud as they solve one of Osherson's problems, they often repeat the


Chapter 9

premisesover and over as if trying to comprehend them (Rips 1983). Of course, it is difficult to tell how much of this is due to difficulty in parsing the sentencesand how much is part of the reasoning processitself; but assumethat there is at least some component due to parsing alone. Dut notice that parsing difficulty for a multistep argument will probably be sharedby someof the corresponding single-step arguments. In the earlier example, the premise of the multistep argument (5) has much the same ' syntactic form as the premise of the single-step argument (4 ), since both ' are conditionals. Thus, if subjects difficulty ratings in part reflect the difficulty of parsing, this will tend to inflate the additivity correlations for reasonsthat have little to do with inference. There are some grounds, then, for thinking that Osherson's tests were too stringent, and others for thinking they were too lax. Unfortunately , there is no way to determine the relative seriousnessof these problems, and this leaves us uncertain about the strength of Osherson's models. Nevertheless, Osherson's general ideas of mental rules and mental proof weregroundbreaking, and they werecertainly the main inspiration for the project reported in part II and for the work of Draine and his colleagues on similar natural-deduction systems. Natural Logic According to Draine, Reiser, and Rumain Draine's theory is an attempt to specify the natural-deduction schemas that underlie sentential reasoning; thus, it falls squarely within the tradition of mental rules. The schemasthemselvesexhibit a few changesfrom one presentation to the next (e.g., Draine 1978, 1990; Draine and O ' Brien 1991; Draine and Rumain 1983); but we can concentrate here on the theory as Draine, Reiser, and Rumain ( 1984) set it out, sincethis appearsto be its most complete statement. The rules of Draine et al. include versions of what I have called AND Introduction , AND Elimination , Double Negation Elimination , Disjunctive Modus Ponens, Disjunctive Syllogism, Conjunctive Syllogism, Dilemma , IF Elimination , IF Introduction , and NOT Introduction . (See tables 4.1 and 4.2. ) Draine et al. generalizesome of theserules to operate with conjunctions and disjunctions of more than two sentences ; for example , the AND Introduction rule operates on sentences~ , P2' . . . ' P" to yield the sentence~ AN D ~ AND . . . AND PII. The multi -place connectives , however, do not playa role in the predictions for their experiments. There are also three technical rules for introducing suppositions and

Rule- BasedSystems


contradictions . Andfinally, therearethreerulesthatdo not recognizing in the of of the playapart theory chapter4. Oneof theseallowssentences fonnP ORNOTP to appearat anypointin a proof. Theothertwoare a rulefor distributingAND overOR anda variationon Dilemma , as shownherein (6) and(7). (6) a. P AND (Ql OR... ORQn) (P AND Ql) OR... OR(P AND Qn) b. (P AND Ql) OR... OR(P AND Qn) P AND (Ql OR .. ORQn) (7) PI OR... OR Pn IF PI THENQl

IFPnTHEN Qn Ql OR... ORQn PSYCOP handles inferenceslike (7) with a combination of OR Elimination and IF Elimination , and it handles(6) by meansof AND Introduction , AND Elimination , OR Introduction , and OR Elimination . Rules(6b) and (7) could be incorporated in PSYCOP as forward rules, which is roughly how they function in the theory of Braine et al. However, (6a) would create difficulties for PSYCOP as a forward rule, and we will see that it is also a trouble spot for Braine et al. The theory proposesthat peopleapply the deduction rules in a two-part procedure (Braine et al. 1984, table III ). The first (direct) part applies a subsetof the rules in a forward direction to the premisesof an argument. (If the conclusion of the argument happensto be a conditional sentence , then the processtreats the antecedentof the conditional as a premiseand its consequentas the conclusion to be proved.) The rules included in the direct part are AND Elimination , Double Negation Elimination , IF Elimination , Disjunctive Modus Ponens, Conjunctive Syllogism, Disjunctive Syllogism, and Dilemma, plus (6) and (7). This step can also use AND Introduction if it enablesone of the other direct rules. If the conclusion of the argument is among the sentencesproduced in this way, then the process " " stops with a true decision. If the conclusion of the argument contradicts " " one of thesesentences , the!) the processstops with a false decision. Otherwise, the direct processis repeated, much as in the British Museum algorithm (chapter 3), until a decision is reached or it produces no new


Chapter 9

sentences . During a repetition, the processcannot apply a rule to the same sentencesit has used on a previous round, and it cannot produce a sentence that it has already deduced. If the direct process fails to reach a " true" or a " false" decision then an indirect , processbegins, implementing such as NOT Introduction that call for suppositions. Draine ) strategies( et al. believe that the direct processis common to all or most subjects, whereasthe indirect processmay require more problem solving and produce some individual differences. Although Draine et al. give examplesof indirect strategies, they do not provide a complete description. Even if we confine our examination to the direct process, however, the model of Draine et al. can' t be quite correct as currently stated, since it leadsto infinite loops. Consider, for example, any argument with premises that include P and Q OR R. Then the derivation shown in (8) is consistent with the direct schemasof Draine et al.


Premise Premise

AND Intro. (from a, b) Rule(6) (from c) AND Intro. (from c,d)

Rule(6) (from e)

As was noted above, the direct procedure limits the useof AND Introduction , but permits it in situations (suchas (8c) and (8e whereit is necessary for applying a further rule (in this case, the distribution rule (6a . Furthermore , neither of theserules duplicates sentencesin the derivation, nor do they apply more than once to the samesentence. Hence, there appearsto be no way to block such an infinite derivation without either altering the direct reasoningprocessor excisingone or more of the rules. In general, it appearsthat the direct reasoningprocedure (as presentedby Braine et al. in table III ) exerts too little control on inference. The main concern of Braine et al., however, is to provide evidencefor the proposedinferencerules, rather than the reasoningprocedure. Toward

Rule - Based Systems


this end, they report several experiments that use the sum of rated difficulties of individual rules to predict the difficulties of arguments they can derive. These predictions (a modified version of Osherson's additivity requirement ) are applied to responsetimes, errors, and ratings over sets of problems whose proofs require from one to four steps. The model produced a correlation of 0.73 betweenthe error rates and the sums of the weighted rules. In fitting responsetimes and ratings of difficulty , Draine et alefound a fairly strong correlation with argument length (in number of words). After partialing out the length effect, the sum-of-rules measurecorrelated between0.83 and 0.91 with ratings and 0.41 with responsetimes. Thesecorrelations are vulnerable to the sameuncertainties as Osherson's additivity idea, but they provide somesupport for the proposed rules over a fairly large databaseof arguments. It is clear that PSYCO P's predictions for these data would be fairly similar to those of Draine et al., owing to the overlap in the systems' rules. This is no accident, since I canvasedthe rules of Draine et alein designing PSYCOP, as mentioned in chapter 4.2 The distinctive feature of the theory of Draine et aleis the commitment to the rules in (6) and (7). The distinctive featuresof PSYCOP are the residual rules in tables 4.1, 6.2, and 6.3 (e.g., the DeMorgan rules), the use of subgoals to control inference, and the ability to perform inferencesbasedon variablesand names. Further experimental evidencemight be helpful in deciding about the status of particular rules; however, global comparisons betweenthe models are extremely difficult becauseof the lack of details in Draine et alis description of their indirect reasoningprocedure. Summary PSYCOP has benefitedfrom the lessonsof the earlier effortsjust reviewed. As positive lessonsfrom this prior work , PSYCOP inherited many of its inferencerules, plus a methodological imperative to test the model over a variety of inferenceforms. The negative lessonswere the need to improve the control structure and to handle variables and names. PSY CO P's ancestors were simply too inflexible in the manner in which they wielded their rules (a common failing among ancestorsgenerally). The symptoms of inflexibility were a tendency to produce assertionsthat are obviously irrelevant to the proof or the contrary tendency to ignore obviously relevant ones. Lack of facility with variables also limited the types of inferences that theseearlier modelscould handle. Osherson( 1976) extendedhis


Chapter 9

theory to one-place predicatesand names, and Braine and Rumain ( 1983) suggestedrules for certain operations on sets. Neither theory, however, handlesmultiple -place predicatesor individual variablesin a generalway. I tried to make the casefor variablesin the last three chapters, and it may be that the pursuit of methods for handling variables will be the principle line of progressfor future work in this tradition . There remain a few substantive differencesin the setsof rules that the natural-deduction models adopt. But the overlap in rule choice is more impressivethan the disparities. Moreover, emphasizingrule distinctions is likely to seema kind of narcissismof small differenceswhen compared to the sorts of alternative theories that will be discussedin chapter 10.

The theories that we looked at in the last section are the only general rule-basedaccounts of deduction and are by far PSYCO P' s closest relatives . Investigators have proposed more specializedrule models, however - mainly to explain performance on Wason's selection task. I discussed the selectiontask in connection with sententialreasoningin chapter 5, and I offered a tentative explanation within the PSYCOP framework for the typically poor performanceon this problem. Recall that in this task subjects receivea conditional sentence,suchas If there's a vowelon onesideof ' the card, theres an evennumberon the other, and must decide which of a set of exemplarsthey must check(e.g., cards showing E, K , 4, or 7) in order to determine whether the conditional is true or falseof the set as a whole. As was also noted, it is possible to improve performance on the task markedly by rephrasing the problem while retaining the IF . . . THEN format. A clear exampleof such improvement occurswhen the conditional is If a person is drinking beer, then the person must be over 19 and the instancesto be checkedare cards representingpeople drinking beer, people drinking coke, people 16 years of age, and people 22 years of age (Griggs and Cox 1982). In this guise, the conditional becomesa clear-cut regulation, and subjects usually spot the cards indicating possible rule violators (i.e., the beer drinker and the 16- year-old ). Investigators have proposed new rule-basedmodels in order to explain the benefit that this wording conveys. A main difficulty in reviewing this researchis keeping one' s perspective. The effectsof content in the selectiontask have produced severalongoing

Rule- BasedSystems


controversies(e.g., Cheng and Holyoak ( 1985, 1989) vs. Cosmides( 1989); Cheng and Holyoak ( 1985) vs. Jacksonand Griggs ( 1990 , and thesecontroversies have the effect of drawing attention away from questionsabout deductive reasoning to questions about the details of the selection task. This tendency seemsto be amplified by properties of the selection task itself: In discussingsome of their own results, Jackson and Griggs ( 1990, " p. 371) remark that their experiments mirror those observedthroughout the past 20 years of researchon the selection task in that a result can be " changeddramatically by only a subtle change in problem presentation. This raisesobvious questionsabout the suitability of the selectiontask as an object of intensescientific theorizing. Furthermore, subjects' lackluster performance on the usual version of the task has tempted investigators, especiallyin other areasof cognitive science, to conclude that people are generally unable to reason to deductively correct conclusions. This neglects the fact that subjects solve many other deduction problems with near-perfect accuracy. Draine et al. ( 1984), for example, found less than 3% errors for one-step arguments involving AND Introduction , AND Elimination , Double Negation Elimination , IF Elimination , Disjunctive Modus Ponens, Dilemma, and Disjunctive Syllogism. We also noted minimal errors for all but one of the generalization problems in the responsetime study of chapter 7 (seetable 7.3). In looking at theories of content effectsin the selectiontask, we must be careful to ask how likely it is that they will extend to other findings in the reasoningdomain. SchemasAccording to Chengand Holyoak Prompted by results like those from the selection problem on drinking , Cheng and Holyoak ( 1985, p. 395) proposed that people typically make inferencesbased on " a set of generalized, context-sensitive rules which, unlike purely syntactic rules, are defined in terms of classesof goals (such as taking desirable actions or making predictions about possible future events) and relationships to thesegoals (such as causeand effect or " precondition and allowable action). Cheng and Holyoak refer to these " " rule setsas pragmatic reasoningschemas , where pragmatic refers to the ' rules usefulnessor goal-relatedness , rather than to the Gricean notion of pragmatics that we encountered in previous chapters. (Seealso Holland et al. 1986.) As an example of a pragmatic reasoningschema, Cheng and Holyoak offer the four production rules shown here in (9), which constitute the basisof a schemafor dealing with permission.


Chapter 9

(9) a. If the action is to be taken, then the precondition must be satisfied. b. If the action is not to be taken, then the precondition neednot be satisfied. c. If the precondition is satisfied, then the action may be taken. d. If the precondition is not satisfied, then the action must not be taken. To explain the selectiontask, Cheng and Holyoak assumethat the usual versions(e.g., the one basedon arbitrary pairings of letters and numbers) don't reliably evoke pragmatic schemas, and so performance is poor . However, the drinking -age problem (and similarly modified problems) do evoke the permission schema: Becauseof the match between(9a) and the conditional rule in the selection instructions (If a person is drinking beer. . . ), the pragmatic schemais activated, and the schemathen makes the rest of the items in (9) available to the subjects. These rules point subjects to the need to check the cards representing beer drinking (the action is taken, so rule (9a) applies) and being underage(the precondition is not satisfied, so rule (9d) applies). The other two cards- representing the cola drinker and the 22-year-old - are covered by the antecedentsof rules (9b) and (9c). But sincethe consequentsof theserules state only that " " " " something mayor may not be the case, the cola and 22 cards neednot be checked. This choice is the conventionally correct one, so the permission schemayields the correct answer. Cheng and Holyoak suggest, however , that schemasbasedon other pragmatic conditions (e.g., schemasfor causation or covariation) would not necessarilyhave the samefacilitating effect. Thus, good performance on the drinking -age problem is due to the fact that the schemait triggers- the permission schema- happensto have testing rules that coincide with those of the material conditional (i.e., the IF . . . THEN connectiveof CPL ). In discussing the pragmatic schemas, it is helpful to distinguish two facetsof their operations, especiallysincethe notion of an " inferencerule" seemsto apply in different ways within the theory. Cheng and Holyoak refer to the items in (9) as production rules, and they believe theseitems are directly responsible for the selection-task choices. The pragmaticschema theory, however, also presupposesa mechanism that supplies (9b)- (9d) whenever (9a) matches the conditional in the task, and this

Rule - Based Systems


relationship constitutes an inferenceof its own. For example, the theory assumesthat people infer If the preconditionis not satisfied, then the action mustnot be taken ( = (9d from If the action is to be taken, then the precondition must be satisfied ( = (9a . In comparing the pragmatic-schemaidea to other theories, we will usually be concerned with the latter schema basedinferences . It might also be helpful to repeata warning from chapter 2: Which answer is deductively correct depends on the analysis of the structure of the problem. Early researchon the selection task assumed that the correct answer is given by the definition of the material conditional (chapter 6) applied to the selectionrule. But part of what is at issue in current work is whether this analysisis appropriate for conditionals like the drinking regulation. I will use" correct answer," in regard to the selection task, as an abbreviation for the choice of the cards corresponding to the antecedentand the negatedconsequent, since that is how it is used in the literature. However, this is not meant to create prejudice against rival theories of the nature of the problem. Ditto for " better performance," " " higher scores, and the like. I will take up the question of what constitutes " " a true error in detail in the final chapter. To bolster their explanation, Cheng and Holyoak ( 1985; seealso Cheng et al. 1986) report experimentsin which permission contexts enhanceselection performance. In one experiment, for example, subjects were told they were immigration officials checking a form in order to make sure the ' ' following rule applied: If the form says EN TE RI N G on oneside, then the . Giving subjects a other side includescholera among the list of diseases rationale for this rule (that the rule was to make sure passengershad beeninoculated against the listed disease) boosted their performancefrom about 60% correct to 90% correct on the corresponding selection task. The claim is that the jump in scoresis due to the rationale suggestingthe permissionschema. A secondexperiment showed that the rule If one is to ' ' ' ' take action A , one must first satisfy precondition P also produced an advantageover the standard selection task (61% vs. 19% correct) despite the absenceof concretedescriptions for the action and the precondition. The Status of Pragmatic Schemas There is no doubt that contexts like the drinking -age and cholera problems promote much better performance in the selection task, but how convincing is the evidencefor pragmatic schemasas Cheng and Holyoak define them? In discussingthesecontent effectsin chapter 5, I noted that


Chapter 9

' Griggs ( 1983) takes the position that the effectsare due to the content s reminding subjects of specific incidents in which rule violations correspond to the right choice of cards. Subjectsmay not have experiencewith cholera inoculations, of course; however, they may well have had analogous experiences(e.g., other sorts of vaccinations required for entering a country , forgoing to school, or forgoing to summer camp), and these experiencescould yield the correct answer (Pollard 1990). Cheng and Holyoak intended their abstract version of the permissionrule (If oneis to ' take action 'A, one mustfirst satisfy precondition 'pi to addressthis concern . The theory is that this conditional is " totally devoid of concrete content" (Cheng and Holyoak 1985, p. 409) and should therefore not trigger retrieval of specificexperiences . But it is open to a critic to claim that " " " " satisfying preconditions and taking actions have enough residual content to prompt retrieval of specific episodes(Cosmides 1989).3 Since the schemasthemselvesare supposedto apply on a " context-sensitive" basis (seethe earlier quotation from Cheng and Holyoak 1985), it would seem there must be enough context to evoke the permission schemaeven in the abstract permission rule. But if this context is sufficient to evoke the schema, why shouldn' t it also be able to evoke specific experiences involving preconditions and actions?4 Despite this uncertainty about the sourceof content effectsin the selection task, there is no reason to doubt that people sometimesmake inferences about permissions, obligations, causes, covariation, and other important relationships. It seemsvery likely , for example, that people can reasonfrom It is obligatory that P given Q to It is permissiblethat P given Q, and from It is obligatory that ( P AND Q) givenR to It is obligatory that P given R (Lewis 1974). We can representtheseexamplesas in ( 10). ( 10) a. OBLIGATORY (PIQ ) PERMISSIBLE( P I Q) b . OBLIGATORY


QI R )

OBLIGATORY (PI R) Specific rememberedexperiencesseemless helpful in thesecasesthan in the selection task. In the latter, particular experiencescan serveas a cue about casesthat could potentially violate a rule. It is not clear, though, how a memory for a circumstance R in which P AND Q is obligatory would show that P must be obligatory in that same circumstance. Intu -



itively , the truth about the memory seemsto be warranted by the inference , not vice versa. For these reasons, it seemsto me that Cheng and Holyoak are probably right about the existenceof rules for pennission and obligation , even if these rules aren' t responsible for selection-task perfonnance. What is in question, however, is the nature of these rules (Rips . 990a). It is striking that the very concepts that are supposed to constitute pragmatic schemas- permission, obligation , and causalityare the targets of well-studied systems of modal logic (e.g., see Lewis ( 1973b) on causality and Fellesdal and Hilpinen ( 1971), Lewis ( 1974), and von Wright ( 1971) on pennission and obligation). Fitch ( 1966) gives a natural-deduction systemfor obligation , and Osherson( 1976, chapter 11) proposesmental inferencerules for pennissionand obligation that operate in much the sameway as his rules for monadic quantifiers. This suggests that the schema-basedinferencesmight be mental deduction rules defined over modal operators such as PERMISSIBLE and OBLIGATORY . ' Cheng and Holyoak don t discusseither the logical or the psychological deduction systems for pennission and obligation - deontic systems, as they are usually called. But they do give somecharacteristicsof pragmatic schemasthat they believe distinguish them from deduction rules. One " point they mention ( 1985, p. 396) is that the schemain (9) contains no " context-free symbols such as p and q [ as in rules of logic] . Instead, they " continue, the inferencepatterns include as components the concepts of possibility, necessity, an action to be taken, and a precondition to be satisfied." It is unlikely, however, that the lack of schematicletters is crucial here. The sentencesin (9) refer to individual actions and preconditions; so we need to use individual variables or names to formulate them (not just sentenceletters like p and q). But within such a predicate-argument framework, we can rephrasea rule like (9a) as If x is an action to be taken and y is a preconditionof x , then y mustbe satisfied, where the variablesare " context free" in applying to any entity in the domain of discourse. In fact, if the items in (9) are production rules (as Cheng and Holyoak assert), then the introduction of variables or some similar device seemsto be required in order to instantiate them to specificcases. A more critical difference between pragmatic schemasand deduction rules is that, according to Cheng and Holyoak ( 1985, p. 397), " the rules attached to reasoning schemasare often useful heuristics rather than strictly valid inferences.. . . Becausereasoning schemasare not restricted to strictly valid rules, our approach is not equivalent to any proposed


Chapter 9

forn1al or natural logic of the conditional." As an instance they note that (9c) does not follow validly from (9a): Since there may be more than one precondition, the statement If the action is to be taken, then the precondition must be satisfiedneedn't imply If the preconditionis satisfied, then the action may be taken. But the pragmatic schemamakes(9c) available whenever a conditional sentencematches(9a); so it looks as though the pern1ission schemadraws invalid inferences.However, there doesn't appear to be any strong experimental evidencefor (9c) (or for (9b), which may also not be entailed by (9a . Support for the pern1issionschemacomes from improved selection-task choices, but to explain the correct answerswe don't need(9b) or (9c). Theserules are supposedto tell us that there is no reason to choose the " action not taken" card and the " precondition satisfied" card. But subjectswho avoid thesecards may do so simply becausethey have no incentive to choose them. Not having a rule that covers these cards seemsas reasonablean explanation as having a rule that says they are irrelevant. If this is true, then the Cheng-Holyoak findings are consistent with a valid deduction principle basedon deontic logical operatorsone that producesan analogueof (9d) from an analogueof (9a). Of course, to explain the data we need to understand OBLIGA TOR Y( PI Q) and PERM I SSIBLE ( PI Q) as expressionsof duties and privileges. There is no reason to think that the broader sensesof these ternis would be successfulin the selectionexperiments. But even with this restriction, there may be an advantageto representingthe (9a)- (9d) relationship in ternis of deontic operators, since deduction principles with theseoperators seemto be neededindependentlyto capture inferenceslike s ( l Oa) and ( I Ob). The general point is that , when we look at them carefully, proposals basedon " pragmatic schemas" may have little more to offer than theories basedon deduction rules. This coincideswith the conclusion in chapter 8 above that we can usually capture inferencesfrom schemasand similar structures by ordinary deduction machinery (Hayes 1979). As long as the deduction systemcan include modal operators for such conceptsas obligation , the systemcan be exactly as sensitiveto context as the proposed schemas.And to the extent that the schemasdepend on production rules that match to propositions in memory, the schemasare exactly as " syntactic " as deduction rules. Thus, drawing distinctions between the " pragmatic " and the " syntactic" is far more confusing than illuminating in this context.



Morals ConcerningStandard Deduction Rules In addition to charnpioning pragrnatic schernas , the Cheng-Holyoak position tends to disparagededuction rules like those in part II . Cheng et al. " ( 1986) do not deny that somepeoplernay in fact reasonwith this syntactic rule [ rnodus ponens] ," but they believe that such rules are not typically usedif a pragrnatic schernais available. Theseconclusionsrest rnainly on the resultsof training experirnentsthat appear to show that instruction on the rnaterial conditional is not as effective as instruction on pragmatic schernasin prornoting good performance on the selection task. Cheng et al. ( 1986, p. 298) expressthe key idea this way: " Sincein our view the rule systernis not used in natural contexts, people lack the requisite skills to interpret problerns in terms of the rnaterial conditional , and hencewould " ' profit little frorn instruction in it . They don t discusseither the logical or the psychologicalsysternsthat incorporate other typesof conditionals,6 so it is conceivablethat theseconditionals would profit frorn instruction in a ' ' way that the rnaterial conditional doesnt. But let s consider the training results in their own terms and seewhat rnorals we can draw frorn thern. The training results of Cheng et al. ( 1986) are cornplex, but at a surn' rnary level they showed that neither a sernesters logic course nor brief training on equivalencesfor if p then q (e.g., telling subjectsthat it can be reformulated as If not q, then not p, but not as If q, then p or If not p, then not q) has rnuch effecton the selectiontask. We needto observe, however, . that training does help on other deduction tests. For exarnple, Cheng et al. found that their training procedure aided subjects in identifying the equivalenceof conditionals and their contrapositives. In fact, this appears to be arnong the biggesteffectsreported in their paper; it boosted perforrnancefrorn 27% correct to 81% correct. Conrad and I reported (Rips and Conrad 1983) that a quarter's course in elernentary logic irnproved sub' jects ability to evaluate propositional argurnents like those in table 5.1, rnany of which contained conditionals and which the studentshad probably never encounteredin a class or in a textbook. Moreover, other forms of training do benefit selection scores. Cheng et al. found that when they told subjectshow to usethe reformulations to check whether an if . . . then sentencehad been violated, the subjects were then able to perform the selectiontask rnore accurately. Explaining the answersto an initial problern sornetirnesalso aids in the solution of subsequentones (Klaczynski, Gelfand, and Reese1989). Furthermore, although Cheng et al. found no


9 Chapter

advantagefor a logic course, Lehman, Lempert, and Nisbett ( 1988) show that two years of graduate training in law, medicine, or psychology (but not chemistry) help students on a set of selection problems (see also Nisbett et al. 1987). Morris and Nisbett ( 1993) also report that graduate training in psychology(but not philosophy) improves performanceon the same problems, while Lehman and Nisbett ( 1990) found that majors in humanities and natural sciences(but not social sciencesor psychology) improve over their undergraduateyears. What can we conclude from this complicated group of results? The underlying logic of Chenget alis training experimentscan be paraphrased as in ( I I a), and Cheng et al. also seemto imply the associatedprinciple shown in ( lib ). ( II ) a. If X is a problem that subjectsordinarily find difficult , and If abstract training aids in the solution of X , Then subjectsmust have an intuitive basisfor understandingthe abstract principles underlying X 's solution. b. If X is a problem that subjectsordinarily find difficult , and If abstract training doesnot aid in the solution of X , Then subjectsdo not have an intuitive basisfor understanding the abstract principles underlying X 's solution. " Along theselines, Nisbett et al. ( 1987, pp. 625 and 629) state: Rules that are extensionsof naturally induced ones can be taught by quite abstract means. This description does not apply to formal, deductive logical rules or to most other purely syntactic rule systems,however. . . . We believethat abstract logical training by itself was ineffective [ in Cheng et al. 1986] becausethe subjectshad no preexisting logical rules corresponding to the conditional. (Or , more cautiously, any such rules are relatively weak and not likely to be applied in meaningfulcontexts)." The negativeconclusions about logical rules clearly dependon ( lib ) or somecloselyanalogousidea. (Fong, Krantz, and Nisbett ( 1986) appeal to ( 1Ia) to support an intuitive basis for statistical principles like the law of large numbers.) But ( II b) seemsmuch lesscompelling than ( 1Ia). It is easyto imagine a person who hasan intuitive understandingof an abstract principle in physics, and who appreciatesabstract training on the principle, but who still fails to solve a difficult problem in which that principle applies. Perhaps the problem also dependson additional principles that weren't part of the lesson, or

Rule- BasedSystems


perhaps the problem requires extensive calculation, or perhaps the formulation of the problem is somehow tricky or ambiguous. In view of the " susceptibility of the selectiontask to subtle changesin problem presentation " Jacksonand ( Griggs 1990), lack of transfer to this task seemsdoubtful evidencefor the position that people have no strong subjective grasp of logical principles. At the very least, such a position seemsto require showing that subjectsfail to transfer to other sorts of problems. The training evidencealso tells us nothing about the ability of peopleto grasp logical principles for connectivesother than if. There are genuine reasonsto question whether people usematerial conditionals to interpret conditional sentencesin natural language(seechapter 2), and the deduction theory of part II refusesto sanction inferenceswith IF that would follow from unrestricted material conditionals. However, as I have also noted, people are clearly able to recognize other principles (including AND Introduction , AND Elimination , and matching rules for namesand variables) as deductively correct. For deduction rules like these, " pragmatic ' schema" explanations seemto fall apart; there simply doesnt seem to be anything about the inferencefrom (say) P AND Q to P that would make it more pragmatically useful than the sorts of conditional rules that Chenget al. criticize. Surely no one believesthat to appreciateAND Elimination " we need to tie it to permission, causality, or other " pragmatic contexts. Thus, in light of the evidenceon AND Elimination and similar ' inferences, pragmatic criteria can t be sufficient for determining whether something is an intuitively acceptableinferenceprinciple. The upshot, at a specific level, is that the pragmatic-schema theory leavesus without a satisfactory account of why subjectsusually fail on the standard selection task. If the reason were simply the pragmatic useless ness of rules for conditionals, then equally uselessrules (such as AND Elimination ) should also be unavailable in arbitrary contexts, contrary to fact (Draine et al. 1984). More important , the pragmatic-schema theory lacks a general account of deductive inference, since it gives us no hints about how people reasonwith the full range of logical operators.

SocialContracts Cosmides( 1989) has proposed a variation on pragmatic schemasin which rules of social exchange, rather than generalpermissionsand obligations,


Chapter 9

detennine successfulperfonnance on the selectiontask. According to this notion , the selectionrules that produce the conventionally correct choices are of the fonD If you take the benefit, then you pay the cost- for example, If Calvin brings food to your table, then you must tip Calvin 15%. Rules of this sort evoke algorithms that are specializedfor dealing with social " " , including algorithms that identify cheating in the exchange. exchanges Cheatersare those who take the benefit without bothering to pay the cost (e.g., getting the advantage of table service without tipping); so, in the context of the selection task, cheater detectors would help focus attention on the cards associatedwith taking the benefit (getting service) and not paying the cost (stiffing the waiter). Thus, the cheater detectors are responsiblefor the correct selections. To handle the Griggs-Cox drinking -age problem or the Cheng-Holyoak cholera problem, the notion of social exchangehas to be enlargedto more general social contracts of the type If you take the benefit, then you must meetthe requirement(e.g., If you are drinking beer, then you must be over 19). Within a selection task that features a social contract, the cheater detector would prime the cards representingthe person taking the benefit (a beer drinker ) and the person not meeting the requirement (a 16-yearold), yielding the correct answer. Since no such procedure for identifying violations is connectedwith the rules in the standard selection task, perfonnance should be correspondingly poor. The differencebetweensocial exchangesand social contracts may be important in detennining whether Cosmides' approach is justified . The theory comes with an evolutionary rationale that is supposedto motivate the existenceof innate, modular cheater detectors. Adaptive selection of these detectors, however, seems more plausible for social exchangesthan for social contracts, sinceindividual exchangesmust have existed long before full -blown social regulations in human history: " While social exchangewas a crucial adaptation for hunter-gatherers, pennission from 'institutional authorities' was not . . . " (Cosmides 1989, p. 255). Yet the contracts are what one needsto explain much of the data on content effects, as Cheng and Holyoak ( 1989) have pointed out. Cheng and Holyoak have other criticisms of the evolutionary approach, but in this overview I will mostly ignore this aspect(exceptfor a few comments at the end of this section) and concentrate instead on whether contracts are helpful in clarifying subjects' perfonnance. According to Cosmides ( 1989, p. 235), the social-contract theory is supposedto share with the pragmatic-schematheory both the idea that

Rule - Based Systems

'" ' " people lack a mental logic and the idea that in solving the selection task, people userules of inferenceappropriate to the domain suggestedby the problem." Where it differs is its specificity: " All social contract rules involve permission (or , more strictly , entitlement), but not all permission rules are social contract rules" (p. 236). This differenceprovides a possible route for testing the two theories: Some permissionsare social contracts and some permissionsare not ; hence, if the contract theory is right, only the former should help subjects on the selection task. Cosmides' experiments are devoted to testing this hypothesis, as well as to testing the specific-memory view. From our perspective, however, social contracts sharemost of the advantagesand disadvantagesof pragmatic schemas. Costs-aod-Benefits' Costsand Benefits "

In discussingpragmatic schemas,we found someopeningsfor the alternative idea that the advantagesof schemasmight be due instead to memory for specific incidents that highlight rule violations. Cosmides also criti cizesCheng and Holyoak ' s ( 1985) experimentsfor failing to eliminate this memory-cuing hypothesis. She believes that the earlier study failed to equate the permission versions and the control versions of the selection task for the total amount of content they contained; hence, any advantage for permissionscould be set down to the sheer number of retrieval cues available. To overcome this potential confounding, Cosmides compared social contracts with what she calls " descriptions" (as well as with more standard selection tasks). Both conditions involved lengthy stories about fictional native peoples, and both centered on the sameconditional sentence (e.g., If a man eats cassavaroot, then he must have a tattoo on his . face) In the social-contract story, however, the conditional conveys a regulation (cassavaroot is a precious aphrodisiac that ought to be eaten only by tattooed men, all of whom are married); in the descriptive story, the conditional conveys a generalization (no costs or benefits attach to eating cassavaroot or to having a tattoo ). In line with predictions, subjects made 75% correct selectionsfor social contracts, versus 21% correct for descriptions. It is difficult to tell , however, whether the experiment succeedsin its . Although the objective of ruling out possibleeffectsof specificexperiences social-contract and descriptive stories are roughly similar in complexity, they also differ in several respectsthat could cue differing numbers of specific incidents. (See the appendix to Cosmides 1989 for the texts of


Chapter 9

the problems.) More important , the differencesin the cover stories that changedthe conditional from a contract to a description may well evoke recall of different types of incidents- onesinvolving assessingregulations versusassessingdescriptions(Pollard 1990). Thesememoriesmay, in turn , cause subjects to attend differentially to the rule-breaking cards if, as might be the case, the retrieved incidents contain more vivid information about regulation violators than about exceptions to descriptive generalizations. Cosmides points out quite correctly that explanations of this sort have to include a substantive theory of how subjectsmanageto map the relevant aspectsof the selection task onto those of the remembered incident.7 But it is not clear why a defender of the specific- experience view couldn' t develop a theory of this sort, perhaps along the lines of the failure-driven remindings of Schank ( 1982) and Schank, Collins, and Hunter ( 1986).8 I will argue in the following chapter that there are legitimate reasonsfor rejecting a theory of deduction based solely on specific " " experiences(or availability ); the presentpoint is that such a view is hard to eliminate within the confines of explanations for content effects. This spellstrouble for the social-contract theory and for the pragmatic-schema theory, which are founded entirely on such evidence. Cosmides' other main goal is to show that the social-contract theory better explains the data on content effects than pragmatic schemas. Cosmides and Cheng and Holyoak agree that social contracts are a subset of permissions; so the social-contract theory ought to predict no ' advantagefor permissionsthat don t happen to be contracts, whereasthe pragmatic-schematheory ought to predict an advantage for these same noncontract permissions. But although this prediction seemsstraightforward , there appears to be evidence on both sides of the issue. In the social-contract comer, Cosmides attempted to construct pairs of stories that implied either social contracts or noncontract permissions. One pair concerned the conditional If a student is to be assignedto Grover High School, then that studentmustlive in Grover City . The social-contract story describedGrover High as a much better school than its alternatives and attributed this to the higher taxes that Grover City residents pay. The noncontract-permissionstory said nothing about the quality of schoolsor towns, but describedthe rule as one adopted by the Board of Education to ensure that all schools had the right number of teachers. Cosmides found 75% " correct" selectionchoicesfor the contract version, but 30% for the noncontract-permission version.

Rule- BasedSystems

b. OBLIGATORY(Studentx livesin GroverCity I Studentx is to GroverHigh). assigned



Chapter 9

The same is true of Cheng and Holyoak ' s cholera problem when it is ' presentedto subjectswithout a rationale. Cheng and Holyoak s rationale ' about inoculation and Cosmides discussion of better schools for higher taxes emphasizethat the story characters are incurring a serious obligation and encouragesa stronger reading like that of ( 12b). Notice that when the conditional is interpreted as ( 12a) none of the selectioncards can show that the rule has been broken (e.g., it is possible that students from other cities also have permission to attend Grover High). On the ( 12b) reading, however, both the " Grover High " card and the " Doesn' t live in Grover " City card are relevant, and these cards correspond to the standard " correct" selection. This way of looking at the content resultsisn't by any meansa complete explanation (especiallyin view of the wording effectsdiscussedin note 4), but it may help us understand the relative strengths and weaknessesof social contracts and pragmatic schemas.Social contracts may be effective becausethe notions of costs-in- exchange-for -benefits and requirementsin -exchange-for -privileges are direct forms of conditional obligation (in the senseof OBLIGATORY ( PIQ ) . However, there are other forms of obligation (e.g., precautions, and maybe threats) that social-contract theory seemsto miss. Pragmatic schemasare better adapted to handling the full rangeof effects. But it may be that not every conditional that embodies a permissionwill produce a content effect, sincenot every suchconditional implies OBLIGATORY ( PI Q) . Morals ConcerningStandardDeduction Rules We need to consider one further aspect of Cosmides' experiments that bears on claims about mental-deduction rules. In addition to the socialcontract stories and conditionals just discussed, some of her studies concerned " switched" conditionals of the form If you pay the cost, then you take the benefit. For example, Cosmidespresentedthe cassava-root story mentioned above, together with the conditional If a man has a tattoo on his face, then he eats cassavaroot (in place of If a man eats cassavaroot, then he must have a tattoo on his face). Despite the change in the conditional , individuals who violate the rule (within the context of the story) are still those who indulge in cassavawithout having a tattoo. Thus, if Cosmides' subjects were monitoring for cheaters, they should have continued to choosethe " cassava" and " no tattoo " cards. In fact about 67% of them did , versus4% for the matched generalization. Cosmidesargues



that theseresults show that the social contracts were not simply stimulating " " " " subjects to reason logically. The correct logical answer for the " " " switched problems would be the tattoo and no cassava" cards, which correspond to the antecedentand the negation of the consequentin the switchedconditional ; but only 4% of subjectschosetheseitems. These findings are susceptible to the criticism that the lengthy background story convincedsubjectsthat the intended meaningof the switched conditional was the sameas before: 0 BLI GA TOR Y( x has tattoo I x eats cassavaroot ) . (Johnson- Laird and Byrne ( 1991) and Manktelow and Over ( 1990) raise similar points.) There is some flexibility in how to represent sentencesin natural languagewithin a given context, and this meansthat people must use their judgment in mapping the text onto an underlying interpretation . There are many ways of conveying a conditional permission or obligation , including surface conjunctions (Get a tattoo and you can havecassavaroot) and surfacedisjunctions (Get a tattoo or you can't havecassavaroot), in addition to surfaceconditionals (G. Lakoff 1970; R. Lakoff 1971; Springston and Clark 1973), so it is not surprising to find subjectsoverriding a literal sentencein order to achieveconsistencywith background information . The social- contract theory, of course, requires an analogous procedure of recasting the stimulus material into a form to which the cost-benefit algorithms apply. According to the theory, " [ an] interpretive component must then map all explicitly describedelementsin the situation to their social exchangeequivalents (cost-benefit relationship " " , the entitlement relationship, and so on), and to do this, implicit inferenceproceduresmust fill in all necessarysteps- even those that have not beenexplicitly stated" (Cosmides 1989, p. 230). More recent researchhas also obtained flipped content effects- selection of the cards for the consequentand the negated antecedent of the target conditional - by giving subjectsinstructions regarding which of the parties to an obligation might be breaking it (Gigerenzer and Hug 1992; Manktelow and Over 1991). For example, Manktelow and Over told their subjectsthat a shop had given customersthe promise If you spendmore than 100, then you may take a free gift . When the instructions indicated that the shop might not have given customers what they were due and asked them to check the cards they would need to find out , the subjects tended to select the " more than 100" and " no free gift " cards. By contrast , when the instructions stated that the customers might have taken more than they were entitled to and asked the subjects to check, they


Chapter 9

tended to select the " free gift " and " less than 100" cards. These results seemeasy to understand on the assumption that the conditional promise placesobligations on both the shop and the customer. We could represent these as OBLIGATORY ( shop gives free gift I shop receives> 100) and 0 BLI GA TOR Y( customerspends> 1001customergets free gift ), where the first spells out the onus on the shop and the secondthe requirements on the customer.I 0 Which of these obligations the subjects attended to naturally dependedon which one they had beentold to check. It therefore seemsthat neither Cosmides' " switched" contracts nor changed obligations pose any greater difficulty for deduction rules than did the original finding of Cheng and Holyoak . Cosmidesbriefly considersthe possibility of explaining her results using deontic logic, but she ends up arguing against this possibility. The first reasonfor this is the sameas her caseagainst pragmatic schemas:Deontic logics apply in all permission-obligation settingsand henceshould incorrectly predict facilitation on the selectio .n task for noncontract permission rules. We have already noted, however, that subjectsmay have understood the noncontract problems as conditional permissions(as in ( 12a)) that had no implications for the choice of cards. But a bigger difficulty for this argument is that inferencesabout obligations and permissionsdo arise outside social contracts. The arguments in ( 10) are one instance; the precaution results of Cheng and Holyoak ( 1989) and Manktelow and Over ( 1990) are another. To capture these inferenceswe need a more general theory, and deontic inferencerules might be one component of such an account.I I Cosmides' secondreasonfor preferring contracts over deontic logics is that " there is an explanation of why the mind should contain social contract algorithms, while there is no explanation of why it should contain a deontic logic. . . . Social contract theory is directly derived from what is known about the evolutionary biology of cooperation, and it is tightly constrained as a result." ( 1989 p. 233) However, evolutionary considerations don' t by themselvesmake the case for " cheater detectors" over more generalforms of inference. Sober ( 1981, p. 107) showsthat evolution can favor standard deduction rules on grounds of their relative simplicity : . . . let mesuggestthat deductivelogicswhicharetruth-functionalwill beinformationally morefit than thosewhich are not. In a truth-functionallogic, the truth valueof a sentence , whilein a dependson just the truth valuesof its components non-truth-functionalsystem , the valuationof a stringwill turn on the truth value

Rule- BasedSystems


of its components, and on other considerationsas well. The informational advantages of a truth -functional systemwill thus resemblethe advantagesnoted before of an instruction which triggers healing processes when a wound occurs, without regard to the shapeof the wound. No shapedetector is needed; all that is necessary is that the organism be able to pick up on when a wound has occurred. . .. Truth functional logics are similarly austere; relatively little information about a sentence needsto be acquired before a computation can be undertaken. Even if there is some adaptive pressure to recognize cheaters, there may well be an evolutionary advantage to accomplishing this through more general inference mechanisms . This would be the case if the more specific solutions required a greater drain on internal resources or otherwise impractical implementations . The complexity of such an algorithm seems an especially pressing problem when one considers the elaborate legal definitions of cheating on contracts . Similarly , the more restricted procedure might be at a disadvantage if people need more general principles for independent reasons. Of course , deontic deduction rules are not purely truth - functional either , since the truth of OBLIGA TORY ( PI Q) and that of PERM I SSI BLE ( PI Q) are not completely determined by the truth of P and Q. But I am not arguing that evolution selected deontic deduction rules over cheater detectors . The point is, rather , that evolutionary theory by itself provides no reasons for preferring more narrowly formulated inference algorithms over more general ones. (See Sober 1981 for an informed discussion of other evolutionary constraints on inference. Stich ( 1990, chapter 3) questions claims that particular inference strategies are necessarily under genetic control , and Lewontin ( 1990) provides reasons to doubt that there is any convincing evidence to back particular evolutionary theories of cognition .)

Summary The aims of the pragmatic-schematheory and the social-contract theory obviously differ from those of the deduction systemsdiscussedin the first section of this chapter. Schemasand social contracts give us primarily accounts of content effectsin the selection task. Although they generalize to other situations where schemaslike (9) apply or to situations where people need to detect cheaters, they are uselessin handling even simple inferencessuch as AND Elimination or instantiation. We have no trouble realizing that it follows from Everyone gaxes everybody that Someone


Chapter 9

' , even if we haven t the faintest idea what gaxes means. gaxes somebody What pragmatic schemaor Darwinian algorithm could account for this? Moreover, the inferencesthat pragmatic schemasand Darwinian algorithms do promote are not necessarily deductively valid (Cheng and Holyoak 1985). A pluralistic conclusion would therefore be that deduction theories such as PSYCOP account for deductive reasoning whereas schemaand contract theories account for certain nondeductive inferences - for example, ones for permission, causality, and covariation. Specific rules of the latter sort seemcloser to practical human concernsin a histori calor a contemporary context, and so it is easyto justify them within the framework of a general learning theory or of an evolutionary approach. One might try for a more unified account by extending pragmatic schemas or evolutionary modules to encompassrules in deduction systemsgeneral-purpose schemasor modules with IF Elimination , AND Introduction , and the rest. But general-purpose schemasor modules defeat the rationale behind thesetheories: Inferencesof this sort aren' t directly linked to practical goals and have no direct adaptive advantage; thus, one can no ' longer explain where they camefrom or why peopledon t employ them to solve the selection task. It is not clear that proponents of schemasor contracts are prepared to swallow theseconsequences . Another route to a unified theory is to adapt the deduction systemsto handle obligation, causality, necessity, and other modal concepts. These ' concepts engender inferencesthat don t reduce to the rules presented above in chapters 4- 7. But it is interesting to contemplate the possibility of extending such systemsin this direction, taking advantage of related researchon modal logic. This is the path that Osherson( 1976) pursued in ' modeling children s evaluation of modal arguments, and our look at the recent literature on the selection task suggeststhat it might also accommodate those findings. However, this clearly entails more than just adding a few rules. Criticisms of " logical" or " syntactic" rules in this literature boil down to the observation that subjects' responsesaren' t predictable from the mere presenceof the words " if . .. then." The obvious reply to the criticism is that deduction rules apply not to the surfaceform of the sentencesin a reasoning problem, but to the mental representationsof the sentencesas people interpret them. But saying this doesn't solve the problem of how such interpretation occurs, and the problem becomes more pressingthe greater the distance betweenthe surfaceform and the representation to which the rules apply (Evans 1989). The same must



be true for rival proposals, of course, since the surface sentencesdon't explicitly display costs and benefitsor permissionsand obligations either. Nevertheless, theories like PSYCOP have a certain advantage here, since the underlying form they posit is closely related to Logical Form in current linguistic theories (e.g., Higginbotham 1987; May 1985). Research on language understanding within this framework provides a natural counterpart to the theory proposed here.



Theories : RulelessSystems Psycholoaical

In the world of mules, There are no rules. Ogden Nash

Rule-basedtheories such as those discussedin chapter 9 are certainly not the only approach to the psychologyof deduction, and we can gain insight by comparing them with rival approaches. The natural-deduction rules that form the heart of PSYCOP and similar systemsapply to most of the researchin this area, as I have tried to demonstrate. But the very generality of rule-basedtheorieshas provoked criticisms from many investigators who believethat human reason proceedsby more concretemethods. It is hard for theseresearchersto take seriously the idea that ordinary people have accessto rules that are sensitive only to the logical form of a sentence - rules that apply no matter what the sentencehappensto be about. How could such principles be learned in the first place? How could they be consistent with the evidenceof error and of context-sensitivity in reasoning ? One could very reasonably retort that these difficulties are no more problematic for deduction rules than they are for grammatical rules. Beforethe developmentof formal linguistics, it must have seemedmysterious how people could have grammatical principles sensitive only to the syntactic form of a sentenceand how such principles could comport with evidenceon learning and on speecherrors. It seemsfair to say, however, that these problems have well-defined answers within current linguistic theory, and it is unclear why they should poseany additional difficulty for deduction systems.! Still , the persistenceof thesequestions in the context of deduction suggeststhat we should take them seriously. We havealready seenthat even those who go along with rules in principle often opt for ones that are lessabstract and more closely tailored to the task domain. Those who reject rules as a basisfor deduction have the opposite problem of explaining the seemingly systematic performance that subjects exhibit on many tasks. One obvious approach is to explain such performance " " by reducing it to simple heuristics or natural assessments that ariseautomatically in the courseof perception or comprehension(Tversky and Kahneman 1983). Thesemight include assessments of similarity or of availability , or other rules of thumb. Part of the appeal of this kind of explanation is that it goesalong with the impressivebody of researchon probabilistic inference(see, e.g., the papers collected in Kahneman et al. 1982). But although there is plenty of evidencethat heuristics sometimes


Chapter 10

affect performance on deduction tasks, there have been few proponents of the idea that deduction is solely a matter of heuristics. Instead, the anti -rule forces have claimed that people solve deduction problems by " " manipulating mental models or other diagrammatic representationsof the problem domain (Erickson 1974, 1978; Johnson-Laird 1983, 1989). Like heuristics, mental models are supposedto arise from perception or from comprehension; however, they differ in requiring much more active operations- typically, a searchfor counterexamplesto a given conclusion.

" Effects and" Content Heuristic-BasedTheories " There is a very general sense in which nearly all cognitive activity is rule " governed . Cognitive psychologists generally assume that people are able to perfonn intellectual tasks in virtue of mental programs or strategies that control lower - level activities , such as search or comparison operations . All contemporary psychological theories of deduction are rule based in this sense, since they assume that people have systematic internal routines for solving problems . The differences among the theories lie in the types of routines they postulate . In this context , those who criticize the use of rules in deduction have in mind rules of a special sort : those of inference systems that are sensitive to logical fonn . These rules apply to internal sentence tokens whenever they contain specified arrangements of logical connectives or quantifiers , and they produce further tokens whose fonn is ' similarly defined . PSY CO P s rules (chapter 6 above ) are of exactly this type . " " The nature of the alternative ruleless theories is not so clear . Although ' " " such theories try to avoid logical rules like P SY CO P s, the operations that take their place are not always easy to characterize . The simplest case of a theory without logical rules is one in which decisions about the correctness of an argument are made by consulting some property that is independent of its logical constants . For example , Morgan and Morton ( 1944) proposed that people ordinarily evaluate an argument in tenns of how believable the conclusion is: If you already think that the conclusion is true , you will believe that the argument is correct . But radical theories of this sort have few partisans these days , since they have trouble accounting for the data on elementary inferences. The theory closest to this pure ' approach may be Pollard s ( 1982) proposal that deduction is a matter of availability .



Deductionand Availability In the literature on decision making, " heuristics" has come to denote short-cut strategiesthat people use to assessthe probability of an uncertain event. Tversky and Kahneman's well-known work (e.g., 1974) suggests that people's probability estimatesdo not obey the standard axioms of the probability calculus, and that instead they employ simple, accessible properties of events to provide a rough answer. One type of event might be deemedmore probable than another if the first is more easily brought to mind (is more available than the second) or if the first event is more similar on averageto other membersof its class(is more representative , in ' Tversky and Kahneman s terminology). Pollard ( 1982) has claimed that the availability heuristic can also explain many of the researchresults on deduction. The general idea is that subjectsselectresponsesthat correspond to familiar information or familiar associations. In evaluating arguments, for example, subjectschoosethe conclusion that seemsmost prevalent, much as in the Morgan - Morton hypothesis. For this reason, the availability heuristic works best as an explanation of experiments that use problems about familiar situations, rather than with the more austere stimuli that we concentrated on in part II . Most of those who have conducted researchon reasoning with sententialconnectivesand with quantifiers have selectedtheir test items so as to avoid confoundings with subjects' preexperimental knowledge, and this meansthat correct and incorrect responsesare usually equivalent in availability . There is no reasonto suppose, for example, that the patterns of subjects' answersin the experimentspresentedin chapters 5 and 7 were due to differential familiarity . How could subjects have used availability to choose some of the arguments in table 5.1 over others, for example, when all of them were framed in similar sentences(e.g., If Judy is in Albany or Janice is in LA , then Janice is in LA )? Thus, the availability hypothesisis at bestextremely incomplete. Those who want to explain all reasoning as heuristic processingmust pin their hope on the possibility that new heuristicswill be discoveredthat can account for performanceon problems with unfamiliar content. , it is worth considering how availability Despite this incompleteness studies that look at effectsof subjects' preexisting knowledge might explain of the materials, especially since these are among the best-known experiments.


Chapter 10

" Content " in Deduction The basic results on familiar content in categorical syllogisms were apparent in an early study by Minna Wilkins ( 1928), who presented her subjects with problems having the same logical form but ranging in content from everyday activities and objects (Some of the boats on the river are sailboats ) to scientific jargon and nonsense terms (Some Ichnogobs are Rasmania) to ' ' meaningless letters (Some x s are y s). Considering just the problems subjects actually attempted , Wilkins found a small benefit for everyday wording : 82% correct , versus 75% for jargon and 76% for letters . In addition , the subjects were slightly more accurate with everyday syllogisms whose conclusions were neutral (85%) than with ones whose conclusions were " " misleading (80%). Wilkins considered a conclusion misleading if it followed ' from the premises but was clearly false or if it didn t follow but was ' true . An example of one of Wilkins misleading syllogisms (of the form . ( No( G , H ), No (FiG ), . . No( F , H ) was given in chapter 1 and is repeated here as ( 1). Syllogism (2) is the counterpart with familiar wording and a neutral conclusion . ( 1)


No oranges are apples . No lemons are oranges . No lemons are apples . ' None of the Smiths money is invested in real estate. None of this money belongs to the Smiths . None of this money is invested in real estate.

Only 5% of subjects thought the conclusion of (2) followed from the premises, whereas 31% went along with ( 1). ' Effects of Prior Like /ilwod Since Wilkins time , there has been a long history of attempts to replicate the finding that the prior likelihood or ' believability of the conclusion affects subjects evaluation of syllogisms (see, e.g., Evans et ale 1983; Janis and Frick 1943; Lefford 1946; Morgan 1945; Morgan and Morton 1944; Oakhill and Johnson - Laird 1985; Oakhill et ale 1989; Revlin et ale 1980). As was noted in chapter 1, most of these studies confirmed the effect, though its magnitude was often quite small . Moreover , there is evidence that even when the effect is relatively large it ' does not fully account for subjects choices. For example , Evans et ale ( 1983) varied independently the validity of the stimulus syllogisms and the



believability of their conclusionsand found clear evidenceof believability: Subjectsacceptedsyllogisms with believableconclusions on 80% of trials and oneswith lessbelievableconclusionson only 33% acrossthree experiments . Nevertheless, with believability control led, subjects still accepted valid syllogisms more often than invalid ones (72% vs. 40%). There was also an interaction between these factors, with believability exerting a larger influenceon invalid items. Pollard ( 1982) takes the believability resultsas support for the availability heuristic, on the assumption that the more believablea conclusion the more available it is as a responsealternative. There may be reason to wonder, however, if this equivalenceisn' t too simple. Although believable sentencesmight be easier to remember or produce than less believable ones, it is not so clear that believablesentencesare necessarilymore available when subjectsmust evaluate them in the context of syllogisms. In the choice betweena believableconclusion such as Someaddictive things are not cigarettes and a less believable one such as Somecigarettes are not addictive, it is not at all obvious that the first is more available. Couldn 't the surprising quality of the secondsentencemake it the more available of the two? Yet the former conclusions are the ones that elicited more positive responsesin the experiment of Evans et al. ( 1983) (seeexamples( 15) and ( 16) in chapter 1 above). In other words, there is little reason to supposethat availability mediatesthe effect in question. The important point of the experiment of Evans et al., however, is that no matter how we explain the believability effect, it cannot by itself account for the results on syllogisms with familiar wording. Unless a pure heuristicstheory can explain the overall differencebetweenvalid and invalid familiar syllogisms, it is no more cogent than it was in the case of argumentswith unfamiliar terms. Effects of Elaboration The framing of a conditional sentencecan affect the conclusionssubjectsderive from it (seechapter 5). The generalization seemsto be that if a conditional suggestsa one-to-one relation between the domains associatedwith its antecedentand its consequent , subjectswill sometimestreat the sentenceas if it implicates a biconditional (Fillenbaum 1975, 1977; Legrenzi 1970; Li 1993; Marcus and Rips 1979; Markovits 1988; Rips and Marcus 1977; Staudenmayer1975). In one of the experiments in the Rips- Marcus paper, for example, subjectsinspecteda pinball machine with several alleys for ball bearings to roll down and several



could be associatedwith more than one light . We then asked them to classify different contingencies(e.g., the ball rolls left and the green light flashes; the ball rolls right and the red light flashes) as logically consistent or inconsistent with an explicit conditional sentencesuch as If the ball rolls left then the greenlight flashes. When subjectsthought the alleys and lights were paired, most of them (67%) evaluated the contingenciesas if they were comparing each one to both the conditional and its converse. For example, they tended to judge as inconsistent with the above sentence a situation in which the ball rolls right and the green light flashes, but to . judge asconsistent a situation in which the ball rolls right and the red light flashes. Only 4% of the subjectsexhibited this tendencywhen there was no one-to -one correlation. The effect of content in thesestudies of conditional reasoning is different ' from that in the believability experiments. The present results don t depend on the prior likelihood of the conclusion, since in some of the experimentsjust cited the conclusion is completely neutral. For example, as was noted in the discussion of conditional syllogisms in chapter 5, subjectsacceptargument (3) below more readily than Argument (4). However , it is not very credible that they do so becauseThe ball rolls left is more likely on the basisof experiencethan The card has an A on the left.. (3) If the ball rolls left, the green light flashes. The greenlight flashes. The ball rolls left. (4) If the card has an A on the left, it has a 7 on the right . The card has a 7 on the right . The card has an A on the left. ' The differenceis much more plausibly due to subjects willingnessto elaborate the information conveyed in the conditional. Preexperimental beliefs are coming into play, all right, but the relevant beliefshave to do with the usual relationship between antecedentand consequent in situations where the conditionals are appropriate. In the light of knowledge of deviceslike our pinball machine with paired alleys and lights, the first ' premisein (3) conveysthe notion that if the ball doesnt roll left then the green light will not flash. Similarly, on the basisof your knowledge of the



purpose of promises, you can conjecture that If you drink your milk, you can have a snack also means that if you don' t drink your milk you will not have a snack (Fillenbaum 1977). In accord with this, the elaboration effectsincreasewhen instructions tell subjectsto regard the task not as a matter of logic but rather as " a set of problems directed to your understanding . .. when you are figuring out what is implicit or implied in sentencesthat you might encounter" (Fillenbaum 1975, p. 250). If availability is to have any chance of explaining these results, we must presumably shift our focus from availability of the conclusion to the availability of the relationship implicit in the conditional. However, this relationship cannot bejust a brute associationbetweenwords in the antecedent and the consequentor an association betweentheir referents. The association betweenballs rolling left and green lights flashing is certainly no stronger than the one betweenhaving an A on the left of a card and a 7 on the right . To account for the differencebetween(3) and (4), what must be available is some type of constraint specifying which combinations of antecedent and consequent values are normally possible in the type of situations that the problems refer to (cf. Barwise 1986). Once someof these combinations are specified, pinball machinesplacedifferent restrictions on the rest of them than cards. To describe these results as effectsof availability , however, is more a play on words than a serious explanation. Obviously, these constraints must be " available" to the subjectsif they are to influence the decisions: Subjectsmust be able to retrieve the information before they can use it . But this doesn't mean that subjectsemploy availability itself as the basis for their responses , which is what the availability heuristic is supposed to be. According to Tversky and Kahneman' s ( 1973) account, subjects make their probability decisionsby assessinghow available- how easily brought to mind - the to-be-judged event is. By contrast, nothing about theseresults appears to depend on how easily information can be found. Instead, what matters is the nature of the information obtained: the antecedent-consequentconstraints themselves . Stretching the meaning of " " to cover these constraints throws no new light on the nature availability of theseexperiments.2 Colltellt Effects ill tile Selectioll TlISk The best-known content manipulations are the ones associatedwith Wason' s selection task, reviewed in chapter 9 above (see also Evans 1989, chapter 4). Can we ascribe


Chapter 10

the improvement in performance on the selection task to an availability heuristic? If we can, then it is presumably becausecontent makes the correct choicesmore available. Perhapsin the context of the drinking -age regulation and similar rules it is potential violators - for example, people who are drinking beer and people who are underage- that are the most obvious choices. However, as Pollard ( 1982) points out with respectto a related issue, it is not clear why violators should be more available than law-abiding citizens (adults and cola drinkers), who are undoubtedly more frequent. To make the availability idea work , one must supposethat availability dependsnot just on the frequency or familiarity of the exemplars but also on the manner in which they are considered. The nature of the problem must somehow suggestthat the rule violators, and not the rule abiders, are relevant. Although the correct cards may be in some sensethe more available onesin the drinking problem and in similar facilitating contexts, availability could hardly be the sole causeof the results. Subjectsmust first recognize the problem as one in which it makessenseto consider exceptionsor counterexamplesto the rule before beer drinkers and underage patrons " can becomerelevantly " available. As in the other examplesof conditional and syllogistic reasoning, the availability heuristic must be supplemented by more complex processes in order to explain the nature of the content effect. Can Availability Explain Deduction?

The availability heuristic is the only heuristic that has beenclaimed to be a general explanation of performance in deduction experiments. The attraction of the claim is that it could reduce a large number of separate findings in both deductive and probabilistic reasoning- to a single psychologically simple principle: The more available a response, the more likely subjectsare to make it . A closer look at this claim, however, suggests that it is unfounded. In the first place, there is no hope that availability can explain the results with unfamiliar problems, sinceexperimenterscustomarily control this very factor. But second, even in the realm of problems with more familiar wording, availability rarely seemsthe crucial determinant . In order to make the availability heuristic work in thesecases, we must either stretch the notion of " availability " to the point of triviality or hedgeit with other principles that themselvescall for deeperexplanations.



Particularly in the case of conditional reasoning, we need to appeal to fairly abstract constraints (correlations betweenthe antecedentand consequent domains, or circumstancesthat highlight counterexamples) before the availability heuristic makessense. Moreover, in both categorical syllogisms and conditional problems there are large residual effectswhen the purported availability factors are held constant. Of course, there are many possibleheuristics other than availability ; so thesedifficulties with availability don't mean that heuristics can' t capture the reasoningresults. But most researchersseemagreedthat an explanation (especially for unfamiliar problems) must lie in processes that are much more systematic than what the usual notion of a heuristic is able to capture.

Reasoning by DiagramsandModels If the goal is to avoid logical rules like those of PSYCOP and yet account for the reliably correct intuitions that people have about some deduction problems, then it seemsreasonableto look to diagrammatic representations such as Venn diagrams or Euler circles. There is a long history of proposals for logic diagrams or logic machinesthat automatically check the correctness of categorical syllogisms (Gardner 1958). Such devices have contributed little to logical theory but have servedas helpful teaching tools in introducing students to the notions of sets and quantifiers. Diagrams of this sort appeal to students becausethey rely on simple perceptual-geometric properties, such as overlap and disjointness, to representrelations among more abstract entities. Diagrams appeal to many psychologists for similar reasons. Those who believe that perception provides a more primitive representational medium than sentential structures may also feel that reasoning with diagrams is more " natural" than reasoning with sentencesor formulas (see, e.g., Rumelhart 1989). This idea is especially tempting in the caseof reasoning about information that is itself spatial. If you must deduceimplicit information about a spatial array or a geographical area from explicitly stated relations, you may try to develop mental maps and read the required facts from them in something like the way you read route information from a physical map of terrain (see, e.g., Bower and Morrow 1990). Similarly, problems that require subjectsto deduceorder relations (e.g., If


Fred is taller than Mary and Charles is shorter than Mary , who is tallest?) also seemto call for somesort of internal array to representthe sequence . See the research on linear cited in ( syllogisms chapter 1; also seeByrne and Johnson- Laird 1989.3) There are questions about how literally we can take theseclaims about mental maps and diagrams (Pylyshyn 1984), but we can grant their existencefor current purposes. The problem for us is to decide how plausible they are as explanations of more complex inferencing . Are they really competitors as theories for the sorts of deductive inferencesthat we studied in part II ? Diagrammatic Approaches Many psychological diagram theories aim at explaining reasoning with ' categorical syllogisms. Erickson s ( 1974, 1978) theory, mentioned in chapter 1, illustrates the basic featuresof this approach. To determine whether a syllogism is correct, people are supposedto translate each premiseinto Euler diagrams that exemplify the possible set relations that the premise implies. In the example of figure 1.1, the premise All square blocks are green blocks appears as two separate Euler diagrams, one with a circle representingsquare blocks inside another representinggreen blocks. The seconddiagram contains two coincident circles representingthe sametwo classes . Similarly, the premise Somebig blocks are squareblocks is represented in terms of the four diagrams that appear in the second row of 1.1. Erickson' s model assumesthat subjectsnever actually consider figure more than one possiblediagram per premise, and that some of the errors they commit are thereforedue to incomplete representations. In order to determine the validity of the syllogism, subjectsmust combine thesetwo premiserepresentations. There are usually severalways in which this can be done (seethe combined diagrams at the bottom of figure 1.1), and Erickson ( 1974) proposesseveralvariant models that differ in the thoroughnessof the combination process. A correct procedure, of course, would be to combine the representations in all possible ways that are set-theoretically distinct. Finally, subjects must check whether the syllo' gism s conclusion holds in all combinations (in the case of asyllogism evaluation task), or must generatea conclusion that holds in all of them (in the caseof a production task). Erickson assumesthat when more than one conclusion is possibleon this basis, subjects' decisionsare determined by blases like that of the atmosphere effect (see chapter 7). Thus, pre-



dictions about the difficulty of syllogisms depend on assumptionsabout which Euler diagram representsa premise, how thoroughly subjectscombine the diagrams, and which conclusionsthey use to expressthe combinations . Of course, these predictions are supposed to apply to people without any formal knowledge of Euler circles. There is little doubt that people can form mental images of Euler circles if given sufficient training with them; however, the model is directed not at experts but at the intuitions of naive subjects. ' It would be of interest to compare Erickson s predictions with those of our own model, but there are problems in doing this that stem from parameter estimation. Erickson' s most tractable models are ones that combine the premise diagrams in all possible ways or that combine them in just one randomly chosenway. But, as he shows, thesetwo modelsgive an incorrect account of those syllogistic premisesthat have no valid conclusion (constituting the majority of problems). The random-combination model tends to underpredict the number of correct responsesto these premisepairs, sincesinglecombinations always allow someincorrect categorical conclusion; but the complete-combination model overpredictscorrect " no valid conclusion" . It is therefore necessaryto assume ( ) responses that subjects consider only some premise combinations and to specify which combinations thesewill be. Sincethe number of potential combinations is large, this means specifying a large number of parameters, one corresponding to the probability that each type of combination is considered . Becauseof this difficulty , it is not clear that this intermediate model can be fitted in a statistically meaningful way to the full set of syllogisms. Erickson himself has attempted to do so only for small subsetsof problems . It may be possibleto constrain the number of Euler combinations to make the theory easier to handle, but no such modification has yet been formulated. Models and Mental Models Johnson-Laird ( 1983) and his colleagues(Johnson-Laird and Bara 1984a; Johnson- Laird and Byrne 1991) have advocated a very similar approach to categorical syllogisms, using somewhat more compact diagrammatic representations. The model for syllogisms, however, is part of a more " comprehensivetheory of what Johnson-Laird calls 44mentalmodels. Before returning to syllogisms, let us get a grip on the intuition underlying this approach.



The idea of " mental models" rests on an analogy with model theory in logic. Traditional treatments of logic distinguish proof systemsfor a formal language from the semanticsof the language, a distinction that we have already discussedin chapters 2 and 6. Proof theory concerns problems of deducibility - which sentencesof the languagecan be derived from others. Model theory has to do with notions of semantic entailment and validity (seetable 2.1). As we have seen, a sentenceis semanticallyentailed by others if it is true in all models in which the others are true, and an argument is valid if its conclusion is semanticallyentailed by its premises. In this book we have focusedmainly on deducibility , since our principal objective has beento formulate a deduction systembasedon proof. However, severalpoints about formal semanticsare important in comparing logical models to mental models. One is that the models of a sentence or ( of a set of sentences ) neednot be confined to the intended model, the one that the sentenceseemsto be about. In discussingthe sentenceIF Satellite( x ) THEN Orbits( x ,bx) in chapter 6, we took the model to be one containing astronomical objects and defined the predicatesSatellite and Orbits to coincide with their natural denotations. But we could also come up with a completely legitimate model M = ( D,f ) in which the same sentenceis true but D (the model's domain) consistsof the set of natural numbers and f (the interpretation function) assignsthe predicatesto the following sets: f (Satellite) = { x: x is odd } and f (Orbits ) = { ( x,y) : y = x + I } . The sameconditional sentencewill be true in this model and in an infinite number of other models as well. This also meansthat a sentence can be true in some model even though it is false. For example, the sentence IF Person( x ) THEN Child-of ( x ,bx) (i.e., every person has a child ) is falseeven though it is true in both of the modelsjust constructed when f (Person) = f (Satellite) and f (Child -of ) = f (Orbits ), as defined above. Of course, any true sentencemust be true in at least one model, namely the intended model for that sentence. A related point is that model theory gives us no way to computethe validity of an argument. To determine whether an argument is valid, we need to consider the relation of the truth of the premisesand the conclusion in all models, not just the intended one. Sincethere are generally an infinite number of models, there is obviously no possibility of checking them one by one. Part of the importance of model theory, as a branch of logic, is that it gives us a meansof preciselycharacterizing correctnessof ' argumentsin a way that doesnt dependon computational considerations.



It is in proof theory that computation gets its due, since the deduction procedures of proof theory are always ones that we can carry out in a purely mechanicalway. Thus, the comparison betweenthe proof-theoretic and model-theoretic descriptions of a logical system (as in the results of chapter 6) gives us an idea of the extent to which the validity of an argument outstrips our ability to recognizeit through mechanicalmeans. This is true not only for the sorts of proof systemswe consideredin part II , but also (if Church's thesis is correct) for any procedure that is mechanically realizable. If we restrict the domain of inferences, however, it is possibleto give an algorithm that usesmodels to test arguments. A simple example is the truth -table method for testing propositional argumentsin elementarylogic (see, e.g., Bergmann et al. 1980and Thomason 1970a). In this case, we can think of each line of the truth table as a model in which the function f assignsto each atomic sentencein the argument a value T (for true) or F (for false). Within each line, rules for AND , OR , NOT , and IF (similar to those in our definition of joint satisfaction in chapter 6) then assignT or F to each premise and to the conclusion on the basis of the truth of the . Separatelines representdifferent models- that is, the atomic sentences . As a different possible assignmentsof T or F to the atomic sentences AND OR and p q q simple example, the truth table for the sentencesp would have this form:





The table shows, for example, that p AND q has the value T only when both p and q have T. In the generalcasethere will be 2/1truth -table lines in all , where n is the number of atomic sentencetypes in the argument. We can say that the premisessemantically entail the conclusion (and that the argument is valid) if the conclusion gets the value T in every model (i.e., line of the table) in which all the premises have the value T. Thus, an argument with p AND q as premise and p OR q as conclusion would be valid, since wheneverp AND q is true so is p OR q. Although the procedure for evaluating an argument in this way can be extremely lengthy if



the number of atomic sentencetypes is large, we can always carry out the test mechanically. Euler circles also present a special case in which we can use models algorithmically to checkargumentsof a restricted type, namely arguments such as syllogisms containing only categorical sentences(i.e., IF P( x ) THEN Q( x ), P( b) AND Q( b), NOT ( P( x ) AND Q( x ) ), and P( b) AND NOT ( Q( b) )). The idea is to let the domainD be points in a plane and to let the interpretation function f assign predicates(P, Q, etc.) to regions that contain a set of thesepoints. Sinceall that matters for satisfaction of thesesentencesare relations betweenregions, all interpretations that preserve these relations (inclusion, overlap, and nonoverlap) are equivalent, and we can representthe entire classof interpretations by the usual circles or ellipses. Truth tables, Euler circles, and similar devicesgive us a decision procedure for arguments, but at the price of severelylimiting the range of arguments we can test. For lessconstrained languages, model theory will give us a description of what it meansfor an argument to be valid , but not a method for recognizing valid and invalid ones. The Mental-Models Hypothesis With this background, let us look at a proposal for a psychologizedversion of model theory. Philip Johnson- Laird has describedmental models for several types of arguments (see, e.g., Johnson-Laird 1983; JohnsonLaird , Byrne, and Tabossi 1989; Johnson-Laird and Byrne 1991; JohnsonLaird , Byrne, and Schaeken 1992); however, the theory for categorical syllogisms is a good place to begin studying them, since it provided the main theoretical and empirical starting point for Johnson-Laird ' s approach. Different versionsof the syllogism model are presentedindifferent publications (Johnson- Laird 1983; Johnson- Laird and Bara 1984a ; Johnson-Laird and Byrne 1991; Johnson- Laird and Steedman 1978); we will use the most recent formulation (Johnson-Laird and Byrne 1991). (For critiques of an earlier version of the syllogism theory, seeFord 1985 and Rips 1986.) Mental Modelsfor Categorical SYUogisms Although the representation is a bit more abstract than Euler circles, the basic featuresof the " model" model are much the sameas in Erickson' s theory. As a first step, people are supposedto translate the individual syllogistic premisesinto diagrams



like those in figure 10.1. The idea is that these diagrams represent the terms of the premisesas tokens standing for specificinstancesof the corresponding sets. For example, the premiseAll squareblocksare greenblocks would appear as a diagram in which there are tokens standing for individual square blocks (the xs in the top diagram in figure 10.1) and other tokens standing for individual greenblocks (the ys). The former tokens are aligned horizontally with the latter, indicating that each square block is identical to some green block. The ellipsis at the bottom is a part of the model and meansthat additional rows of xs and ys can be added. However , the brackets around the xs stipulate that membersof the set cannot " occur elsewherein the model (the xs are " exhausted with respectto the ys in mental-model terminology). Becausethe ys are not bracketed, further rows of the model could contain ys with either xs or non-xs (where the latter are symbolized IX ). Similarly , No squareblocksare greenblocks would be representedby a diagram containing a sampleof xs and a sample of ys segregatedin different rows, as in the third diagram in figure 10.1.

PremiseType All XareY

Diagram (x] (x]

y y . .

Some Xare Y

x x

NoXare Y

[x] [x]


y y

[yJ [yJ . . .

Some X are not Y

x x


Fiaure 10.1 , according to Johnson- Laird and Byrne ( 1991). Representationsfor categorical premises



The brackets mean that no xs can appear with a y and no ys can appear with an x in further rows of the model. In order to determine the validity of a syllogistic conclusion (or to produce a conclusion to a pair of premises), people must combine the diagrams for the individual premises to produce a representation for the premisepair as a whole. For example, consider how the model would deal with the samplesyllogism that was usedin earlier chapters(e.g., chapter 6, example ( I . For reasonsthat we will take up later, Johnson- Laird and ' Byrne s model applies most easily to situations where subjectsmust produce a conclusion from a pair of premises. Thus, supposethe task is to find a conclusion for (5) that will make the syllogism valid. (5) All squareblocks are green blocks. Somebig blocks are square blocks.. ? Figure 10.2 illustrates the mental-model approach to this problem in a form comparable to the Euler-circle representations of figure 1.1. The premise diagrams at the top are the sameas those in figure 10.1, with ss representingindividual square blocks, gs green blocks, and bs big blocks. The initial combined diagram is the one Johnson- Laird and Byrne ( 1991, table 6.1) give as the representationfor the premisesof syllogism (5). This combined diagram preservesthe premise relationships in one possible way, with the ellipsis again allowing us to add further rows to the model. If you interpret the implied relation betweenthe bs and the gs according to the convention about the meaning of the rows, you can seethat in this combination some big blocks are green blocks. Beforeone can be sure that the conclusion is correct, however, one must check whether there are alternative combined diagrams that also represent the premisesbut in which the tentative conclusion is false. If such a counterexampleexists, one must selecta new tentative conclusion that is true of all combined diagramsfound so far and then continue to searchfor further counterexamples. If no counterexampleexists, then the tentative conclusion must be a valid consequenceof the premises. (When there is no categoricalconclusion containing the end terms that holds in all modelsof the premises, one should respond that no valid conclusion is possible.) In the presentexample, further combinations can be formed by adding new bs , and g tokens. Figure 10.2 shows possible ways of " fleshing out " the




8wM big ~

.. ~

I *x*8.

.. .... ~

[a] [a]

g g





(Ie] (Ie]




0] g]

:b& ..:1-.m g

m .~.I g

m g 1.:1-.!g EJ I ~ .. g :-Ibb .a:.J-.m 1 g ~ .= .1!g ~ m .a:.Jg

b b b

b b ...b b





m g ...H...:) -.m g : : & J m .-.tb.EJ " ' b . . I t ~ I I m . . b .-.g -,g -G.. :~ H


b b~ ..b :I m


&:J m


b b



"' II

II ..

Figure10.2 -Laird and Byrne's (1991 A representation in tenDSof Johnson ) diagramsof thesyllogism

All square blocksaregreenblocks . Some blocks . bigblocksaresquare Some blocks are blocks . big green







Chapter 10

initial model that correspond to the 16 Euler circle combinations in figure 1.1. However, the same conclusion, Somebig blocks are green blocks, remains true in theseexplicit models. Sincethere are no counterexamplesto this conclusion, it is a correct completion for the premisesin (5). Predictions about the difficulty of syllogisms depend on the number of counterexamples. Syllogism (5) is an especiallyeasy one, according to Johnson-Laird and Byrne, sincethere are no counterexamplesto the conclusion reached in the initial model. Argument (6) presents more of a challenge. (6) Somesquare blocks are not green blocks. All square blocks are big blocks. ? Figure 10.3 shows the premisemodels and the two combined models that Johnson- Laird and Byrne give for the premisesin (6). In the first of the combined diagrams, it is true that some green blocks are not big and that some big blocks are not green blocks, so we have two potential conclusions to deal with . The second combined model provides acounter exampleto one of them (Somegreenblocksare not big), sinceall the green blocks are now big; hence, the only categoricalconclusion relating the big and greenblocks is Somebig blocksare not greenblocks. Sincethere are no further diagrams of the premisesthat make this conclusion false, it must ' be the correct completion for (6). On Johnson-Laird and Byrne s account, the difficulty of solution is a function of the number of models that a personmust form beforereachingthe correct conclusion. This implies that syllogism (5), which requires only one model, should be easierthan syllogism (6), which requirestwo - and this prediction clearly holds in the data. All Appraisal of Melltal Modebfor Syllogisms Johnson- Laird and Byrne " ( 1991) report that it is a striking fact that the rank order of difficulty of the problems is almost perfectly correlated with the predictions of the " theory. The theory predicts a total of 205 different types of conclusions acrossthe syllogistic premises, 159of which actually appearedin the data of Johnson- Laird and Bara' s ( 1984a) third experiment. (See chapter 7 above for a discussionof this data set.) Theseconclusions accounted for 87.3% of the responsetokens that the subjectsproduced. In the opposite direction, there are 371 responsetypes that are predicted not to appear; of these, 46 typesdid in fact occur, contrary to prediction. Thesemake up the





[g] [g]



AI . , . . ~

[8] [8]

. . big ~

b b ...


(gJ (gJ

a.) a. ) a. ) ...

b) b) b)

(g) (g)

([a] ([a]

b] b] b]


FIaare10.3 A representation -Laird and Byrne's ( 1991 in tenDS of Johnson ) diagramsfor thesyllogism Somesquareblocksarenot greenblocks . All squareblocksarebig blocks. ?



.4 (Johnson- Laird and Bara explain some remaining 12.7% of the responses of theseon the basis of a Gricean implicature from sentencesof the form SomeX are not Y to SomeX are Y.) Thus, the model performs about as well as the one describedin chapter 7. A closer examination of the theory and the data, however, suggestssome problems concerning the logical adequacy of mental models and the relation between the theory and its predictions. ' Notice, first, that Johnson-Laird and Byrne s mental models differ from standard logical models in severalways. Whereasstandard modelsconsist entirely of sets, mental models also possesssuch devicesas negation signs, ellipsesindicating the potential for further models, brackets representing subsetrelationships, and rows depicting equality among the tokens. The procedures for constructing and revising mental models and the procedures for generatingconclusionsfrom them must preservethe meaningsof thesesymbols. For example, the proceduresmust govern the brackets so that if [ a] b exists in a model it will not be fleshed out with a new row pairing a with -, b. Newell ( 1990, p. 394) points out that , as thesedevices increasein scope, they " becomeindistinguishable from full propositional " representations. It is easy to overlook these constraints, but they must exist in the mental-model apparatus in order to guaranteecorrect results. Thus, mental models and the proceduresdefined over them have a lot in common with rule-basedproof systems. Not only do both contain negation signsand other operators, both also employ routines for transforming strings of these symbols into other strings that preservethe truth of the premises. This means that it is at best extremely misleading to regard " " mental models as a " semantic" theory distinct in kind from syntactic approaches (Rips 1986). The strings in a mental model have a syntax as well, and are as much in need of semantic interpretation as any other ' " ' representation. As Soames( 1985, p. 221) puts it , the mental models " hypothesisis, in effect, a proof procedure inspired by semanticideas. Along the same lines, mental models, like Euler circles, have built -in limitations when compared with formal models. For example, sincepredicates are representedby a small number of tokens, it will be impossible to representcorrectly sentenceswith predicatesor quantifiers that dependon an infinite number of objects. If k is the largest number of tokens that can appear in a mental model, then there is no apparent way to represent sentenceslike There are more than k natural numbers. Although JohnsonLaird ( 1983) mentions such sentences , his treatment suggeststhat mental



-down" models modelsare not the way peoplerepresentthem. " Scaled with only a few tokensmight go someway toward dealingwith such sentences , but in generalthe informationfrom them" canbegraspedwithout havingto constructa mentalmodelcontainingthecompletemapping. Onewayof thinkingof the representation of the sentence is therefore.. . a that is set but never propositionalrepresentation up actuallyusedin a " ibid. . 443 to construct a mental model . procedure ,p ( ) Reasoningwith " -Laird believes suchsentences , Johnson , forces a distinctionbetweennaive or intuitive reasoningbasedon mentalmodelsand mathematicalreasoning " . 444. Thereis little , whichrelieson other mechanisms elaboration (p ) on what theseothermechanisms be . might -Laird and Byrne's syllogismtheory is inexplicit in Further, Johnson waysthat makeit difficult to understandthe basisof their predictions . This is easiestto seein thecaseof syllogismssuchasthosein (7). (7) a. All greenblocksaresquareblocks. All squareblocksarebig blocks. ? b. All greenblocksaresquareblocks. Somesquareblocksarebig blocks. ? Supposeyou are a subjectin an experimentand you haveto produce a conclusionthat follows from eachof thesesyllogisms . Accordingto mental-modeltheory, you mustfirst producean initial modelof the combined . In the caseof (7), it turns out that theseinitial modelsare premises -Laird and Byrne1991 identical(seeJohnson , pp. 121and 126): (8) [ [g] [ [g]

s] s]

b b

. . .

Now , from this representation you can read ofTthe tentative conclusion All greenblocksare big blocks. This is, of course, the correct conclusion for syllogism (7a), which is one of the easiestproblems. Johnson- laird and " Byrne state that there is no way of fleshing out the model to refute this conclusion: it is valid and dependson the construction of just one model" (ibid., p. 121). If you produced this sameconclusion to (7b), however, you would be wrong. There is no categoricalconclusion that follows from (7b), and the correct responseshould be that " nothing follows." It must be


Chapter 10

possible to revise the initial model in (8) to show that All green block are big blocks isn' t an entailment of (7b), and Johnson- Laird and Byrne suggest(9).

s] s] s

b b

Syllogism(7b) is a multiple-model problem, sinceboth (8) and (9) are . requiredto showthat thereis no valid conclusion The troublesomeaspectof this exampleis this: How doesthe reasoner knowto stopwith model(8) in thecaseof (7a) andto continuerevisingthe modelin the caseof (7b)? Model (8) itself can't supplythe answer , since . Nor is it true that this modelis preciselythe samefor the two syllogisms thereare no further modelsof the premisesin (7a); the modelin (10), for . , is alsoconsistentwith the samepremises example ( 10) [ [g] [ [g]

s] s] s

b b b b

Somehow reasonersneedsto determine that revised models like ( 10) will not refute the tentative conclusion whereasmodels like (9) will ; but it is not clear how this is possiblein mental-model theory unlessone actually inspectsthesemodels. On one hand, supposethe reasonerscan infer from the premisesof (7a) and (8) alone that no further models will refute All green blocks are big and, likewise, can infer from the premises of (7b) and (8) alone that further models would refute this conclusion. Then they would appear to be engagingin a form of inferencethat goes far beyond what is provided in the mental-model approach. The ability to make such inferencesis entirely unexplained. In fact, for someone who is able to reasonin this way, there would be no point whatever in constructing (9), and both (7a) and (7b) would be one-model problems, contrary to what Johnson-Laird and Byrne assert. On the other hand, suppose the reasoners must actually construct models like ( 10) in order to determine that no further models of (7a) refute the conclusion, just as they must construct



(9) to reject the conclusion for (7b). Then it is very misleading to say that " (7a) dependson the construction of just one model." Moreover, it is not clear exactly how many models reasonersmust consider before deciding that the tentative conclusion must be correct. A similar gap in the theory has to do with forming the initial combined model. On Johnson- Laird and Byrne's account, syllogism (6) is a " twomodel" problem, since one of the potential conclusions that holds in the initial model in figure 10.3 (Somegreen blocks are not big) can be refuted by the second. According to Johnson- Laird and Byrne, this predicts that Somegreen blocks are not big should be a common error , committed by subjectswho fail to keep track of the secondmodel. But this relationship . is not symmetric. Supposethat, instead of starting with the former model, you start with the latter. In that case, the model is consistent with the conclusion All greenblocksare big as well as with the correct conclusion, Somebig blocks are not green. Thus, All green blocks are big should be a common error. (In fact, although a few subjectsdo respond erroneously with Somegreenblocks are not big, none of them say All greenblocks are big, according to the data of Johnson-Laird and Bara ( 1984a).) Thus, the predicted responseto a syllogism depends on the order in which you generatethe models. But Johnson-Laird and Byrne give no details on how this ordering is determined. (SeeRips 1986for further discussion.! ) Difficulties like thesebecomeespeciallyacute when we try to apply the mental-model hypothesisto other experimental proceduresinvolving syllogisms . Consider the problem of predicting the difficulty of syllogismsin the task we studied in chapter 7, where subjectsseea complete syllogism and must assessits logical correctness. Supposethe syllogism is a valid one. Then if subjectsbegin by mentally constructing a model in which the premisesare true, as Johnson-Laird and Byrne hypothesize, the conclusion of the syllogism will also be true in that model. Moreover, further revisions of the model that preserve the truth of the premises will , of course, also allow the same conclusion, since by definition a conclusion that is semantically entailed by its premisesis true in all models of the premises. But notice that what makes a syllogism a one-, two-, or threemodel problem seemsto be the number of models necessaryto eliminate rival conclusionsthat are consistentwith the initial model but that are not entailed by the premises. For example, syllogism (6) is a two-model problem becausethe model on the right side of figure 10.3 is necessaryto eliminate the incorrect conclusion Somegreen blocks are not big. In the


Chapter 10

caseof evaluating syllogisms, however, there are no rival categorical conclusions that a subject need consider; the only relevant conclusion is the one explicitly stated. This seemsto imply that all valid syllogisms are one-model problems in the argument-evaluation task. But theseproblems are far from equivalent in difficulty for subjects, varying from 35% to 90% correct in the Schank- Rips data and from 24% to 95% in the Dickstein data (seechapter 7). One factor that might explain thesedifferencesis the figure of the syllogism (i.e., the arrangement of its terms), since according ' to Johnson-Laird and Byrne s account it is easier to construct models in some figures than in others. But even if we hold figure constant, there is still an enormous range in the difficulty of valid syllogisms. For example, syllogismsof the secondfigure (i.e., of the form ( Ql (H ,G ),Q2(FiG ) vary from 35% to 90% correct in our data and from 58% to 95% correct in Dickstein' s. Without further assumptions, there seemsto be no way to 6 predict thesedifferenceswithin the mental-model framework. ExtendedMental Models: Multiple Quantifiers Johnson- Laird has tried in more recent work to addresssome of the limitations of mental models by attempting to show that mental models can handle all the arguments that rule-basedtheories can and, at the sametime, to provide a better fit to the data. One extension is to a classof two -premisearguments that contain sentences with a single relation and two separatelyquantified terms. Arguments ( 11) and ( 12) are examplesof this sort from table 5 of Johnson-Laird et al. 1989. ( 11) None of the grocersare related to any of the violinists. Someof the violinists are related to all of the dancers. None of the grocersare related to any of the dancers. ( 12) None of the grocersare related to any of the violinists. All of the violinists are related to someof the dancers. None of the grocersare related to someof the dancers. " In thesearguments, related to is supposedto be understood in the simple consanguinealsensethat subjects naturally treat as being transitive and " symmetric (ibid., p. 665). Both of theseargumentsare deductively correct, according to Johnson-Laird et al., but subjectsfind it easierto generatea correct conclusion to the premisesof ( 11) than to those of ( 12).



The mental modelsfor theseproblemsare somewhatsimilar to thosefor categorical syllogisms in containing tokens for individual members. Instead of relating the tokens by aligning them in separaterows, however, Johnson- Laird et al. display them in regions of the model, with all the " individuals who are " related to one another in the same region. For example, Johnson- Laird et alis model for Someof the violinists are related to all of the dancersis Ivvvddl

OvOvl ,

where the vs representviolinists, the ds dancers, and the Ovsviolinists who 7 mayor may not exist. The model for All of the violinists are related to someof the dancersis Ivddlvdlvddl

OdI .

As in the caseof the models for categorical syllogisms, subjectsare supposed to combine the models for individual premises, formulate a conclusion from the combined model, and attempt to refute the conclusion by constructing further models of the premisesin which the conclusion is false. Predictions again depend on the number of models that provide ' counterexamples. Johnson-Laird et al. don t tell us exactly how many models each syllogism requires, however; they simply divide them into one-model and multiple-model problems. Argument ( 11) calls for one model and argument ( 12) for multiple models, according to this account. The experiments of Johnson- Laird et al. confirm that subjects have a more difficult time producing conclusions to their multiple -model premises . But Greene( 1992) has raised a serious question about the adequacy of theseresults. The basic contrast in the data is betweenone-model and s multiple -model problems with valid conclusions, but the conclusion that Johnson- Laird et al. count as correct for all the latter problems is the one " " they expressas None of the X are related to some of the Y , as inargu ment ( 12). For this responseto be deductively right , however, it must be interpreted as if some has wide scope. That is, it must be understood to mean that there are some Y such that no X is related to them (i.e., ( 3y) NOT ( ( 3x) Related( x ,y) ) = ( 3y) ( Vx) NOT ( Related( x ,y) ), or , in our quantifier-free notation , NO T ( Related( x ,b) ) . Greene shows that few subjectsaccept this reading for the sentencein question. Moreover, when given diagrams that embody the intended relationship, no subject (out of 40) was able to formulate a sentencedescribing them that meant the same



as NOT ( Related( x ,b) ) . This is not becausethesesubjectswere producing a more obvious description that blocked the sentenceof interest; only 9 of the 40 subjectsmanagedto respond with any true description. Thesefindings imply that the difficulty of thesemultiple -model problems has to do with how the sentencesare interpreted or produced rather than with the number of models that the subjectsmust form during reasoning. Greene's results tally nicely with an experiment of my own in which Stanford undergraduatesreceivedcomplete syllogisms(premisesplus conclusions ) similar to ( II ) and ( 12) above. Half of the subjectssaw the valid one-model and the valid multiple -model problems usedby Johnson- Laird et al. (The conclusions for thesesyllogisms were the ones Johnson- Laird et al. suggestare entailed by the premises.) These valid items appeared randomly mixed with an equal number of invalid ones formed by permuting the premisesand the conclusions. The other half of the subjects saw the same problems, but with each sentencerephrased to make its intended meaning more explicit . Thus, problems ( II ) and ( 12) would have ' ' appearedas ( I I ) and ( 12) in the revisedversions. ' ( I I ) No grocer is related to any violinist . There


s a violinist


every - dancer

is related

to .

No groceris relatedto any dancer. ' (12) No groceris relatedto any violinist. EveryvIolinistis relatedto a dancer(but not necessarily the same

,). -There 'sadancer who no isrelated to. grocer dancer

Subjectswere to decidewhether the conclusion of eachsyllogism followed from the premises. The results of this study show that the difference between one- and multiple -model syllogisms depends strongly on the wording. With the wording of Johnson- Laird et al., the subjects correctly evaluated 86.1% of the valid one-model problems but only 41.6% of the valid multiple model problems. With the revisedwording, the differencewas significantly reduced; the repective percentageswere 83.3 and 72. 2. Thus, clarifying the premisesand conclusions largely eliminates the effect of number of models. The residual difference between one-model and multiple -model syllogismsis probably due to the inclusion of very simple one-model items



(e.g., All of the X are related to all of the Y; All of the Yare related to all of the Z ; Therefore, all of the X are related to all of the Z ), which can be handled by simple forward rules and for which perfonnancewas perfectin the experiment. The best test casesare the syllogisms in ( 11)- ( 12) and ' ' ( 11)- ( 12), since these items control for most other factors. as JohnsonLaird et al. acknowledge. The resultsfor thesepairs showeda large difference with the original wording: 92.0% correct on ( 11) and 42.0% correct on 12 . ( ) But with the revised wording, the differencevanished entirely: Subjects ' werecorrect on 75.0% of trials with both ( I I ') and ( 12'). Like Greenes results, thesesuggestthat the effect Johnson Laird et al. attribute to the number of models subjectsfonn during reasoningis more likely the result of difficulty subjects had in grasping the meaning of the premisesand potential conclusions. Extended Models: Propositional Argllments.9 Johnson- Laird , Byrne, and Schaeken( 1992) have also attempted to enlarge the mental-models hypothesis to cover reasoning with propositional connectives. However, although the mental models that they propose in this domain bear a surface resemblanceto those for categorical syllogisms, they must be interpreted quite differently, in a way that closely approaches the entries in a truth table. For example, a sentenceof the fonn p OR q would appear as p -' p p

-' q q q

" " where each row is a separate model and where the ps and qs represent propositions. The first row or model representsthe possibility that p is true and q is false, the secondmodel the possibility that p is false and q is true, and so on. The models represent all the separatecontingenciesin " which the entire sentenceis true. Thus, the three " models correspond to three lines of the standard truth table that we looked at earlier, namely those lines that assignT to p OR q. (Johnson- Laird et al. also allow p OR " " q to be represented implicitly by just the first two lines above, until the task requires that subjects " flesh out" the representation by adding the third line.) This choice of representation also brings out a point that I made earlier about the syntax of mental models. As Johnson-Laird et al. mention, one can turn any of their models for sentential reasoninginto an



isomorphic sentencein standard logical notation simply by placing an AND betweeneach of the elementsin the rows and then joining the rows with OR. For example, the models for p OR q above can be expressedas ( p AND ( NO Tq ) ) OR ( ( NO Tp ) AND q) OR ( p AND q) . Sentenceslike these- composedof disjunctions, eachof which is a conjunction of atomic - are said to be in disjunctive sentencesor negations of atomic sentences normalform . Thus, any mental model can be translated into an isomorphic sentencein disjunctive normal form. As in the earlier studies, Johnson- Laird et al. ( 1992) use the number of models (i.e., rows) to predict the difficulty of a deduction problem. For example, premisesthat require three models should be more difficult for subjectsthan premisesthat require only one or two. However, thesemodified truth tableslead to somedubious predictions. Consider the sentential argumentsshown in ( 13) and ( 14). ( 13) P AND q IF P THEN r IF q THEN r r ( 14) p OR q IF P THEN r IF q THEN r r Thesetwo argumentsvary only in their first premise, and both are clearly valid. However, they differ in the number of models in their representation . The first premiseof ( 13) is just a single line or model consisting of p and q. But the first premiseof ( 14) would appear as the three lines in our earlier example (or as just the first two lines in the caseof the " implicit " ' representation). The conditional premisesdon t reducethis difference. The full set of premises in ( 13) has the model ( 13'), whereas the full set of ' premisesin ( 14) has the model ( 14). ' r ( 13) P q ' r -, q ( 14) p -' p r q r p q



We are assuming that the OR in ( 14) is inclusive; but even if we were to interpret it as an exclusive disjunction, ( 14) would still require more models than ( 13). The representation would change only by deleting the last line of ( 14'). Thus, on any plausible assumptions(implicit vs. explicit representation, inclusive vs. exclusiveOR ), the mental-model theory must predict that ( 13) is easierfor subjectsto evaluate than ( 14). Similar predictions hold for the pairs of arguments that form the rows in table 10.1. The first row of the table simply repeats( 13) and ( 14). The second row contains related arguments that switch the polarity of the atomic sentencesin the premisesfrom p to NOT p and from q to NOT q. These two arguments are also valid, of course, and their models are like ( 13') and ( 14') except that the positions of p and -' p (q and -, q) are reversed. The argumentsin the third row of the table are logically equivalent to those of the secondrow and have exactly the samemodels. Thus, the argument on the left in all three rows is associatedwith just a single model, whereasthe argument at the right is associatedwith more than one model. It follows from the mental-models theory that the arguments on the left should be easier than the corresponding ones on the right . However , although the argumentsin rows 2 and 3 are logically equivalent, they differ in the scopeof the negation. The arguments in row 2 have narrow scope(the negation is within the conjunction or disjunction), while those in row 3 have wide scope. Table 10.1 " " Percentageof necessarilytrue responsesfor one-model and multiple model problems. = . n 37 ( ) Multiple models

One model

pAN Dq IF , p IF q,rr Argument

% Correct

(NOT (NOT q) p)AND IFNOT P.r IFNOT q. r r



( p OR






p, r q, r


pORq IF p,,rr IF q f (NOTp) OR(NOTq) IF NOTp, f IF NOT ', f f NOT(p ANDq) IF NOTp, f IF NOTq ',, f It


% Correct






q ,






To test the prediction of the mental-models theory, I asked subjectsto decidewhether the conclusionsof the argumentsin table 10.1 were " neces" " " sarily true or not necessarilytrue on the basis of their premises. There were three groups of subjects in this experiment; each received the two arguments in one of the rows of the table. Each group also received two filler arguments that were not valid. The arguments were phrased in terms of simple sentencesabout the location of people in places, as in the experimentsdiscussedin chapter 5 (seeRips 1990b). Table 10.1 showsthe percentagesof the subjectsthat respondedthat the conclusion was necessarilytrue for the six critical arguments. The overall accuracy for theseproblems was fairly good (80.6%), in accord with their intuitive simplicity . However, contrary to the predictions of the mentalmodel theory, there was no differencewhatever in subjects' accuracy for ( 13) and ( 14) (as can be seen in the first row). The same is true for the argumentsin row 3, which have wide-scopenegatives. Only the arguments with narrow-scopenegatives(row 2) produced a trend in the direction of greater accuracy for one-model problems, though this differencedid not result in a significant effect for number of models. There are, however, obvious differencesin difficulty among thesesix problems. Comparing the three rows, we find that subjects have more trouble with the arguments involving wide-scope negation (only 64.9% correct) than for arguments with narrow-scopeor no negation (87.8% and 89.2% correct, respectively). Theseresultssuggestthat the needto keep track of multiple models was not the source of the subjects' difficulties. Subjects must follow up the consequencesof just one model to evaluate argument ( 13), but two or three models for argument ( 14); yet there is no differencein how easyit is to recognize them as deductively correct. This is not just a matter of a ceiling effect, since the samegoesfor the problems in row 3 of table 10.1, where performanceis relatively poor. What makes for complexity in reasoning with theseproblems is not multiple models but wide-scope negatives. Theseresults are generally consistent with PSYCO P's approach to these arguments. PSYCOP deducesthe conclusion of argument ( 13) by meansof its Forward AND and IF Elimination rules, and it deducesthe conclusion of ( 14) by meansof Forward Dilemma (seetable 4.1). Although more rules are necessaryfor the proof of ( 13), it is quite possible that the Dilemma rule is somewhatharder for subjectsto apply than the AND and IF Elimination sequence ; Dilemma requires coordination of three premises , whereasAND and IF Elimination require one or two. There is no



reasonto think that one of thesemethods should be much easierthan the other. The sameis true for the argumentsin the secondrow of table 10.1. However, both problems in the third row also require distributing the NOT via one of the forward DeMorgan rules. Sincethis extra distribution ' . step requiresextra work on PSYCO P s part, accuracyshould decrease How PlausibleAre Mental Diagrams and Models as Explanationsof

inR? Reason If people usemental modelsfor reasoning, thesemodels are unlikely to be much like formal models. There are ordinarily an infinite number of formal models for any consistentset of sentences ; thus, there is no hope that correctness of arguments by sorting the deductive could evaluate people . To make a psychologized through all formal models for a set of premises version of model theory plausible, you have to introduce somelimitations in the nature of the modelsand in the kinds of argumentsthey can handle. In the case of Erickson' sEuler -circle theory the limitations are clear. Even if there were no psychologicallimits on generatingand manipulating Euler circles, the theory would apply only to a restricted set of arguments. There is no way to representarguments that contain five or more terms with Euler circles(Quine 1972), and this meansthat we can deal effectively with categorical syllogismsand not much more. ' In the caseof Johnson- Laird s mental models, the restriction to finite numbersof exemplarsleadsto problems with evenfairly simple inferences. The difficulties in reasoning with sentenceslike There are more than k natural numbers, when k is the largest number of tokens that can appear in a mental model, have already been noted. Our survey of the mentalmodels literature points up another sort of restriction. Since arguments that depend on propositional connectives are handled by one sort of model and arguments with quantifiers by another, there is no apparent way to explain reasoning that dependson both. This includes the type of argumentsthat we studied in chapter 7 (e.g., Someoneis rich and everyone paystaxes; therefore, someoneis rich and paystaxes). The modelsfor quantifiers and the models for propositions all contain similar-looking tokens, but they must be interpreted in completely different ways when we move from one type of task to another. The mental model a



Chapter 10

can mean that some artists are beekeepers , that one artist is a beekeeper , that Alfred is in Albany AND Ben is in Boston, that an aardvark is to the left of a baboon, and presumably many other things. Hence, distinct processeshave to operate on this model in thesevaried contexts in order to enforcethe hidden logical differencesamong them. We have also seenthat the mental-models hypothesis dependson a large number of unspecified assumptionsabout how peopleform and revisemental models, and it runs into seriousproblems in accounting for the experimental results. In view of thesedeficiencies, why should we believe the mental-model hypothesis? Johnson- Laird ( 1983, chapter 2) lists six problems about reasoning that he believesmental models can solve, but which he believes createdifficulties for approaches basedon mental inferencerules. Perhaps theseproblems could tip the balancein favor of models, despite the difficulties we have scouted. The first three of theseproblems are addressedin II of this book: Which logics are to be found in the mind, how is part mental logic formulated, and why do people sometimesmake fallacious inferences? The theory outlined in chapters 4 and 6 provides answersto the questions about the nature and the form of mental logic, as well as somereasonswhy deduction sometimesbreaksdown. The resultsin chapters 5 and 7 confirm thesehypotheses. I have also offeredan answer to the fourth of Johnson- Laird ' s problems: How can we explain the inferences people spontaneouslydraw from the potentially infinite number that follow from a set of premises? The solution is that the spontaneousinferences are ones that follow from forward rules, where the distinction between forward and backward rules is in turn explained by the self-constraining property of the former (as discussedin chapter 3). Peoplecan useforward rules spontaneously becausethe output of these rules will never lead to runaway inferences. The remaining two points of Johnson- Laird deserve comment, however, since they might provide a reason to hang onto mental models despite their flaws. We will also consider a matter that Johnson- Laird takes up elsewhereconcerning the ability of models to explain truth and falsity. ' Models, InferenceRules, and Content Effects One of Johnson- laird ' s remaining points has to do with the content effectsdiscussedin the first section of this chapter: Perhapsmental models are better able to deal with theseeffectsthan logical rules, which are tied to the logical form of sentences . But our examination of mental models



showsthat their ability to take " content" into account is no different from that of systemsbased on logical rules. For example, there is nothing in Johnson-Laird and Byrne's theory of syllogistic reasoningthat dependson what the tokens in their models(the little gs, ss, and bs in figures 10.2 and 10.3) stand for. The mental-model theory would work in exactly the same way if the terms of the syllogism wereswitchedfrom green, square, and big blocks to grocers, squires, and barristers, or to any other triple . Of course, you could changemental models to coincide with further information you haveabout theseclasses . For example, if you happento know on the basis of experiencethat all green blocks are square, then you might alter the representationof the first premisein figure 10.2 (All squareblocksare green blocks) by bracketing the gs to indicate that there are no green blocks that aren' t square. This would, in turn , have a subsequenteffect on the conclusionsyou draw from the syllogism. As Politzer and Braine ( 1991) have noted, however, proponents of mental models have offered no account of how such an interpretive process takes place beyond saying that it dependson " world knowledge." Moreover , evenif one grants sucha process, this method of dealing with content is also open to inference-rule approaches. Prior information about the terms of a syllogism can be added to the premisesin the form of additional mental sentences , where this extra information is then open to manipulation rules in the usual way. For example, the premisesin figure 10.2 by could be supplementedby a sentencestating that all green blocks are square, and the outcome would presumably be the sameas in the caseof mental models. For thesereasons, mental models do not seemespeciallywell suited to explain the sorts of findings that I called elaboration effectsabove; and much the same is true of effectsof believability. There is nothing about mental models to explain why subjectssometimesthink arguments with believable conclusions are more likely to be correct than ones with less believableconclusions. A proponent of mental modelsmight stipulate that ' peopledon t work as hard at finding counterexampleswhen the tentative conclusionsare more believable(seeOakhill et ale 1989). But proponents of inference rules could equally well assert that people slack off when searchingfor proofs for arguments with less In short, mental modelsare no better equippedthan inferencerules to explain either elaboration or believability effects, though both kinds of theory are probably consistentwith such effectsunder further assumptions.



Models , Inference Rules, and Truth ' In Johnson - Laird s framework , mental models are fundamentally different from the more usual formalisms in cognitive psychology , such as mental sentences and mental networks . The idea is that , whereas mental sentences and mental networks are just syntactically arranged strings of symbols , mental models could go beyond syntax in order to determine the truth or falsity of sentences: Theories basedsolely on linguistic representationsdo not say anything about how words relate to the world.. .. Until such relations are established, the question of whether a description is true or false cannot arise. Mental models are symbolic structures, and the relation of a model to the world cannot simply be read off from the model. So how is the truth or falsity of an assertionjudged in relation to the world? The answer is that a discoursewill be judged true if its mental model can be embedded in the model of the world . Thus, for example, you will judge my remark about the table being in front of the stove as true if it correspondsto your perception of the world , that is, a model basedon the assertioncan be embedded within a perceptual model of the situation... . (Johnson- Laird 1989, pp. 473- 474) The first thing to notice about this argument is that the term " linguistic " representation is ambiguous , since it could refer to representations derived from specifically linguistic material (such as spoken or written sentences) or to any mental representation that itself has the structure of a " sentence. If we understand " linguistic representation in the first way , then there are obviously situations in which our judgments of whethersome thing is true or false depend on something other than these linguistic representations . If I tell you that the table is in front of the stove , then you ' might judge that to be true on the basis of your perception of the table s being in front of the stove. On this reading, a representation derived from ' perception isn t a linguistic representation ; so we need something more than linguistic representations to make judgments about truth . " However , the difficulty with understanding " linguistic representation ' in this way is that you then can t use the argument to show that you need mental models in addition to mental sentences in order to account for such judgments . To see why , let us go back to the experiment by Clark and Chase ( 1972) that we discussed in chapter 5. Recall that on Clark and ' Chase s theory the sentence The star is above the plus is represented as the mental sentence Above ( star ,plus) . A picture of a star above a plus is also represented as a mental sentence of the same form . In order to judge the truth or falsity of a sentence with respect to a picture , one compares the



two mental sentences . No further entity , such as a mental model, is necessary , on this theory, to account for the true/ false decision. Mental sentences are sufficient to account for thesejudgments becauseyou can use mental sentencesto representthe information derived from perception as well as the information derived from a spoken or written sentence. Thus, in order for Johnson- Laird ' s argument to establish that mental " " models are necessary , linguistic representation must be understood " " more broadly. Linguistic representation has to mean something like a mental sentenceor some similar structure. If mental sentencescan' t by themselvesaccount for judgments of truth or falsity, then maybe we need someother mental entity - perhapsmental models. But in that casethere is no reasonat all to acceptJohnson- Laird 's premises. If a linguistic representation is just a mental sentence, then mental sentencescan account for judgments of truth and falsity at least as well as mental models. That is exactly what wejust noticed in the star-plus example, where we can judge truth or falsity by comparing mental sentences . In other words, neither " " allows the of reading linguistic representation argument to establish that mental sentencesaren' t perfectly sufficient for truth judgments. There may be another way to take Johnson-Laird 's argument, however. In the precedingdiscussion, mental models were supposedto account for the belief that The table is in front of the stoveis true: One might believe that this is true becauseone can compare a representationof the sentence with a representationbasedon perception of a scene. For thesepurposes, it seemsas though the representation of the scene could as easily be a mental sentenceas a mental model. But perhaps what Johnson- Laird means is that we need mental models in order to establish whether the sentenceis really true, not just whether we believeit is. The sentenceThe table is in front of the stove is true becauseof the way the world is, not becauseof the way our representationsare. To establishtruth in this sense, a comparison of mental sentencesclearly isn' t sufficient. Maybe JohnsonLaird thinks mental modelsare necessarybecausethey can put us in touch with the world in a way that mental sentencescan't. This sort of interpretation is suggestedin the passagequoted above by the assertion that linguistic representationsdo not say anything about how words relate to the world. But consider The table is in front of the stove. This sentenceis true, Johnson- Laird suggests , if there is a mental model for it of the right sort.



As he says, mental modelsare themselvessymbolic structures; so we can' t simply read ofTthe truth of the sentencefrom the model. However, the sentencewill be true if its mental model can be " embeddedin the model of the world." It is completely obvious, though, that if this " model of the world " is also a mental model- another symbolic structure- then we need to ask how it is that we know that this model of the world is true. Even if one's model of the world is some sort of image derived from perception, there is no guaranteethat it is true, since it is notorious that perception is subject to all sorts of illusions. Thus, if the model of the world that Johnson- laird has in mind is another mental representation, being embeddedin such a model is no guaranteeof truth . Could Johnson- laird mean that a sentenceis true if it has a mental model that can be embeddedin a nonmental model of the world? If so, then we needsomesort of explanation of what it meansto embeda mental model in a nonmental one. Moreover, at this point it is no longer clear what the concept of a model is buying us. Instead of saying that The table is in front of the stove is true if its mental model can be embeddedin a nonmental model of the world , couldn' t we say more simply that the sentenceis true if the table is in front of the stove? Johnson-Laird and Byrne ( 1991, p. 213) suggestthat, although their computer simulations operate in a purely syntactic way, the models becomesemanticwhen they are embodied in people becauseof causal links betweenmodels and the world. However, if that is true for mental models, it could equally well be true for other forms of mental representation. The mental-model hypothesis is at its weakestwhen it claims for itself representationalpowers that .I I go beyond those of sentences Mental Models, InferenceRules, and Acquisition Johnson-Laird 's final point is that there is no obvious way that mental inferencerules can be learned, whereasmental models poseno such problem . It is possible to suppose, of course, that inference rules for specific connectivesor quantifiers are learned along with the words to which they correspond. AND Elimination and AND Introduction are acquired, on this account, while children are mastering the meaning of and in their native language. This seemsquite reasonable, in some respects, in view of the close connection between a connective's meaning and its use in performing inferences(Dummett 1973; Sundholm 1986); in some cases, exactly this sort of learning may occur. But, as Fodor ( 1975, 1981) has



pointed out, the child needssome logical baseeven to begin formulating hypothesesabout the meaning of such a word. If the child acquiresand by " ' learning that and means . . . ~, then he or she must already have alanguage that is logically rich enough to fill in the blank. This seemsto entail that somelogical principles are innate. Johnson- Laird ( 1983, p. 25) objects to this conclusion on the ground that " there is no direct evidencefor it whatsoever, merely the lack of any alternative that proponents find convincing ." Moreover, he argues that mental models avoid this problem, sincethere are no inferencerules to acquire. Instead, " what children learn first are the truth conditions of expressions:they learn the contributions of connectives, quantifiers, and other such terms to these truth conditions" (p. 144). The innatenessof basic logical abilities is not a bitter pill , however. Theseabilities are exactly the sort of things one might expectto be innate, if anything is, given the central role of reasoningin cognition and learning. (See Macnamara 1986for a development of this point of view.) And, of course, the presenceof innate logical abilities doesn't imply that people nevermake mistakesin deduction, or that they are incapableof improving their inferenceskills, any more than the presenceof innate grammatical abilities implies that people never make grammatical errors, or that they can' t improve their grammar through learning. Mistakes can come about through limits on rule availabilities or memory capacity, improvements can occur (in logic class, say) through the learning of short-cuts that compress long chains of inferencesinto single steps. What is important in this context, however, is that, even if we are dubious about the innatenessof logical rules, Johnson- Laird 's truth conditions are not going to be more palatable. The truth conditions for the logical connectiveand are standardly given in the form in which they were stated in chapter 6: 81 AND 82 is satisfied(or is true) if Tboth 81 and 82 are satisfied(are true). Within this framework, we can think of 81 AND 82 as the expressionto be learned and of the material on the right -hand side of the biconditional as part of the child' s preexisting language. But this makes it completely obvious that any such learning is impossible unless the child already understands what it means for both 81 and 82 to be true- that is, unlessthe child already knows the meaning of someexpression that is logically equivalent to AND . Recastingthis in terms of mental models, rather than formal models, doesn't avoid this consequence . Then the truth condition is presumablythat 81 AND 82 is true in a mental model


Chapter 10

iff both 51and 52 are true in the mental model. And this formulation, too , ' transparently requires prior knowledge of and' s meaning. Fodor s problem applies in spadesto the learning of truth conditions, and it leads to exactly the sameinnatenesshypothesis. The basic argument for innateness is the difficulty of conceiving a learning mechanism that doesn't already contain basic logical operations. Problems about acquisition of logical knowledge, like problems about effectsof content on reasoning, are difficult , and any theory that could give a convincing explanation of them would certainly be welcome. But mental modelscome no closer to this ideal than do rule-basedapproaches. This is probably traceable to the fact that both types of theories are, at heart, methods for transforming configurations of (syntactically structured ) symbols in ways that preservetruth . In order for this to be done in a general way- a way that avoids the need to restate the sameinference pattern for every new domain - both theories must be quite abstract, and their abstractnessis what makes content effects and acquisition thorny issues. To decide betweenthe theories, we need to consider other properties , such as explicitness, generalizability, and empirical accuracy. The results in part II should help decide the issue with respect to these characteristics.


Perspectiveson ReasoningAbilitJ

Galileo, you are traveling the road to disaster. . . . A momentago, whenyou were at the telescope , I saw you tied to the stake, and whenyou said you believedin proof, I smeltburning flesh. Brecht, Galileo

We have beenafter a psychologicaltheory of deductive reasoningthat can be defendedon the basisof experimentsand that hasadvantagesover rival accounts. I hope that part II success fully carried the experimental burden of the theory and that the earlier chapters of part III establishedsome of its advantages. But we also need to look from a higher position at the ' ' theory s adequacy, since we want to ensure that its assumptions don t clash with global facts about cognition or with secureprinciples from phi losophy or AI . We have already bumped against such high-level principles in consideringthe placeof deduction among inferencetypes, but it is worth ' gatheringsomeof thesegeneralissueshereto get an overviewof the theorys ' commitments. I would like to prevent misreadingsof the theory s claims, if I can, and to indicate what questions the theory currently leavesopen. Most of the critiques of rule-based, psychological theories of deduction fall into two opposing camps. One camp consistsmainly of psychologists to whom such theories seemto make people out to be far more logical or rational than they are in practice. If people really come equipped with inferencerules, how come they have such trouble with seemingly trivial problems in deduction? The other camp consistsmainly of philosophers who feel that thesetheories make people out to be much more irrational than is possible. This camp believesthat, on a proper analysis of the very ' concepts of deduction and rationality , people can t be systematically in error about basic logical matters. Thesetwo camps have clearly conflicting ' positions, and the present theory s intermediate stance places it in the crossfire. To seewhether the theory is defensible, I therefore need to consider both campsand find out how much of the theory remains intact. Let me start with the view that our theory is " too rational ," reserving discussionof the " too irrational " position till the end.

Aren'tY 00 Makinl People Out to BeMore l .oglcal orRational Than Are? TheyReally It is obvious, at this late stage , that the PSYCOPtheory allowscertain of errors since much of the evidencethat supportsthe theorycomes , types


Chapter 11

from data on the percentageof " incorrect" responsesfrom subjects. Yet the question of what qualifies as an error in reasoning isn' t straightforward . In early research on the psychology of reasoning, experimenters counted as an error any deviation from a standard logical systemgenerally Scholastic doctrine in the case of syllogisms or classical sen" " tentiallogic in the caseof sentential arguments. However, if an error of this sort is something for which we can actually criticize a subject, rather than just a convenient label for classifying the subject's responses , then the logical systemsthemselvesmust be correct norms of appraisal. Some early researchersmay have acceptedthe standard systemsas normatively appropriate simply out of ignorance of rival ones, but the variety of contemporary logics raisesquestionsabout the correctnessof Scholasticand of classical logic. Inferencesthat follow the pattern of Double Negation Elimination , for example, are valid in classical logic but not in intui tionistic logic (Dummett 1977; Fitch 1952; Fitting 1983; Heyting 1956). . is valid Similarly, the syllogism ( AII(G , H ), AII(GiF ), . . Some(F, H according to Scholastic logic but not according to classical logic (see the appendix to chapter 7 above). This means that attributing errors to subjects can be a delicate matter that depends on how we justify the . Let us call justifications of this sort " system logical systemsthemselves " ' justifications , since they determine the entire systems correctnessor incorrectness . Once we have fixed on a system that we can justify , there is a further ' ' question of how closely peoples decisionsconform to it . Whether peoples reasoningis correct or not is most clear-cut if the justification of the system doesn't itself dependon facts about human reasoning. Things become more complex if the system's criteria are themselvesbased on people's intuitions about correct inferences. It is reasonable to hold, for example , that a correct deduction system is one that squares with primitive intuitions about what follows from what (Cohen 1981; Goodman 1965; Harman 1986; Peacocke1987). In that casethe correctnessof a particular inferencedependson how it fits with the deduction system, and the correctness of the deduction system depends on how it fits with primitive judgments. Theories of this sort placea limit on how badly people reason, sincethe primitive judgments partially define correct reasoning. However, thesetheories generally leavesomeroom for errors. Distractions can keep people from tapping their primitive intuitions for simple arguments, and their intuitions may not extenddirectly to more complex argument forms.

Perspectiyeson Reasoning Ability


The problem of errors in reasoningis also complicated by the different computational levelsin PSYCOP. This model comeswith a stock of inference rules (the rules of part II ), which I take to constitute part of the architecture of human cognition. It is possible to raise questions about . whether these rules are correct, much as one can raise questions about the correctnessof classical logic. If I am right about empirical matters (that theserules really are embodied in the architecture), and if theserules are incorrect, then the inference rules could be commit ting people to a type of irremediable error. Human thinking would then contain a basic flaw that could never be entirely eradicated. I am assuming, however, a secondlevel of deduction, in accord with the promotional/ demotional approach introduced in chapter 2. When people face a concrete deduction problem (e.g., those in the experimentsof part II ), they make use of deduction rules that they can employ strategically and can emend or supplement. I have assumedthat theserules resemble the built -in ones, sincebuilt -in structure governsthe way peoplelea.rnand deploy the lower-level rules. But there may be disparities betweenlevels, just as there can be disparities betweenprograms in PASCAL and in the assemblylanguageto which the PASCAL programs are translated. Errors arise naturally with respect to these lower-level rules, since individuals' processinglimits and blases, competition from other nondeductive strategies , uncertainties about the correctnessof a rule, and assumptionsabout the appropriatenessof a rule can all interfere with the way people use them. 1 One must keep thesedistinct levelsof deduction in mind when discussing whether PSYCOP is " too logical." Someof the questionsabout errors are bestdirected to the architectural level, whereasothers are more appropriately directed to the strategic level. It is much more difficult (though perhapsnot impossible) to convict human deduction of mistakesinarchitecture than to convict it of mistakes in strategic use of rules. Let us consider some of the possible issuesabout errors in relation to this dual approach. Aren' t You Assertingthe Discredited Doctrine of Psycbologism(or that the Laws of Logic Are the Laws of Thougbt)? Psychologism is the thesis that a logical or a mathematical system is a description or a generalization of how people reason. According to this thesis, whether Double Negation Elimination (for example) is deductively



correct can be decided only in tenDs of whether people agree (perhaps under ideal circumstances) to the inferencefrom NOT NOT P to P. Psychologism appears to be the result of an expansionist tendency of some nineteenth-century philosophers and psychologists to try to explain all facetsof intellectual life in empirical psychologicaltenDs. Not surprisingly, this tendencyprovoked a strong backlashby membersof other disciplines, who saw thesepsychologicalexplanations as essentiallytrivial . As a result of this backlash, it has come to be a truism that logical principles are not " merely redescriptionsof psychological ones: If we could accept as good psychology the old idea that logic teachesthe laws of thought, we should be far our knowledge of the processof thinking . Recentstudents of logic and psychologyagreethat logic, or at least the formal logic which has come down from Aristotle , is not psychology to any great extent." (Woodworth 1938, p. 807) It may seemat first glance that theories like PSYCOP are throwbacks to the earlier psychologistic tradition , but that woul~ reflect a misunderstandingof the aims of theseapproaches. ' Freges ( 1893/ 1964, 1918/ 1977) well-known attack on psychologism emphasizedthe fact that truths of logic and arithmetic hold whether or not anyone has managed to discover them. Moreover, such truths are constant, in contrast with people's fluctuating psychological awarenessof them. Thesesubsistenceand stability properties are fundamental to doing logic and math, and it is impossible to explain them in purely psychological tenDs. (Seechapter 1 of Macnamara 1986for a discussionof the psychologism/anti-psychologismcontroversy.) Logical and mathematical truths also seemto be necessarytruths that hold acrossall possibleworlds or states of affairs. By contrast, empirical facts about human psychology are only contingently true, on the assumption that in other possible worlds our psychological makeup would be different. This tension between logical necessityand psychological contingency (seeStroud 1981) ' again suggeststhat the latter can t be entirely responsiblefor the fonDer. If a logical truth dependedsolely on contingent psychological properties of ours, then in someother world in which that property failed to applyin which we reasonedin ways that we don' t now- the logical truth pre' sumably would also have to fail. But since logical truths can t fail , logic ' can t be psychologicalat heart. It might be possible to defusesome of these antipsychologistic arguments . For example, Sober ( 1978) claims that Frege's contrast between logical stability and psychological variability was due to his taking intro -

Perspectiveson Reasoning Ability


spectionist psychology as a guide. Current information -processing theories posit mental mechanismsthat are stable enough to allow logical principles to be psychologically real. (Sober believes, however, that fullblown ' psychologism still falls prey to Freges subsistencethesis.) One might also argue that any such psychologically real logical principles must be essentialaspectsof the human mental makeup and hence true in all possible worlds. Such a position would contend that, just as a correct chemical theory of a natural kind such as water identifies properties of water that are necessarilytrue (Kripke 1972; Putnam 1975), so a correct theory of psychologyidentifies properties of reasoningthat are necessarily true. If this is correct, then there needbe no conflict in the status of logical principles and their psychologicalembodimentsacrosspossibleworlds. The theory of part II , however, doesn't depend on the outcome of the debate betweenpsychologismand anti -psychologism. The theory needn't claim that logic exists only as a psychological object or activity, and thus it doesn't entail psychologism as defined above. It is consistent with the possibility that logics are wholly or partly mind -independententities and with the possibility that a logic system is justified by non-psychological standards. A complete divorce beweenlogic and psychology createspuzzles about how people could apprehend a logic system, but these are puzzlesfor anti-psychologism itself and not for the particular reasoning theory I have proposed. The PSYCOP theory simply means to be descriptive of the mental processes that people go through when they engagein deductive reasoning ' , and nothing in Freges argument contradicts the possibility that this is right (Nottumo 1982). What Frege( 1918/ 1977, p. 2) denied is that these mental processes constitute logical truths: " . . . an explanation of a mental processthat ends in taking something to be true, can never take the place of proving what is taken to be true. But may not logical laws also have played a part in this mental process? I do not want to dispute this, but if it is a question of truth this possibility is not enough. For it is also possiblethat somethingnon-logical played a part in the processand made it swerve from the truth ." On Frege's view, logical laws might inform mental processes, though the two can' t be identical. The robustnessof Frege's logical truths suggeststhat , instead of trying to reduce logical principles to psychological ones, it would be more reasonable to move in the opposite direction. Perhaps if it were possible to show that Fregean laws did all the work required of a theory of mental



inferencerules, we could dispensewith theserules as superfluous. But this would probably be a mistake. Mental rules are supposedto be part of a causal account of the way people evaluate arguments, produce conclusions , and solve other types of deduction problems. How Fregeanlogical truths could accomplish thesethings is, to say the least, perplexing. Aren' t You AssumingThat PeopleCan' t Possess" Incorrect" Deduction Rules? PSYCOP clearly admits the possibility that people don' t always reason correctly. But how deepdo the errors go? In the simulations of part II , we assumedthat subjects have a certain pool of inferencerules at their disposal but sometimesfail to apply them correctly, which causestheirjudg mentsto deviatefrom what they are ideally capableof. But what about the inferencerules themselves ? All the rules posited in chapters4 and 6 appear to be sound with respectto classical logic, so people who stuck to these rules couldn't reach classically invalid conclusions. Any errors that occurred in their reasoningwould have to be errors of omission (essentially, failures to find somemental proof ) or errors due to guessingor to interference from other factors. However, one might try to argue that people also make errors of commission, applying some unsound rules that get them into trouble. (Goldman ( 1986) raisesthe issueof unsound rules indiscussing ' cognitive models of deduction, though he doesnt necessarilyadvocate including them.) As examples of possible errors of commission, consider the " formal fallacies" that appear in textbooks on informal logic and rhetoric. These books warn against reasoning in certain rule-like patterns that are unsound in standard logic systems. For instance, the fallacy of " denying the antecedent" is an unsound argument of the form IF P THEN Q; NOT P ; therefore, NOT Q. It would be easy enough to formulate adeduction rule along the lines of those in table 4.1 that would produce such a fallacy if faced with the appropriate task. There is nothing about the mechanics of the inference-rule systemthat would forbid it . Moreover, subjects sometimesaffirm argumentsof just this type (seechapter 5). So there seemsto be somemotivation for taking theseunsound rules to bepsycho" logically real. Along the samelines, it is possibleto specifyan atmosphere " rule for syllogisms that would cause subjects to affirm arguments like . ( AII(G , H ), AII(GiF ), . .AII(F,H , as was noted in chapter 9. In the area of probabilistic reasoning, claims for the presenceof fallacious rules are

Perspectiveson ReasoningAbility


common (see, e.g., Kahneman et al. 1982). For example, some individuals seemto operateunder the " law of small numbers" ( Tverskyand Kahneman 1971) or " gambler' s fallacy," giving higher odds to a coin' s landing " heads" the longer the string of precedingflips that landed " tails." If peoplepossess such normatively inappropriate rules for reasoning with uncertainty, it seemsa short step to assuming that they have similarly inappropriate rules for reasoningdeductively. Strllteg;c Errors The possibility of incorrect deduction rules has different implications for the strategic levelsand the architectural levels of the theory , becausethese levels vary in how easy it is to change them. At the strategic level, it seemsreasonable to suppose that people could have incorrect routines that they mistakenly take to be correct deduction rules. We could imagine, for example, a devious instructor who persuadedhis studentsto usea particular rule for solving deduction problems that didn' t happen to be correct and that produced incorrect proofs of arguments. Nothing in the model would prohibit such a possibility, and from the standpoint of proponents of nonclassical logics that is what most logic teachersactually do. (See, for example, Anderson and Belnap' s ( 1975) discussion of the " Official View" of IF .) Of course, the methods that people use in reasoningare probably not ordinarily ones they explicitly learned in class, and deliberate professorial brainwashing is unlikely to be the source of many incorrect rules. But if people can learn incorrect rules through direct training , it seemshighly likely that they can also learn them in indirect ways, abstracting them from the speechor the writing of others. So why not take incorrect rules such as theseto be the basis for some of the mistakesthat people make in reasoningexperiments? Whether to include Denying the Antecedent and similar principles ' among the deduction rules can t be decided without evidence. PSYCOP excludesthem becauseit seemspossible to construct within this framework better explanations for the phenomena that they are supposedto explain. I have suggestedthat when subjects affirm an argument of the form of Denying the Antecedent they may do so on the basis of extra information that they import from background knowledge, using legitimate inductive inferences(seechapters 5 and 10). Since these inductive strategiesare neededanyway, a Denying the Antecedentrule seemsredundant . Similarly, we get a better account of the syllogism data on the assumption that people reason by applying the rules of chapter 6 to the



premisesand their implicatures. The implicatures are again needed on independent grounds to explain other facts about communication and inference, so this assumption seemssensible. It was also assumedthat subjects are more willing to guesswith Some or Some-not conclusions than with Allor No conclusions, since the former seem" safer" to them. This assumption, too , makes more senseof the data than an Atmosphere rule that automatically produces a particular (Some or Some-not ) conclusion if either premise is particular and a negative conclusion if either premise is negative. Excluding incorrect deduction rules on the basis of explanatory inadequacy, however, leavesthe door open to including them if the evidencetips in the opposite direction. Architectural Errors What about errors at the architectural level? If intu itionistic logic is correct, for example, then there must be somethingwrong with the rules we are currently assuming to be part of the architecture, since they include Negation Elimination , Double Negation Elimination , and other principles that aren' t valid in intuitionistic logic. But the situation here is different from that at the strategic level. Although we can try out strategic rules, discarding those that get us into trouble, we are stuck with architectural rules, since by definition architectural properties are fixed parts of the cognitive system(Newell 1990; Pylyshyn 1984). There is no way for us to get outside the particular architectural rules with which we are endowed. And since it is through these rules that we learn about and make senseof the mental capacitiesof other creatures, the rules place limits on the extent to which we can identify reasoning that proceeds according to principles that are very different from our own. This is one of the lessons of the philosophical work on rationality by Davidson, Dennett, and others cited in the prefaceand in chapter 1. The conceptof a deduction systemitself limits the scopeof architectural errors, since deduction systemscommit us to constraints on what can qualify as a rule. For example, as Belnap ( 1962) pointed out, potential rules for a new connective have to be consistent with old ones. The new rules can' t sanction argumentsnot containing the connectivethat weren't previously derivable; otherwise, it is possible to have rules (e.g., the * -Introduction and * - Elimination rules mentioned in note 3 to chapter 2) that prove any sentenceon the basis of any other. Systemsthat violate such constraints no longer seemto have much to do with deduction. Of course, the deduction-system hypothesis may itself turn out to be a false

Perspectiveson ReasoningAbility


account of the psychology of thinking . Peoplemay not operate according to deduction rules at all; we may operatesolely on the basisof probabilistic or connectionist principles, contrary to the evidencepresentedin part II . (See Rips 1990afor a discussion of this point of view.) However, if the deduction-system hypothesis is correct, there are going to be bounds to the erroneous rules that can be part of people's mental architecture. Although we can live with an operating systemthat hasan occasionalbug, it has to meet some minimal functional standards in order to be an operating systemat all. Doesn't tbe Account of PerformanceBeg the Questionabout tbe Sourceof ' Subjects Errors on DeductionTasks? A related criticism of the theory is that, although it may provide a place for reasoningerrors, it doesn't explain them in a principled way. According to this objection, if errors are simply the result of an arbitrary failure to apply a rule or of a haphazard collection of interfering strategies, then ' it seemsthat we haven t made much progressin accounting for why people commit them. This is especially true, of course, if we appeal to these error -producing factors in an ad hoc way. Similarly, estimating parameters from the data to produce the error distributions , as was sometimesdone in the experiments, may give us reasonto think that the model is consistent with the results; but it doesn't seem to provide much insight, since it doesn't generate predictions or explanations about the difficulty of the problems (Johnson-Laird and Byrne 1991). A preliminary point to make in regard to this objection is that we need to be careful about adopting unrealistically simple ideasabout the source of subjects' mistakes. The successfulperformance of any complex skill obviously dependson the coordination of a large number of internal components and external enabling conditions, and the failure of any of these factors can lead to errors. For this reason, a unified account of errors seemsextremely unlikely . This doesn't mean that errors are unimportant or irrelevant for studying mental abilities; on the contrary, much ofcogni tive psychology is premisedon the idea that errors can reveal something of the structure of skills. There are evenclassesof errors that have intrinsic interest and deservestudy on their own. However, it is unlikely that we will make much progressin studying reasoning unlesswe recognizethat there are diverse sourcesof mistakes. Working -memory limitations , time limitations, attention limitations, responseblases, interpretation difficulties,


Chapter 11

Gricean pragmatic factors, number of required steps in the problem, and complexity of individual steps are all known to affect performance on a wide range of cognitive tasks, and it would be extremely surprising if ' they didn t affect reasoning performance as well. The study of errors in reasoningis important preciselybecausedifferent types of errors pinpoint different aspectsof reasoning. Thus, if having a principled account of reasoning errors meansdiscovering a single factor that explains all of them, no such account is apt to be forthcoming. Wily Errors Htlppen Our handling of errors could be unprincipled in another senseif it appealedto errors in order to hide defectsin the deduction ' theory. Evidencethat subjects responsesfail to conform to the predictions of the model could obviously be due to the model's deficiencies.If it is, then to explain these deviations by claiming them to be subjects' errors is to blame the subjectsfor the theorist' s mistake. Ascribing errors to subjectsin this way threatensto make the theory vacuous unlessthere are good independentgrounds for theseerror tendencies. But I don' t think it is fair to describethe useof errors in the experiments reported here as merely an ad hoc attempt to make the theory fit the data. In explaining some of the earlier results in this field, the analysis was necessarilyad hoc; there is no way to test the theory in this context short of redesigning the original studies. We did , in fact, redesign the categorical-syllogism experiments for this reason in order to eliminate what we thought were error tendencies(e.g., bias causedby responsefrequency ) that obscuredthe syllogism findings. But there are so many different types of deduction experimentsthat one cannot do this in all cases, and for many of these studies we have to be content with plausibility arguments to the effect that the theory is consistent with the findings. When we move from earlier researchto the new studies of part II , however , we no longer have to rely on ad hoc methods. Most of theseexperiments derive predictions from the model on the basis of well-established assumptions about how errors come about. These studies employ the straightforward notions that, other things being equal, errors are more likely if the problem contains more required steps, if the complexity of each step is relatively great, or if the types of steps are more varied. It is hard to think of more fundamental assumptions than these in the cognitive psychology tradition . Using these assumptions in a predictive way to evaluate a theory is as clear-cut a methodology as cognitive

Perspectiveson ReasoningAbility


psychology has to ofTer. The proof -following experiment in chapter 5 and the matching experiment in chapter 7 are nearly pure examples of this methodology. Parametersand Model Fitting . Parameter estimation in theory testing introduces some further complications becausethe parameter values seem to many people to be devoid of predictive or explanatory power: The values appear to be derived entirely from the data rather than from the theory. Thus, fitting the model in this way, as we did in several of the studies in part II , again raisesquestions about whether our explanations aren' tad hoc. The reasonfor estimating parameters, however, is to provide a quantitative analysis of errors and other responsemeasures. As long as we are content with qualitative predictions, we can rely solely on what the theory saysabout the relative number of stepsand the relative complexity of the steps in a problem. But in order to obtain quantitative predictions, we needexact measuresof how likely it is that a reasoningstep will go awry or how long such a step will take. Although there are several ways of , none of them entirely eliminates free parameters estimating thesemeasures . Moreover, alternative methodsare also basedin part on the data and are sometimessubject to confoundings, as was noted in chapter 9.2 Although it is certainly possibleto abuseparameterestimation in model " " fitting , free parametersin models like that of part II are constrained by the theory. First , the values of these parameters depend in part on the structure of the equations in which they appear, and the equations are themselvesbased upon what the theory says about the relation between . This contrasts with the caseof internal processes and external responses to the data an , in which the values of the fitting arbitrary polynomial ' t tied to coefficientsaren any substantivetheory. An incorrect theory will , of course, come closer to fitting the data as the result of optimizing the values of its free parameters. Nevertheless, there are upper limits to how good the fit can be when the theory dictates the form of the equations, as one quickly learns from practical experiencein model fitting . Second, the estimatesgain credencewhen similar values appear from one experiment to the next, as they did in the studiesreported in chapters 5 and 7. Stability of the estimates across different sets of subjects and different types of deduction problems suggeststhat the parameters are ' capturing permanent aspectsof peoples reasoning ability and not just


Chapter 11

superficial features of the experiment. It is a remarkable fact that very similar valuesof the rule parametersshow up when subjectsare reasoning about sentential arguments, categorical syllogisms, and general multivariable arguments. Third , the theory assignsan interpretation to the parameters, and this too constrains the role they play in the explanation of reasoning errors. In general the theory predicts that more structurally complicated rules should be associatedwith lower parameter valuesin our studies, sincethe more complex the rule the more difficult it should be to usecorrectly. For example, rules that create and manipulate subdomains should be more difficult than rules that don't , other factors being equal. Parameter values that accord with this complexity ordering help confirm the theory. The experimentson syllogismsand multivariable argumentsin chapter 7 support this prediction, since estimates are invariably lower for rules that createsubdomains(e.g., NOT Introduction ) than for rules that don' t (e.g., AND Introduction or AND Elimination ). (Seetables 7.2 and 7.5.) Parameter valuesfor the argument-evaluation experimentin chapter 5 (table 5.2) are not as clear-cut. For example, the Backward OR Introduction rule (which yields P OR Q from P) is fairly simple internally , but it has the lowest availability value in that experiment. This finding must be treated with caution, and my tentative explanation for it (that Gricean factors made subjectsunwilling to go along with inferencesof this type) is indeed ad hoc. However, deviations like these are no different from deviations from other predictions; parameter estimation plays no specialrole. In short, PSYCOP has a well-motivated account of how and when subjectsmake errors in reasoning. Thesepredictions arise naturally from the form of the theory- the type and the structure of the rules- together with some very general and widely acceptedassumptions about performance breakdown. In the studies reported here, we have mainly relied on the notion that a problem is error prone to the extent that it requiresmany rules (types or tokens) or more complicated rules in its solution. The data confirm this notion in most cases. Aren' tY 00 Assuminga " Logicist" ApproacbThat Has Already Proved ' Unwieldy in AI ? (Aren tY 00 AssumingThat Everytbingls a Matter of DeductiveReasoning ?) Aside from errors, a second source of uneasinessabout linking reasoning with natural-deduction rules is the question how such rules could be

Perspectiveson ReasoningAbility


responsiblefor the variety of inferencesthat people'engagein. There are obviously lots of perfectly good inferencesthat aren t deductively valid. I infer from my previous beliefsthat the french fries at the university cafeteria come from potatoes rather than from recyclednewspaper, eventhough there is no valid argument from my evidenceto my conclusion; it is not logically, or even causally, inconsistent with these beliefs that someone has discovered that paper pulp can serve as a cheap potato substitute. Furthennore, thesenondeductive inferencesoccur continually in comprehension , planning, problem solving, and decision making. Giving deduction rules a crucial role in thinking - making them part of cognitive architecture- seemsto deny the cognitive importance of other inference types and seemsto leave us without any way of explaining them. This amounts to a kind of deduction chauvinism whoseonly excuseis the fact that logical theory has given us a neat set of rules to work with. Logicismin AlA relatedissuehas beenplayed out in artificial intelligence " " as a debate between logicist researchers(who employ proof-theoretic and model-theoretic methods) and others who favor non-logic-basedapproaches. (See, e.g., McDennott 1987 vs. Hayes 1987 and other papers in the same issue of Computational Intelligence, and Nilsson 1991 vs. Birnbaum 1991.) Of course, neither party to this controversy believes that the only interesting inferencesare those that correspond to theorems of classicallogic. Rather, the crux is the extent to which we can fruitfully capture the commonsenseinfonnation neededfor skills like planning or problem solving in a theory composedof explicit sentencesand of a logic . There are some obvious for unpacking the entailments of thesesentences since research to a , representingcommonprogram logicist advantages sensebeliefs as an explicit theory makes it much easier to detennine the adequacyof thesebeliefs(e.g., whether they are consistentand cover all of the intended domain) and to use the knowledge in situations that the ' researcherdidn t foresee. The purported disadvantageis that this strategy seemsto exalt deduction at the expenseof other, equally interesting fonDS " of inference: . . . logicists tend to ignore other sorts of reasoning that seemquite central to intelligent behavior- probabilistic reasoning, reasoning from examplesor by analogy, and reasoning basedon the fonnation of faulty but useful conjecturesand their subsequentelaboration and " 3 debugging, to name a few. . . . (Birnbaum 1991, p. 59) In psychology, Kahneman and Varey ( 1990) have recently expressedsimilar antilogic


Chapter 11

views with respect to reasoning about counterfactual situations , and their viewpoint is probably widely shared. Logicists have some room to maneuver in this controversy . It is possible to claim , for purposes of AI or even cognitive psychology , that formalizing common sense knowledge in the logicist manner is a necessary preliminary step. Formalizing makes clear the content of common sense beliefs by specifying the entities and relations that it presupposes. Once this job has been completed and it is clear what the domains of the beliefs are, we can tl :en turn to implementing the theory in a computer program or discovering how it happens to be implemented in humans . Thus , the logicist needn' t claim that people actually represent knowledge in a logic - based theory , or that they use deduction to draw inferences from it . A logicist could even agree that other forms of representation and process are more practical for building computer programs . The formalizing step is a way of " " doing a kind of natural - language metaphysics by teasing out the basic components of our everyday talk and thought (Bach 1986; Hayes 1985; McCarthy 1977); simulation and experimentation can take over from there. Moreover , logicists can point , with some justification , to the excessive vagueness of alternative ways of developing theories of common sense knowledge in both AI and psychology . But although this line of defense is congenial in many respects, it depends on a certain way of partitioning these domains of human knowledge . The logicist program allows us to focus on the information that the axioms make explicit , leaving it to the logic to flesh out the axioms ' implicit (but derivable ) consequences. As long as the logic itself is well understood , this strategy seems realistic , and the logicist can afford to ignore probabilistic reasoning , analogical reasoning , and the rest. But in practice such proposals have had to incorporate powerful logics (e.g., the higher -order intentional logic of Montague ( 1973), or the nonmonotonic ' logics discussed in chapter 8) whose properties aren t as clear or as elegant as those of standard systems. Once the logics become complex , it is no longer clear what advantage the logicist approach has over other forms of theorizing that include nondeductive inferences in their artillery . The PltlCe of Noll deductive Reuolli " , No matter how this debate turns out , it is clear that the present theory doesn ' t have the luxury of claiming that implementing reasoning is someone else' s business. Our goal is describing the ways people reason , and the theory is incorrect if it ascribes to deduction inferences that people carry out by other means. It would be

Perspectiveson ReasoningAbility


a rnistake, however, to view the theory as taking the position that all inferenceis deduction. There is a placein this frarnework for nondeductive inferenceat exactly the sarnelevel as the strategic useof deductive reasoning . For exarnple, we found in our discussionof categorizing (chapter 8) that the belief that an instancebelongsto a particular category is generally not deduciblefrorn available evidence. We were able to accornrnodatethis forrn of nondeductive reasoningby irnplernentingit indirectly, placing the relevant sentence(that the instance is in the category) in rnernory when ' supporting evidencebecornesavailable. The supporting evidenceneednt logically entail the belief in order to trigger this process. PSYCOP can include other forms of reasoningin the sarneor sirnilar ways. That is, we can expressproceduresfor probabilistic reasoningor analogical reasoning that searchrnernoryfor relevant support and produce inferenceswhen the support rneets given conditions for strength and sirnplicity. Of course, ' working out the details of these reasoning theories isn t a sirnple rnatter. But since rnany prior psychological theories of probabilistic reasoning (see, e.g., Collins and Michalski 1989; Oshersonet al. 1991) and analogical reasoning(see, e.g., Gentner 1983; Gick and Holyoak 1983) can be irnplernented using sentencesin the language of chapter 6, they seernto be perfectly consistentwith our frarnework. At the strategic level deductive and nondeductive reasoning are on a par, but at the architectural level the theory works solely according to the deduction rules of part II . This helpsexplain the theoretical and ernpirical evidencefor the centrality of deduction that we encounteredin chapter I . It helpsaccount for why elernentarydeductiveinferencessuchas IF Elirni nation, AND Elirnination , and instantiation are so hard to deny and yet so hard to justify in rnore basic terms. But in rnaking theseassurnptions, we are not atternpting to reduceall forms of reasoningto deduction or to portray such reasoning as deduction in disguise. It is easy to seehow we could usea cornputer languagesuch as PROLOG , which ernploys a form of resolution theorern proving, to write a prograrn that reasonsin a very different rnanner- say, according to the Tversky-Kahnernan heuristics. In exactly the sarneway, it is possible to use an architecture that runs on natural deduction to carry out nearly any sort of nondeductive inference. This rneans that a theory of the psychological principles that govern nondeductive inference can proceed along independent lines, and the principles themselveswill have an independentscientific interest. If this is a caseof deduction chauvinism, it is a very mild one.


Chapter 11

Aren't YouMaking PeopleOut to BeMore Illogicalor IrrationalThan


Once one has admitted the possibility that people can make mistakes in reasoning, one is open to doubts and criticisms from the opposite point of view. As I have already acknowledged, there appear to be theoretical limits on ascribing illogical thinking to people. The aim of this scientific ' enterpriseis to explain ordinary peoples inferences, where part of what it meansto explain them is to view thesethought patterns as obeying certain psychological laws or generalizations. But the generalizationsat work in cognitive psychology are ones that presupposea healthy dose of rationality : that peoplehavesufficient mental aptitude to be able to fashion strategies to achievetheir goals, that they can selectamong the strategiesthose that are more likely to be successfulon the basisof their beliefs, that they can recognizewhen a strategy is in fact successful , and so on. Creatures who behave in a thoroughly irrational manner don' t conform to these ' cognitive platitudes and hence aren t easily explained from a cognitive point of view. Of course, creatures who do conform to these platitudes needn't have perfect rationality ; for example, people's ability to achieve ' goals doesnt mean that they always reason optimally in attaining them. But, having glimpsed theseunderlying rational principles, one can easily become skeptical of any alleged instances of reasoning errors (see, e.g., Cohen 1981, 1986; Henle 1978). For example, with respect to errors in " experimentson deduction, Cohen ( 1986, p. 153) claims that it looks as though the data are inevitably inconclusive. To be sure that the subjects understood the exact question that the investigators wished to ask them, it would be necessaryto impose on them such an apparatus of clarifications and instructions that they could no longer be regardedas untutored laymen and the experiment would then becomejust a test of their educational " progressin the field of philosophical logic. Whether PSYCOP is too illogical or irrational depends, of course, on how " reasoningerrors" are defined. Many of the types of error tendencies that we consideredearlier don' t count as reasoningerrors for purposesof this debate, even though these tendenciesmay lead to judgments that differ from those sanctioned by some logic system. Both parties to the debate agree that many factors can be responsiblefor such deviations. If Q follows from P according to some logical theory T but subjects fail to affirm that Q follows from P, that could be because(a) T isn't the

Perspectiveson ReasoningAbility


appropriate normative standard; (b) subjectsinterpret the natural-language sentencesthat are supposedto translate P and Q in some other way; (c) ' performance factors (e.g., memory or time limits ) interfere with subjects the instructions fail to convey to subjects ) drawing the correct conclusiond that they should make their responseson the basisof the entailment or deducibility relation rather than on someother basis(e.g., the plausibility or assertibility of the conclusion); (e) responsebias overwhelms the correct answer; or (f ) the inferenceis suppressedby pragmatic factors (e.g., conversational implicatures). If Q does not follow from P according to T but subjects affirm that Q follows from P, that could be because(a)- (e) hold as above; (g) subjectsare interpreting the task as one in which they should affirm the argument, provided only that P suggestsQ, or P makes Q more likely , or P is inductive grounds for Q; (h) subjectstreat the argument as an enthymemethat can be filled out by relevant world knowledge; i ( ) subjects ascribe their inability to draw the inference to performance factors and incorrectly guessthat P entails Q; or (j ) subjectsare misled by a superficial similarity to somevalid inferencefrom P' to Q' into supposing that there is a valid inferencefrom P to Q. Any of thesefactors might, in the right circumstances,provide a plausible reasonfor why subjects' judgments depart from what is prescribed by a logic system- and this is not an exhaustivelist. The question about " reasoningerrors" is whether PSYCOP posits mistakes that go beyond factors like (a)- G) and implicates a " true" failure in people's thinking apart from lapsesof interpretation , attention, or response . Needlessto say, it isn' t easy to identify such errors in an actual sample of data. Armed with factors (a)- (j ) and others like them, a determined skeptic can usually explain away any instance of what seemsat first to be a logical mistake. (The opportunistic use of these factors is behind the earlier complaints about unprincipled accounts of errors.) To determinewhether we are giving people too little credit for correct reasoning , we therefore need to get straight about what would count as a true reasoning error and whether it makes conceptual and empirical senseto postulate such errors. Aren' tY 00 AssumingThat PeopleAre Programmedto Be Irrational? Some of those who believe that theories like PSYCOP paint too grim a picture of human reasoning seem to suggest that these theories doom people to faulty inferences. For example, Rescher ( 1988, pp. 194- 195)


Chapter 11

writes that " recent psychological studies have sought to establish with experimental precision that people are generally inclined to reason in inappropriate ways. One investigation [ Rips 1984] , for example, concludes that people systematically commit the well-known fallacy of denying the antecedent.. . . But it is far from clear that an error is actually committed in the casesat issue. For , people often use ' If p, then q' in everyday discourse as abbreviationfor ' if but only ifp then q' . (' If you have a ticket they will let you board' , ' If you passthe course, you will get four credits' .) What seemsto be happening in the casesat issueis not misreasoning , but mere conclusion-jumping by tacitly supplying that ' missing' inverse." Similarly , Rescherremarks a bit later (p. 196), " to construe the data [ from experiments on deductive and probabilistic reasoning] to mean that people are systematically programmed to fallacious processes of reasoning- rather than merely indicating that they are inclined to a variety of (occasionally questionable) substantive suppositions- is a very questionablestep." Rescherdoesn't deny that human irrationality is possible: " Irrationality is pervasive in human affairs. While all (normal) people are to be credited with the capacity to reason, they frequently do not exerciseit well." (ibid., p. 198) What seemsto be at issueis whether incorrect reasoningis a " systematically " programmed part of thinking rather than just a peccadillo. One type of systematically programmed reasoning error , on this account , would be the incorrect rules discussedin the preceding section. Thus, if peoplehad a Denying the Antecedentrule that operatedalongside the rules of chapter 6, there would be a natural tendency for them to conclude NOT Q from premisesof the form IF P THEN Q and NOT P, and this would constitute the kind of inherently fallacious thinking that Rescherdeemsquestionable. It is quite true that subjectssometimesaffirm the correctnessof arguments that are of the denying-the-antecedenttype (seeMarcus and Rips 1979, and other studiescited in chapter 5). And it is also true that psychologistshave, at times, posited syst~matic error patterns of this generalsort; the atmosphererules for syllogismswould probably qualify, at least under some interpretations. However, it would be a misreading of the present theory (and of earlier versions, such as that in Rips 1984) to supposethat it builds in such error tendencies. As has already beennoted, there is no Denying the Antecedentrule in PSYCOP or its ancestors, nor is there any other rule that is not sound in classicallogic. Although the theory does not exclude incorrect rules on a priori grounds, a better account can often be provided through the kind of alternative

Perspectiveson ReasoningAbility


factors that Reschermentions. For instance, the idea that a conditional sometimessuggestsits converseis essentially what I used to explain the data on conditional syllogismsin chapter 5. As PSYCOP is currently constructed, any " systematicallyprogrammed" errors must be due to failures to apply its rules. For example, the probability that subjects correctly apply the NOT Introduction rule is fairly low, according to the model. But should we take this as evidence of a built -in prohibition against this type of inference(i.e., as a " true" reasoning error ), or instead as a proclivity that often affectsreasoningperformance? I claimed earlier that the stability of the parametersindicates that these tendenciesare themselvesstable characteristics of reasoning, and this might incline someone to view them as preprogrammed. However, the point of making these rule failures probabilistic parts of the model is to acknowledgevariability in the use of a rule over occasionsin which it is relevant. Rules like NOT Introduction are relatively difficult for subjects to apply, at least partly as a result of their internal complexity, and this ' gives rise to stable estimates. But this added complexity doesnt mean that people are inherently incapableof applying such rules. It is doubtful , for this reason, that the errors predicted by the availability parameters are instancesof programmed irrationality of the sort that Rescherwarns against. In this respectrules like NOT Introduction , which the model claims are difficult for subjectsto use, differ from rules like Conditional Transformation ' (table 4.3), which are not part of the model s repertoire at all. Thus, if it turned out that a normatively appropriate theory of reasoningincluded inferencesthat could be derived only via Conditional Transformation, then PSYCOP could be said to be systematically unable to reason in a normatively correct manner. But convicting PSYCOP of this type of error would require evidenceof the missing inferences. Aren' t You AssumingThat PeopleShouldInterpret Connectivesin a Truth- Functional Manner and Then AccusingThem of Errors When They Don' t Do So? Another way in which we could be making people out to be too illogical is by holding them to an inappropriate standard of logic. Cohen ( 1986) has raised this possibility in discussingthe evidenceregarding the rule of OR Introduction . We found in chapter 5 that subjectstend not to affirm arguments whoseproofs rely on this rule, according to our proof system. And


Chapter 11

beginning logic studentstend to find this rule counterintuitive , at least in the way it is usually presented. But perhapspeople's reluctanceto usethis rule merely indicates that they do not understand the English or as the OR of classicallogic: " Rips is assumingthat the layman always usesand understandsthe elementary connectivesof natural language in a purely truth -functional sense, so that a proposition of the form 'p or q' is true if and only if p and q are not both false. And, if we are not prepared to take this notoriously controversial assumption for granted, we can just as well construe Rips' s result as revealingindividual differencesin how people use and understandsuch connectives.. . . " (Cohen 1986, p. 151) Cohen' s aim in suggestingthis possibility is to ward off what he seesas a threat to human rationality . He construesthe theory' s account of arguments with OR as attributing to people a logical defect- as exemplifying the sort of built -in error that I have just discussed: " Very many papers have appeared which claim to have establishedthat logically or statistically untutored adults are inclined to employ one or another of a variety of fallacious proceduresin their reasoningsor not to employ the correct ' procedures. The normal persons intuitions on theseissues, it is claimed, tend to be persistently irrational ." (Cohen 1986, p. 150) We have already seenin connection with Rescher's commentsthat this is amisunderstanding of the present theory. Nothing about the model implies that subjects are somehow " persistently irrational " in connection with OR Introduction or in connection with other rules that receivedlow availability values in the studiesdiscussedin part II . We interpreted the results on OR Introduction , in particular , as probably due to sensitivity to pragmatic factors such as the pointlessnessof assertingP OR Q when the stronger statement ( P is known to be true). SinceCohen himself usesthe samekind of conversational factors to explain other findings in the literature on deductive ' 4 reasoning( 1986, p. 152), he presumably doesnt view them as irrational . Although Cohen aims his comments about irrationality at the wrong target, the point about truth -functional connectivesis worth considering. Our account would simply be wrong if people reasonedsolely with nontruth -functional connectives and PSYCOP solely with truth -functional ones. We needto be a bit careful in examining this issue, however, sinceit is not necessarilytrue that the model implies that the layman always uses and understandsthe elementaryconnectivesof natural languagein a purely truth -functional sense. Truth -functionality means that the truth of any sentencethat PSYCOP handles is completely determined by the truth of

Perspectiveson ReasoningAbility


its constituent sentences . But sincePSYCOP is a proof systemand doesn't determine the truth of its working -memory sentencesdirectly, only in a derivative sensecould it qualify as truth -functional. One possibility along theselines would be to say that if . is a logical connective then the system should be able to prove ST from the premisesconsisting of its component sentences(S, T ) or their negations(NOT S, NOT T ). Standard deduction systemsfor classicalsentencelogic are truth -functional in this sense . (e.g , they can prove IF S THEN T from NOT S or T ). But the results of chapter 4 show that the systemis incomplete with respectto classicallogic, and this means that the system will sometimesfail to recognize certain truth -functional relationships. Thus, if using sentenceconnectives in a truth -functional senseentails being able to derive compound sentences from their components, then PSYCOP isn' t a purely truth -functional systems Whether or not the systemis fully truth -functional , however, the rules of part II provide for only a single version of the connectives. It is easy to agreethat natural languageincludesa wide variety of logical operators, not currently included in PSYCOP, that are worthy of psychological investigation . For example, there is little doubt that natural languagecontains that differ from the IF of our system- perhaps along the lines ifs of the intensional IF of chapter 2. We also discussedgiving PSYCOP (non-truth -functional) rules for OBLIGATORY and PERMISSIBLE to explain recent results on the selection task. PSYCOP in its current form isn' t supposedto be an exhaustive theory of the logical operators that humans can comprehend; the model is psychologically incomplete with respectto this wider field of operators. From this perspective, the possibility that some subjectsunderstood the connectivesin our problems in an alternative way is one of the many factors that , like (a)- (j ) above, can affect their responses . What the theory does claim is that its rules for AND , OR, NOT , and IF are psychologicallyreal, and that they are among the basic deductive processes in cognition. Thus, what would cast doubt on the theory is, not the presenceof alternative connectives, but the absenceof the specifiedones. Summary

One way to think about the theory developed here is as a merger of two main ideas about the human nature of deductive reasoning. One of



theseideas is that reasoninginvolves the ability to make suppositions or assumptions- that is, to entertain propositions temporarily in order to trace their consequences . This idea comesfrom formal natural-deduction systemsin logic, but it is clearly psychological at root. Nothing about deduction per se forces suppositions on us, since there are perfectly good deduction systems that do without them; we could start with a set of axioms and derive all the same theorems. But human styles of reasoning aren' t like that , as both Gentzen and Jaskowski observed. We tend to assumepropositions for the sake of the argument in order to focus our efforts in exploring what follows. The secondof the key ideasis that reasoningincludes subgoals. People are able to adopt on a temporary basisthe desire to prove some proposition in order to achievea further conclusion. This idea seemsmore mundane than the one about suppositions, sincewe are accustomedto the use of subgoalsin cognitive and computer science: Even the simplest AI programs use subgoaling to reducethe amount of search. But, again, deduction itself doesn't require subgoals. Even natural-deduction systems, as ' logic textbooks formulate them, don t have subgoals. Instead instructors in elementary logic have to provide informal hints about strategiesfor applying the rules, generally in the form of advice about working backward from the conclusion to more easily achievablelemmas. If thesesubordinate conclusionsdon't pan out , we can abandon them for others that . Our theory gives this purposefulnessa status may prove more successful to that of equal suppositions. Although I have borrowed suppositions from logic and subgoalsfrom computer science, theseconceptsare closely interrelated. Suppositions are roughly like provisional beliefs, and subgoalsroughly like provisional desires . In something like the way beliefs and desiresabout external states guide external actions, provisional beliefsand desiresguide internal action in reasoning. According to the current theory, what gives human reasoning its characteristic tempo is the way these suppositions and subgoals coordinate: Can we show that somesentenceC follows from another sentence P? Well, C would follow if we can show that C' follows from P'; ' so let s assume P' for now and try to find out whether C' holds; and so on. From one perspective, this sequencehelps to simplify the problem at hand by lemma-izing it into manageableparts. But reasoning of this type also presupposessome fairly sophisticated cognitive apparatus for

Perspectiveson ReasoningAbility


keeping track of the nesting of suppositions within suppositions and subsubgoalsen route to subgoals. In the present theory, rnost of the responsibility for handling suppositions and subgoalsdevolveson the deduction rules. The rules in the rnodel are a conservativechoice of principles that seernpsychologically (though not always logically) prirnitive . Researchin this area should consider expanding this set of principles, and perhapsernendingthe current set, to achieve better coverageof the logical resourcesof natural language(see, e.g., Dowty 1993). But the present rules seernto provide a good starting point , both becausethey account for rnuch of the data frorn the experirnents reviewed here and becausethey are capable of supporting other cognitive tasks. A rnain innovation in the current rnodel, with respectto previous deduction theories in psychology, is that the rules also handle variablesand narnesin a generalway. This rneansthat the systernachieves rnost of the power of predicate logic, and without having to include additional rules for quantifiers. This accords with the rnathernatical practice of ornitting explicit quantifiers in equations and instead using distinct variables and constants, for ease of rnanipulation. Although this capability cornesat the price of additional cornplexity in the rules, the payoff is a single rnodel that applies to all, or nearly all, of the tasks that psychologists havestudied under the headingof deductive reasoning. It seerns surprising that , despite fairly wide agreernentabout the irnportance of variables and instantiation , no previous psychological theories of deduction have addressedthis problern in a global way and no previous experirnents have looked at the details of how people rnatch variables and names. The ability to deal with variables and namesgreatly expandsthe utility of deductive reasoning. Instantiating and generalizingvariables allow the model to swapinformation in memory and therebycarry out many higherlevelcognitive taskswithin the samesupposition/subgoalframework. Binding variables gives the model the power of a general symbol system, and this raisesthe possibility that deduction might serveas the basisfor other higher cognitive tasks. This proposal will seemwildly improbable to many cognitive scientists, who are used to thinking of deduction as aspecial purpose, error -prone process, but I hope that the examplesin chapter 8 will help soften them up. A couple of examples, of course, are hardly sufficient to establish the proposal as adequatefor the full range of higher cognitive tasks; we needa great deal of further experiencewith the system



in order to understand its strengths and weaknesses . However, the idea that cognition has deductive underpinnings shouldn' t be any harder to swallow than the idea of a production systemas a cognitive theory. In fact, we could get quite a closeapproximation to a production systemby modifying the rule we called Conjunctive Modus Ponens and applying it to conditional sentences(i.e., " production rules" ) in a special partition of long-term memory. The examples in chapter 8 also make it clear that deduction doesn't replace other forms of reasoning in this theory but supports these forms by supplying mechanismsfor keeping track of assumptions , exploring alternative cases, conditional branching, binding, and other necessities . It may help those who boggle at the deductionsystem idea to recognize that these processes are essentially deductive ones. This perspectiveon deduction does not resolve the debate about the scopeof human rationality , but it may help us to bring some of the complexities of this issue into focus. If the outlines of the model are correct, then reasoningtakes placeon at least two mental levels, perhapseachwith its own distinctive forms of rationality . We would expect reasoningat the architectural level to be relatively free of interferencefrom responsebias or from conversational suggestion, and equally immune to facilitation in learning. Reasoningat the strategic level may have the opposite susceptibilities : Explicit training could enhance it , but disturbances from neighboring processescould have it at their mercy. The presenttheory provides an account of how errors can occur during reasoning, basedon the length of a chain of reasoning and on the strength of its individual links , and it acknowledgesfurther sourcesof errors. But at the same time, the theory doesn't mandate reasoningerrors through unsound rules. In this way the ' theory tries to steer between the extremesof asserting that people can t ' reason correctly and asserting that they can t but reason correctly. It attempts to portray human reasoning, instead, in both its intricacies and its frailties.


Preface I. Modus ponensmeans" affirming mode," since we are affirming the IF part of the orginal sentencein order to affirm the THEN part. Modus ponens contrasts with modus toUem " " ( denying mode ), which is the argument from IF so-and-so THEN such-and-suchand NOT such-and-suchto NOT so-and-so. Theseterms come from traditional logic. Of course, even earthlings will forego a modus ponens inference under special circumstances - for example, when other information shows that the argument is elliptical . (See chapter 3, note 4, below.) 2. Cross-cultural researchmight throw light on theseissues , but conclusionsin this area have been inconsistent, at least on the surface. For example, on the basis of his researchin rural " Russia, Luria ( 1971, p. 271) suggestedthat thesefacts indicate that the operation of reaching a logical conclusion from the syllogism is certainly not a universal character as one might have thought." However, Hamill ( 1990, p. 60) suggeststhat his own, more recent " syllogism testswerestrikingly consistentacrosslanguageboundaries. Consultants consideredthe same kinds of argument valid and invalid , and in no casedid they draw conclusions that were " invalid according to the rules of textbook logic. The safestconclusion might be that any difference in performance between cultures on tests of deductive reasoning are probably attributable to translation difficulties (Au 1983; Liu 1985) or to relative familiarity with Western test-taking conventions (e.g., restricting consideration to explicitly stated premises; seeScribner 1977). 3. In fact, the comparison betweendeduction and programming languagesis more than just an analogy. There are now general-purpose languagesthat use deduction to carry out their activities. The best-known language of this sort is PROLOG (short for PROgramming in LOGic ), which has receive publicity becauseof its connection with the JapaneseFifth Generation computer project. For more on PROLOG , seeaocksin and Mellish 1981and Sterling and Shapiro 1986. Kowalski 1979contains a more generaldiscussionof the idea of deduction as programming.

Chapter1 -Bindingframework I. For example , thelogicalformof a sentence , within thecurrentGovernment from S-structure is a levelof grammaticalrepresentation , derivingtransformationally (surfacestructure) and revealingexplicitscoperelationshipsamongquantifierssuchas all, , andsome(seechapter2 for thenotionof scope ). Rulesthat yieldthesemanticinterpretation every of the sentence ; Chierchiaand McConnell applydirectlyto logicalform (May 1985 -Functionalgrammar Ginet 1990 , translationrulesmapthe f-structure(functional ). In Lexical the scopeof quantifiers structure ; ) of a sentenceto a semanticstructurethat represents semantic structure , intensionallogic, , in turn, mapsontoa formulaof a specificlogicalsystem tense that directlyindicatesscopefor adverbials(e.g. , necessarily , and negation(Halvorsen 's 1973 ), of a similarto Montague 1983 ( ) approachin whichcategories ). This is alsosomewhat syntacticdescriptionof Englisharetranslatedinto thoseof an intensionallogic. A distinctstrategyis to try to downplaythenotionof logicalformasa separate grammati callevel and to representscopeambiguitiesin other ways. This is the approachofferedin ' and relatedproposals(seeBarwise1987a SituationSemantics , Cooper1983 , and Fodors 1987 . )) critique( ' the sheriff, 'you' re just trying to : " ' Now look here, Ulysses 2. A typical passage , shouted 's '" ! ! on let checkers deduction Come , (McCloskey1943 ). play complicatemy " doesnot tie it to formal 3. Thisuseof "deductivereasoning any systemof logic. For directly " Ledas an , wecancountthe mentalstepfrom Mike is a bachelorto Mike is unma example instanceof deductivereasoning argumentis not sanctioned , eventhoughthecorresponding in anystandardlogic.


Notes to pp. 13- 41

4. A possiblequalification is that there might be some level of comprehension- perhaps the level at which individual sentencesare initially understood- that i ~ inferencefree. Or , perhaps , any inferencesthat affect this level are highly specializedand distinct from more general inferenceprocesses operating on a higher cognitive level (Fodor 1983). The centrality claim, however, does not contradict thesepossibilities. What exampleslike (6) show is that deduction is part of the global processof extracting the speaker's meaning from text or spoken discourse. This notion of (global) comprehensionis common to ordinary and scientific usage; hence, the evidencethat deduction is central to comprehensionisn' t due to our taking " comprehension " in someunnaturally broad sense.For a defenseof the view that deduction is part of every instanceof verbal communication, seeSperberand Wilson 1986; for a discussionof different notions of comprehension, seeRips 1992. 5. Examplesof syllogism and number-seriesproblems appear in the next paragraph. Analogies include four-term verbal items such as mayor : city :: captain : 1. A sample arithmetic word problem is " When duty on a certain commodity decreases30010 , its consumption increases . By what per cent is the revenuedecreasedor increased?" For verbal classifica60010 tion problems, the test takers might be asked to sort a set of words into two categoriesthat are exemplified by separateword lists. As an example, if category 1 includes lacerate, torture, bite, and pinch and category 2 includes suffer, ache, twinge, and writhe, classify the following items as belonging to category 1 or 2: wince, crucify, crush, smart, moan, and cut. Thesesimple ' ' examplesdon t do justice to Thurstone s brilliance as a test designer. SeeThurstone 1937for a list of his actual test items. 6. Aristotle ' s original use of " syllogism" in the Prior Analytics was apparently not limited to two -premise arguments (Corcoran 1974). In current writing, however, the two-premise format seemsuniversal. 7. Among students of ancient logic, there is a debateon what Aristotle took to be the nature of a syllogism- whether he construed it as an argument, a proof, or a conditional. SeeLear 1980and Smiley 1973for accounts of this controversy. There is also a disagreementabout the total number of syllogisms and the number of deductively correct ones (Adams 1984), partly becauseAristotle seemsnot to have considered as true syllogisms arguments with terms arranged as in ( 12). Neither of thesehistorical controversies, however, has much impact on the claims psychologistshave made about syllogistic reasoning, and we can safely ignore theseissuesin what follows. 8. Seechapter 10 for an assessmentof Johnson- laird ' s ( 1983) claims about the extendability of his model of syllogisms.

Chapter2 1. Pureaxiomaticsystemsdo not automaticallyallow the premisesof an argumentto be introducedaslinesof theproof. Instead , an argumentis deducibleif theconditionalsentence IF P, AND P2AND ... AND Pi THEN cisdeduciblefrom the axiomsand modusponens alone,wherethePiSarethepremises of theargumentandc is its conclusion . Themoredirect methodof table2.2 is justifiedby the DeductionTheorem(Mendelson1964 , p. 32), which ~ that if the sentence estahlishe IF P, AND P2AND ... AND Pi THEN cisdeduciblefrom theaxiomsthenq is deduciblefrom the axiomstogetherwith P" Pb .. . , and This useof the DeductionTheoremis alreadya majorstepin thedirectionof natural 2. Thesecondclausein this rule is logicallyredundant , sincean.' argumentthat is deducible in the systemis also deduciblein a modifiedsystemin whie}, this clauseis omitted(see Jaskowski1934for a proof). However , addingthe secondclausetendsto makederivations a reasonable . For example simplerandseems part of theinfonnalreductiostrategy , theproof in (6) belowrequirestwo additionallinesif thesecondclauseis not available . By contrast,a systemcontainingthe secondclausewithout the first is muchweaker(i.e., is ableto prove fewertheorems . ), asJaskowskialsoshowed

Pi'inn deduct

Notes to pp . 44 - 48


3. The relationship between the introduction and elimination rules for a connective is a prominent part of the literature on natural deduction in the philosophy of logic. This comes from attempts to use the structural characteristics of the proofs themselvesas ways of defining the deductivecorrectnessof argumentsor defining the meaning of logical operators. (This project is an especiallypressingone, of course, for those who regard with suspicion the ' semantictreatments of meaning and validity .) The issuecan best be seenin terms of Prior s that define rules ( 1960) demonstration that it is possibleto state introduction and elimination " " a bizarre connective(say, . ) that would allow any sentenceto be deduced from any other. In terms of table 2. 3, the rules for . would be as follows: . Introduction : (a) If a sentenceP holds in a given domain, . (b) then the sentencep Q can be added to that domain, where Q is an arbitrary sentence(cf. OR Introduction in Table 2. 3). . Elimination . (a) If a sentenceof the form p Q holds in a given domain, (b) then the sentenceQ can be added to that domaind . AND Elimination ) Then from Tin is elastic it follows that Tin is elastic . Fish float (by . Introduction ), and it follows in turn from the latter sentencethat Fish float (by . Elimination ). The upshot is that if the natural-deduction fonnat is to define connectives or correct arguments, then some constraints have to be placed on potential rules (Belnap 1962). One possibility is to enforce " " hannony betweenthe introduction and elimination rules so that the output of the latter don' t go beyond the input to the fonner. SeeDummett 1975and Prawitz 1974for attempts along theselines; seeSundholm 1986for a review. In the psychologicalliterature, Osherson( 1977, 1978) has claimed that logical connectives are natural for humans only if they can be fonnulated within a systemin which their introduction and elimination rules are subject to certain constraints (e.g. , that the sentencesfrom the proof mentioned in the conditions of an introduction rule be no more complex than the sentenceproduced by the rule). The approach to constraints on rules in the present book dependson global featuresof the system I will propose: I seekconstraints that confonn to general properties of human reasoning, such as avoiding infinitely long proofs and generating all intuitively correct inferences.The nature of theseconstraints will becomeapparent in the statementof the deduction rules in chapters4 and 6.

of theOR of table2.3. However , heabandonsthe 4. Gazdar(1979 ) followsGricein defense ) advocates IF of table2.3 in favor of the conditionalanalysisof Stalnaker(1968 ). Braine(1978 and if in termsof thededucibilityrelationthat holdsbetweenthepremises representing . In his system the conclusionof an argument , though, both of thesereduceto a relation and similarto that of the IF of the table. This meansthat any argumentwith falsepremises a true conclusion is deductively correct. For examole.

. ontrees grows ~ aghetti . aremammals Cows - a resultthat seemsmuchmorecounterintuitivethan is deductivelycorrectin this system ' ) addressthe problemof (11) in a differentway by placing (II ). Braineand O Brien(1991 furtherrestrictionson IF Introduction.On thistheory, a suppositionfor IF Introductioncan be madeonly if that assumptionis consistentwith all earliersuppositionsthat hold in its the suppositionin domain. So, for example , the derivationin (12) will be blockedbecause . the (12b) contradictsthat in (12a). But althoughthis analysismakes(11) nondeducible somewhatsimilarargumentin ( 13) still goesthrough. with theIF ruleshasto do with thetendencyof somepeopleto 5. Anotherpuzzleconnected of the form If P thenQ to If Q thenP (or the acceptascorrectan inferencefrom sentences


Notes to pp . 48 - 61

logicallyequivalentIf not P thennot Q). For evidence on conditionals , seetheexperirnents citedat the endof the first sectionof this chapter;seealsoFillenbaurn1975 . Suchan , 1977 inferenceis indeedcorrectwhenthe originalsentence is interpretedasa biconditional(that is, as IF AND ONLY IF P THEN Q), but not whenthe sentence is interpretedas IF P THEN Q accordingto therulesof table2.3. A relatedproblernis theinference frornAll x are y to All yare x, whichrnanyinvestigators haveidentifiedasa sourceof error in syllogistic . Thelogicalfonn of All x arey containsa conditionalon thestandardtheory(as reasoning wewill seernornentarily ), so theseinferences . rnaywell havethesarneunderlyingsource Thereis also a pragmatictheory that atternptsto explainthesetendencies . Gels and " " that rnanyconditionals invite suchan inference Zwicky( 1971 ) clairn that , a phenornenon " ." For exarnple they ) call conditionalperfection toofar out that window , If you lean , 'll(1971 ' you fall certainlyinvitesa hearerto supposethat if you dont leantoo far thenyou will not fall. But, as Boerand Lycan(1973 are warrantedby the ) point out, suchinferences backgroundinfonnationthat we haveaboutthe subjectrnatter(e.g., our knowledgeabout windowsand falling) and rnayhavelittle to do with conditionalsper se. Accordingto this account , conditionalperfectionis an inductiveinference on thesarnefootingastheinference " " frornFredis a secretary to Fredcantype. Still, it canbesaidthat subjects sornetirnes perfect a conditionalin the absence of any supportingbackgroundinfonnation. Line e of table 1.1 containsan excellentexarnple . 6. Sornephilosophers andlinguistshavealsousedquantifierslike mostto rnotivatea change in representation frorn attachingquantifiersto singlevariables(in the way that FOR ALL attachesto x in ( FOR ALL x ) P( x ) to attachingthernto phrasesrepresenting sets. As BarwiseandCooper(1981 )', McCawley(1981 ), andWiggins(1980 like ) point out, a sentence Most politicianswhinecant be statedin tennsof ( MOSTx ) ( Politician(x ) . Whine ( x ) ), ' where. is a sententialconnective . This is because the originalsentence doesnt rneanthat sornethingis true of rnostx, but insteadthat sornething (i.e., whining) is true of rnostpoliticians . This suggests an alternativerepresentation that containsa restrictedor sortalquantifier - for exarnple , ( MOST: Politicianx ) Whine ( x ),- wherex now rangesonly overpoliticians . This alsogivesa sornewhatbetterfit to the NP + VP shapeof the Englishsurface structure . However , seeWiggins(1980 ) for an argurnentagainstrestrictedquantifiers . 7. Hannan's thesisthat logicis not speciallyrelevantto reasoningis basedon the ernpirical that peopledon' t havetheconceptof logicalirnplicationandso can't recognize assurnption whenonepropositionlogicallyirnpliesanother. He believes that peoplelack the conceptof logicalirnplication , in the sensethat theydon' t distinguishpurelylogicalirnplicationsfrorn ' . nonlogicalones For exarnple , theydont consistently divideirnplicationslike the into logicaland nonlogicaltypes:(a) P or Q andnot P irnplyQ; (b) A < Band B <following C irnply ' A < C; (c) X is Y s brotherirnpliesX is male;(d) X playsdefensive tacklefor thePhiladelphia EaglesirnpliesX weighsmorethan150pounds(Hannan1986 , p. 17). However , beingableto rnakea cleardistinctionbetweeninstances of oneconceptandinstances of a cornplernentary ' one is not a necessary conditionfor havingthe concept . Peoples inability to distinguish intennediateshadesconsistently asredor nonreddoesnot entailthat theyhaveno concept of red. PerhapsHannanrneansthat peopledo not consistently identifyevenclearcasessuch as(a) aslogicalirnplications , but thetruth of thisclairnis far frornevident.As Hannannotes in an appendix , it rnaybe irnpossible to decidewhetheror not logicis speciallyrelevantto criteriaof logicandof irnrnediate reasoningwithoutindependent . psychological irnplication 8. Thisshouldnot betakento irnplythat sucha systernnecessarily IF Elirnination represents in the senseof containinga descriptionof the rule in rnentalese (in the way that table2.3 containsa descriptionof IF Elirninationin English). One could also havea rule of IF Elirninationin virtue of havinga rnentalrnechanisrn that producestokensof q as output, given tokensof p and IF p THEN q as input. In other words, rulescan be hardwired in a systern , as well as represented syrnbolicallyand interpreted , as Fodor (1985 ), Srnith et al. ( 1992 ), and Stabler(1983 ) havepointedout. Prornotedrulesseernrnorelikely to be hardwired dernotedrulesseernrnorelikely to beinterpreted , whereas .

Notes to pp. 67- 119


Chapter 3 I . The Traveling Salesmanproblem supposesthere is a finite set of cities with a known distance between each pair of them. The problem is to find the shortest tour that passes through' each city once and then returns to the first. Cook s theorem demonstrated the NP -completenessof the problem of determining the " " satisfiability of an arbitrary sentenceP in sentential logic, where P is said to be satisfiable if it is true in some possible state of affairs (seechapter 6). However, this theorem yields the N P-completenessof validity testing as a corollary . For if there were a decision procedurefor validity testing that was not NP -complete, it could be usedto establishsatisfiability in a way that was also not NP-complete. This is becauseP is satisfiable if and only if NOT P is not valid. Hence, to test P for satisfiability we would merely need to test NOT P for validity . Sincesatisfiability testing is NP -complete, so is validity testing. 2. The literature in AI sometimes refers to theorem proving in the forward direction as " " " bottom - " and theorem up proving in the backward direction as top- down. However, this context. Because is somewhat in the arguments are conventionally present terminology confusing written with the premisesat the top and the conclusion at the bottom , it is odd to speak of the premise-to-conclusion direction as bottom -up and the conclusion-to -premise direction as top- down. For this reason, I will stick with the forward/ backward terminology throughout this book. 3. The usual convention is to employ functions rather than subscripted temporary names. That is, instead of the name ax in (4b), we would have I ( x ) . These are merely notational variants, however, and the use of subscripted temporary namesis more in keeping with the discussionin chapter 2. 4. Of course, further information about the conditional's antecedent or consequent may causeyou to withhold belief in the conclusion of a Modus ponensargument. Additional facts may convince you that the conditional is elliptical , specifying only some of the conditions that are sufficient for its consequent(Byrne 1989), or they may lead you to doubt the conditional outright (Politzer and Braine 1991). The presentpoint , however, is that if P and IF P TH EN Q occur to us simultaneously(or in closesuccession ) then Q occurs to us too. (Seethe discussionof promoted and demoted rules in chapter 2.)

Chapter4 I. An earlierversionof thismodel,calledANDS(A NaturalDeductionSystem ), is described a newname,sinceit containsseveralimprovements in Rips1983 . The newsystemdeserves , 1984 . in its controlstructureandits memoryrepresentation on thebasisof entiresubdomains 2. Ruleslike IF Introductiondeducesentences , aswehave on a derivationwithin a seen . For example , theconclusionof theproofin figure4.1depends subdomainbeginningwith the suppositionBettyis in Little Rock(seeproof (2) below). In usedby therule of thesubdomains suchcases , thedeductionlinksrun fromthefinalsentence the . For example that theruleproduces to thesentence , BackwardIF Introductionproduces AND Sandrais in Memphisto IF Bettyis in Little deductionlink from Ellenis in Hammond AND Sandrais in Memphis RockTHEN ( Ellenis in Hammond ) in figure4.1. The fact that is part of a subdomainis indicatedby dashedarrows. thefirst of thesesentences 't 3. Thisdoesn precludethe possibilitythat a personmight forgeta subgoalwhileretaining within thesameproof. In thecontextof , for example someof theassertions , there AND Sandrais is nothingthat wouldpreventforgettingof thesubgoalEllenis in Hammond ?duringthemodusponensstep. in Memphis in chapter3 above. 4. The Modusponensrulein LT is discussed


Notes to pp. 121- 155

5. Another way to handle arguments (4)- (6) is to allow Double Negation Elimination to operate in a forward direction inside the conditional premise. A rule of this sort would immediately simplify the first premise of (4), for example, to IF Calvin deposits50 cents TH EN Calvin gets a coke, and Forward IF Elimination would apply directly to this new sentence. Although this seemsa plausible hypothesisabout (4)- (6), it doesn't solve the general we can find arguments analogous to these in which the antecedentof a problem, for ' conditional can t be deducedfrom the remaining premisesby forward rules alone. The difficulty inherent in (4)- (6) is also connectedwith another. Sinceboth the antecedent and the consequentof a conditional can be arbitrarily complex, it is possibleto create valid ' argumentsthat' IF Elimination can t handle, evenwith the new backward rules. For example, PSYCOP can t prove the following argument, given only the rules of this chapter: IF (P AND Q) THEN (IF R THEN S) P Q R S The point is that, if the conditional ' s consequentis sufficiently complex, it will keep PSYCOP from noticing that the consequent(plus other assertions) entails the conclusion and so prevents the program from attempting to prove the antecedent. The argument above is only the simplestexample. A solution to this difficulty would be a generalizedbackward IF Elimination rule that first determines(in a subdomain) whether the consequententails the current subgoal, setting up the antecedentas a new subgoal if it does. This rule would be an interesting mix betweenthe IF Elimination and IF Introduction strategies, but it is not clear to me that the added complexity of such a rule is warranted by the facts about human inference. 6. Leon Gross suggestedmany improvements to the halting proof. Proof HP -III in its current form in the appendix is due to him.

Chapter5 I. For this reason it is unwarrantedto concludefrom a low scoreon a givenproblemthat ' , subjectswerent engagedin reasoningabout the problem. In this vein, Braineet al. ( 1984 , 5.1 as one in which " the conclusionseemsto follow p. 360) point to problemB in table " ' from the premises and state that the subjectsscoreon this item (66.7% "transparently " , true" responses that theexperiment necessarily oftenfailedto engagethereasoning ) suggests ." It is unclearwhat rationaleBraineet al. areapplyingin judging procedureof subjects " " that thisargument transparently follows. Thebasicpoint, however , is that it is impossible to determinetheextentto whichsubjectswereengaged in reasoningfrom an absolutescore withoutalsoknowingthe response criterionthesubjectswereadopting. FredConradand I (Ripsand Conrad 1983 ) havereplicatedthe experimentdescribedhereand found higher absolutescores(62% vs. 51% of subjects judgedthe classicallyvalid arguments" necessarily true" ) but a verysimilarpatternof scoresacrossproblems . 2. Anothermethodfor estimatingparameters is to includein the stimulussetan argument that turnson a singlerule. The percentage of correctanswersfor that argumentis thenthe estimatefor that rule's availability(Braineet al. 1984 ; Osherson1974 , 1975 , 1976 ). One difficultywith this procedure , for' our theory, is that someof the rules, suchasIF Introduction and NOT Introduction, cant be the solerule usedin the proof of an argument . Thus, thealternativemethodgivesusno estimates for theseitems.Second , useof simplearguments to estimateparameters maysometimes spuriouslyinnatethefit of thetheory. Thisis because the premiseor theconclusionof a single-rule argumentwill oftensharethe syntacticstructure of the premiseor conclusionof the argumentthat the investigatoris trying to predict,

- 157 Notesto pp. 155


with comprehendingthe especiallywhen the proof is short. Hence, any difficulties associated ' syntax will increasethe correlation betweenthe arguments scores. Seechapter 9 for further discussion. 3. This pattern of parameter values explains some observations about this experiment that Johnson- Laird , Byrne, and Schaeken( 1992) have made. The low availabilities for OR Introduction and NOT Introduction mean that the model predicts a fairly low proportion of " " on that involve theserules and a fairly high proportion necessarilytrue responses ' arguments on argumentsthat don t. Johnson- Laird et al. also claim that the scoresfor the problems in table 5.1 depend on " whether or not the arguments" maintain the semantic information of the premises. By the amount of " semantic information ," Johnson- Laird et al. mean the percentageof possible statesof affairs that the premiseseliminate. A possiblestate of affairs is. in turn , determined by an assignmentof truth or falsity to each of the atomic sentencesthat appear in the argument (Johnson- laird 1983, p. 36; Johnson- laird et al. 1992, p. 423). For example, consider the simple argument p AND q. p. Sincethis argument containsjust two atomic sentencetypes, the four possiblestatesof affairs are one in which p is true and q is true, one in which p is true and q is false, one in which p is false and q is true, and one in which p is false and q is false. ( Thus, the states of affairs are equivalent to the horizontal lines of a standard truth table for the argument.) In the sample argument, the premiserules out all but one of thesestatesof affairs (the one in which p is true and q is true). However, the conclusion rules out only two statesof affairs (the ones in which p is false). Of course, in any valid argument the conclusion must be true in all statesof affairs in which the premisesare true (seechapter 2), and this meansthat the amount of semantic information conveyedby the premisemust be greater than or equal to the amount conveyed Johnson- laird ( 1983) talks of an argument " maintaining semantic information by the conclusion. ,. if the and conclusion have the sameamount of semanticinformation , and premises " " throwing away semantic information if the conclusion contains lesssemantic information than the premises. Thus, the argument above throws away semanticinformation. " According to Johnson- Laird et al.(I992. p. 428), to throwaway semantic information is to violate one of the fundamental principles of human deductive competence, and so we can predict that performance with these problems should be poorer." They then report a test comparing 16 arguments in table 5.1 that purportedly maintain semantic information with 16 arguments that don' t. However, Johnson- Laird et al. apparently miscalculated the amount of semantic information in thesearguments; according to the criterion of Johnsonlaird ( 1983), only three of the 32 arguments (C, 0 , and X ) maintain semantic information. " The percentageof " necessarilytrue responsesfor theseproblems was48.2%, nearly the same as the percentagefor the entire problem set (50.6%). Instead, Johnson- Laird et al. seemto have decided which arguments maintained semantic information according to whether the conclusion contains an atomic sentencetoken that does not appear in the premises. (E.g., the conclusion of Argument A contains an r that does not appear in the premises.) Adding atomic sentencesto the conclusion, however, is not the only way of reducing semantic information , as we have already seenwith respectto the sampleargument above. In short, contrary to Johnson- laird et al., there is no evidence from these data that " ' " throwing away semantic information hurts subjects performance. What does seem to causedifficulty is the presenceof atomic sentencesin the conclusion that did not appear in the premisesand may thereforeseemirrelevant to those premises. This may be traced in turn to rules like OR Introduction (P; Therefore, P OR Q) that add sentencesin this way. 4. This reinforcesthe conclusion from chapter 3 that resolution theorem proving, tree proofs, and other methods that rely on reductio as the central part of their proof procedure are probably not faithful to the deductive strategiesof most human reasoners.


Notes to pp. 160- 231

5, PSYCOP does not have any skill in producing or comprehending natural language. Its messagesare limited to a few stock sentenceframes that it fills with information relevant to the last processingcycle, The purposeis simply to give the usera signal about how it handled the input sentence.Similarly , its comprehensionabilities are limited to the input propositions and to prompts (" suppose," " therefore," " ,. . follows by Theorem 6," " , . . follows by meansof OR Elimination ," and so on).

6. Marcus(1982 2) alsodemonstrates , experiment that theembedding effectis not simplythe " that introducethe resultof the adverbialphrases(" Under that condition," " in that case ) embedded sentences in (7). 7. Of course , westill needto accountfor the secondpremiseof (9a)- (9d), sinceit is unlikely that subjectshavethis sentence storedprior to the experiment . In the followingchapter, however , we will seehow this premisecan be derivedfrom generalinformation(e.g., IF ' Above ( x,y) THEN NOT Above ( y,x ) , whichprobablyis part of subjects base . knowledge 8. Furtherargumentformscan be generatedby substitutingcomplexpropositionsfor the antecedent and the consequent . For example , Evans( 1977 ) exploresthe casein whichthe conditionalcanincludea negativeantecedent or a negativeconsequent . 9. This still leavesa puzzleasto why the4-cardis so frequentlychosenin the selectiontask whenconverse seemto berelativelyrarewith similarmaterialsin theconditional interpretations . It is possiblethat the greaterdifficulty of the selectiontask syllogismexperiments moreconverseinterpretations . Alternatively encourages , choiceof the 4 card may reflect somemoreprimitivestrategy , suchas matchingthe valuesnamedin the rule (Beattieand Baron1988 ; Evansand Lynch 1973 ; OaksfordandStenning1992 ).

Chapter6 I . A possiblecompromiseis to distinguish betweenthoseconditional assertionsto which the forward inferenceshould apply and those to which it shouldn' t. Hewitt ( 1969) incorporated this in his PLANNER in the form of what he termed " antecedent" and " consequent" theorems . It is possibleto argue, however, that the appropriate direction for a conditional inference should be determined by the nature of the deduction environment rather than by intrinsic differencesamong conditionals. Moore ( 1982) suggeststhat the correct direction might be determined by the theorem prover itself as it reasons about its own inference abilities. 2. It may help to remember that the simplification we get from dealing with satisfaction rather than truth also occurs in CPL , though in slightly different ways. For instance, the usual semanticsfor CPL contains a clauselike (b), specifyingsatisfaction of a conjunction in terms of satisfaction of the conjuncts. But, of course, it is not the casein CPL that ( 3x ) F ( x ) and ( 3x ) G( x ) are true iff ( 3x ) ( F ( x ) AND G( x ) ) .

7 Chapter I. To symbolizeNo Fare G, we useNOT( F( x ) AND G( x )) in preference to the logically equivalentIF F( x ) THEN NOT( G( x ) ) sincetheformerseems closerto Englishsyntaxand shouldthereforebea morenaturalrepresentation for Englishspeakers . -LairdandRara1984bfor a discussion 2. See8o0los1984andJohnson of thispoint. Apuleius wasapparentlythefirst to advocatethesesubaltern entailments (seeHorn 1989 ). " asareNo andSome -not sentences 3. In Aristotelianlogic, All andSome are"contradictories , sentences . The negativerelationsbetweenthesepairs aren't immediatein the traditional

Notesto pp. 231- 254


-No terminologyor in thequantifier-freenotationof the b sentence in (1)- (4). This All-Some is due, in the latter case , mentionedin the previous , to the implicit scopeof the variables . In the caseof All, we haveIF F( x ) THEN G( x ) in (I b), whichis equivalentto chapter ( Vx) ( IF F( x ) THEN G( x ) ), and its negationis NOT( ( Vx) ( IF F( x ) THEN G( x ) ) ) = ( 3x) NOT ( IF F( x ) THEN G( x ) ) = NOT( IF F( b) THEN G( b) ) = F( b) AND NOT -not in (4b). The sametypeof expansion of Some is our representation G( b). This lastsentence . alsosufficesto showthat No andSomearecontradictories of subjects 4. Dickstein( 1978a , both consistingof Wellesley ) reportsdatafrom two samples . The percentages wecite are basedon the secondsample , whichcontained undergraduates moreresponses perpremisepair. 5. One usefulfeatureof syllogismsis that their syntacticcomplexityis approximatelythe distributionsin the appendix sameacrossthe differentargumenttypes. Thus, the response are not due to variationsin numberof premises , and so on. In , numberof connectives chapter5 I arguedon statisticalgroundsthat the parameterestimateswerenot due to to thoseof chapter5 lendssome surfacecomplexity ; the similarityof the presentestimates . furthersupportto thisconclusion -Laird and Bara( 1984a 6. The datacitedherearefrom experiment3 of Johnson ), in which . Thoseauthorsalso report subjectshad as muchtime as they neededto producea response . deadline had to be madeundera 100second a separatestudyin whichthe response Sincethesedata are lessappropriatefor assessing , we will specificallylogicalprocessing -Laird and Bara 1984aincludeonly categorical not considerthem here. The tablesin Johnson " that containedthe end termsof the syllogism , plus the no conclusion responses -laird and Bara(19848 . 51) describethe remainingitemsas follows" responses . Johnson , p " , e.g. the inclusionof a middleterm insteadof an end term as in idiosyncraticconclusions ' Some are of reportedresponses A are B' ." The statisticsbelowthat referto the percentage -Laird and Barado tabulateconclusions in basedon this set. On the other hand, Johnson . (E.g., given premises which the order of termsis oppositethat of traditional syllogisms ( H,F), etc., as well as ( All(G,H),All(FiG , they count the conclusionsAll(H,F), Some . All(F,H), Some ( F,H), and the like.) The statisticsthat follow also includesuchresponses is This means , in particular, that the numberof premisepairsthat yield valid conclusions . differentfrom that in table7.1 or theappendix 7. The premisepairsfor whichforwardprocessing is sufficientareall onesthat havevalid conclusions whentheimplicaturesareaccepted , of the 16pairsthat requirebackward ; however . This meansthat underthe implicatures processing , only four havevalid conclusions ' of the presuppositions the difference just reportedmaybe partly dueto subjectsaccepting ' for the entire . Implicatures the premisesand the conclusion , though, cant be responsible effect. If we confineour attentionto thosefour pairs, we still find only 42.5% of subjects . producinga validconclusion 8. Onefact aboutsyllogismsis that the conclusions producedby the forwardrulesarealso in onesthat havethedominantquantifierof the premises ); so responses (astable7.1 reveals in classd of theconclusions classa abovearea subsetof thosein classb. Moreover , several arealsoin classb. This raisesthe issueof whetherwecan't describethe resultsin a simpler , eithergivinga conclusionthat has altogether way: Perhapssubjectsskip logicalprocessing that nothingfollows.Thelatterdescription thedominantquantifieror responding , however , ' t do full doesn , sincesubjectsaremorelikely to producea conclusion justiceto the results with a dominantquantifierwhenthat conclusionis deduciblethan whenit is not. Fourteen , and subjectsproducedthese premisepairshavedominantconclusionsthat are deducible on 62.8% of trials; of the remaining50syllogisms conclusions , subjectsproduceddominant on only 44.2%. conclusions 9. The only importantshortfallin PSYCOP's ability to provethe argumentsin theseexercises with the . This is connected conditionals oneswhosepremises hadembedded concerned in difficultiesin dealingwith conditionalsby BackwardIF Eliminationthat werediscussed


Notesto pp. 254- 277

note5tochapter 4). Forexample previous (see chapters cannot especially , PSYCOP prove thefollowing fromtherules ofchapter 6: argument IF F(x) THEN(IF G(y) THENH(y- . IF(F(z) ANDG(z THEN(F(a) ANDH(a . PSYCOPsensiblytriesto provethisargumentusingIF Introduction,assuming theconclusion's antecedent and thenattemptingto proveits consequent . In orderto establishH( a), however , PSYCOPneedsto usethe premiseinformation. BackwardIF Eliminationseems 't like the right strategyat this point, but the rule doesn apply, sinceH( y) is burledin the 's of the premise . Of course the consequent , consequent proof wouldgo throughif we gave PSYCOPa generalized IF Eliminationrule, suchas the one discussed in chapter4, that allowedit to work with theseembedded conditionals ; however , it is not clearwhethersucha rulewouldbepsychologically . McGee( 1985 plausible ) claimsthat IF Eliminationsometimes givesresultsthat areintuitivelyincorrectwhenappliedto conditionalswhoseconsequent is 't alsoa conditional.Onepossibilitythat McGeediscuss es(but doesn endorse necessarily ) is that peopleunderstandtheseembedded conditionals(onesof the form IF P THEN ( IF Q THEN R) asif theywerelogicallyequivalentconditionalswith conjunctionsin their antecedents (i.e., asIF ( P AND Q) THEN R). This hypothesis bothexplainsMcGee's examples andallowsPSYCOPto provethearguments in question . Chapter 8 I. I am assumingthat the Talk-to links are one-way in order to keepthe examplesimple. Thereareseveralwaysto implementa versionof theproblemwith two-waylinks. Oneis by for (la) and(lb ): substitutingthefollowingassertions IF Talks-to( u,v) OR Talks-to( v,u) THEN Path(u,v) IF (Talks-to( x,y) OR Talks-to(y,x AND Path(y,z) THEN Path(x,z) A secondsimplificationis that wearetacitlyassuming that thereareno closedcircuitsin the ' spynetwork(in fact, noneof Hayes problemscontainedcircuits). Givenjust theinformation in (I a) and (Ib). circuitswill causePSYCOPto loop endlessly . To preventthis, PSYCOP couldkeepa list of thespieswho havealreadycarriedthemessage , preventinganysuchspy from receivingit twice. Seechapter7 of Clocksinand Mellish 1981for an exampleof this sort. 2. Later researchon memoryfor partial ordersconfirmssomeof thesesameeffects . In of this type, subjectslearnrelationsbetweenadjacentitemsin an ordering(e.g., experiments Tangois taller thanChina, Tangois taller thanTree, etc.) and thenanswerquestionsabout the adjacentpairs(Is Tangotaller thanChina?, Is Treetaller than Tango ?) or nonadjacent ones(Is Tangotaller thanLarynx?, Is Larynxtaller thanChina?). (Theyarenot requiredto traceeachlink in thechain.) Subjectstendto answerthesequestionsfasterfor theadjacent than the nonadjacentitemsand fasterfor partsof the structurethat containno competing -Roth and Hayes -Roth 1975 pathways(Hayes ; Moeserand Tarrant, 1977 ; Warnerand ). However , the differencebetweenadjacentand nonadjacentpairsdecreases Griggs, 1980 whenmetricinformationis includedwith the partiallyordereditems(Moeserand Tarrant, 1977 for linearly orderedstructures(Potts 1972 ), and the differencereverses ; Scholzand ' Potts1974 or reversalis probablydueto subjects ). Thisdisappearance adoptinga different from the pairwiserelationsthat wehaveassumed representation in thisexample . " canhaveweakeror 3. " Sufficiency . Much of the discussion strongerinterpretations about andsufficientpropertiesin theliteratureon categorization necessary focuses on sufficiency in a modalsensein whichthe propertieswouldhaveto guarantee in all categorymembership

Notesto pp. 277- 297


to showthat a proposedset possibleworlds. It is usuallyeasyto find sci-fi counterexamples of featuresis not modallysufficientfor thecategoryin question . (E.g., thepossibilityof a new typeof mammalgrowingfeatherswoulddefeatfeathersasa sufficientpropertyfor beinga bird, if wetakesufficiency in thisstrongway.) But(2) demands in theweakerform sufficiency that thereis nothing(in theactualworld) that satisfiestheantecedent of theconditionalthat ' doesnt also satisfythe consequent . Therecould certainlybe sufficientpropertiesin this weakersense , eventhoughtherearenonein themodalsense . Onemightarguethat any usefulclassification procedureshouldbestrongerthan whatis requiredby the conditionalin (2). The procedureshouldbe ableto classifynot only birds that happento existbut also onesthat couldexistin someat leastmildly counterfactual . Perhapsit shouldbeableto classifyall physicallypossiblebirds. Bearin mind, situations though, that what is at stakein this discussionis not a scientifictheory of birds but a scientifictheoryof how peopleclassifythem. For the latter purpose , it is not clearto me exactlyhowstrongtheconditionalshouldbe. 4. And, of course , both (a) and(b) differfrom (c) Georgeis a bird if therearecausalrelationsof a certainsort. the sufficiencyof causalrelationsfor categorymembership Brieflyput, (c) is a claimabout , ' (b) is a claimaboutpeoples beliefin (c), and(a) is a claimaboutthe sufficiency of the belief ' about causalrelationsfor the beliefabout Georges beinga bird. Traditionalarguments for againstsufficient ' properties naturalcategoriesare directedagainst(c), and arguments againstpeoples beliefin sufficientpropertiesarearguments against(b). 5. It is anotherquestion , anda muchtrickierone, whetherweshouldidentifyour schemas or mini-theorieswith concepts of the samecategories . Thereare good reasons , both philosophical(Putnam 1988 ) and psychological (Armstronget al. 1983 ; Oshersonand Smith 1981 are differentcreatures ), for supposingthat conceptsand schemas , althoughthey are oftenrun togetherin thepsychological literature. 6. RumelhartandNormanalsomentionasa fifth characteristic thefactthat oneschema can embedanother; for example , theschemafor bodywill includeschemas for head , trunk, and limbs.It is not clear, however , whetherthis meansthat all of theinformationwehaveabout headsis a physicalpart of our representation of bodiesor ratherthat the representation of bodiescanincludea reference to heads . Thelatternotion, whichseems themoreplausibleof thetwo, is surelywithin thecompass of the logicalformswehavebeenusing. 7. Oaksfordand Chater( 1991 ) also note that the intractability(i.e., NP-completeness ) of nonmonotoniclogicsmakethemunlikelyascognitivemechanisms for inductiveinference . They usethe intractabilityof theselogics, however , as part of a generalargumentagainst classicalrepresentational esin cognitivescience . OaksfordandChatermayberight approach that inductiveinferencewill eventuallybe the downfallof theseapproach es. But, as the ' , theclassicalviewisnt committedto nonmonotoniclogicsas categorization exampleshows a way of obtainingnondeductive belieffixation. This remainstrue evenwhenthe classical viewis specialized sothat all conclusions sanctioned in thesystemaredeductivelyvalidones, because thesesystems canadopta beliefin waysotherthan makingit the conclusionof an . argument 8. Thereis an echohereof thedebatebetweenGelsandZwicky( 1971 ) and Boerand Lycan on in (1973 2 . ) on invitedinference The , whichI touched inference from If youmow chapter herlawn, Joanwill payyou,55to If youdon't mowherlawn, Joanwon't payyou,55mightbe takenasa typeof nonmonotonicinferencebasedon minimizingthe possiblecircumstances in whichJoanpaysyou $5 to thoseexplicitlystatedin the premise(i.e., thosein whichyou - if the paying-$5 casesare the mow her lawn). Obviously, if thesecircumstances coincide - asthis minimizationpredicts sameasthelawn-mowingcases thentheconclusionfollows. ", as a Thus, wecanviewGelsand Zwicky's "conditionalperfection specialcaseof circumscription - to reasonfrom All X are Y . Thesamegoesfor thetendencyto perfectuniversals to All Yare X.

Notesto pp. 297- 322


On one hand, it is clearly of interest that the same nonmonotonic mechanism might account both for theseconditional and universal inferencesand for the default inferencesjust discussed.On the other hand, perhapsthis is merely a reflection of the fact that all theseitems are inductive and the fact that all inductive inferencesrequire conceptual minimizing of some sort. We minimize the possibility that what is unforeseenor unlikely will turn out to be true.

Chapter9 I . There are deontic logics that fonnalize pennission and obligation and that are extensions of the basicCPt system(Fellesdal and Hilpinen 1971; Lewis 1974; von Wright 1971). These will be discussedbelow. 2. In order to handle someof theseproblems, PSYCOP needsa processthat recognizeswhen the conclusion of an argument is inconsistent with the premises(as Braine et al. point out). In their experiments, subjects were required to detennine whether the conclusion of an was " true" on the basis of the premises, " false" on the basis of the premises, or argument " ' " " indetenninate" not " " ( just follows vs. doesnt follow ). For example, the argument below should have a False response:

'snotanL. There isanF,and there Thereis an L.

PSYCOP can determine that an argument is false in a general way by proving that the negation of the conclusion follows from the premises . In some problems like this one , ' however , the negation of the conclusion is produced automatically in virtue of P SY CO P s " forward rules , and PSYCOP could make an immediate " false response if it detected the presence of such an assertion . Outcome measures in Braine et alis experiments suggest that subjects are able to do something like this . Braine et al . implement this through a separate inference rule ; but since this monitoring seems to have a different status than the inference ' rules in chapters 4 and 6, it might be more appropriate as part of P SY CO P s response . process " 3. In a later paper , Cheng and Holyoak ( 1989, p . 301) acknowledge that looser versions of the memory - cuing view , which simply argue that remindings can influence reasoning , seem to us impossible to entirely rule out for any selection - task experiment conducted on adult " " subjects . But they note that the critical weakness of looser formulations of the memory view , . . is that are not ; rather they provide only post hoc explanations cuing they predictive " of when facilitation is or is not obtained . Although an explanation based on specific experiences ' is certainly post hoc in the present context , it is not clear that it can t be developed in a way that makes testable predictions . One might start from the hypothesis that context is effective to the extent that it reminds subjects of (independently described ) experiences that , is not to develop such a theory , but clearly identify violations . ' The current point , however " " simply to agree that it can t be entirely ruled out , 4. Jackson and Griggs ( 1990) report that seemingly simple changes in wording can disrupt performance on abstract permission rules . For example , in their Experiment 4 they found only 15% correct responses when the problem was phrased as follows : Your task is to decide which of the cards you need to turn over in order to find out whether or not a certain regulation is being followed , The regulation is " If one is to ' ' '" take action A , then one must first satisfy precondition ' P . Turn over only those cards that you need to check to be sure , ~.and Holyoak Cheni


s ( 1985) wording was as follows :

Notes to pp. 322- 334


Supposeyou are an authority checking whether or not people are obeying certain " ' regulations. The regulations all have the general form, If one is to take action ' A, then one must first satisfy precondition ' P.' " In other words, in order to be permitted to do " A " , one must first have fulfilled prerequisite " P." The cards below contain information on four people.. .. In order to check that a certain regulation is being followed, which of the cards below would you turn over? Turn over only those that you needto check to be sure. Jackson and Griggs were able to replicate Cheng and Holyoak 's effect if they repeatedthe exact wording, so presumably something about the pretenseof the authority checking a " " regulation (Gigerenzerand Hug 1992) or the paraphrasingof the rule ( In other words .. . ) is crucial to the effect (Cosmides 1989). The checking context alone is not sufficient, however (Pollard & Evans 1987). 5. This is, of course, not to deny that people sometimesreason with principles (heuristics, inductive rules, plausible rules) that aren' t deductively correct. We saw an example of heuristic reasoning of this sort in the context of categorization in chapter 8. It probably goeswithout saying at this point that a deduction systemsuch as PSYCOP can incorporate these heuristics in the same way that production systemsdo. In fact, this suggestsa third possibility for squaring the evidenceon content effectswith a deduction systemsuch as ours: We can allow PSYCOP to apply (9a)- (9d) in order to solve the selectiontask, handling them in just the way we do any other conditional sentences . This would be an easy project, since there are only four sentencesto worry about, and it would also avoid the need to postulate specialdeduction rules basedon deontic concepts. In the text I have chosento focus on the possibility of deontic deduction rules, since there do appear to be valid, productive relationships among sentencescontaining PERMISSIBLE and OBLIGATORY . Manktelow and Over ( 1991) also note the possibility of construing pragmatic schemasin terms of a deontic deduction theory. They claim, however, that such a theory is not sufficiently general, sinceit could not account for why peoplecome to acceptdeontic conditionals (e.g., If you tidy your room, then I will allow you to go out to play) in the first place. But ' although deduction can t alwaysexplain the rationale for deontic conditionals, it can explain certain relations among them. The situation is the same as the caseof ordinary deductive inferences: Although a theory of deduction can explain why Trump is rich AND Koch is ' famous follows from the separatesentencesTrump is rich and Koch is famous, it can t necessarily explain why you believe Trump is rich (Koch is famous) in the first place. 6. Seechapter 2 above. 7. Seealso note 3 above. 8. Seealso Reiseret al. 1985. 9. I am using " obligation " and " permission" in a way that differs from Cheng and Holyoak 's usagebut that is more in keeping with their usagein deontic logic. Seethe above references to work on conditional obligation and permission. For Cheng and Holyoak , permission implies that a mandatory action (i.e., precondition) ought to be performed before the permitted action can be carried out, and obligation implies that an action commits a person to performing a later mandatory action. My use of PERMISSIBLE and OBLIGATORY carries no temporal restrictions. Rather PERMISSIBLE (P I Q) meansthat P can be the case given that Q is the case. OBLIGA TORY (P I Q) meansthat P ought to be the casegiven that ' Q is the case. Thus, both permissionsand obligations in Cheng and Holyoak s senseentail that somestate of affairs is OBLIGATORY given someother state. 10. Another way to representthe dual obligation in this setup is to make the OBLIGA TORY and PERM ISSI BLE operators include an argument for the individual who is under the obligation or who is given permission. 11. Cosmides, following Manktelow and Over ( 1987), also notesdisagreementamong philosophers about exactly which argumentsare deontically valid. But although there are differences


Notesto pp. 334- 348

of opinion about the right set of rules or axioms for deontic logic (just as there are for sentential logic), systematicrelations exist among the deontic systems, as Lewis ( 1974) demonstrates . Cosmides' comments seemto imply that thesesystemssuffer from vaguenessin their formulation (" similarities or differences [ between deontic logic and social contract theory] will become clearer once deontic logic reachesthe level of specificity that social contract theory has" (p. 233 . But just the opposite is the case; rival deontic systemsemerged becauseof the needto formulate them at a detailed level. Moreover, the theoretical disagreements don' t foreclosethe possibility that somesuch systemprovides a good basisfor human reasoningin this domain.

Chapter10 I. DeniseCumminspointed this out (personalcommunication , 1992 ). Cohen ( 1981 ), Macnamara(1986 ), Osherson(1976 , as was ), and Sober( 1978 ) all developsimilaranalogies notedin chapterI.

(a) A is on the right of B. C is on the left of B.

D is in front of C. E is in front of B. Whatis the relationbetweenDand E? B is on theright of A. C is on theleft of B. D is in front of C. E is in front of B. Whatis the relationbetweenDand E? with (b), but only Accordingto theseauthors,therearetwo mentalmodelsthat areconsistent onethat is consistent with (a). Sinceit is moredifficultto keep - trackof two modelsthanone. model theory predicts that (b) will lead to more errors than (a). Byrne and Johnson- Laird point out , however, that one can construct the same proof that D is to the left of E for both problems. (The first premise is irrelevant to the correct conclusion in both cases, and

the remainingpremises areidentical; thus, any proof that will work with (a) will alsowork with (b).) Hence , theybelieve , rule-basedtheoriesarecommittedto predictingno difference

Notes to pp. 348- 363


between these items. Becausesubjects in fact have more difficulty with (b) than with (a), Byrne and Johnson- laird conclude that model theory is right and rule theories are wrong. There are multiple sourcesof difficulty , however, in interpreting this experiment. First , the instructions specifically requestedsubjects to form spatial arrays in solving the problems. This obviously biasedthe subjectsto usean imaginal strategy that favored the mental-model predictions (or placed a strong task demand on them to respond as if they were trying to image the arrays- seePylyshyn 1984). Second, there is no reason at all to supposethat the final length of a derivation is the only pertinent factor in rule-basedaccountsof this type of reasoning. In any realistic deduction system, searchingfor a correct proof can be sidetracked by the presenceof irrelevant information , such as the first premiseof theseproblems. In fact, we have already seenan example of how PSY CO P's problem solving becomeslessefficient in the presenceof irrelevant information for somewhat similar tasks (seethe simulation of ' Hayes spy problems in chapter 8). Of course, if a deduction systemwere able to ignore the first premise, it would produce equivalent performancefor (a) and (b); but exactly the same is true for model theory. Third , Byrne and Johnson-laird ' s method of counting models in these problems violate a policy that Johnson- laird establishedelsewhere(Johnson- laird and Bara 1984a, p. 37). In the caseof the mental-model theory for other types of inferences (which will be discussedmomentarily ), a problem requires more than one model only if additional modelsrule out potential conclusionsthat are consistentwith the initial model. In the relational problems, however, both of the purported models for (b) lead to preciselythe sameconclusion with respectto the relation betweenD and E. Thus, according to the former method of counting models, both (a) and (b) would be one-model problems. 4. These statistics are based on predictions and data in tables 9- 12 of Johnson- laird and Bara 1984a. Johnson- laird and Byrne ( 1991) are less explicit about their predictions, but they are probably close to those of the earlier version. 5. I am assumingthat All green block are big can' t be eliminated on the basis of the bracketing in the secondmodel of figure 10.3. But even if it can, predictions for the models theory will still depend on the order in which subjectsconsider these two models. In that case, if subjectsfirst construct the model on the right of the figure, then the only potential conclusion would be the correct one (Somebig blocks are not green), and syllogism (6) would be a one-model rather than a two-model, problem.

6. Polk andNewell(1988 ; seealsoNewell1990and Polk et al. 1989 ) haveproposeda theory that combines -basedreasoning of mentalmodelsandproposition aspects in theSoarframework ' -laird and Baras approach and whichmay improveon Johnson . The detailsof the theoryin the publishedreportsare too fragmentaryto allow a completeevaluation . However , thereareseveralfactsaboutthescheme that areworth noting. Oneis that Soar's basic format consistsof object-attribute-valuetriples. Both the mentalmodels representational and the categoricalpropositionsthat Soarusesin solvingsyllogismsareconstructedfrom thesetriples. In particular,Soar's mentalmodelsarenot diagrammatic entitieswith inherent spatial ; they are lists of attributesand valuessuchas ( block-l Ashapesquare Acolor properties , useof mentalmodelsis extremelylimited, at leastin the single green). Second examplethat PolkandNewellprovide.In fact, mentalmodelsseemrestrictedto determining thepolarityof theconclusion(negativeor positive); all otheraspects of theconclusion(order of termsand quantifier) aredeterminedentirelyby the propositions . Thereseemsto be no ' counterpartto Johnsonlaird s useof modelsto eliminatepotentialconclusions , whichis the centralcomponentof histheory. Soaralsoallowsnewpropositionsto beproducedfromold ones.Finally, Soarmakeserrorsbecause of simpleblasesin constructingtheconclusionand because of conversionof premises . Numberof modelsisn't thedeterminingfactor. -Laird and Byrne( 1991 7. Johnson ) revisethis notationby eliminatingthe optionaltokens (e.g., Ov)andinsteadusingthebracketconventionasin thesyllogismmodels . I haveusedthe -laird et al. (1989 notationof Johnson is morecomplete . The ) here, sincetheir presentation two systems . appearto makeidenticalpredictions , however


Notes to pp. 363- 371

-Laird et al. (1989 8. Johnson ) includedsomemultiple-modelpremisesthat haveno valid conclusionof theform Q, of theX arerelatedto Q2of the y: Thecorrectresponse for these " ' " problemswas nothingfollows, and for this reasontheir difficultymaybedueto subjects biasagainstresponses of this sort(Revlis1975a ). . 9. Someof thematerialin thissectionis drawnfrom Rips1990b -Laird and Byrne( 1991 10. As supportfor modelsover rules, Johnson ) cite earlierwork (Oakhill et al. 1989 ) purportedlyshowingthat believabilityof a tentativeconclusioncan . Thenotionis that rule theoriescanexplaincontenteffectsonly affectthereasoning -; process . Content or throughthecensoringof conclusions throughthe initial interpretationof premises content its input andoutput. However doesnot affecttheactualreasoning , , process only ' caninfluencereasoningaccordingto the " model" modelby increasingsubjects willingness . Theevidence that supportsthelatter idea to considermorethanonemodelof the premises like these : comesfrom syllogisticpremises arewinedrinkers. (a) All of the Frenchmen . Someof the winedrinkersaregourmets ? arewinedrinkers. (b) All of the Frenchmen Someof the winedrinkersareItalians. ?

ha form as 7 w the b model ith in 8 and 9 and nei These have the same ( ( ( ) ( , ) problems . However for a the tenta conc tha are a valid conclusion , ( ) categorical syllogis are wh consistent with model 8 are believabl e . Some the Frenc , ( ) ( ) , of gour g . If unb b are not e . Some the French are Italia for , ( ( ) ) they g of syllogism the to in for alter tentative conclusions mo , encourage subjects persist search tha more to find model 9 and who receive b should be ( ) ( ) resp corr likely subjects for b tha for . indeed more correc no conclusion follows . Oakhill et al did ( ) respo .W a (jects )from the evide for men mo and to mention is that Johnson l hat aird Byrne neglect tentat conc affe su hill al . is . If the of Oak et contradict believa 'compare it shou also with no conclu on , pe syllogisms categor performanc aI . conc . Oak hill et on m odel that do have multiple syllogism catego in of the tentat conc are un such as c below which some , , ( ) problems as in conclus is believa and which the final correct ,suc (d),in prob conclusio are believa .with which both tentative and aremarried. (c) Someof the houseowners is a husband . Noneof the houseowners ? aremarried. (d) Someof the houseowners . Noneof the houseowners is a bachelor ?

According to model theory, problem (c) has unbelievable tentative conclusions (including None of the husbandsare ma" Led) and a believable final conclusion (Someof the married ). Problemd ) has believabletentative conclusions(such as None of peopleare not husbands the bachelorsare ma" Led) and a believablefinal conclusion (Someof the Ina" Ledpeopleare not bachelors). Since unbelievabletentative conclusions should motivate subjects to search for further models, Oakhill et al. predict that problems of type (c) should produce better performance than ones of type (d). In three experiments, however, Oakhill et aI. found no significant differencebetweentheseproblems; in two of three, the trend was in the direction opposite to the mental-model predictions.

The a nd t est acco for gene tha I ou in 7 i s syll pro ch cons with thes . data For quite s uc as a an b th ha prob ( no ( ) ( ca " " concl PSY will end , eith n o co fo up or res g the (case conc . If ) are iG ncorr mo gene to if th sub like gu ' is belie , sho be wi b subj w acc a hig ( ) , ( ) .Thus obtain Now cons s as uc c and d tha do ha va prob ( ( ) ( c ca ' Since P SY P s CO forw rule are not suf to th co pro co the shou the tent progr con No agai gene hu ar m of or its conv i n the case of c and Non ( the bac are ( ma or its i n of th ( of d . Neith of thes (posse conc is der an mo w , a t h su g i u " " n o ,inform conc point follo or at respo the co . gue ju \assoc more often for d than for c will lalso essin no ) diff in ( ) th tim pro s , ac "have " both n o conc follo and the N on the ba ar ( ar gue in of m ) 'Carpe shou be low and subje accu for c an d w is app eq ( ) ( what Oak hill et al . obse . ."speci II Thes cons do not sho that the is wi th id th any wr p ment that info ac repre se or inte in from sent and from sou . Fo per , ex m in whic posit in a te di a core exp with a com nod or . the ar sym so Alth rep " " " " " called m ent mod s itua or , d is m mo , e . Ju , ( g 1987 Kints 1988 thei ; do no cla ) tha th m pro p that are in repre diff fro pow of st prin re netw or men ( sen . Th sa is tru i n ) w " m ent mode is used to mea a colle of bel abo so sc do s a , or . S ee the coll electr in temp ( Ge an St 19 . pap T ) ' theor differ from John l s aird 1 98 1 98 clear of ( , m ) con m he believ these mod wha stan ca go beyo rep ac Notes to pp. 371- 396



I. The differencebetweenarchitecturaland strategicdeductionmay be relatedto Grice's ( 1977 ) contrastbetween"flat" and" variable" conceptsof rationality. Thefirst is theconcept involvedin the notion that peopleare rational beings , whereasthe secondallowssome peopleto bemorerationalthanothers. 2. Many investigators fail to realizethat parameterfitting is alsopart of the morestandard statisticaltechniques , suchasanalysisof variance(ANOVA) andregression , that theyusefor hypothesis , it comesasa surpriseto manypsychologists testing.For example ones , -levelstatisticscoursesandwho usestatisticson a including whohavetakengraduate , that daily basis ANOVA involvesparameter estimationfor effectsizes . Thisgoesunnoticedbecause different estimationtechniques leadto the samechoiceof parameters unlessthe designis a complex one(e.g., involvesan unbalanced numberof observations ). 3. I use" Iogicist" here, as Birnbaumand other AI researchers do, to meana cognitive scientistwho usesmethodsof formal logic as one of his or her main researchtools. For connections between logicismof thissortandtheolderlogicistprogramin thefoundationof mathematics on 1991 , seeThomas . 4. Whatmay haveled to confusionaboutOR Introductionis the claimin an earlier paper ' useof this rule and (Rips 1984 ) that thereare individualdifferences in subjects that such , differences constitute variationsin reasoningcompetence might . I was usingcompetence ' deductionabilities in this contextto refer to stableaspectsof subjects , as opposedto suchtemporaryperformance factorsas inattentionand motivation. Stabledifferences in


Notesto pp. 396,397

susceptibility to pragmatic influences, such as ones that affect OR" Introduction" , would be differencesin competen~ on this approach. However, the term competen~ may have suggested"that subjects" who consistently refrained from applying OR Introduction were somehow incompetent in their reasoning. This, of course, was not the intent of the argu.: ment. Differencesin reasoning betweentwo individuals, even differencesin their reasoning ' competen~ , neednt' imply that either is thinking irrationally . (An analogy: Native 'English t learned Chinese- who have no competen~ in Chinese- aren t thereby who haven speakers defectivein their linguistic abilities.) ' Silt is possiblefor someoneto claim that PSYCO Ps incompletenesswith respectto classical logic is itself eviden~ that the theory attributes to people a kind of systematicirrationality . No one supposes, however, that the incompletenessof arithmetic demonstratesthe irrationality of finite axiom systemsin classical logic. Thus, a critic would need some additional ' argumentsto show why PSYCO Ps incompletenessis especiallyculpable. Another argument along these lines con~ rns the distinction between architectural and strategic levels. If the necessarydeduction rules are built into the architecture, it may seem as though replicating these rules for strategic purposesis just introducing an unnecessary ' sour~ of error. On~ a deduction problem has been entered in memory, why doesnt the systemsimply deal with it at the architectural level? Our original motive for separatingthese two modesof deduction, however, was to keep the systemfrom colnlnitting itself to conclusions that it was then powerlessto retract. We need to be able to override some usesof deduction, and one way to do this is to separatethe overridable usesfrom the overriding ones. Although this may seemunnecessaryduplication and a possiblesour~ of mistakes, its purpose is to eliminate much more serious catastrophes. ( Theremight also be a question about how the system could keep the architectural rules from applying to the overridable cases. One approach would be to tag sentencesfor architectural use; another would be to store them in a separatelocation, such as the production -rule memory of Anderson ( 1976, 1983).)


. Inquiry8: 166- 197. Adams,E. (1965 ) Thelogicof conditionals ' Adams , M. J. ( 1984 ) Aristotles logic. In G. H. Bower(ed.), Psychology of Learningand Motivation,volume18. AcademicPress . . Universityof Adkins, D. C., and Lyerly, S. B. (1952 Tests ) Factor Analysisof Reasoning North CarolinaPress . , V., and Shastri, L. (1989 ) Efficientinferencewith multiplace predicatesand Ajjanagadde . Programof the 11th AnnualConference variablesin a connectionist system of theCognitive Science . Erlbaum. Society Anderson : The Logic of Relevance and , A. R., and Belnap , N. D., Jr. (1975 ) Entailment . , volumeI. PrincetonUniversityPress Necessity Anderson . Erlbaum. , Memory,andThought , J. R. (1976 ) Language Anderson . HarvardUniversityPress . , J. R. (1983 ) TheArchitecture of Cognition Anderson . Erlbaum. , J. R. (1990 ) TheAdaptiveCharacterof Thought Anderson , J. R., Greeno,J. G., Kline, P. J., and Neves , D. M. (1981 ) Acquisitionof problem . Erlbaum. solvingskill. In J. RAnderson(ed.), CognitiveSkillsandTheirAcquisition ArmstrongS. L., Gleitman, L. R., and Gleitman, H. ( 1983 What some ) conceptsmight not be. Cognition13: 263- 308. Au, T. K.-F. ( 1983 : The SapiriWhorf hypothesisrevisited ) Chineseand Englishcounterfactuals . CognitionIS: 155- 187. Bach, E. ( 1986 . In R. BarcanMarcus , G. J. W. Dom, and ) Natural languagemetaphysics P. Weingartner(eds.), Logic, Methodology , and Philosophyof Science , volume7. NorthHolland.

Barwise in English . Journalof Philosophical , J. (1979 ) On branching quantifiers Logic8: 47- 80. Barwise , J. (1986 ) The situationin logic-II: Conditionalsand conditionalinfonnation. In E. C. Traugott, C. A. Ferguson . CambridgeUniversity , andJ. S. Reilly(eds.). OnConditionals Press . Barwise , J. ( 1987a , 2: 82- 96. ) Unburdeningthelanguageof thought. Mind andLanguage Barwise . In P. Girdenfors , J. ( 1987b , generalized , and anaphora ) Noun phrases quantifiers : LinguisticandLogicalApproach es. Reidel. (ed.), Generalized Quantifiers Barwise Generalized . Linguistics and , J., andCooper, R. (1981 ) quantifiers naturallanguage 4: 159- 219. andPhilosophy Beattie , J., andBaron, J. (1988 ) Confinnationandmatchingbiasin hypothesis testing.Quarterly Journalof Experimental 4OA:269- 289. Psychology and conversioninterpretations ) Empiricalreconciliationof atmosphere Begg . I., and Denny, J. ( 1969 . Journalof Experimental of syllogisticreasoning 81: 351- 354. Psychology . Journalof VerbalLearning ) On theinterpretationof syllogisms Begg . I., andHarris, G. ( 1982 andVerbalBehavior21: 595- 620. , N. D. ( 1962 ) Tonk, plonk, andplink. Analysis22: 130- 134. Belnap Benacerraf . Monist51: 9- 32. , P. (1967 ) God, thedevil, andGOdel M. Moor J. and Nelson 1980 The J. , , , , , ( ) Bergmann LogicBook. RandomHouse. Beth, E. W. ( 1955 van de ) Semanticentailmentand fonnal derivability. Mededelingen Akademie vanWetenschappen 18: 309- 342. KoninklijkeNederlandse Birnbaum to Nilsson's " Logicandartificialintelligence ." , L. ( 1991 ) Rigormortis: A response 47: 57 77. Artificial Intelligence Bledsoe 9: 1- 35. , W. W. (1977 ) Non-resolutiontheoremproving . Artificial Intelligence



-provingsystem Bledsoe , W. W., and Bruell, P. ( 1974 . Artificial ) A man-machinetheorem 5: 51- 72. Intelligence Bledsoe , W. W., Boyer, R. S., and Henneman , W. H. ( 1972 ) Computerproofsof limit theorems . Artificial Intelligence 3: 27- 60. Boer, S. E., and Lycan, W. G. ( 1973 andotherunwelcome . Papers ) Invitedinferences guests in Linguistics6: 483- 505. . Boolos,G. (1984 .' Cognition17: 181- 182. ) On syllogisticinference Boolos,G. S., andJeffrey andLogic. CambridgeUniversityPress , R. C. ( 1974 . ) Computability Borkowski , L., andSiupecki , J. ( 1958 ) A logicalsystembasedon rulesandits applicationin teachingmathematical logic. StudiaLogica7: 71- 106.

Bower in narrative , G. H., andMorrowD. G. (1990 . Science ) Mentalmodels comprehension 247:44- 48. Brachman , R. J., andSchmolze , J. G. (1985 ) An overviewof K LONE knowledge representation . CognitiveScience 9: 171- 216. system Braine, M. D. S. ( 1978 ) On the relationbetweenthe naturallogicof reasoningandstandard Review85: 1- 21. logic. Psychological " " Braine, M. D. S. (1990 ) The naturallogic approachto reasoning . In W. F. Overton(ed.), . Necessity . andLogic: Developmental . Erlbaum. Reasoning Perspectives Braine,M. D. S., andO' Brien, D. P. (1991 ) A theoryof It A lexicalentry, reasoning program, andpragmaticprinciples . Psychological Review98: 182- 203. Braine, M. D. S., and Rumain,B. ( 1983 . In P. H. Mussen(ed.), Handbook ) Logicalreasoning , volume3. Wiley. of ChildPsychology Braine,M. D. S., Reiser , B. J., andRumain,B. (1984 ) Someempiricaljustificationfor a theory of naturalpropositionalreasoning . In G. H. Bower(ed.), Psychology of LearningandMotivation . , volume18. AcademicPress Brooks, L. (1978 . In E. Roschand ) Nonanalyticconceptformationandmemoryfor instances B. B. Lloyd (eds.), CognitionandCategorization . Erlbaum. Burt, C. (1919 of reasoningin children- I. Joumalof Experimental Peda ) Thedevelopment gogy5: 68- 77. validinferences with conditionals . Cognition31: 61- 83. Byrne, R. M. J. ( 1989 ) Suppressing R. M. J. and Johnson Laird P. N. 1989 , , Byrne , ( ) Spatialreasoning . Journalof Memoryand 28: 564- 575. Language . MIT Press . ) Conceptual Carey,S. ( 1985 Changein Childhood : A psycholinguistic , P. A., and Just, M. A. ( 1975 Carpenter ) Sentence comprehension processing modelof verification . Psychological Review82: 45- 73. Carroll, J. B. ( 1989 : Wheredo westand?Whatdo weknow? ) FactoranalysissinceSpearman In R. Kanfer, P. L. Ackerman . , and R. Cudeck(eds.), Abilities. Motivation.andMethodology Erlbaum.



. CognitivePsychology schemas ) Pragmaticreasoning Cheng , P. W., andHolyoak, K. J. (1985 17: 391- 416. . ) On the natural selectionof reasoningtheories Cheng , P. W., and Holyoak, K. J. ( 1989 Cognition33: 285- 313. ) Pragmaticversus Cheng , P. W., Holyoak, K. J., Nisbett, R. E., and Oliver, L. M. (1986 18: 293- 328. esto trainingdeductivereasoning . CognitivePsychology syntacticapproach . MIT Press . Chemiak,C. ( 1986 ) MinimalRationality -Ginet, S. (1990 : An Introductionto Chierchia , G., and McConnell ) MeaningandGrammar . MIT Press . Semantics . Mouton. , N. ( 1957 ) SyntacticStnlctures Chomsky . , N. ( 1965 ) Aspects ofa Theoryof Syntax.MIT Press Chomsky . Journalof SymbolicLogic 1: 40Church, A. ( 1936a ) A noteon the Entscheidungsproblem 41. numbertheory. AmericanJournal Church, A. (1936b ) An unsolvableproblemof elementary 58: 345- 363. of Mathematics Review76: es in deductivereasoning Clark, H. H. (1969 ) Linguisticprocess . Psychological 387- 404. Clark, H. H., and Chase , W. G. ( 1972 againstpictures ) On the processof comparingsentences 3: 472- 517. . CognitivePsychology Clement , andmental , R. J. ( 1986 , C. A., and Falmagne ) Logicalreasoning , world knowledge in cognitiveprocess es. MemoryandCognition14: 299- 307. : Interconnections imagery -Verlag. in PROLOG . Springer Clocksin,W. F., and Mellish, C. S. (1981 ) Programming ?Behavioral and demonstrated Cohen,L. J. (1981 ) Canhumanirrationalitybeexperimentally 4: 317- 370. BrainSciences . . Clarendon Cohen,L. J. (1986 ) TheDialogueof Reason -activationtheoryof semanticprocessing Collins, A. M., andLoftus, E. F. ( 1975 . ) A spreading Review82: 407- 428. Psychological Collins, A., and Michalski, R. (1989 : A coretheory. Cognitive ) The logicof plausiblereasoning 13: I - SO . Science -provingprocedures of the . In Proceedings Cook, S. A. ( 1971 ) The complexityof theorem on theTheoryof Computing Third AnnualACM Symposium . andSyntacticTheory.Reidel. ) Quantification Cooper, R. ( 1983 ) SymbolicLogic. Macmillan. Copi, I. M. (1954 ) SymbolicLogic, fourthedition. Macmillan. Copi, I. M. (1973 ' . In J. Corcoran(ed.), AncientLogic Corcoran,J. ( 1974 ) Aristotles naturaldeductionsystem . Reidel. andIts ModernInterpretations : Hasnaturalselectionshapedhowhumans Cosmides , L. ( 1989 ) Thelogicof socialexchange reason ? Cognition31: 187- 276. Cummins , D. D., Lubart, T., Alksnis, 0 ., and Rist, R. (1991 ) Conditionalreasoningand . MemoryandCognition19: 274- 282 causation and . In L. Fosterand J. W. Swanson(eds.), Experience Davidson , D. (1970 ) Mentalevents Press . Theory.Universityof Massachusetts Davis, M., and Putnam , H. (1960 ) A computingprocedurefor quantificationtheory. Journal Machinery7: 201- 215. for Computing of theAssociation -basedlearning . In HE . Shrobe(ed.), ) An introductionto explanation Delong, G. ( 1988 . . Kaufmann ExploringArtificial IntelligenceMorgan



-basedTMS. ArtificiallnteUigence de Kleer, J. (1986 28: 127- 162. ) An assumption Dennett,D. C. (1971 . Journalof Philosophy 68: 87- 106. ) Intentionalsystems Dennett,D. C. (1978 . BradfordBooks. ) Brainstorms Dennett : The intentionalstrategyand why it works. In A. F. , D. C. ( 1981 ) True believers Heath(ed.), ScientificExplanation . Clarendon . DeSoto , C. B., London, M., and Handel, S. (1965 . ) Socialreasoningand spatialparalogic Journalof Personality andSocialPsychology 2: 513- 521. Dickstein , L. S. (1978a ) Theeffectof figureon syllogisticreasoning . MemoryandCognition6: 76- 83. Dickstein es in syllogisticreasoning , L. S. (1978b ) Error process . MemoryandCognition6: 537- 543. . Paperpresented Dowty, D. R. (1993 , reasoning ) Categorialgrammar , andcognition at the 29thregionalmeetingof theChicagoLinguisticsSociety . . ArtificiallnteUigence 12: 231- 272. Doyle, J. ( 1979 ) A truth maintenance system Dummett, M. (1973 . Proceedings ) Thejustificationof deduction of the BritishAcademy59: 201- 231. Dummett. M . ( 1975) The philosophical basis of intuitionistic logic. In HE . Rose and J. ' Shepherdson (eds.). Logic Colloquium 73. North - Holland. Dummett, M . (1977) Elementsof Intuitionism. Clarendon.

Erickson, J. R. ( 1974) A set analysis theory of behavior in fonnal syllogistic reasoningtasks. In R. L. Solso(ed.), Theoriesin Cognitive Psychology. Erlbaum. Erickson, J. R. ( 1978) Researchon syllogistic reasoning. In R. Revlin and RE . Mayer (cds.), Human Reasoning . Winston.

Ericsson , K. A., and Simon, H. A. (1984 : VerbalReportsas Data. MIT ) ProtocolAnalysis . Press Evans , J. St. B. T. (1977 ) Linguisticfacto" in reasoning . QuarterlyJournalof Experimental 29: 297- 306. Psychology Evans , J. St. B. T. ( 1982 . Routledge & KeganPaul. ) ThePsychology of Deductive Reasoning Evans . Erlbaum. , J. St. B. T. ( 1989 ) Biasin HumanReasoning Evans , J. St. B. T., andLynch, J. S. (1973 ) Matchingbiasin theselectiontask. BritishJournal 64: 391- 397. of Psychology Evans , J. St. B. T., Barston , J. L., and Pollard, P. ( 1983 ) On the conflictbetweenlogic and beliefin syllogisticreasoning . MemoryandCognition11: 295- 306. Fillenbaum . Psychological , S. ( 1975 Research 37: 245- 250. ) It Someuses ' ' Fillenbaum S. 1977 Mind s and s: , ( ) your p q Theroleof contentandcontextin someusesof and, or, and it In G. H. Bower(ed.), Psychology of Learningand Motivation , volume11. AcademicPress . Fine, K. ( 1985a . Journalof Philosophical ) Naturaldeductionandarbitraryobjects Logic14: 57- 107. Fine, K. ( 1985b withArbitraryObjects . Blackwell . ) Reasoning Fitch, F. B. (1952 . Ronald. ) SymbolicLogic: An Introduction



deduction rul~i for obligation . AmericanPhilosophicalQuarterly Fitch , F. B. (1966 ) Natural 3: 27- 38.

Studies24: 89- 104. Fitch, F. B. ( 1973 ) Naturaldeductionrulesfor English.Philosophical Intuitionistic . Modal and M. 1983 Methods Logics Reidel. ( ) Proof for Fitting, . . Crowell The Fodor, J. A. ( 1975 ) Language of Thought . . MIT Press Fodor, J. A. ( 1981 ) Representations . Mind. MIT Press The Fodor, J. A. ( 1983 ) Modularityof 's : The intelligentauntie's vade. to mental Fodor, J. A. (1985 Fodor ) representation guide . Mind 94: 76- 100. mecum 2: 64- 81. ? Mind andLanguage Fodor, J. A. (1987 ) A situatedgrandmother . Cognition Z. 1988 and Fodor, J. A., and Pylyshyn Connectionism , ( ) cognitivearchitecture 28: 3- 71. Fellesdal, D., and Hilpinen , R. ( 1971) Deontic logic: An introduction . In R. Hilpinen (ed.), Deontic Logic: Introductory and SystematicReadings. Reidel. . ( 1986) The effectsof statistical training on Fong, G. T., Krantz, D. H., and Nisbett, RE thinking about everyday problems. Cognitive Psychology18: 253- 292. Forbus, K . D., and de Kleer, J. ( 1993) Building ProblemSolvers. MIT Press. Ford , M . ( 1985) Reviewof Mental Models. Language61: 897- 903. Frege, G. ( 1964) The BasicLaws of Arithmetic. University of California Press. (Original work published 1893.) Frege, G. ( 1977) Thoughts. In P. T. Geach(ed.), Logical Investigations. Yale University Press. (Original work published 1918.) Gaiotti , K . M . ( 1989) Approaches to studying fonnal and everyday reasoning. Psychological Bulletin 105: 331- 351. Gardner, M . ( 1958) Logic Machinesand Diagrams. McGraw - Hill . Garey, M . R., and Johnson, D. S. ( 1979) Computersand Intractability : A Guideto the Theory . Freeman. of N P-Completeness Gazdar, G. ( 1979) Pragmatics: Implicature, Presupposition. and Logical Form. Academic Press.

. LinguisticInquiry2: 561- 566. Gels, M. L., andZwicky, A. M. (1971 ) On invitedinferences induction in youngchildren. Cognition and Markman E. 1986 A. and S. , , Gelman , ( ) Categories 23: 183- 209. . Genesereth , M. R., and Nilsson, N. J. ( 1987 ) LogicalFoundations of Artiflciallntelligence . MorganKaufmann -mapping Gentner,D. ( 1983 ) Structure : A theoreticalframeworkfor analogy.CognitiveScience 7: 155- 170. . Erlbaum. Gentner,D., andStevens , A. L. (1983 ) Mentalmodels . In ME . Szabo(ed.), The Collected Gentzen , G. ( 1969 ) Investigationsinto logicaldeduction fiberdaslogische (OriginallypublishedasUntersuchungen Papersof GerhardGentzen Schliessen , Mathematische ): 176- 210, 405- 431.) Zeitschrift39(1935 . Cognitive Gick, M., and Holyoak, K. J. (1983 ) Schemainductionand analogicaltransfer 15: 1- 38. Psychology , cheating , , G., and Hug, K. (1992 ) Domainspecificreasoning : Socialcontracts Gigerenzer . Cognition43: 127- 171. andperspective change , N. E., and Wynn, V. (1993 ) Workingmemoryand Gilhooly, K. J., Logie, R. H., Wetherick -reasoningtask. MemoryandCognition21: 115- 124. in syllogistic strategies



, M. L. ( 1987 Ginsberg ) Introduction. In M. L. Ginsberg(ed.), Readingsin Nonmonotonic Rea.\"oning. MorganKaufmann . Goldman,A. ( 1986 andCognition . HarvardUniversityPress ) Epistemology . Goodman , N. ( 1965 . Secondedition. Bobbs-Merrill. ) Fact. Fiction. andForecast ) Advanced . Reidel Grandy, R. E. ( 1977 . Logicfor Applications Green,C. ( 1969 -answeringsystems ) Theoremprovingby resolutionasa basisfor question . In D. Michieand B. Meltzer(eds.), MachineIntelligence 4. EdinburghUniversityPress . -provingto question Green,C. ( 1980 -answering ) Theapplicationof theorem . Garland. systems (Doctoraldissertation , StanfordUniversity, 1969 .) Green, R. F., Guilford, J. P., Christensen , P. R., and Comrey, A. L. ( 1953 ) A factor-analytic abilities. Psychometrika 18: 135- 180. studyof reasoning Greene , S. B. ( 1992 ) Multiple explanationsfor multiply-quantifiedsentences : Are multiple modelsnecessary ? Psychological Review99: 184- 187. " Grice, H. P. ( 1977 of Reason ." ImmanuelKant Lectures ) SomeAspects . , StanfordUniversity Grice, H. P. ( 1978 . HarvardUniversityPress ) Studiesin the Wayof Words . ) The role of problemcontentin the selectiontaskandTHOG problem. Griggs, R. A. ( 1983 In J. St B. T. Evans(ed.), ThinkingandReasoning : Psychological es. Routledge . Approach -materialseffectin Wason's selection The elusive Griggs, R. A., andCox, J. R. ( 1982 thematic ) task. BritishJournalof Psychology 73: 407- 420. S. D. and , M. , Guttenplan Introduction . BasicBooks. Tamny, ( 1971 ) Logic: A Comprehensive Guyote, M. J., and Sternberg -chaintheoryof syllogisticreasoning , R. J. ( 1981 ) A transitive . 13: 461- 525. CognitivePsychology Haack, S. ( 1974 ) DeviantLogic. CambridgeUniversityPress . Haack, S. ( 1976 . Mind 85: 112- 119. ) Thejustificationof deduction Halvorsen , P.- K. ( 1983 ) Semanticsfor lexical-functionalgrammar . LinguisticInquiry 14: 567- 615. Hamill, J. F. ( 1990 - Logic: The Anthropologyof HumanReasoning ) Ethno . Universityof Illinois Press . Harman,G. ( 1986 ) Changein View: Principlesof Reasoning . MIT Press . Harman. H. H. ( 1976) Modern Factor Analysis. Third edition. University of Chicago Press. HastieR . . Schroeder. C.. and Weber. R. ( 1990) Creating complex social conjunction categories from simple categories. Bulletin of the PsychonomicSociety28: 242- 247. Haviland. S. E. ( 1974) Nondeductive Strategiesin Reasoning. Doctoral dissertation, Stanford University. Hayes. J. R. ( 1965) Problem typology and the solution process. Journal of Verbal Learning and Verbal Behavior4: 371- 379. Hayes. J. R. ( 1966) Memory, goals, and problem solving. In B. Kleinmuntz (ed.), Problem . Method. and Theory. Wiley. Solving: Research Hayes. P. J. ( 1979) The logic of frames. In D. Metzing (ed.), Frame Conceptionsand Text Understanding. Walter de Gruyter . Hayes. P. J. ( 1985) The second naive physics manifesto. In J. R. Hobbs and R. C. Moore (eds.). Formal Theoriesof the Common senseWorld. Ablex. Hayes. P. J. ( 1987) A critique of pure treason. ComputationalIntelligence3: 179- 185.

-Roth, 8., and Hayes -Roth, F. ( 1975 Hayes . Journalof ) Plasticityin memorialnetworks VerbalLearningandVerbalBehavior14: 506- 522.



Henle, M . ( 1962) On the relation betweenlogic and thinking. PsychologicalReview69: 366378. . Winston. Henle, M . ( 1978) Forward. In R. Revlin and RE . Mayer (cds.), Human Reasoning in robots. In theorems for Hewitt, C. ( 1969) PLANNER : A language proving Proceedingsof the International Joint Conferenceon Artificial Intelligence. Heyting, A. ( 1956) Intuitionism: An Introduction. North - Holland. . Higginbotham, J. ( 1987) On semantics. In E. le Pore (ed.). New Directions in Semantics Academic Press. Hintikka , J. ( 1974) Quantifiers vs. quantification theory. Linguistic Inquiry 5: 153- 177. " Hintzman, D. L. ( 1986) " Schemaabstraction in a multiple -trace memory model. Psychological Review93: 411- 428. Hitch , G. J., and Baddeley, A. D. ( 1976) Verbal reasoning and working memory. Quarterly Journal of ExperimentalPsychology28: 603- 621. Holland , J. H., Holyoak , K. J., Nisbett, R. E., and Thagard, P. R. ( 1986) Induction: Processes of Inference, Learning. and Discovery. MIT Press. Horn , L. R. ( 1973) Greek Grice: A brief survey of proto -conversationalrules in the history of logic. In C. Corum , T. C. Smith-Stark, and A. Welser (cds.), Papersfrom the Ninth Regional Meeting of the ChicagoLinguistic Society. Chicago Linguistic Society. Horn , L. R. ( 1989) A Natural History of Negation. University of Chicago Press. Huttenlocher, J. ( 1968) Constructing spatial images: A strategy in reasoning. Psychological Review75: 550- 560. ' IsraelD . J. ( 1980) What s wrong with non-monotonic logics? In Proceedingsof the First Annual National Conferenceon Artificial Intelligence. Morgan Kaufmann. JacksonS . L., and Griggs, R. A. ( 1990) The elusive pragmatic reasoning schemaseffect. Quarterly Journal of ExperimentalPsychology42A: 353- 373.

James , volume2. Dover. , W. (1890 ) Principlesof Psychology and Janis, I. l ., and Frick, F. (1943 ) The relationshipbetweenattitudestowardconclusions 33: 73. Journalof Experimental errorsin judginglogicalvalidity of syllogisms Psychology 77. Jaskowski , S. (1934 ) On the rulesof suppositionsin fonnallogic. StudiaLogicaI: 5- 32. 1967 R. C. , ( ) FormalLogic: Its ScopeandLimits. McGraw-Hill. Jeffrey -laird , P. N. ( 1975 : Repre . In R. J. Falmagne Johnson (ed.), Reasoning ) Modelsof deduction in ChildrenandAdults. Erlbaum. se'ltationandProcess -laird , P. N. (1983 . . HarvardUniversityPress Johnson ) MentalModels ed . Foundations I. Posner . In M. Johnsonlaird , P. N. ( 1989 ( ), of Cognitive ) Mentalmodels . . MIT Press Science -laird , P. N., and Bara, B. G. (1984a . Cognition16: 1- 61. Johnson ) Syllogisticinference asa causeof error. Cognition Johnsonlaird , P. N., andBara, B. G. ( 1984b ) logical expertise 17: 183- 184. -laird , P. N., and Byrne, R. M. J. (1991 . Erlbaum. Johnson ) Deduction -laird , P. N., Byrne, R. M. J., and Schaeken Johnson , W. ( 1992 ) Propositionalreasoningby Review99: 418- 439. model. Psychological -laird , P. N., Byrne, R. M. J., andTabossi Johnson , P. ( 1989 ) Reasoning by model: Thecase Review96: 658- 673. . Psychological of multiplequantification - laird , P. N., and Steedman . Cognitive Johnson , M. J. ( 1978 ) The psychologyof syllogisms 10: 64- 99. Psychology



-Laird, P. N., andTagart Johnson , J. (1969 is understood . American ) Howimplication Journal 82: 367- 373. of Psychology -Laird, P. N., andWason Johnson , P. C. (1970 . Quarterly ) Insightinto a logicalrelation Journal 22: 49- 61. of Experimental Psychology Just, M. A., and Carpenter , P. A. (1987 ) ThePsychology of ReadingandLanguageComprehension . Allyn andBacon. Kahneman , D., Siovic, P., and Tversky, A. (eds.) (1982 : ) Judgmentunder Uncertainty HeuristicsandBlases . CambridgeUniversityPress . Kahneman , D., and Varey, C. A. (1990 and counterfactuals : The loser