`
`
`
`NATURAL
`LANGUAGE
`UNDERSTANDING
`
`JAMES ALLEN
`UNIVERSITY OF ROCHESTER
`
`The Benjamin/Cummings Publishing Company, Inc.
`
`Menlo Park, California • Reading, Massachusetts
`Don Mills, Ontario • Wokingham, U.K. • Amsterdam
`Sydney • Singapore • Tokyo ~ Madrid • Bogota
`Santiago • San Juan
`
`Petitioner Apple Inc. - Exhibit 1014, p. 2
`
`
`
`Preface
`
`I started work on this book over four years ago for a course I teach at the
`University of Rochester on natural language understanding. At that time,
`there was no source for such a course except for collections of research papers,
`with each paper using a different notation. So much time was spent learning
`the notations that there was little time to appreciate the general principles
`underlying the field. Today this situation has changed little. While books on
`natural language understanding are now available, they either cover only one
`aspect of the area (say, syntactic processing) or they cover the entire field hut
`not in enough depth.
`
`As a result the primary goal in writing this book was to produce a
`comprehensive, in-depth description of the entire area of natural language
`understanding. To do this in a single book, I had to eliminate all the
`complexity not inherent in the process of under standing language. Most of all,
`the number of different notations introduced had to be kept to a minimum. This
`was accomplished by developing adaptations of a few select representations in
`the area that then could be used to present the range of problems and solutions
`from a wide range of sOUrces. In addition, I made a deliberate effort not to
`assume a great deal of programming expertise. The essence of most algorithms
`in the literature is fairly simple and becomes complex only when the details of
`a particular implementation are considered. As a result, any reader having
`some familiarity with the basic notions of programming can obtain an
`understanding of the fundamental techniques without becoming lost in
`programming details. There is enough detail in this book, however, that a
`sophisticated programmer can take the abstract algorithms and produce useful
`working systems.
`
`The book is intended both as a textbook and as a general reference for
`researchers in artificial intelligence and other areas related to language. As a
`text, it is suitable for an advanced undergraduate or beginning graduate level
`course in computer science or a graduate level course in linguistics. As a
`
`v
`
`Petitioner Apple Inc. - Exhibit 1014, p. 3
`
`
`
`Preface
`
`reference text, it is suitable for many advanced courses concerning language,
`for any research and development projects involving natural language
`processing, or for any individual interested in learning more about the area.
`
`The Organization
`
`Natural language understanding requires background in many different
`areas--IDost importantly, programming, linguistics, and logic. Very few people
`have all this background, however, so the book introduces whatever material is
`needed. In particular, there is an appendix on the first-order predicate calculus
`and another on basic programming techniques. Background material from
`linguistics is introduced as needed throughout the book. This should make the
`material accessible to all. The more background a reader has, however, the
`more he or she should be able to get from the book.
`
`The book is organized by problem area, rather than by technique. As a
`result, much of the duplication found in a set of papers on a particular area is
`eliminated. It also 'makes the comparison of different techniques much easier.
`For example, rather than have a chapter on augmented transition networks,
`then another on context-free grammars, and another on logic-based grammars,
`the book is organized by linguistic problem. There is a chapter on basic context(cid:173)
`free parsing that shows that the parsing techniques underlying all three of
`these representations are essentially the same. Then there is a chapter on
`simple augmented systems, again showing the similarities between each of the
`approaches. Finally, there is a chapter on how long-distance dependencies are
`handled.
`
`While three different representations are presented for syntactic
`processing, the remainder of the. book develops a single set of representations
`and translates all work inthe area into these formalisms. In particular, the
`representations used are based on semantic networks, logical form, Horn
`clause logic systems, and frame-based systems.
`
`How to Use This Book
`
`To cover the entire book in detail would require a two--semester sequence in
`natural language processing. However, the book is organized so that it is easy
`to custom tailor a one-semester course concentrating on the topics that the
`instructor wants to cover. In particular, it is divided into three main parts:
`syntactic processing, semantic processing, and the use of context and general
`world knowledge. Each of these parts contains some introductory chapters that
`provide enough background that the material in the following parts can be
`
`vi
`
`Petitioner Apple Inc. - Exhibit 1014, p. 4
`
`
`
`Preface
`
`understood. I have taught two different courses from the manuscript. One was
`a syntax and semantics course, with an emphasis on syntax, which covered
`Chapters 1-6 and the basic issues in semantics, consisting of Chapters 7 and 8
`and selections from Chapter 9. Another was a course emphasizing the
`contextwiJ aspects of language, which included Chapter 1, Chapters 7 and 8,
`and then Chapters 11-16 on world knowledge and discourse structure. Many
`other organizations are possible, either as a first course in the area or as a
`second--say, following a course that covers mainly syntactic aspects. The
`following chart outlines the main dependencies between the chapters.
`
`PARTI
`SYNTAX
`
`PART II
`SEMANTICS
`
`PART III
`CONTEXT AND WORLD
`KNOWLEDGE
`
`Part IV, Response Generation, contains two chapters. Chapter 16 examines
`question-answering techniques for database queries and relies only on the
`work in Part II. It includes two case studies of actual natural language
`interfaces to databases. Chapter 17 examines the issues in natural language
`generation and draws on the basic material from all of the first three parts.
`
`References
`
`The book contains extensive pointers into the literature so that the particular
`details of anyone approach can easily be found. Each chapter also contains
`boxes, which contain optional material examining various issues in detail.
`Some boxes give additional background material from linguistics or
`philosophy, while other boxes examine some particular computatiOrial work
`that is related to the chapter. With the references the emphasis has been to
`identify the most readily available sources rather than the original papers in
`which ideas appeared. Thus I've cited journal articles, books, and papers in the
`major conferences in AI.
`I have tried to avoid technical reports and
`unpublished notes that are hard to find ten years after the fact. In general,
`most work in the area is published in technical reports several years before
`appearing elsewhere, so the dates of the references are often not a good
`
`vii
`
`Petitioner Apple Inc. - Exhibit 1014, p. 5
`
`
`
`Preface
`
`indicator of when the work was actually done. While I've tried to be
`comprehensive and give credit for all the ideas in the book, I am sure that I've
`omitted key papers that I will regret immensely when I remember them. To
`these authors, please accept my apologies.
`
`Acknowledgements
`
`I first want to thank the Computer Science Department at the University of
`Rochester for their support over the last four years. I was given without
`question all the time and extra resources I needed to complete this project. I
`also want to thank Peggy Meeker for the preparation of marty different drafts
`that were aCtually rewrites and her endless patience and assistance during the
`final revision and preparation of camera-ready copy. I do not have much of an
`eye for consistency of notation and style, and much of the continuity of this
`book is due to her corrections. Thanks also to Gail Cassell for her help in
`preparing the final draft.
`
`Parts of this book would not have been possible without the support of the
`Office of Naval Research and the National Science Foundation, who have
`supported my natural language research for the last eight years. Also, the
`production of the book itself was greatly aided by the Xerox Corporation, who
`provided the facilities for producing the camera-ready copy through tbeir
`University Grants Program.
`
`I also thank the reviewers of the earlier drafts who provided many excellent
`comments on the presentation and content of the book. Eugene Charniak,
`Michal Ephratt, Ray Perrault, Bonnie Webber, Natalie Dehn, and Elaine Rich
`all made specific comments that helped the overall organization of the book,
`and Glenn Blank, Robin Clark, David Evans, Jim Hendler, Graeme Hirst, and
`Mike Swain all made detailed comments chapter by chapter that contributed
`much to the book's overall coherence and organization. This level of feedback
`would not have been manageable without my editor, Alan Apt, who actively
`participated in the reviewing process and who provided me with much valuable
`feedback and support during the final revisions.
`
`Finally, I want to thank my wife, Judith Hook, and Jeffrey, Daniel, and
`Michael for keeping life interesting when I was trying my antisocial, tedious
`best to be totally obsessed with the task of completing this book.
`
`Rochester, New York
`July 1987
`
`viii
`
`Petitioner Apple Inc. - Exhibit 1014, p. 6
`
`
`
`Brief Contents
`
`Chapter 1
`
`Introduction to Natural Language Understanding
`
`1
`
`PART I Syntactic Processing
`
`Chapter 2
`Chapter 3
`Chapter 4
`
`Chapter 5
`
`Chapter 6
`
`Linguistic Backgrollnd: An Outline of English
`
`24
`
`40
`Basic Parsing Technjques
`Features and Augmented Grammars
`
`80
`Grammars for Natural Language: Handling Movement
`Toward Deterministic Parsing
`160
`
`122
`
`PART II Semantic Interpretation
`Semantics and a Logical Form
`Chapter 7
`Chapter 8
`222
`Chapter 9
`Strategies for Semantic Interpretation
`282
`Chapter 10
`Issues in Semantic Interpretation
`
`Semantic Interpretation
`
`192
`
`244
`
`PART III Context and World Knowledge
`Chapter 11 Knowledge Representation
`314
`
`334
`Chapter 12 Reference
`Chapter 13 Using World Knowledge Ahout Actions
`
`366
`
`Chapter 14
`Chapter 15
`
`396
`Discourse Structure
`Belief Models and Speech Acts
`
`PART IV Response Generation
`
`Chapter 16
`Chapter 17
`
`Question-Answering Systems
`Natural Language Generation
`f; ,
`
`432
`
`468
`490
`
`Appendix A Logic and Rules ofInference
`
`514
`
`Appendix B Symholic Computation
`
`530
`
`Bibliography
`
`544
`
`Index
`
`560
`
`ix
`
`Petitioner Apple Inc. - Exhibit 1014, p. 7
`
`
`
`Contents
`
`Chapter 1
`Introduction to Natural Language Understanding
`
`1
`
`1.1 What is Natural Language Understanding?
`1.2 Evaluating Computational Models of Language
`1.3 Knowledge and Language
`6
`8
`1.4 Representations and Ambiguity
`1.5 A Warning about Names in Representations
`1.6 The Organization of Actual Systems
`17
`
`1
`
`2
`
`16
`
`PART I
`SYNTACTIC PROCESSING
`
`23
`
`Chapter 2
`Linguistic Background: An Outline of English
`
`24
`
`25
`2.1 Words
`2.2 The Elements of Simple Noun Phrases
`2.3 The Elements of Simple Sentences
`2.4 Prepositional Phrases
`34
`2.5 Embedded Sentences
`35
`2.6 Complements
`36
`2.7 Adjective Phrases
`
`37
`
`26
`
`30
`
`Chapter 3
`Basic Parsing Techniques
`
`40
`
`3.1 Grammars and Sentence Structure
`3.2 What Makes a Good Grammar
`47
`3.3 Top-Down Parsing Methods
`50
`3.4 Bottom-Up Parsing Methods
`60
`3.5 Mixed-Mode Methods
`65
`
`41
`
`xi
`
`Petitioner Apple Inc. - Exhibit 1014, p. 8
`
`
`
`Contents
`
`Chapter 4
`Features and Augmented Grammars
`
`80
`
`4.1 Augmented Transition Networks 81
`4.2 Useful Feature Systems 89
`4.3 A Sample ATN Grammar for Assertions
`95
`4.4 Verb Complements and Presetting Registers
`4.5 Augmenting Chart Parsers 102
`4.6 Augmenting Logic Grammars
`106
`4.7 Generalized Feature Manipulation
`
`110
`
`99
`
`Chapter 5
`Grammars for Natural Language: Handling Movement
`
`122
`
`124
`5.1 Local Movement
`5.2 Wh-Questions and the Hold Mechanism
`5.3 Relative Clauses
`136
`5.4 Using a Hold List in the Mixed-Mode Parser
`139
`5.5 Handling Movement in Logic Grammars
`143
`5.6 Slashed Categories: An Alternative to Hold Lists
`5.7 A Comparison of the Methods Using Constraints
`
`130
`
`145
`147
`
`ChapterS
`Toward Deterministic Parsing
`
`160
`
`6.1 Human Preferences in Parsing
`6.2 Shift-Reduce Parsers
`166
`6.3 Shift-Reduce Parsers and Ambiguity
`6.4 Lookahead in Parsers
`176
`6.5 The Marcus Parser
`178
`
`161
`
`171
`
`PARTU
`SEMANTIC INTERPRETATION
`
`191
`
`Chapter 7
`Semantics and a Logical Form
`
`192
`
`193
`
`7.1 Why Derive a Logical Form?
`7.2 Types and Features
`194
`7.3 Selectional Restrictions
`197
`7.4 Case Relations
`198
`7.5 The Structure of Verbs
`7.6 Semantic Networks
`7.7 The Logical Form
`
`204
`206
`212
`
`xii
`
`Petitioner Apple Inc. - Exhibit 1014, p. 9
`
`
`
`Contents
`
`ChapterS
`Semantic Interpretation
`
`222
`
`8.1 The Basic Operations for Semantic Interpretation
`8.2 The Interpretation Algorithm
`227
`8.3 An Example: Assigning Case Roles
`230
`8.4 Embedded and Nonembedded Sentences
`8.5 Rule Hierarchies
`236
`
`234
`
`223
`
`ChapterS
`Strategies for Semantic Interpretation
`
`244
`
`9.1 A Sample Domain
`245
`9.2 Semantic Grammars
`250
`9.3 A Simple Interleaved Syntactic and Semantic Analyzer
`9.4 Semantic Interpretation Based on Preferences
`256
`9.5 Rule-by-Rule Semantic Interpretation Based on the A-Calculus
`9.6 Rule-by-Rule Interpretation Using Variables
`269
`9.7 Semantically Directed Parsing Techniques
`270
`
`253
`
`260
`
`Chapter 10
`Issues in Semantic Interpretation
`
`282
`
`10.1 Scoping Phenomena
`283
`10.2 Modifiers and Noun Phrases
`10.3 Adjective Phrases
`294
`10.4 Noun-Noun Modifiers
`300
`10.5 Lexical Ambiguity
`301
`10.6 Tense and Aspect
`306
`
`290
`
`PARTIH
`CONTEXT AND WORLD KNOWLEDGE
`
`313
`
`ChapterU
`Knowledge Representation
`
`314
`
`11.1 The Major Issues in Knowledge Representation
`11.2 Logic-Based Representations
`320
`11.3 Frame-Based Systems
`321
`11.4 Representing Actions
`324
`
`315
`
`xiii
`
`Petitioner Apple Inc. - Exhibit 1014, p. 10
`
`
`
`Contents
`
`Chapter 12
`Reference
`
`334
`
`336
`12.1 Simple Reference
`339
`12.2 Simple Anaphoric Reference
`12.3 Extending the History List Mechanism
`12.4 One-Anaphora and VP-Anaphora
`351
`12.5 Other Problems in Reference
`355
`12.6 Ellipsis
`357
`
`345
`
`Chapter 13
`U sing World Knowledge About Actions
`
`366
`
`13.1 Mapping to the Final Representation
`367
`13.2 Understanding Stereotypical Courses of Action: Scripts
`13.3 Plan-Based Analysis
`378
`13.4 Linguistic Structures and Plan Tracking
`13.5 Complex Goalsand Plans
`389
`
`385
`
`369
`
`Chapter 14
`Discourse Structure
`
`396
`
`398
`14.1 Segmentation
`14.2 The Attentional State and Cue Phrases
`14.3 The Tense Filter
`404
`14.4 The Reference Filter
`406
`14.5 Analyzing a Story Using Cue Phrases, Tense, and Reference
`14.6 The Plan-Based Filter
`416
`14.7 Explicit Discourse Relations
`
`401
`
`419
`
`411
`
`Chapter 15
`Belief Models and Speech Acts
`
`432
`
`15.1 Beliefs, Speech Acts, and Language
`15.2 Simple Belief Models
`435
`15.3 Speech Acts and Plans
`443
`15.4 Speech Act Theory and lndirect Speech Acts
`15.5 Speech Acts and Discourse Structure
`454
`15.6 Formal Models of Belief and Knowledge
`457
`
`433
`
`450
`
`xiv
`
`Petitioner Apple Inc. - Exhibit 1014, p. 11
`
`
`
`Contents
`
`PART IV
`RESPONSE GENERATION
`
`467
`
`Chapter 16
`Question-Answering Systems
`
`468
`
`469
`16.1 Procedural Semantics
`16.2 LUNAR: A Case Study
`475
`16.3 Logic-Based Question Answering
`16.4 Providing Useful Answers
`483
`16.5 Responses to Questions in More ComplexDomains
`
`478
`
`485
`
`Chapter 17
`Natural Language Generation
`
`490
`
`492
`17.1 DecidingWhattoSay
`496
`17.2 Selecting Words and Phrases
`17.3 Producing Sentences Directly from a Representation
`17.4 Grammars and Generation
`504
`
`500
`
`APPENDIX A
`Logic and Rules ofInference
`
`514
`
`A.I Logic and Natural Language
`A.2 Semantics
`521
`A.3 A Semantics for FOPC: Set-Theoretic Models
`
`515
`
`525
`
`APPENDIXB
`Symbolic Computation
`
`530
`
`B.1 Symbolic Data Structures
`B.2 Matching
`535
`B.3 Horn Clause Theorem Proving
`B.4 The Unification Algorithm
`
`531
`
`536
`539
`
`BIBLIOGRAPHY
`
`544
`
`INDEX
`
`560
`
`xv
`
`Petitioner Apple Inc. - Exhibit 1014, p. 12
`
`
`
`Chapter 1
`
`Introduction to
`Natural Language
`Understanding
`
`1.1
`
`1.2
`
`1.3
`
`1.4
`
`What is Natural Language Understanding?
`
`Evaluating Computational Models of Language
`Knowledge and Langua~
`
`Representations and AmbIguity
`Syntax: Representing Senjence Structure
`The Logical Form
`'
`The Final Meaning Representation
`
`1.5
`
`A Warning about Names in ~epresentations
`
`1.6
`
`The Organization of Actual Systems
`
`Petitioner Apple Inc. - Exhibit 1014, p. 13
`
`
`
`Introduction to Natural Language Understanding
`
`1
`
`1.1 What is Natural Language Understanding?
`
`This book describes the basic techniques that are used in building computer
`models of natural language production and comprehension. In particular, the
`book describes the work in the interdisciplinary field called computational
`linguistics that arises from research in artificial intelligence (AI). There are
`two primary motivations for this type of research. First, the technological
`motivation is to build intelligent computer systems, such as natural language
`interfaces to databases, automatic machine~translation systems, text analysis
`systems, speech understanding systems, or computer-aided instruction
`systems. Second, the linguistic, or cognitive science, motivation is to gain a
`better understanding of how humans communicate by using natural language.
`This second motivation is not unique to computational linguistics, but is shared
`with theoretical linguistics and psycholinguistics.
`
`The tools that the work in computational linguistics uses are those of
`artificial intelligence: algorithms, data structures, formal models for
`representing knowledge, models of reasoning processes, and so on. The goal of
`this computational approach is to specify a theory of language comprehension
`and production to such a level of detail that a person could write a computer
`program that can understand and produce natural language. Typical subareas
`of the field include the specification of parsing algorithms and the study oftbeir
`computational properties, the construction of knowledge representation
`formalisms that can support the semantic analysis of sentences, and the
`modeling of reasoning processes that account for the way that context affects
`the interpretation of sentences.
`
`While the goals of the computational approach overlap those of theoretical
`linguistics and psycholinguistics, the tools that are used in each of these fields
`differ markedly. Before you examine the computational methods, consider
`briefly the related disciplines.
`
`Theoretical linguists are primarily interested in producing a structural
`description of natural language. They usually do not consider the details of the
`way that actual sentences are processed (the parsing process) or the way that
`actual sentences might be generated from structural descriptions. A major
`constraint on linguistic theories is that the theories should hold true in general
`across. different languages; thus, theoretical linguists attempt to characterize
`the general organizing principles that underlie all human languages, and
`spend less time and effort examining any particular language. The goal of
`theoretical linguists is a formal specification oflinguistic structure, both in the
`form of constructive rules that define the range of possible structures and in the
`form of constraints on the possible allowable structures.
`
`Petitioner Apple Inc. - Exhibit 1014, p. 14
`
`
`
`2
`
`Chapter 1
`
`Psycholinguists, like the computational linguists, are interested in the way
`that people actually produce and comprehend natural language, A linguistic
`theory is only useful to the extent that it explains actual behavior. As a result,
`psycholinguists are interested in both the representations of linguistic
`structures and the processes by which a person can produce such structures
`from actual sentences. The primary tool that is used is experimentation--that
`is, actual measurements made on people as they produce and understand
`language, including how much time a person needs to read each word in a
`sentence, how much time a person needs to decide whether a given item is a
`legal word or not, what types of errors people make as they perform various
`linguistic tasks, and so Oll. Experimental data is used to attempt to validate or
`reject specific hypotheses about language, which are often taken from the
`theories that linguists or computational linguists propose.
`
`As mentioned earlier, there can be two underlying motivations for building
`a computational theory. The technological goal is simply to build better
`computers, and any solution that works would be acceptable. The cognitive
`goal is to build a computational analog of the human-Ianguage-processing
`mechanism; such a theory would be acceptable only after it had been verified
`by experiment. By necessity, this book takes a middle ground between these
`two criteria. On one hand, this reflects a belief that natural language is so
`complex that an ad hoc approach that a well-specified underlying theory does
`not support is not possible. Thus, the technological goal cannot be realized
`without using sophisticated underlying theories that are on the level being
`developed by theoretical linguists. On the other hand, the present state of
`knowledge about natural language processing is so preliminary that
`attempting to build a cognitively correct model is not feasible. Before
`researchers can begin a project to build such a model, there would have to be
`simultaneous major advances in both computational linguistics and the
`experimental techniques used by psycholinguistics. As a result, thi.s book will
`assume that the current goal of computational linguistics is to produce a
`comprehensive, computational theory of language understanding and
`production that is well-defined and linguistically motivated. Constructing such
`a computational theory would be an important first step in producing a
`cognitively correct theory, as a computational theory could serve as a starting
`point for experimental testing and lead to modified theories that account for
`the experimen tal evidence.
`
`1.2 Evaluating Computational Models of Language
`
`One obvious method to evaluate a computational theory would be to run the
`program and see how well it performs. For instance, if the program is meant to
`answer questions about a database of facts, you might ask it questions to -see
`how good it is at producing the correct answers. If the system is designed to
`
`Petitioner Apple Inc. - Exhibit 1014, p. 15
`
`
`
`Introduction to Natural Language Understanding
`
`3
`
`User:
`
`Men are all alike.
`They're always bugging us
`about something or other.
`Well, my boyfriend made me
`come here.
`He says I'm depressed much
`ofthe-time.
`It's true, I am unhappy.
`
`I need some help; that much
`seems certain.
`
`Perhaps I could learn to get
`along with my mother.
`My mother takes care of me.
`
`My father.
`You are like my father in
`some ways.
`
`ELIZA:
`
`In what way?
`
`Can you think of a specific example?
`
`Your boyfriend made you corne here.
`
`I am sorry to hear you are depressed.
`Do you think coming here will
`help you not to be unhappy?
`
`What would it mean to you if you
`got some help?
`
`Tell me more about your family.
`Who else in your family takes care
`of you?
`Your father.
`
`What resemblance do you see?
`
`Figure 1.1 A dialogue with ELIZA
`
`participate in simple conversations on a certain topic, then you might run it
`and try conversing with it. While ultimately this method of evaluation may be
`the best test of a theory, at the present stage of the field, it can be misleading.
`For example. given that most systems only exist in complete working order for
`a brief time, you usually do not get to interact with the program yourself but
`must depend on transcripts of others' interactions. These transcripts can easily
`mislead you: the underlying theory of a program that seems to produce
`impressive results may prove to be much less interesting and important than
`another system that seems limited in its abilities. You can attribute much
`more intelligence to the program than is actually present.
`
`Consider a concrete example--the ELIZA program that was developed in the
`mid-1960s at MIT. This program must be one of the most popular artificial
`intelligence programs of all time, and versions of it exist for most machines,
`including most micros. ELIZA was never claimed to embody a theory of
`language comprehension and production, and yet its behavior seems
`impressive. The system plays the role of a therapist and, to obtain the best
`results, the user should correspondingly play the role of a patient. Figure 1.1
`presents a transcri pt of this system in operation.
`
`Petitioner Apple Inc. - Exhibit 1014, p. 16
`
`
`
`4
`
`Chapter 1
`
`Here is a simple description of the basic mechanism of ELIZA. There is a
`database of particular words that are called keywords. For each keyword, the
`system stores an integer, a pattern to match against the input, and a
`specification of the output. The algorithm is as follows: given a sentence S, find
`a keyword in S whose pattern matches S. If there is more than one keyword,
`pick the one with the highest integer value. Use the output specification that is
`associated with this keyword to generate the next sentence. If there are no
`keywords, generate an innocuous continuation statement, such as Tell me more
`or Go on.
`
`Figure 1.2 shows a fragment of a database of keywords. In this database, a
`pattern consists of words and variables. The prefix "?" before a letter indicates
`a variable, which can match any sequence of words. For example, the pattern
`
`?X are you?Y
`
`would match the sentence Why are you looking at mel, where the variable ?X
`matches Why and ?Y matches looking at me. The output specification may also
`use the same variables. In this case, ELIZA inserts the words that match the
`variables in the input into the output after making some minor changes in the
`pronouns (for example, ELIZA would replace me with you). Thus, for the
`pattern given earlier, the output specification
`
`Would you prefer it ifI weren't ?Y?
`
`would generate a response Would you prefer it if I weren't looking at you? When
`the database lists multiple output specifications for a given pattern, ELIZA
`selects a different one each time that a keyword rule is used, thereby
`preventing unnatural repetition in the conversation. Using these rules, you
`can see the way that ELIZA generated the first two exchanges in the
`conversation in Figure 1.1. ELIZA generated the first response from the first
`output of,the keyword alike, while generating the second response from the
`first output of the keyword always.
`
`This description covers all of the essential points of the program! You will
`probably agree that this program does not understand the conversation that it
`is participating in. Rather, it is a simple collection of tricks. Given this, why
`does ELIZA appear to function so well? There are several reasons. Perhaps the
`most important reason is that, when people hear or read a sequence of words
`that they understand as a sen tence, they attribute meaning to the sentence and
`assume that the person (or machine) that produced the sentence actually
`intended that meaning. People are extremely good at distinguishing word
`meanings and interpreting sentences to fit the context. As a result, ELIZA
`appears to be intelligent because you use your own intelligence to make sense
`of what it says.
`
`Petitioner Apple Inc. - Exhibit 1014, p. 17
`
`
`
`Introduction to Natural Language Understanding
`
`5
`
`Word
`
`Rank
`
`Pattern
`
`Outputs
`
`alike
`
`10
`
`?X
`
`are
`
`always
`
`what
`
`3
`3
`
`5
`
`2
`
`?X are you?Y
`?X are?Y
`?X
`
`In what way?
`What resemblance do you see?
`
`Would you prefer it ifl weren't ?Y?
`What if they were not ?Y?
`
`Can you think of a specific example?
`When?
`Really, always?
`
`'X
`
`Why do you ask?
`Does that interest you?
`
`Figure 1.2 Sample data from ELIZA
`
`There are other crucial characteristics of the conversational setting that
`also aid in sustaining the illusion of intelligence. For instance, the system does
`not need any world knowledge because it never has to make a claim, support an
`argument, or answer a question. Rather, ELIZA simply asks a series of
`questions. Except in a patient-therapist situation, this characteristic would be
`unacceptable. ELIZA evades all direct questions by responding with another
`question, such as Why do you ask' Thus, there is no way to force the program to
`say something concrete about any topic.
`
`However, even in such a restricted situation, it is relatively easy to show
`that the program does not understand. For example, since ELIZA has no
`knowledge about the structure of language, it accepts gibberish just as readily
`as valid sentences. Thus, if you entered Green the adzabak are the a ran four,
`ELIZA would respond with something like What if they were not the a ran four?
`Furthermore, as a conversation progresses, it becomes obvious that the
`program does not retain any of the content in the conversation. It begins to ask
`questions that are inappropriate in light of the earlier exchanges, and its
`responses in general begin to show a lack of any focus. Of course, if you are not
`able to play with the program and must depend only on transcripts of
`conversations by others, you would have no way of detecting these flaws unless
`they were explici tly mentioned.
`
`Thus, in addition to studying examples of system performance,
`computational1inguistics needs a method to evaluate work. In general, there
`must be some underlying theory that researchers can describe precisely enough
`so that they can test new examples against the theory.
`In addition.
`generalizations about language in theoretical1inguistics should be reflected in
`the computational theory.
`
`Petitioner Apple Inc. - Exhibit 1014, p. 18
`
`
`
`6
`
`Chapter 1
`
`1.3 Knowledge and Language
`
`A language-comprehension program must have considerable knowledge about
`the structure of the language itself, including what the words are, how to
`combine the words into sentences, what the words mean, how these word
`meanings contribute to the sentence meaning, and so on. However, a program
`cannot completely simulate linguistic behavior without first taking into
`account an important aspect of what makes humans intelligent--their general
`world knowledge and their reasoning ability. For example, to answer
`questions or to participate in a conversation, a person not only must know a lot
`about the structure of the language being used, but also must know about the
`world in general and the conversational setting in particular. Thus, a natural
`language system would need methods of encoding and using this knowledge in
`ways that will produce the appropriate behavior. Furthermore, the knowledge
`of the current situation (or context) plays a crucial role in determining how the
`system interprets a particular sentence. This factor comes so naturally to
`people that researchers overlook it.
`
`The different forms of knowledge have traditionally been defined as follows:
`
`• Phonetic and phonological knowledge concerns how words
`are realized as sounds. While this type of knowledge is an
`important concern for automatic speech-understanding systems,
`there is not the space to examine these issues in this hook.
`
`• Morphological knowledge concerns how words are constructed
`out of more basic meaning units called morphemes. For
`example, you can construct the word friendly from a root form
`friend and the suffix -Iy.
`
`e Syntactic knowledge concerns how words can be put together to
`form sentences that look correct in the language. This form of
`knowledge identifies how one word relates to another (for
`example, whether one word modifies another, or is unrelated).
`
`e Semantic knowledge concerns what words mean and how these
`meanings combine in sentences to form sentence meanings.
`
`e Pragmatic knowledge concerns how sentences are used in
`different contexts and how context affects the interpretation of the
`sentence.
`
`• World knowledge includes the general knowledge about the
`structure of the world that language users must have in order to,
`for example,