throbber
Proceedings ofthe IJCAI'97 workshop on "Intelligent Multimodal Systems".
`August 24th. Nagoya, Japan. http://www.miv.t.u-tokyo.ac.jp/ijcai97-IMS/
`
`Towards "intelligent" cooperation between modalities.
`The example of a system enabling multimodal interaction with a
`map
`
`Jean-Claude MARTIN
`LIMSI-CNRS, BP 133, 91403 Orsay Cedex, France
`martin@limsi.fr
`
`Abstract
`
`In this paper we propose a coherent approach for studying and implementing multimodalinterfaces. This approach is based onsix basic
`"types of cooperation" between modalities: transfer, equivalence, specialization, redundancy, complementarity and concurrence. Definitions
`and examples of these types of cooperations are given in the paper.
`Wehaveused this approach to develop both theoretical tools (a framework, and formal notations) and software tools (a language for
`specifying multimodal input, and a module integrating events detected on several modalities).
`These tools have been applied to the developmentof a prototype enabling a user to interact with a geographic map by combining speech
`recognition, pointing gestures with a mouse and a keyboard. We explain the underlying software architecture and give details on how the
`multimodal module may enable "multimodal recognition scores".
`Finally, we describe what webelieve "intelligent" multimodal systems should be, and how our approach based on the types of cooperation
`between modalities could be used in this direction.
`
`1. Introduction
`2. Theoretical tools
`3. The CARTOONprototype
`4. The specification language
`5. The multimodal module
`6. Conclusions and perspectives
`7. References
`
`1. Introduction
`
`The development of multimodal systems addresses several issues [Maybury 1994]: content selection ("what to say"), modality allocation
`("which modality to say it"), modality realization ("how to say that in that modality") and modality combination. Our work deals with the
`"modality combination" issue. A multimodal interface developer has to know how to combine modalities and why this combination may
`improvethe interaction. Although several multimodal interfaces have already been developed [CMC 1995 ; IMMI 1995], thereisstill a
`lack of coherent theoretical and software tools.
`
`In the first part of this paper, we proposea theoretical framework for analyzing modality combinations. The secondpart details two
`software tools based on the framework: a specification language and a multimodal module using Guided Propagation Networks. Illustrative
`examples are taken from a prototype enabling multimodal interrogation of a geographic map developed by [Goncalveset al. 1997].
`
`2. Theoretical tools
`
`A system should use multimodality only if it helps in achieving usability criteria and requirement specifications such as:
`
`improving recognition in a noisy (audio,visualor tactile) environment,
`enabling a fast interaction,
`being intuitive or easy to learn,
`adapting to several environments,users oruser's be-haviors,
`enabling the userto easily link presented information to more global contextual knowledge,
`translating information from one modality to another modality...
`
`These usability criteria may depend on the application to be developed. From a multimodal point of view, they can be seen as "goals of
`cooperation" between modalities. How can modalities cooperate and be combinedto achieve each of these goals ? We proposesix basic
`"types of cooperation" between modalities: transfer, specialization, equivalence, redundancy, complementarity and concurrency.In this
`section, we define each of them and give examples on how they mayhelp in reaching usability criteria (figure 1). In our definitions, a
`
`DISH, Exh. 1024, p. 1
`
`DISH, Exh. 1024, p. 1
`
`

`

`modality is considered as a process receiving and producing chunksof information. More examples of types of cooperation can be found in
`[Martin et al. in press].
`
`
`
`Intuitiveness orfaster learningSil
`bg-ooneceneeaerent arora
`Fast interactionee ne ener ne ene eee een e ene n eee ree ne
`
`Recognition and understanding
`egoo soec(o Tepes
`S x?
`~wyee eo
`
`Figure 1. The frameworkproposedin this paper for studying and designing multimodalinterfaces. Six "types of cooperation"
`between modalities (horizontal axis) may be involved in several "goals of cooperation (vertical axis). For instance (red box), it has
`been shownthat with redundantdisplayed text and vocal output, a user learned faster how to use a graphical interface [Wanget al.
`1993].
`
`2.1. Equivalence
`
`Whenseveral modalities cooperate by equivalence, this means that a chunk of information may be processedas an alternative, by either of
`them.
`
`In COMIT,a multimodal interface that we have developed, the user can create a graphical interface (windows,buttons, scrollbars) inter-
`actively by combining speech, mouse and keyboard. For instance, the user mayeither utter or type "create a scrollbar" to create a new
`scrollbar.
`
`is applied to hierarchical file system management. It allows the user to chooseat any time
`The EDWARDsystem [Huls and Bos 1995]
`during the interaction the style that suits best at that moment (mouseornatural language). Experimental tests have shownthat subjects
`tended to choose the mousefor selecting an object with a long name. Yet, when the object was difficult to locate on the screen, subjects
`preferred typing.
`
`Equivalence also enables adaptation to the user by cus-tomization: the user may be allowedto select the modalities he prefers [Hare etal.
`1995]. The formation of accurate mental models of a multimodal system seems dependent upon the implementation of such options over
`whichthe user has control [Sims and Hedberg 1995].
`
`Thus, equivalence meansalternative. It is clear that differences between each modality, either cognitive or technical, have to be considered.
`
`2.2 Specialization
`
`When modalities cooperate by specialization, this means that a specific kind of information is always processed by the same modality.
`
`Specialization is not always absolute and may be moreprecisely defined: one should distinguish data-relative specialization and modality-
`relative specialization. In several systems, sounds are somehow specializedin errors notification (forbidden commandsare signaled with a
`beep). On the other way,it is a modality-relative specialization if sounds are not used to convey any other type of information.It is a data-
`relative specialization if errors only produce sounds and no graphicsor text. Whenthere is a one-to-onerelation betweena set of
`information and a modality, we will speak of an absolute specialization.
`
`Specialization mayhelp the userto interpret the events produced by the computer(to link them to the global contextual knowledge). This
`meansthat the choice of a given modality adds semantic information and hence helps the interpretation process.
`
`When a modality is specialized, it should respect the specificity of this modality including the informationit is good at representing. For
`instance, in reference interpretation, the designation gesture aimsat selecting a specific area and the verbal channel provides a frame for the
`interpretation of the reference: categorical information, constraints on the numberof objects selected [Bellalem and Romary 1995].
`
`DISH, Exh. 1024, p. 2
`
`DISH, Exh. 1024, p. 2
`
`

`

`In an experimental study [Bressole et al. 1995] aiming at the understanding of cooperative cognitive strategies used byairtraffic
`controllers, non-verbal resource are revealed to be a specific vector of communication for some types of information which are not verbally
`expressed such as the emergencyofa situation. Intuitive specialization of a modality may goesagainstits technical specificities. In the
`Wizard of Oz experiment dealing with a tourist application described in [Siroux et al. 1995], despite the low recognition rate of town
`names,the users did not use the tactile screen to select a town but used speech instead.
`
`2.3. Redundancy
`
`If several modalities cooperate by redundancy, this means that the same information is processed by these modalities.
`
`In COMIT,if the user types "quit" on the keyboard or utters "quit", the system asks for a confirmation. Butif the user both types andutters
`"quit", the systems interpret this redundancy to avoid a confirmation dialogue thus enabling a faster interaction by reducing the numberof
`actions the user has to perform.
`
`Regarding intuitiveness, redundancy has been observedin the Wizard of Oz study described in [Siroux et al. 1995]: sometimes the user
`selected a town both by speech and a touch onthetactile screen.
`
`Regarding learnability of interfaces, it has been observed that a redundant multimodal output involving both visual display of a text and
`speechrestitution of the same text enabled faster graphical interface learning [Dowell et al. 1995]. Redundancy betweenvisual and vocal
`text with verbatim reinforcement wasalso tested in [Huls and Bos 1995] with natural language descriptions of the objects the user
`manipulates and the action he performs. Although speech coerced the subjects into reading the typed descriptions, the subjects made more
`errors and were slower than with the visual text output only.
`
`2.4. Complementarity
`
`Whenseveral modalities cooperate by complementarity, it means that different chunks of information are processed by each modality but
`have to be merged. First systems enabling the "put that there" command for the ma-nipulation of graphical objects are described in
`[Carbonnel 1970 ; Bolt 1980]. In COMIT,if the user wants to create a radio button, he may type its name on the keyboard andselect its
`position with the mouse. These two chunks of information have to be mergedto create the button with the right nameat the right posi-tion.
`This complementarity may enable a faster interac-tion since the two modalities can be used simultaneously and convey shorter messages
`which are moreoverbetter recognized than long messages.
`
`In [Huls and Bos 1995], experiments have shownthat the use of complementarity input such as "Is this a report ?" while pointing ona file,
`increases with user's experience.
`
`Complementarity may also improveinterpretation, as in [Santana and Pineda 1995] where a graphical outputis sufficient for an expert but
`need to be completed by a textual output for novice users. An important issue con-cerning complementarity is the criterion used to merged
`chunksof information in different modalities. The most classical approaches are to merge them because they are temporally coincident,
`temporally sequential or spatially linked. Regarding intuitiveness, complementarity behavior were observedin [Sirouxet al. 1995]. Two
`types of behavior did feature complementarity. In the "sequential" behavior, which wasrare, the user would by example utter "what are the
`campsites at" and then select a town with thetactile screen. In the "synergistic" behavior, the user would utter "Are there any campsites
`here ?"and select a town with the tactile screen while pronouncing "here". Regarding the output from the computer, it was observed in the
`experiment described in [Hare et al. 1995] that spatial linking of related information encouragesthe user's awareness of causal and
`cognitive links. Yet, when having to retrieve complementary chunksof information from different media, users behavior tended to be
`biased towards sequential search avoiding synergistic use of several modalities.
`
`Modalities cooperating by complementarity may bespecialized in different types of information. In the example of a graphical editor, the
`nameof an object may be always specified with speech whileits position is specified with the mouse. But modalities cooperating by
`complementarity may bealso be equivalent for different types of information. As a matter of fact, the user could also select an object with
`the mouseand its new position with speech ("in the upperright corner"). Nevertheless, the complementary use of specialized modalities
`gives the advantages of specialization: speech recognition is improved since the vocabulary and syntax is simpler than a complete linguistic
`description.
`
`2.5. Transfer
`
`Whenseveral modalities cooperate by transfer, this means that a chunk of information produced by a modality is used by another modality.
`
`Transfer is commonly used in hypermedia interfaces when a mouseclick provokes the display of an image. In informationretrieval
`applications, the user may express a request in one modality (speech) and get relevant information in another modality (video) [Footeet al.
`1995]. Output information may notonly be retrieved but also produced from scratch. Several systems generate graphical descriptions of a
`scene from a linguistic description [O Nuallain and Smith 1994]. Natural language instruc-tions can also be used to create animated
`simulations of virtual human agents carrying out tasks [Webber 1995]. Similarly, the visual description of a scene can be used to generate a
`linguistic description [Jackendoff 1987] or a multimodal description [André and Rist 1995]. Let's say that all these previous examples
`involved transfer for a goaloftranslation.
`
`Transfer may also be involved in other goals such as improving recognition: mouseclick detection may be transferred to a speech modality
`in order to ease the recognition of predictable words(here, that...) as in the GERBALsystem [Salisbury et al. 1990].
`DISH, Exh. 1024, p. 3
`
`DISH, Exh. 1024, p. 3
`
`

`

`2.6. Concurrency
`
`Finally, when several modalities cooperate by concurrency, it means that different chunks of information are processed by several
`modalities at the same time but must not be merged. This may enable a faster interaction since several modalities are used in parallel.
`
`2.7. Formal notations
`
`To define more precisely these types of cooperation, we proposelogical formal notations. They aim at stating explicitly the parameters of
`each type of cooperation and the relation between these parameters which is subsumed by the type of cooperation. We consider the case of
`input modalities (human towards computer). These formal notations have helped usin defining a specification language for implementing
`multimodal interfaces (next section).
`
`Wedefine a modality as a process receiving and pro-ducing chunksof information. A modality M is formally defined by:
`
`e E(M)the set of chunks of information received by M
`e S(M)the set of chunks of information produced by M
`
`Two modalities M1 and M2 cooperate by transfer when a chunk of information produced by M1 can be used by M2after translation by a
`transfer operator tr which is a pa-rameter of the cooperation.
`
`transfer (M), Mo, tr):
`tr(S(Mjj) CE(M))
`
`An input modality M cooperate by specialization with a set of input modalities Mi in the production of a set I of chunks of information if M
`producesI (and only I) and no modality in Mi producesI.
`specialisation(M, I, {M,}):
`L=S() A VM;,, 1 CSM
`
`Two input modalities M1 and M2 cooperate by equiva-lence for the production of a set I of chunks of informa-tion when each elementi of I
`can be produced either by M1 or M2. An operator eq controls which modality will be used and maytake into account user's preferences,
`environmental features, information to be transmitted...
`
`equivalence (M,, M, I, eq):
`Viel, Fe, € B(M,), Fe, © E(M)), i = eq((M;, e)), (Ma,
`
`€2))
`
`Two input modalities M1 and M2 cooperate by redundancyfor the production of a set I of chunks of informa-tion when each elementi of I
`can be producedby an operator re merging a couple (s1, s2) produced respec-tively by M1 and M2. Theoperatorre will merge(s1, s2) if
`their redundantattribute has the same value anda criterion crit is true. A chunk of information has several attributes. For instance, a chunk
`of information sent by a speech recognizer has the following attributes: time of detection, label of recognized word, recognition score. The
`redundantattribute of two modalities plays a role in deciding whether two chunksof information produced by these modalities is redundant
`or complementary.
`
`redundancy (M), M2, I, redundantattribute, crit):
`Viel, Fs; €S(M), Fs, € S(M3),
`redundantattribute (sl) = redundantattribute (s2)A
`i= ré(S; 82, crit)
`
`Two input modalities M1 and M2 cooperate by complementarity for the production ofa set I of chunksofin-formation wheneach elementi
`of I can be produced by an operator co merging a couple (s1, s2) produced re-spectively by M1 and M2. Theprocessco will merge(s1, s2)
`if their redundantattribute does not have the same valueandacriterion crit is true:
`
`DISH, Exh. 1024, p. 4
`
`DISH, Exh. 1024, p. 4
`
`

`

`complementarity (MM), Mo, I, redundant_attribute, crit):
`Yield, Fs; € S(M)), F827 € S(M)),
`redundantattribute (sl) #redundant_attribute (s2)A
`i= co(s; 82, crit)
`
`In the next sections, we introduce a specification language based on these formalnotation. This language has been used for the
`implementation of a multimodal prototype: CARTOON.
`
`3. The CARTOONprototype
`
`We have implemented CARTOON (CARTography and cOOperatioN between modalities), a multimodal interface to a cartographic
`application developed by [Goncalveset al. 1997] enabling the manipulation of streets, the computation of shortest itinerary... Multimodal
`interrogation of maps seemsto be a promising application for multimodal systems [Cheyer and Julia 1995 ; Siroux et al. 1995] as more and
`more tourist information is available on the Internet. Figure 2 shows a screen dump during a multimodalinteraction in CARTOON. A map
`is displayed on the screen. The user may combine speech utterances and pointing gestures with the mouse. Forinstance, the user may utter
`(translated from French) "I want to go from here to here ". Then the system computesthe shortest itinerary and the streets to be taken are
`displayed in red. The following combinationsare possible with CARTOON:
`
`Whereisthe police station ?
`Show methe hospital
`I want to go from here to the hospital
`I am in front of the police station. How can I go here ?
`Whatis the nameofthis building ?
`Whatis this ?
`Show me how to go from here to here
`
`
`
` iei
`
`
`ici
`
`416 1955
`354 501
`
`
`
`
`
`
`Figure 2. Example of a multimodalinteraction with the CARTOONprototype. The events detected on the three modalities (speech,
`mouse, keyboard)are displayed in the lower window asa function of time. In this case, the detected speech events were:
`"T_want_to_go", "here", "here". Two mouse clicks were detected. The system integrated these events as a request and displays the
`shortest itinerary.
`
`In the currentversion,thereis no linguistic analysis preliminary to the multimodal fusion. Events produced by the speech recognition
`system (a Vecsys Datavox) are either words ("here") or sequences of words ("I_want_to_go"). There are 38 such possible speech events.
`Each speech eventis characterized by: the recognized word, the time of utterance and the recognition score.
`
`The pointing gestures events are characterized by an (x,y) position and the time of detection.
`
`DISH, Exh. 1024, p. 5
`
`DISH, Exh. 1024, p. 5
`
`

`

`The overall hardware and softwarearchitecture is describedin figure 3.
`
`Silicon
`Graphics
`
`
`EMUX||TYCOON finary description
`
`M, Aras
`
`Server
`Cut
`
`X. Brigfault
`MR. Goncalves
`
`Figure 3. hardware and software architecture. Events detected on the speech, mouse and keyboard modalities (left-hand side) are
`time-stamped coherently by a Modality Server [Bourdotet al. 95]. The events are then integrated in our multimodal module
`TYCOON(in the middle) which merges them and sends messages to the cartography and itinerary application (right-handside).
`
`4. The specification language
`
`The combination of modalities used in CARTOONare described in a specification languagethat is based on our formalnotations. In this
`section, we explain parts of the specification file used for CARTOON.
`
`Firstly, the modality used are specified (the objects modality is activated when one graphical object such as a building is mouse-clicked):
`
`modality Speech Keyboard Mouse Objects
`
`Then, these modalities are connected to the multimodal module:
`
`link
`link
`link
`link
`
`Speech
`Mouse
`Keyboard
`Objects
`
`Multimodal
`Multimodal
`Multimodal
`Multimodal
`
`The events to be detected on each modality are also specified (38 speech items):
`
`event
`
`Speech where_is
`show_me
`I_am
`I_want_to_go
`
`For each commandofthe cartographic application, the possible combination of modalities are specified. Here is the example of the
`command NameOf: A variable V3 is defined as the beginning of a sequence:
`
`start_sequence Multimodal
`
`v3
`
`It is may be activated by one event among several (the word "name" typed on the keyboard or the speech items "whatis the name of" or
`"whatis that"):
`
`Multimodal
`
`equivalence
`Keyboard
`Speech
`Speech
`
`v3
`name
`what_is_the_name_of
`what_is_that
`
`This V3 variable is linked sequentially to a second vari-able V4:.
`
`complementarity_sequence
`
`Multimodal
`
`v3
`
`«(V4
`
`V4 mayonly be activated by a mouseevent:
`
`specialization
`
`Multimodal
`
`V4
`
`Mouse
`
`*
`
`V4 is boundto a parameter of an application module whichis involved in the execution process:
`
`bind_application
`
`Parameter1NameOf
`
`V4
`
`DISH, Exh. 1024, p. 6
`
`DISH, Exh. 1024, p. 6
`
`

`

`V4is the last variable of the sequence:
`
`end_sequence
`
`Multimodal
`
`V4
`
`NameOf
`
`5. The multimodal module
`
`The multimodal module used in CARTOONis based on Guided Propagation [Béroule 1985] (figure 4). Such networks comprise
`elementary processing units: event-detectors and multimodal units. Event detectors (square units) selectively respond to events at the
`moment they occur in the environment. Whenactivated by an event, these event-detectors send a signal to the multimodal units (circle
`units) to which they are connected. The connections betweenthe units are build from the specification file described in the previous
`section.
`
`
`
`proves
`
`
`
`
`
`
`Note of emporad
`
`Figure 4: the multimodal module uses Guided Propagation Networks. Left-handside: a network integrating events detected on
`three modalities is composed of event-detectors (square units) and multimodal units (circle units). Right-hand side: three properties
`of these networks enable multimodalrecognition scores (see text).
`
`Theactivity level of a detector at the end of a multimodal command pathwaycorrespondsto the way an occurrenceof this command
`matchesits internal representation. This "matching score" accounts for the degree of distortions undergoneby the reference multimodal
`command,including noisy, missing or inverse components.Initially applied to robust parsing [Westerlund et al. 1994], this feature has been
`adapted to multimodality [Veldman 1995]. This quantified matching score results from three properties of GPN (figure 4, right-handside):
`
`e A: the amplitude of the signal emitted by a speech detector is proportional to the recognition score provided by the speech recogniser
`e B: a multimodal unit can be activated even if some expected events are missing (in this case, the amplitude of the signal emitted by
`this variable is lower than the maximum)
`e C: the bigger the temporal distortion between two events, the weaker their summation (or note of temporal proximity), because of the
`decreasing shapeofthe signals.
`
`6. Conclusion and perspectives
`
`In this paper, we have described sometheoretical and software tools that we have developed. We explained how we used them for
`implementing a multimodalinterface to a cartography application. The main features of our work are the typology of types of cooperation
`that we proposeand the capacity of our multimodal module to provide multimodal recognition scores.
`
`Weplan to improve the CARTOONsystem in the following directions:
`
`e makeuserstudies to test the advantages of multimodal recognition scores and to evaluate the types of cooperation that are used by
`the user
`e develop linguistic and semantic representations (which are currently missing in our work) : we plan to connect our multimodal
`moduleto the linguistic tools developed by [Briffault et al. 1997] and test several possibilities of interaction such as early dropping of
`linguistic hypothesis due to multimodalresults
`e extend the gesture modality to circling and trajectory gestures on a tactile screen
`
`Moregenerally, what should be an "intelligent" multimodal system ? We propose hereafter some answersto this question. It should:
`
`recognize several input modalities (speech, hand and body gesture, gaze)
`*
`e generate contextual output modalities (speech, displayed text and graphics) depending on the users profile, behavior and environment

`beintuitive to use
`e
`integrate multi-users dialogues mediated by the computer
`e manipulate semantic representations
`e
`find out dynamically the most important goal of cooperation between modalities depending on the user and environmental features
`
`DISH, Exh. 1024, p. 7
`
`DISH, Exh. 1024, p. 7
`
`

`

`e dynamically select (these three questions haveto be tackled together):
`e
`the information to be transmitted
`e
`the modalities to be used (and hence the media)
`e
`the types of cooperation between modalities to be used
`
`Acknowledgments
`
`The author would like to thank Marie-Rose Goncalves and Xavier Briffault for the cartographic application they have developed and which
`is used within the CARTOONproject.
`
`References
`
`[André and Rist 1995] André, E. and Rist, T. Generating coherent presentations employing textual and visual material. Artificial
`Intelligence Review,9 (2-3), 147-165.
`
`[Bellalem and Romary 1995] Bellalem, N. and Romary, L. Reference interpretation in a multimodal environment com-bining speech and
`gesture. In [IMMI 1995].
`
`[Béroule 1985] Béroule, D. (1985). A model of Adaptative Dynamic Associative Memory for speech processing. The-sis, 31 may, Univ.
`Orsay. 185p. In French.
`
`[Bolt 1980] Bolt, R.A. "Put-That-There": Voice and Gesture at The Graphics Interface. Computer Graphics 14 (3):262-270.
`
`[Bourdotet al. 1995] Bourdot, P., Krus, M., Gherbi, R. Management of non-standard devices for multimodaluser interfaces under
`UNIX/X11. In [CMC 1995].
`
`[Bressole et al. 1995] Bressolle, M.C, Pavard, B., Leroux, M. The role of multimodal communication in cooperation and intention
`recognition: the case of air traffic control. In [CMC 1995].
`
`[Goncalveset al. 1997] http://www.limsi.fr/Individu/goncalve/index.html
`http://www. limsi.fr/Individu/xavier/index.html
`
`[Carbonnelet al. 1970] Carbonnel, J.R. Mixed-Initiative Man-Computer Dialogues. Bolt, Beranek and Newman (BBN)Report N 1971,
`Cambridge, MA.
`
`[Cheyer and Julia 1995] Cheyer, A. and Julia, L. Multimo-dal maps: an agent-based approach. In [CMC 1995].
`
`[CMC 1995]. Proceedings of the International Conference on Cooperative Multimodal Communication (CMC'95). Bunt, H, Beun, R.J. and
`Borghuis, T. (Eds.). Eindhoven, may 24-26.
`
`[Dowell et al. 1995] Dowell, J.; Shmueli, Y.; and Salter, I. Applying a cognitive model ofthe user to the design of a multimodal speech
`interface. In [IMMI 1995].
`
`[Foote et al. 1995] Foote, J.T.; Brown, M.G.; Jones, G.J.F.; Sparck Jones, K.; and Young, S.J. Video mail retrieval by voice: towards
`intelligent retrieval and browsing of multi-media documents. In [IMMI95].
`
`[Briffault et al. 1997] http://www.limsi.fr/Individu/xavier/index.html http://www.limsi.fr/Individu/vap/index.html
`
`[Hareet al. 1995] Hare, M.; Doubleday, A., Bennett, I.; and Ryan, M.Intelligent presentation of informationretrieved from heterogeneous
`multimedia databases. In [IMMI 1995].
`
`[Huls and Bos 1995] Huls, C. and Bos, E. Studies into full integration of language and action. In [CMC 1995].
`
`[IMMI 1995] Pre-Proceedingsof the First International Workshop onIntelligence and Multimodality in Multimedia Interfaces: Research
`and Applications. Edited by John Lee. University of Edinburgh, Scotland, July 13-14.
`
`[Jackendoff 1987]. Jackendoff, R. On beyondzebra: the relation between linguistic and visual information. Cognition 26(2):89-114.
`
`[Martinet al. In press] Martin, J.C., Veldman, R. and Béroule, D. Developing MultimodalInterfaces : A theoretical Framework and Guided
`Propagation Networks. Book following the [CMC 1995] workshop. Bunt, H. (Ed.)
`
`[Maybury 1994] Maybury, M.Introduction. In Intelligent multimedia interfaces. AAAI Press. Cambridge Mass.
`
`[O'Nuallain and Smith 1994] O'Nuallain, S. and Smith, A.G. An investigation into the common semantics of language andvision. Artificial
`Intelligence Review 8 (2-3):113-122.
`
`DISH, Exh. 1024, p. 8
`
`DISH, Exh. 1024, p. 8
`
`

`

`[Salisbury et al. 1990] Salisbury M.W.; Hendrickson, J.H.; Lammers, T.L.; Fu, C.; and Moody, S.A. Talk and draw: bundling speech and
`graphics. IEEE Computer., 23(8) 59-65.
`
`[Santana and Pineda 1995] Santana, S. and Pineda, L.A. Producing coordinated natural language and graphical ex-planations in the context
`of a geometric problem-solving task. In [IMMI 1995].
`
`[Sims and Hedberg 1995] Sims, R. and Hedberg, J. Dimen-sionsof learner control: a reappraisal of interactive multi-media instruction. In
`[IMMI1995].
`
`[Siroux et al. 1995] Siroux, J., Guyomard, M., Multon, F., Remondeau, C. Modeling and processing ofthe oral and tactile activities in the
`Georal tactile system. In [CMC 1995].
`
`[Veldman 1995] Experiments on robust parsing in a multi-modal Guided Propagation Network. ERASMUSReport. LIMSI.
`
`[Wanget al. 1993] Wang, E.; Shahnvaz, H.; Hedman, L.; Papadopoulos, K.; and Watkinson. A usability evaluation of text and speech
`redundant help messageson a readerinter-face. In G. Salvendy M. Smith (Eds.), Human-ComputerInteraction: Software and Hardware
`Interfaces. pp 724-729.
`
`[Webber 1995] Webber, B. Instructing Animated Agents: Viewing Language in Behavioural Terms. In [CMC 1995].
`
`[Westerlundet al. 1994] Westerlund, P., Béroule, D and Roques, M. Experiments of robust parsing using a Guided Propagation Network.
`Proc. of the International Conf. on New Methods in Language Processing (NEMLAP), sept. 14-16, Manchester.
`
`DISH, Exh. 1024, p. 9
`
`DISH, Exh. 1024, p. 9
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket