`Toyoda
`
`I lllll llllllll Ill lllll lllll lllll lllll lllll 111111111111111111111111111111111
`US006452348Bl
`US 6,452,348 Bl
`Sep.17,2002
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54) ROBOT CONTROL DEVICE, ROBOT
`CONTROL METHOD AND STORAGE
`MEDIUM
`
`(75)
`
`Inventor: Takashi Toyoda, Tokyo (JP)
`
`(73) Assignee: Sony Corporation, Tokyo (JP)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/724,988
`
`(22) Filed:
`
`Nov. 28, 2000
`
`(30)
`
`Foreign Application Priority Data
`
`Nov. 30, 1999
`
`(JP) ........................................... 11-340466
`
`Int. Cl.7 .................................................. H02K 7/14
`(51)
`(52) U.S. Cl. ........................... 318/3; 318/632; 700/259;
`434/308
`(58) Field of Search ..................... 318/3, 632; 446/268,
`446/279, 280, 298, 299, 330; 381/110;
`700/258, 259; 901/46, 47; 434/308; 463/35,
`39
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`4,717,364 A * 1/1988 Furukawa ................... 446/175
`5,493,185 A * 2/1996 Mohr et al. .................... 318/3
`5,832,189 A * 11/1998 Tow ............................ 901/47
`6,160,986 A * 12/2000 Gabai et al. ................ 434/308
`
`* cited by examiner
`
`Primary Examiner---Khanh Dang
`(74) Attorney, Agent, or Firm-Frommer Lawrence &
`Haug LLP; William S. Frommer; Gordon Kessler
`
`(57)
`
`ABSTRACT
`
`A robot control device for controlling a robot having a
`substantial entertainment value is disclosed. A sensor signal
`processor recognizes the voice of a user, sets an association
`between the voice recognition result and an action of the
`robot, and registers the association in a behavior association
`table of a behavior association table memory. A behavior
`decision unit decides which action for the robot to take based
`on the behavior association table.
`
`8 Claims, 9 Drawing Sheets
`
`10
`
`3
`
`8
`
`40} 60
`
`50
`
`4C SC
`6C
`
`~
`
`1
`
`IROBOT 2010
`Shenzhen Zhiyi Technology v. iRobot
`IPR2017-02061
`
`
`
`U.S. Patent
`U.S. Patent
`
`Sep. 17, 2002
`Sep. 17, 2002
`
`Sheet 1 of 9
`Sheet 1 of 9
`
`US 6,452,348 Bl
`US 6,452,348 B1
`
`FIG. 1
`FIG.
`1
`
`x
`
`z
`
`10
`
`3
`
`8
`
`4A
`{
`6A SA
`
`
`
`40} 60
`
`50
`
`~
`
`4C SC
`6C
`
`2
`
`
`
`U.S. Patent
`
`Sep.17,2002
`
`Sheet 2 of 9
`
`US 6,452,348 Bl
`
`FIG. 2
`
`8
`
`CAMERA
`
`10 PRESSURE
`SENSOR
`
`11
`
`~------ACTUATOR 71
`(MOTOR)
`
`CONTROLLER
`
`~--.i ACTUATOR
`(MOTOR)
`
`ROTARY
`ENCODER
`
`ROTARY
`ENCODER
`
`121
`
`· · · · · · · · 12N
`
`FIG. 3
`
`--------------------------------------, I
`
`I
`I
`I
`I
`I
`-------------------------------~------
`
`~24
`
`--
`-25
`
`I
`I
`I
`I
`I
`
`FROM MICROPHONE,
`CAMERA, PRESSURE
`SENSOR, AND
`ROTARY ENCODER
`
`TO MOTOR
`~ ,..-
`
`' ' ' ' '
`11
`
`I
`I
`I
`I
`~23 I
`I
`I
`I
`I
`I
`
`·~-26
`- - - -
`- - -
`
`NON-VOLATILE
`MEMORY
`
`CPU
`
`-
`20
`- MEMORY
`-
`22
`
`21
`
`PROGRAM
`
`~ ~ ~ ~
`
`- ~ -
`
`-
`
`I I F
`
`RAM
`
`- - - -
`~ , ~ ,
`,,.
`
`MOTOR
`DRIVER
`
`3
`
`
`
`U.S. Patent
`
`Sep.17,2002
`
`Sheet 3 of 9
`
`US 6,452,348 Bl
`
`FIG. 4
`
`9
`
`MICRO(cid:173)
`PHONE
`
`8
`
`10
`
`CCD
`CAMERA
`
`PRESSURE
`SENSOR
`
`~-------------------
`
`32
`
`EMOTION/INSTINCT
`MODEL UNIT
`
`SENSOR SIGNAL ....---~ FROM ROTARY
`PROCESSOR
`1 ENCODER
`I
`I
`I
`I
`I
`I
`
`31
`
`I 1-----11
`
`33
`
`~~~'ij~~R MODEL ~33A
`BEHAVIOR ...___ ____ _____.
`BEHAVIOR
`DECISION
`UNIT
`ASSOCIATION
`TABLE MEMORY
`
`-338
`
`34
`
`POSTURE
`TRANSITION UNIT
`
`35 DATA CONTROL UNIT
`
`I MO!OR I· ....
`71
`
`I
`
`I
`
`• II
`
`I
`
`I
`
`4
`
`
`
`U.S. Patent
`
`Sep.17,2002
`
`Sheet 4 of 9
`
`US 6,452,348 Bl
`
`FIG. 5
`
`AR Co
`
`FIG. 6
`
`I~ WALKING LOOKING
`
`s
`HEY
`
`BITING
`
`•
`
`•
`
`I
`
`I
`
`I
`
`HAND-
`SHAKING
`
`20
`
`I
`
`I
`
`I
`
`I
`
`I
`
`0
`
`0
`
`.
`.
`.
`.
`.
`.
`.
`.
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`.
`.
`.
`.
`.
`.
`.
`.
`
`0
`
`0
`
`70
`
`.
`.
`.
`.
`.
`.
`.
`.
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I .
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`.
`.
`.
`.
`.
`.
`.
`
`0
`
`0
`
`20
`
`.
`.
`.
`.
`.
`.
`.
`.
`
`R
`
`FORWARD UP
`
`10
`
`COME
`HERE
`SHAKE
`HANDS
`.
`.
`.
`.
`.
`.
`.
`.
`.
`
`60
`
`0
`
`.
`.
`.
`.
`.
`.
`.
`.
`
`5
`
`
`
`U.S. Patent
`
`Sep. 17,2002
`
`Sheet 5 of 9
`
`US 6,452,348 Bl
`
`FIG. 7
`
`VOICE DATA
`
`41
`\
`''
`FEATURE PARAMETER
`EXTRACTOR
`
`42
`, '
`\
`MATCHING UNIT
`
`.-
`
`-... • •
`
`I
`
`'t
`VOICE RECOGNITION
`RESULT
`
`ACOUSTIC MODEL
`MEMORY
`
`r--- 43
`
`MEMORY
`
`I DICTIONARY ~
`44
`~45
`I
`
`GRAMMAR
`MEMORY
`
`6
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 6 of 9
`
`US 6,452,348 Bl
`
`FIG. 8
`
`VOICE RECOGNITION
`PROCESS
`
`EXTRACT FEATURE 81
`PARAMETER
`
`S2
`
`S3
`
`PERFORM
`MATCHING
`
`UNKNOWN
`WORD?
`NO
`
`YES
`
`OUTPUT WORD AS A RESULT S4
`OF VOICE RECOGNITION
`
`S5 OUTPUT PHONOLOGICAL
`INFORMATION
`
`END
`
`7
`
`
`
`U.S. Patent
`
`Sep.17,2002
`
`Sheet 7 of 9
`
`US 6,452,348 Bl
`
`FIG. 9
`
`BEHAVIOR LEARNING
`PROCESS
`
`RECEIVE VOICE
`RECOGNITION RESULT
`
`S 11
`
`S12
`
`UNKNOWN YES - - - - - - - - - ,
`WORD?
`NO
`
`S13
`REGISTER UNKNOWN
`WORD IN TABLE
`
`DECIDE AND PERFORM ACTION S14
`
`NO VOICE RECOGNITION RESULT? S1 S
`YES
`
`TIME UP? 816
`PERFORM ASSESSMENT S17
`YES
`
`BASED ON ASSESSMENT, MODIFY SCORE S18
`OF BEHAVIOR RESPONSIVE TO WORD
`IN ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`END
`
`8
`
`
`
`U.S. Patent
`
`Sep. 17,2002
`
`Sheet 8 of 9
`
`US 6,452,348 Bl
`
`FIG. 10
`
`BEHAVIOR LEARNING
`PROCESS
`
`DECIDE AND PERFORM ACTION 821
`
`NO VOICE RECOGNITION RESULT? 822
`YES
`
`S23
`TIME UP?
`YES
`
`524
`> - - - - - .
`YES
`
`UNKNOWN
`WORD?
`NO
`
`REGISTER UNKNOWN S25
`WORD IN TABLE
`
`INCREASE SCORE OF BEHAVIOR S26
`RESPONSIVE TO WORD IN
`ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`9
`
`
`
`U.S. Patent
`
`Sep. 17,2002
`
`Sheet 9 of 9
`
`US 6,452,348 Bl
`
`FIG. 11
`
`BEHAVIOR LEARNING
`PROCESS
`
`ENABLE POSTURE SETTING 831
`
`~_N_O_ POSTURE MODIFIED? 832
`YES
`833
`
`TIME UP?
`YES
`
`REGISTER ACTION IN
`RESPONSE TO MODIFIED
`POSTURE, IN TABLE
`AND BEHAVIOR MODEL
`
`834
`
`835
`VOICE RECOGNITION RESULT? ,_N~O __ _
`
`YES
`
`837 UNKNOWN YES
`WORD?
`NO
`
`836
`TIME UP?
`
`YES
`
`NO
`
`REGISTER UNKNOWN
`WORD IN TABLE
`
`S38
`
`INCREASE SCORE OF BEHAVIOR S39
`RESPONSIVE TO WORD IN
`ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`DISABLE POSTURE SETTING
`
`840
`
`END
`
`10
`
`
`
`US 6,452,348 Bl
`
`1
`ROBOT CONTROL DEVICE, ROBOT
`CONTROL METHOD AND STORAGE
`MEDIUM
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates to a robot control device, a
`robot control method, and a storage medium, and, more
`particularly, to a robot control device and a robot control
`method for controlling a robot with which an individual
`enjoys a training process like training an actual pet, such as
`a dog or cat, and to a storage medium for storing a software
`program for the robot control method.
`2. Description of the Related Art
`Commercially available are a number of (stuffed) toy
`robots which act in response to the pressing of a touch
`switch or a voice of an individual having an intensity above
`a predetermined level. In the context of the present
`invention, the toy robots include stuffed toy robots.
`In such conventional robots, the relationship between the
`pressing of the touch switch or the input of the voice and the
`action (behavior) of the robot is fixed, and a user cannot
`modify the behavior of the robot to the user's preference.
`The robot merely repeats the same action for several times,
`and the user may grow tired of the toy. The user thus cannot
`enjoy a learning process of the robot in the same way as a
`dog or cat may learn tricks.
`
`SUMMARY OF THE INVENTION
`
`10
`
`2
`Preferably, the robot control device further includes a
`posture detector for detecting a posture of the robot, wherein
`the setting unit sets an association between the voice rec(cid:173)
`ognition result of the voice recognition unit and an action
`5 which the robot needs to take to reach the posture detected
`by the posture detector.
`Preferably, the control unit controls the drive unit in
`accordance with the association set between the action of the
`robot and the voice recognition result of the voice recogni(cid:173)
`tion unit.
`Another aspect of the present invention relates to a robot
`control method for controlling the action of a robot, and
`includes a voice recognition step of recognizing a voice, a
`control step of controlling a drive unit that drives the robot
`15 for action, and a setting step of setting an association
`between the voice recognition result provided in the voice
`recognition step and the action of the robot.
`Yet another aspect of the present invention relates to a
`storage medium for storing a computer-executable code for
`controlling the action of a robot, and the computer(cid:173)
`executable code performs a voice recognition step of rec(cid:173)
`ognizing a voice, a control step of controlling drive unit that
`drives the robot for action, and a setting step of setting an
`25 association between the voice recognition result provided in
`the voice recognition step and the action of the robot.
`In accordance with the present invention, the drive unit is
`control to drive the robot for action while the voice is being
`recognized, and an association is set between the voice
`30 recognition result and the behavior of the robot.
`
`20
`
`40
`
`Accordingly, it is an object of the present invention to
`provide a robot which offers substantial entertainment value.
`An aspect of the present invention relates to a robot
`control device for controlling the action of a robot, and
`includes a voice recognition unit for recognizing a voice, a 35
`control unit for controlling a drive unit that drives the robot
`for action, and a setting unit for setting an association
`between the voice recognition result provided by the voice
`recognition unit and the behavior of the robot.
`The control unit may decide an action for the robot to
`take, and controls the drive unit to drive the robot to perform
`the decided action, wherein the setting unit sets an associa(cid:173)
`tion between the decided action and the voice recognition
`result immediately subsequent to the decided action taken by 45
`the robot.
`The robot control device preferably includes an assess(cid:173)
`ment unit for assessing a voice recognition result obtained
`subsequent to the first voice recognition result provided by
`the voice recognition unit, wherein the control unit controls 50
`the drive unit to drive the robot to perform a predetermined
`action in response to the first voice recognition result, and
`wherein the setting unit sets an association between the
`predetermined action and the first voice recognition result in
`accordance with the assessment result of the next voice 55
`recognition result.
`The setting unit preferably registers an association
`between the voice recognition result and the action of the
`robot in an association table that associates a word, which
`the voice recognition unit receives for voice recognition, 60
`with the action of the robot.
`When the voice recognition result provided by the voice
`recognition unit indicates that the word is an unknown one,
`the setting unit preferably registers the unknown word in the
`association table, and preferably registers an association 65
`between the registered unknown word and the action of the
`robot.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is an external perspective view showing one
`embodiment of the robot of the present invention;
`FIG. 2 is a block diagram showing the internal construc(cid:173)
`tion of the robot;
`FIG. 3 is a block diagram showing the hardware con(cid:173)
`struction of a controller;
`FIG. 4 is a functional block diagram that is performed
`when the controller executes programs;
`FIG. 5 shows a stochastic automaton as a behavioral
`model;
`FIG. 6 shows a behavior association table;
`FIG. 7 is a block diagram showing the construction of a
`voice recognition module that performs voice recognition in
`a sensor input processor;
`FIG. 8 is a flow diagram illustrating the operation of the
`voice recognition module;
`FIG. 9 is a flow diagram illustrating a first embodiment of
`the behavior learning process of a behavior decision unit;
`FIG. 10 is a flow diagram illustrating a second embodi(cid:173)
`ment of the behavior learning process of the behavior
`decision unit; and
`FIG. 11 is a flow diagram illustrating a third embodiment
`of the behavior learning process of the behavior decision
`unit.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`FIG. 1 is an external perspective view showing one
`embodiment of a robot of the present invention, and FIG. 2
`shows an electrical construction of the robot.
`In this embodiment, the robot models a dog. A head unit
`3 is connected to a torso unit 2 at the forward end thereof,
`
`11
`
`
`
`3
`and foot units 6A and 6B are respectively composed of
`thighs 4A-4D and heels 5A-5D, and are respectively con(cid:173)
`nected to the torso unit 2 on the side walls at the front and
`the back thereof. A tail 1 is connected to the back end of the
`torso unit 2.
`. 7 N as actuators are respectively
`Motors 71 , 7 2 ,
`.
`.
`arranged at the joints between the tail 1 and the torso unit 2,
`between the head unit 3 and the torso unit 2, between each
`of the thighs 4A-4D and the torso unit 2, and between the
`thighs 4A-4D and the respective heels 5A-5D. With the
`motors 7 1 , 7 2 , . . . 7 N turning, the tail 1 and the head unit 3
`are rotated about each of the three axes, i.e., the x, y, and z
`axes, the thighs 4A-4D are rotated about each of the two
`axes, i.e., the x and y axes, the heels 5A-5D are rotated
`about the single axis, i.e., the x axis. In this way, the robot 15
`takes a variety of actions.
`The head unit 3 contains a CCD (Charge-Coupled
`Device) camera 8, a microphone 9, and a pressure sensor 10
`at the predetermined positions thereof. The torso unit 2
`houses a controller 11. The CCD camera 8 picks up a picture 20
`of the surroundings of the robot, including the user. The
`microphone 9 picks up ambient sounds including the voice
`of the user. The pressure sensor 10 detects pressure applied
`the head unit 3 by the user or other objects. The controller
`11 thus receives the image of the surroundings taken by the 25
`CCD camera 8, the ambient sound picked up by the micro(cid:173)
`phone 9, pressure applied on the head unit 3 by the user, as
`image data, sound data, and pressure data, respectively.
`. 12N are respectively
`Rotary encoders 121 , 122 ,
`arranged for the motors 71 , 7 2 , . . . 7 N at the respective 30
`articulation points. The rotary encoders 121 , 122 , . . . 12N
`respectively detect the angles of rotation of the rotary shafts
`of the respective motors 71 , 7 2 , . . . 7 N· The angles of rotation
`detected by the rotary encoders 121 , 122 , . . . , 12N are fed
`to the controller 11 as detected angle data.
`The controller 11 determines the posture thereof and the
`situation surrounding the robot based on the image data from
`the CCD camera 8, the sound data from the microphone 9,
`the pressure data from the pressure sensor 10, and the angle
`data from the rotary encoders 121 , 122 ,
`. 12N. The
`.
`.
`controller 11 decides a subsequent action to take next in
`accordance with a preinstalled control program. Based on
`the decision, any of the motors 7 1 , 7 2 , . . . 7 N is driven as
`required.
`The robot thus acts in a self-controlled fashion by moving
`the tail 1, the torso unit 2, and the foot units 6A-6D to a
`desired state.
`FIG. 3 shows the construction of the controller 11 of FIG.
`
`.
`
`.
`
`US 6,452,348 Bl
`
`4
`through 12N, and sends the data to the CPU 20. Under the
`control of the CPU 20, the motor driver 25 feeds, to the
`motors 7 1 through 7 N, drive signals to drive these motors.
`The CPU 20 in the controller 11 controls the robot in
`5 accordance with a functional block diagram shown in FIG.
`4, by executing the control program stored in the program
`memory 21.
`FIG. 4 thus illustrates the function of the controller 11.
`A sensor signal processor 31 recognizes external stimu-
`10 lation acting on the robot or the surroundings of the robot,
`and feeds these data of the external stimulation and the
`surroundings to an emotion/instinct model unit 32 and a
`behavior decision unit 33.
`The emotion/instinct model unit 32 manages an emotion
`model and an instinct model respectively expressing the
`state of the emotion and the instinct of the robot. In response
`to the output from the sensor signal processor 31, and the
`output of the behavior decision unit 33, the emotion/instinct
`model unit 32 modifies parameters defining the emotion
`model and the instinct model, thereby updating the state of
`the emotion and the instinct of the robot.
`The behavior decision unit 33 contains a behavior model
`memory 33A and a behavior association table memory 33B,
`and decides a next behavior to be taken by the robot based
`on the content of the memory, the output of the sensor signal
`processor 31, and the emotion model and the instinct model
`managed by the emotion/instinct model unit 32. The behav(cid:173)
`ior decision unit 33 then feeds the information of the
`behavior (hereinafter referred to as behavior information) to
`a posture transition unit 34.
`In order to cause the robot to behave in accordance with
`the behavior information supplied by the behavior decision
`unit 33, the posture transition unit 34 calculates control data,
`such as angles of rotation and rotational speeds of the motors
`7 1 through 7 N and outputs the control data to a data control
`unit 35.
`The data control unit 35 drives the motors 7 1 through 7 N
`in response to the control data coming from the posture
`40 transition unit 34.
`The sensor signal processor 31 in the controller 11 thus
`constructed recognizes a particular external state, a particu(cid:173)
`lar action taken by the user, and an instruction given by the
`user based on the image data supplied by the camera 8, the
`45 voice data provided by the microphone 9, and the pressure
`data output by the pressure sensor 10. The recognition result
`is then output to the emotion/instinct model unit 32 and the
`behavior decision unit 33.
`The sensor signal processor 31 performs image recogni-
`50 tion based on the image data provided by the camera 8. For
`example, the sensor signal processor 31 recognizes that there
`is a pole or a wall, and then feeds the recognition result to
`the emotion/instinct model unit 32 and the behavior decision
`unit 33. The sensor signal processor 31 performs voice
`55 recognition by processing the voice data from the pressure
`sensor 10. For example, when the pressure sensor 10 detects
`a pressure of short duration of time at a level higher than a
`predetermined threshold, the sensor signal processor 31
`recognizes that the robot is being "beaten or chastised".
`60 When the pressure sensor detects a pressure of long duration
`of time at a level lower than a predetermined threshold, the
`sensor signal processor 31 recognizes as being "stroked or
`praised". The sensor signal processor 31 then feeds the
`recognition result to the emotion/instinct model unit 32 and
`65 the behavior decision unit 33.
`The emotion/instinct model unit 32 manages the emotion
`m model expressing emotional states, such as such as "joy",
`
`35
`
`2.
`
`The controller 11 includes a CPU (Central Processing
`Unit) 20, program memory 21, RAM (Random Access
`Memory) 22, non-volatile memory 23, interface circuit (llF)
`24, and motor driver 25. All of these components are
`interconnected via a bus 26.
`The CPU 20 controls the behavior of the robot by execut(cid:173)
`ing a control program stored in the program memory 21. The
`program memory 21 is an EEPROM (Electrically Erasable
`Programmable Read Only Memory), and stores the control
`program executed by the CPU 20 and required data. The
`RAM 22 temporarily stores data needed by the CPU 20 in
`operation. The non-volatile memory 23, as will be discussed
`later, stores an emotion/instinct model, a behavioral model,
`a behavior association table, etc, which must be retained
`throughout power interruptions. The interface circuit 24
`receives data supplied by the CCD camera 8, the micro(cid:173)
`phone 9, the pressure sensor 10, and the rotary encoders 121
`
`12
`
`
`
`US 6,452,348 Bl
`
`5
`"sadness", "anger", etc., and the instinct model expressing
`"appetite", "sleepiness", "exercise", etc.
`The emotion model and the instinct model express the
`states of the emotion and instinct of the robot by integer
`numbers ranging from zero to 100, for example. The
`emotion/instinct model unit 32 updates the values of the
`emotion model and instinct model in response to the output
`of the sensor signal processor 31, and the output of the
`behavior decision unit 33 with a time elapse taken into
`considered. The emotion/instinct model unit 32 feeds the 10
`values of the updated emotion model and instinct model (the
`states of the emotion and the instinct of the robot) to the
`behavior decision unit 33.
`The states of the emotion and instinct of the robot change
`in response to the output of the behavior decision unit 33 as
`discussed below, for example.
`The behavior decision unit 33 supplies the emotion/
`instinct model unit 32 with the behavior information of the
`behavior the robot took in the past or is currently taking (for
`example, "the robot looked away or is looking away").
`Now, when the robot already in anger is stimulated by the
`user, the robot may take an action of "looking away" in
`response. In this case, the behavior decision unit 33 supplies
`the emotion/instinct model unit 32 with the behavior infor- 25
`mation of "looking away".
`Generally speaking, an action of expressing discontent in
`anger, such as the action of looking away, may somewhat
`calm down anger. The emotion/instinct model unit 32 then
`decreases the value of the emotion model representing
`"anger" (down to a smaller degree of anger) when the
`behavior information of "looking away" is received from the
`behavior decision unit 33.
`The behavior decision unit 33 decides a next action to take
`based on the recognition result of the sensor signal processor
`31, the output of the emotion/instinct model unit 32, elapsed
`time, the memory content of the behavior model memory
`33A, and the memory content of the behavior association
`table memory 33B. The behavior decision unit 33 then feeds
`the behavior information, representing the action, to the
`emotion/instinct model unit 32 and the posture transition
`unit 34.
`The behavior model memory 33A stores a behavioral
`model that defines the behavior of the robot. The behavior
`association table memory 33B stores an association table
`that associates the voice recognition result of the voice input
`to the microphone 9 with the behavior of the robot.
`The behavioral model is formed of a stochastic automaton
`shown in FIG. 5. In the stochastic automaton shown here, a
`behavior is expressed by any node (state) among NODE0
`through NODEM, and a transition of behavior is expressed
`by an arc ARCmz representing a transition from a node
`NODEmo to another node NODEmz (note that there is a case
`when another node is the original node) (mO, ml=O, 1, ... ,
`M).
`The arc ARCmz, representing the transition from the node
`NODEmo to the node NODEmz, has a transition probability
`P mz, and the probability of node transition, namely, the
`transition of behavior is determined, in principle, based on
`the corresponding transition probability.
`Referring to FIG. 5, for simplicity, the stochastic automa(cid:173)
`ton having (M+l) nodes includes arcsARC0 throughARCM
`respectively extending from node NODE0 the other nodes
`NODE0 through NODEM.
`As shown in FIG. 6, the behavior association table
`registers the association between each word obtained as a
`
`6
`result of voice recognition and an action to be taken by the
`robot. The table shown in FIG. 6 lists, as a correlation score
`of an integer number, the association between a voice
`recognition result and a behavior. Specifically, the integer
`5 number representing the degree of association between the
`voice recognition result and the behavior is the correlation
`score. When a voice recognition result is obtained, the robot
`changes the probability or the degree of frequency of a
`behavior depending on the correlation score.
`When the voice recognition result is "Hey" in the behav-
`ior association table in FIG. 6, the degrees of frequency of
`actions of "walking forward" and "biting" (each having no
`zero correlation scores) taken by the robot are respectively
`increased by correlation scores of 10 and 20. When the voice
`15 recognition result is "come over here", the degree of fre(cid:173)
`quency of the action of "walking forward" (having no zero
`correlation score) taken by the robot is increased by a
`correlation score of 60. When the voice recognition result is
`"Shake hands", the degree of frequency of the action of
`20 "looking up" (having no zero correlation score) taken by the
`robot is increased by a correlation score of 20, and at the
`same time, the degree of frequency of the action of "shaking
`hands" is increased by a correlation score of 70.
`The behavior decision unit 33, in principle, determines
`which node to transition to from a node corresponding to a
`current behavior in the stochastic automaton as a behavioral
`model (see FIG. 5), based on the values of the emotion
`model and instinct model of the emotion/instinct model unit
`32, elapsed time, the recognition result of the sensor signals
`30 provided by the sensor signal processor 31, besides the
`transition probability set for the arc extending from the
`current node. The behavior decision unit 33 then supplies the
`emotion/instinct model unit 32 and posture transition unit 34
`with the behavior information representing the behavior
`35 corresponding to the node subsequent to the node transition
`(also referred to as a post-node-transition action).
`Depending on the values of the emotion model and
`instinct model, the behavior decision unit 33 transitions to a
`different node even if the sensor signal processor 31 outputs
`40 the same external recognition results.
`Specifically, now, the output of the sensor signal proces(cid:173)
`sor 31 indicates that the palm of a hand is stretched out in
`front of the robot. When the emotion model of "anger"
`indicates that the robot is "not angry" and when the instinct
`model of" appetite" indicates that the robot is not hungry, the
`behavior decision unit 33 decides to drive the robot to shake
`hands as a post-node-transition action, in response to the
`stretched palm.
`Similarly, the output of the sensor signal processor 31
`now indicates that the palm of the hand is stretched out in
`front of the robot. Although the emotion model of "anger"
`indicates that the robot is "not angry" but the instinct model
`of "appetite" indicates that the robot is hungry, the behavior
`55 decision unit 33 decides to lick at the palm of the hand as a
`post-node-transition action.
`Again, the output of the sensor signal processor 31 now
`indicates that the palm of the hand is stretched out in front
`of the robot. When the emotion model of "anger" indicates
`60 that the robot is "angry", the behavior decision unit 33
`decides to drive the robot to abruptly look away, as a
`post-node-transition action, regardless of the value of the
`instinct model of "appetite".
`When the recognition result of the sensor output provided
`65 by the sensor signal processor 31 determines that the voice
`is a user's own voice, the behavior decision unit 33 deter(cid:173)
`mines which node to transition to from the node for the
`
`45
`
`50
`
`13
`
`
`
`US 6,452,348 Bl
`
`5
`
`10
`
`7
`current behavior, based on the correlation scores of the
`behaviors indicated by the voice recognition result, regis(cid:173)
`tered in the behavior association table (see FIG. 6) in the
`behavior association table memory 33B. The behavior deci(cid:173)
`sion unit 33 then supplies the emotion/instinct model unit 32
`and the posture transition unit 34 with the behavior infor(cid:173)
`mation indicating the behavior (post-node-transition action)
`corresponding to the decided node. In this way, the robot
`behaves differently dependent on the correlation scores of
`the behaviors in accordance with the voice recognition
`result.
`Upon receiving a predetermined trigger, the behavior
`decision unit 33 transitions to a node in the behavior model,
`thereby deciding a post-node-transition action to take.
`Specifically, the behavior decision unit 33 decides a post(cid:173)
`node-transition action to take, when a predetermined time
`has elapsed since the robot started the current action, when
`the sensor signal processor 31 outputs a particular recogni(cid:173)
`tion result such as a voice recognition result, or when the
`value of each of the emotion model or the instinct model of
`the emotion/instinct model unit 32 rises above a predeter(cid:173)
`mined threshold.
`Based on the behavior information provided by the behav-
`ior decision unit 33, the posture transition unit 34 generates
`posture transition information for transitioning from a cur(cid:173)
`rent posture to a next posture, and outputs the posture
`transition information to the data control unit 35.
`Specifically, the posture transition unit 34 recognizes the
`current posture based on the outputs from the rotary encod-
`ers 121 through 12N, and calculates the angles of rotation and 30
`rotational speeds of the motors 7 1 through 7 N for the robot
`to take an action (a post-node-transition action) correspond(cid:173)
`ing the behavior information from the behavior decision unit
`33, and then outputs as the posture transition information to
`the data control unit 35.
`The data control unit 35 generates drive signals for
`driving the motors 71 through 7 N in accordance with the
`posture transition information from the posture transition
`unit 34, and supplies the motors 7 1 through 7 N with the drive
`signals. The robot thus takes a post-node-transition action 40
`accordingly.
`FIG. 7 is a functional block diagram of a portion of the
`sensor signal processor 31 shown in FIG. 4, which is
`hereinafter referred to as a voice recognition module and
`performs voice recognition in response to voice data from
`the microphone 9.
`The voice recognition module recognizes a voice input to
`the microphone 9 using a continuous HMM (Hidden
`Markov Model), and outputs voice recognition results.
`A feature parameter extractor 41 receives the voice data
`from the microphone 9. The feature parameter extractor 41
`performs MFCC (Mel Frequency Cepstrum Coefficient)
`analysis on the voice data input thereto on a frame by frame
`basis. The MFCC analysis result is output to a matching unit 55
`42 as a feature parameter (feature vector). As feature
`parameters, the feature parameter extractor 41 may further
`extract a linear prediction coefficient, a cepstrum coefficient,
`a line spectrum pair, and power in every predetermined
`frequency band (output of a filter bank).
`Using the feature parameters from the feature parameter
`extractor 41, the matching unit 42 recognizes the voice input
`to the microphone 9 based on the continuous HMM model
`while referencing an acoustic model memory 43, a dictio(cid:173)
`nary memory 44, and a grammar memory 45 as necessary. 65
`The acoustic model memory 43 stores an acoustic model
`that represents an acoustic feature such as phonemes and
`
`8
`syllables in a voice to be recognized. Since voice recognition
`is here carried out using the continuous HMM, an HMM is
`employed. The dictionary memory 44 stores a dictionary of
`words which contains information of the pronunciation
`(phonological information) of each word to be recognized.
`The grammar memory 45 stores a grammar which describes
`how each word registered in the data control unit 35 is
`chained. The grammar may be a context-free grammar, or a
`rule based on word chain probability (N-gram).
`The matching unit 42 produces an acoustic model of a
`word (a word model) by connecting acoustic models stored
`in the dictionary memory 44 through referencing the dic(cid:173)
`tionary in the dictionary memory 44. The matching unit 42
`further connects several word models by referencing the
`grammar stored in the grammar memory 45, and processes
`15 the connected word models through the continuous HMM
`method based on the feature parameters, thereby recogniz(cid:173)
`ing the voice input to the microphone 9. Specifically, the
`matching unit 42 detects a word model having the highest
`score (likelihood) from the time-series feature parameters
`20 output by the feature parameter extractor 41, and outputs a
`word (a word chain) corresponding to the word model. The
`voice recognition result of the matching unit 42 is thus
`output to the emotion/instinct model unit 32 and the behav(cid:173)
`ior decision unit 33 as the output of th