`Office (JP)
`
`
`
`
`
`(51) Int.C1. 7
`G1OL
`A63H
`B25J
`G1OL
`
`
`13/08
`11/00
`13/00
`13/00
`
`
`
`
`
`
`
`
`
`
`JP 2005-202076 A 2005.7.28
`
`(12) Japanese Unexamined
`Patent Application
`Publication (A)
`
`JP 2005-202076 A 2005.7.28
`(11) Japanese Unexamined
`Patent Application
`Publication Number
`2005-202076
`(P2005-202076A)
`(43) Publication date October 28, 2005 (7.28.2005)
`
`F1
`G1OL
`A63H
`B25J
`G1OL
`
`
`3/00
`11/00
`13/00
`3/00
`
`H
`Z
`Z
`Q
`
`
`
`Theme codes (reference)
`2C150
`3C007
`5D045
`
`Request for examination Not yet requested Number of
`Claims 24 OL (Total of 19 pages)
`
`Japanese Patent Application 2004-7306
`(P2004-7306)
`January 14, 2004 (1.14.2004)
`
`(21) Application
`number
`(22) Date of
`application
`(Japan Patent Office Note: the following is a registered
`trademark)
`1. Bluetooth
`
`(71)
`Applicant
`
`(74) Agent
`
`000002185
`Sony Corporation
`6-7-35 Kita Shinagawa, Shinagawa-ku,
`Tokyo-to
`100082740
`Patent attorney Keiki Tanabe
`(72) Inventor Hideki Shimomura
`c/o Sony Corporation, 6-7-35 Kita
`Shinagawa, Shinagawa-ku, Tokyo-to
`2C150 CA01 DA04 DF02 DF21 EB01
`
`EE02 EF11 EF16
`
`3C007 AS36 BS27 CS08 JS03 KS03
`
`KS11 KS31 KS36 KS39 KX02
`
`MT14 WA03 WA13 WB18 WC07
`
`WC30
`
`
`
`5D045 AA07 AA08 AB11
`
`
`
`F terms
`(reference)
`
`
`
`[54] [Title of the Invention] Speech control device and method and robot device
`
`[57] [Abstract]
`[Problem]
`Depending on the distance from the device or robot device to
`the conversation partner, the conversation partner may have
`difficulty hearing the speech of the device or robot device.
`[Means for Solving]
`By changing the speech style of the device or robot device
`when interacting with the user as necessary depending on the
`distance between the device or robot device and the user, the
`device or robot device can always speak in a style that is easy
`for the user to hear, allowing for smooth interaction with the
`user, thus realizing a speech control device and method, and a
`robot device that can improve entertainment value.
`[Representative Drawing] Fig. 13
`
`
`
`
`
`
`RT3
`
`Start
`
`SP20
`
`From volume memory
`Get reference volume
`
`SP21
`
`Change speech
`parameters based on
`distance information
`
`SP22
`
`SP23
`
`Does the volume exceed
`the limits?
`
`YES
`
`Change speed parameters
`based on distance
`information
`
`SP24
`
`NO
`
`SP27
`
`SP28
`
`Change to speed
`parameter at
`maximum distance
`
`Change to pause length
`parameter at maximum
`distance [illegible]
`
`Change pause length
`parameter based on distance
`information [illegible]
`
`SP25
`
`SP29
`
`Change to pause
`length parameter at
`maximum distance
`
`Change pause length
`parameter based on
`distance information
`
`SP26
`
`Waveform generation
`process based on the above
`parameters
`
`SP30
`
`Finish
`
`SP31
`
`FIG. 13 Speech change processing procedure
`
`
`
`Amazon v. SoundClear
`US Patent 11,069,337
`Amazon Ex. 1004
`
`
`
`(2)
`
`
`JP 2005-202076 A 2005.7.28
`
`[Scope of Patent Claims]
`[Claim 1]
`A speech control device that controls the speech of a device having a dialogue function with a user during dialogue
`between the device and the user, characterized by comprising;
`a distance detection means for detecting the distance between the device and the user who is the conversation partner;
`and
`a speech form change means for changing the speech form of the device during conversation with the user as
`necessary, according to the distance between the device and the user detected by the distance detection means.
`
`[Claim 2]
`The speech control device according to Claim 1,
`characterized in that the speech form change means
`changes the speech volume as the speech form.
`[Claim 3]
`The speech control device according to Claim 1,
`characterized in that the speech form change means
`changes the speech speed as the speech form.
`[Claim 4]
`The speech control device according to Claim 1,
` characterized in that the speech form change means
` changes the speech intonation as the speech form.
`[Claim 5]
`The speech control device according to Claim 1,
` characterized in that the speech form change means
`changes the space between phrases as the speech form.
`[Claim 6]
`The speech control device according to Claim 5, characterized by comprising:
`a character string output means for outputting a character string corresponding to a content to be spoken by the
`device;
` a speech means for generating a speech signal of synthetic voice corresponding to the character string;
` and a speaker for outputting audio based on the audio signal,
`wherein the speech form changing means transforms the character string
`output from the character string output means
` as a method of changing the space between the phrases.
`[Claim 7]
`The speech control device according to Claim 2, further comprising:
` a user identification means for identifying the user who is the conversation partner;
` a memory means for storing a reference volume for each of the users;
` and a reference volume change means for changing the reference volume of the user stored in the memory means in
`response to a request from the user during conversation,
` wherein the speech style change means changes the speech volume
`based on the reference volume of the user who is the conversation partner
`stored in the reference volume change means.
`[Claim 8]
`The speech control device according to Claim 2,
`wherein the speech style change means changes another speech style by a maximum amount when the changed
`speech volume exceeds a predetermined threshold value.
`[Claim 9]
`A speech control method for controlling speech of a device having a function for interacting with a user during a
`conversation between the device and the user, comprising:
` a first step of detecting a distance between the device and the user who is a conversation partner;
`and a second step of changing the speech form of the device during a conversation with the user as necessary,
`depending on the detected distance between the device and the user.
`A speech control method comprising:
`
`
`
`
`
`(3)
`
`
`JP 2005-202076 A 2005.7.28
`
` [Claim 10]
`The speech control method according to Claim 9,
`characterized in that the second step
`changes the speech volume as the speech form.
`[Claim 11]
`The speech control method according to Claim 9,
`characterized in that the second step
`changes the speech speed as the speech form.
`[Claim 12]
`The speech control method according to Claim 9,
`characterized in that the second step
`changes the speech intonation as the speech form.
`[Claim 13]
`The speech control method according to Claim 9,
`characterized in that the second step
`changes the space between phrases as the speech form.
`[Claim 14]
`The speech control method according to Claim 13,
`characterized in that the second step includes
`a character string output step of outputting a character string corresponding to a content to be spoken by the device,
` a speech generation step of generating a speech signal of synthetic voice corresponding to the character string,
`and a speech output step of outputting speech based on the speech signal,
`wherein in the speech generation step, the output character string is transformed
`as a method of changing the intonation of the speech.
`[Claim 15]
`The speech control method according to Claim 10,
` further comprising: a storing step of storing a reference volume for each user;
`a user identifying step of identifying the user who is a conversation partner;
` and a reference volume changing step of changing the stored reference volume of the user
`in response to a request from the user during the conversation,
`wherein in the second step, the speech volume is changed
`based on the stored reference volume of the user who is a conversation partner.
`[Claim 16]
`The speech control method according to Claim 10,
`wherein the second step changes another speech style by a maximum amount when the changed speech volume
`exceeds a predetermined threshold value.
`[Claim 17]
`A robot device having a function of interacting with a user, comprising:
`a distance detection means for detecting a distance to the user who is a conversation partner;
`and a speech form change means for changing an utterance form during a conversation with the user as necessary,
`according to the distance to the user detected by the distance detection means.
`
`
`
`
`
`
`
`(4)
`
`
`JP 2005-202076 A 2005.7.28
`
`[Claim 18]
`The robot device according to Claim 17,
`characterized in that the speech form change means
`changes the speech volume as the speech form.
`[Claim 19]
`The robot device according to Claim 17,
`characterized in that the speech form change means
`changes the speech speed as the speech form.
`[Claim 20]
`The robot device according to Claim 17,
` characterized in that the speech form change means
` changes the speech intonation as the speech form.
`[Claim 21]
`The robot device according to Claim 20, characterized by comprising:
`a character string output means for outputting a character string corresponding to a content to be spoken by the
`device;
` a speech means for generating a speech signal of synthetic voice corresponding to the character string;
` and a speaker for outputting audio based on the audio signal,
`wherein the speech form changing means transforms the character string
`output from the character string output means
` as a method of changing the speech intonation.
`[Claim 22]
`The robot device according to Claim 17,
` characterized in that the speech form change means
`changes the space between phrases as the speech form.
`[Claim 23]
`The robot device according to Claim 18, further comprising:
` a user identification means for identifying the user who is the conversation partner;
` a memory means for storing a reference volume for each of the users;
` and a reference volume change means for changing the reference volume of the user stored in the memory means in
`response to a request from the user during conversation,
` wherein the speech style change means changes the speech volume
`based on the reference volume of the user who is the conversation partner
`stored in the reference volume change means.
`[Claim 24]
`The robot device according to Claim 18,
`wherein the speech style change means changes another speech style by a maximum amount when the changed
`speech volume exceeds a predetermined threshold value.
`[Detailed Description of the Present Invention]
`[Technical Field]
`[0001]
`The present invention relates to a speech control device and method, and a robot device, which is suitable for
`application to, for example, an entertainment robot.
`[Background art]
`[0002]
`In recent years, many companies and research institutions such as universities have been developing humanoid robots.
`Such humanoid robots are equipped with external sensors such as Charge Coupled Device (CCD) cameras, microphones,
`and touch sensors, and internal sensors such as battery sensors and acceleration sensors, and are designed to recognize
`the external and internal conditions based on the output of these external sensors and internal sensors, and to act
`autonomously based on the recognition results (see, for example, Non-Patent Literature 1).
`
`
`
`
`
`(5)
`
`
`JP 2005-202076 A 2005.7.28
`
`
`[0003]
`In recent years, many entertainment robots have also been installed with speech recognition and interaction control
`functions to enable simple daily conversations with users.
`[Non-Patent Literature 1] Japanese Patent Application 2003-270835
`[Disclosure of the Invention]
`[Problem to be Solved by the Invention]
`[0004]
`However, conventional robots equipped with such voice recognition and dialogue control functions are constructed to
`always converse with the user at a constant speech volume set in advance, regardless of the physical distance between
`the robot and the user.
`[0005]
`For this reason, depending on the setting of the speech volume, even if it is appropriate for a user close to the robot,
`the volume may be too low for a user a little distance away from the robot to hear the speech content, or conversely,
`even if it is appropriate for a user a little distance away from the robot, the volume may be too high for a user close to
`the robot to hear the speech content.
`[0006]
`One method for solving such problems is to make it possible to freely change the speech volume of the entertainment
`robot by operating a switch. However, this method has the problem of impairing the naturalness of the interaction
`between the user and the robot, and furthermore, having to set the speech volume each time is a problem of extreme
`inconvenience for the user.
`[0007]
`Also, when the surrounding environment, such as a room with a lot of reverberation, is taken into consideration,
`simply increasing the robot's speech volume does not necessarily make it easier for users who are far from the robot to
`hear the robot's speech.
`[0008]
`Such a situation in which the content of a robot's speech is difficult for the user to hear is a factor that hinders smooth
`and natural dialogue between the user and the robot, and impairs the entertainment value of the robot having a dialogue
`control function, and some kind of solution is desired.
`[0009]
`The present invention has been made in consideration of the above points, and aims to propose a speech control
`device and method, and a robot device, that can improve the entertainment value of a robot having a dialogue control
`function.
`[Means for Solving the Problem]
`[0010]
`In order to solve such a problem, in the speech control device that controls the speech of the device during a dialogue
`between the device and a user, a speech form changing means is provided that changes the speech form of the device
`during a dialogue with the user as necessary according to the distance between the device and the user.
`[0011]
`As a result, the speech control device can always speak in a speech form that is easy for the user to hear, and can have
`a smooth dialogue with the user.
`[0012]
`Also, in the present invention, in the speech control method that controls the speech of the device during a dialogue
`between the device and a user, the speech form of the device during a dialogue with the user is changed as necessary
`according to the distance between the device and the user.
`
`
`
`
`
`(6)
`
`
`JP 2005-202076 A 2005.7.28
`
`[0013]
`As a result, according to this speech control method, the robot device can always speak in a speech form that is easy
`for the user to hear, and therefore a smooth dialogue with the user can be carried out.
`[0014]
`Furthermore, in the present invention, the robot device is provided with a speech form changing means for changing
`the speech form when interacting with the user as necessary, depending on the distance to the user.
`[0015]
`As a result, the robot device can always speak in a speech form that is easy for the user to hear, and can thus have a
`smooth dialogue with the user.
`[Effect of the Invention]
`[0016]
`According to the present invention, in a speech control device and method for controlling the speech of a device
`during a dialogue between the device and a user, the speech form of the device during dialogue with the user is changed
`as necessary according to the distance between the device and the user, so that the device can always speak in a speech
`form that is easy for the user to hear, allowing for smooth dialogue with the user, and thus improving entertainment
`value.
`[0017]
`In addition, according to the present invention, a robot device is provided with a speech style change means for
`changing the speech style when interacting with a user as necessary depending on the distance to the user, so that the
`robot device can always speak in a style that is easy for the user to hear, allowing for smooth interaction with the user
`and thus realizing a robot device that can improve entertainment value.
`[Best Embodiments of the Invention]
`[0018]
`One embodiment of the present invention will be described in detail below with reference to the drawings.
`[0019]
`(1) Constitution of the robot according to this embodiment
`(1-1) Hardware constitution of robot 1
`In FIG. 1, 1 indicates the robot according to this embodiment as a whole, which is configured by attaching a head unit
`4 to the upper part of a body unit 2 via a neck joint 3, attaching arm units 5A and 5B to the upper left and right parts of
`the body unit 2 via shoulder joints 4A and 4B, respectively, and attaching a pair of leg units 7A and 7B to the lower part
`of the body unit 2 via hip joints 6A and 6B, respectively.
`[0020]
`FIG. 2 schematically illustrates a functional constitution of this robot 1. As shown in FIG. 2, the robot 1 is configured
`with a control unit 10 for controlling overall operation and other data processing, an input/output unit 11, a drive unit 12,
`and a power supply unit 13.
`[0021]
`The input/output unit 11 includes, as input units, a pair of CCD (Charge Coupled Device) cameras 20 corresponding
`to the robot 1's eyes, a pair of microphones 21 corresponding to the ears, touch sensors 22 arranged on the head, hands,
`soles of the feet, etc., to detect physical actions from the user, contact between the hands and external objects, and the
`soles of the feet touching the ground, as well as various other sensors corresponding to the five senses.
`[0022]
`The input/output unit 11 is also equipped with a speaker 23 equivalent to the mouth of the robot 1, and LEDs (eye
`lamps) 24 that form facial expressions by combining flashing lights or timing the lights on and off, as output units.
`These output units can express user feedback from the robot 1 in forms other than mechanical motion patterns using legs,
`such as voice and flashing lights.
`
`
`
`
`
`(7)
`
`
`JP 2005-202076 A 2005.7.28
`
`[0023]
`The drive unit 12 is a functional block that realizes the body movement of the robot 1 according to a predetermined
`motion pattern commanded by the control unit 10, and is the object to be controlled by behavior control. The drive unit
`12 is a functional module for realizing the degree of freedom at each joint of the robot 1, and is composed of multiple
`drive units 25 1 to 25 n provided for each axis, such as roll, pitch, and yaw, at each joint. Each drive unit 25 1 to 25 n
`consists of a combination of motors 26 1 to 26 n performing rotational motion around a given axis, encoders 27 1 to 27 n
`detecting rotational positions of motors 26 1 to 26 n, and drivers 28 1 to 28 n that adaptively control rotational positions
`and rotational speeds of motors 26 1 to 26 n based on the output of encoders 27 1 to 27 n.
`[0024]
`The power supply unit 13 is literally a functional module that supplies power to each electric circuit in the robot 1.
`The robot 1 according to this embodiment is an autonomous type that uses a battery, and the power supply unit 13 is
`composed of a charging battery 29 and a charge/discharge control unit 31 that manages the charge/discharge state of the
`rechargeable battery 30.
`[0025]
`The charging battery 29 is composed, for example, in the form of a "battery pack" in which multiple lithium-ion
`secondary battery cells are packaged in a cartridge format.
`[0026]
`In addition, the charge/discharge control unit 30 also measures the terminal voltage, charge/discharge current, and
`ambient temperature of the battery 29 to grasp the remaining capacity of the battery 29 and determine the start and end
`times of charging. The start and end time of the charge determined by the charge/discharge control unit 30 is notified to
`the control unit 10 and becomes a trigger for the robot 1 to start and end the charge operation.
`[0027]
`The control unit 10 corresponds to the “brain” of the robot 1 and is mounted, for example, in the head unit 4 or the
`body unit 2. As shown in FIG. 3, the control unit 10 is configured with a CPU (Central Processing Unit) 31 as the main
`controller, which is connected to memory, other circuit components, and peripheral devices via a bus. Bus 37 is a
`common signal transmission path including a data bus, an address bus, a control bus, and the like. Bus 37 is a common
`signal transmission path including a data bus, an address bus, a control bus, and the like. CPU 31 is assigned a unique
`address (memory address or I/O address) to each device on the bus 37.
`[0028]
`RAM (Read Access Memory) 32 is a writable memory composed of volatile memory such as DRAM (Dynamic
`RAM), and is used to load program code executed by the CPU 31 and to temporarily store work data by the executed
`program.
`[0029]
`Read Only Memory (ROM) 33 is a read-only memory that permanently stores programs and data. The program code
`stored in the ROM 33 includes a self-diagnostic test program that is executed when the robot 1 is powered on, and a
`control program that specifies the operation of the robot 1.
`[0030]
`The control program for the robot 1 includes a "sensor input/recognition processing program" that processes input
`from various sensors such as the CCD camera 20 and microphone 21 and recognizes them as symbols, a "behavior
`control program" that controls the behavior of the robot 1 based on the sensor input and a predetermined behavior
`control model while managing memory operations such as short-term memory, and a "drive control program" that
`controls the drive of each joint motor and the voice output of the speaker 22 according to the behavior control model.
`[0031]
`The non-volatile memory 34 is composed of a memory element that can be electrically erased and rewritten, such as
`an EEPROM (Electrically Erasable and Programmable ROM), and is used to store data that needs to be updated in a
`non-volatile manner. Examples of data that needs to be updated include secret keys, other security information, and
`device control programs that need to be installed after shipment.
`
`
`
`
`
`(8)
`
`
`JP 2005-202076 A 2005.7.28
`
`
`[0032]
`Interface 35 is an apparatus for interconnecting with equipment outside the control unit 10 and enabling data
`exchange. The interface 35, for example, inputs and outputs data between the CCD camera 20, microphone 21, and
`speaker 22 in the input/output unit 11. The interface 35 also inputs and outputs data and commands between the drivers
`28 1 to 28 n in the drive unit 12.
`[0033]
`Also, the interface 35 may be provided with a general-purpose interface for connecting peripheral devices of a
`computer, such as a serial interface such as RS (Recommended Standard)-232C, a parallel interface such as IEEE
`(Institute of Electrical and Electronics Engineers) 1284, a USB (Universal Serial Bus) interface, an i-Link (IEEE 1394)
`interface, a SCSI (Small Computer System Interface) interface, a memory card interface (card slot) that accepts a PC
`card or a memory stick, etc., so that programs and data can be transferred between the computer and a locally connected
`external device. In addition, as another example of the interface 35, an infrared communication (IrDA) interface may be
`provided to perform wireless communication with external devices.
`[0034]
`Furthermore, the control unit 10 includes a wireless communication interface 36 and a network interface card (NIC)
`38, and can communicate data with various external host computers via short-distance wireless data communication
`such as Bluetooth, a wireless network such as IEEE802. 11b, or a high-bandwidth network such as the Internet.
`[0035]
`By using such data communication between the robot 1 and the host computer, it is possible to use remote computer
`resources to calculate the complex motion control of the robot 1 and to remotely control it.
`[0036]
`(1-2) Software constitution of robot 1
`FIG. 4 schematically illustrates a functional constitution of a behavior control system 40 of a robot 1 configured by a
`group of control programs stored in a ROM.
`[0037]
`This behavior control system 40 is implemented incorporating object-oriented programming. In this case, each
`software is handled in a module called an "object" that integrates the data with the processing procedures for the data. In
`addition, each object can pass and invoke data by means of an object-to-object communication method using message
`communication and shared memory.
`[0038]
`Here, the behavior control system 40 includes an image recognition unit 41, a speech recognition unit 42, and a
`contact recognition unit 43 for recognizing an external environment based on sensor output from various sensors such
`as a CCD camera 20 (FIG. 2), a microphone 21 (FIG. 2), and a touch sensor 22 (FIG. 2).
`[0039]
`The image recognition unit 41 performs image recognition processing and feature extraction, such as facial
`recognition and color recognition, based on the image signal S1 provided by the CCD camera 20. The image recognition
`unit 41 then sends out to the short-term memory unit 44 the various image recognition results, such as facial recognition
`information such as the person’s unique facial ID (identifier), the position and size of the facial image region, and the
`color recognition information such as the position, size, and feature of the color region such as the color recognition
`result, and the image signal S1. The image recognition unit 41 also detects the distance to the imaging object by means
`of a so-called stereo visual method based on the image signal S1 from the CCD camera camera 20, and sends the
`detection results to the short-term memory unit 44.
`
`
`
`
`
`
`
`
`
`
`
`(9)
`
`JP 2005-202076 A 2005.7.28
`
`[0040]
`
`The speech recognition unit 42 performs various sound-related recognition processes such as speech recognition,
`
`speaker recognition, and sound source direction recognition based on the speech signal S2 provided by the microphone
`
`21. The speech recognition unit 42 then sends various speech recognition results, such as character string information of
`
`the recognized words, which is the speech recognition result, speaker ID information unique to the speaker, which is the
`
`speaker recognition processing result based on acoustic features, etc., and sound source direction information, which is
`
`the sound source direction recognition result, to the short-term memory unit 44. In addition, the speech recognition unit
`
`42, along with these various speech recognition results, sends these speech signals S2 to the short-term memory unit 44.
`
`[0041]
`
`Furthermore, the contact recognition unit 43 recognizes physical contact with the outside, such as "being stroked,"
`
`"being hit," "grasping an object," and "the sole of the foot touching the ground," based on the pressure detection signal S3
`
`provided by the touch sensors 22 arranged on the upper part of the head unit 4 (FIG. 1), the hands that are the tips of the
`
`arm units 5A and 5B (FIG. 1), and the soles that are the bottoms of the leg units 7A and 7B (FIG. 1), and sends these
`
`contact recognition results to the short-term memory unit 44. The contact recognition unit 44 also sends a pressure
`
`detection signal S3 from each touch sensor 22 to the short-term memory unit 44 in conjunction with these contact
`
`recognition results.
`
`[0042]
`
`The short-term memory unit 44 is an object that holds information about the external environment of the robot 1 for a
`
`relatively short period of time, and receives and stores for a short period of time various image recognition results and
`
`image signals S1 provided by the image recognition unit 41, various speech recognition results and speech signals S2
`
`provided by the speech recognition unit 42, and various contact recognition results and pressure detection signals S3
`
`provided by the contact recognition unit 43.
`
`[0043]
`
`In addition, the short-term memory unit 44 uses the image recognition results received, the speech recognition results
`
`and the contact recognition result, and the image signal S1, the speech signal S2, and each pressure detection signal S3 to
`
`coordinate the facial image area, person ID, speaker ID, and character string information, etc., to generate target
`
`information and event information about where the person is currently, what person the words are, and what kind of
`
`interaction has been made with that person, and sends this information to the behavior selection control unit 45.
`
`[0044]
`
`Based on the target information and event information provided by the short-term memory unit 44 and the contents
`
`stored in the short-term memory unit 44, the behavior selection control unit 45 determines the next behavior of the robot
`
`1 from among a plurality of previously prepared behaviors, such as a behavior selected depending on the current situation
`
`and previous behavior of the robot 1 (situation-dependent behavior), a reflexive behavior in response to an external
`
`stimulus (reflexive behavior), or an action based on a relatively long-term action plan in response to a given situation or a
`
`command from the user (deliberative behavior). The behavior selection control unit 45 then notifies the output
`
`management unit 46 of the behavior determined in this way.
`
`[0045]
`
`In response to the notification from the behavior selection control unit 45, the output management unit 46 performs
`
`arbitration processing when multiple behaviors such as situation-dependent behaviors and reflex behaviors compete, and
`
`processing to synchronize the movement, speech, and blinking of the LED 24, while driving the motors 26 1 to 26 n of the
`
`corresponding drive units 25 1 to 25 n and driving the LED 24 to blink in a predetermined pattern.
`
`[0046]
`
`In addition, if the behavior selection control unit 45 decides to interact with the user as the next action, the speech
`
`recognition results of the speech of the user stored sequentially in the short-term memory unit 44 are continuously
`
`monitored by the speech recognition unit 42 after this, and the content to be spoken by the robot 1 is sequentially
`
`determined based on the speech recognition results. Then, the behavior selection control unit 45 reads the necessary
`
`character string from the speech string database 47 stored in advance in the ROM 33 (Fig. 3) based on this determination
`
`result, and sends it to the output management unit 46.
`
`
`
`(10)
`
`
`JP 2005-202076 A 2005.7.28
`
`[0047]
`At this time, the output management unit 46 sends the character string provided by the behavior selection control unit
`45 to the speech synthesis unit 48, while the speech synthesis unit 48 generates a speech signal S4 of synthetic speech
`based on the supplied character string and sends it to the speaker 23 (FIG. 2). As a result, a speech based on this speech
`signal S4 is output from the speaker 23.
`[0048]
`In this way, the robot 1 can act autonomously based on external conditions, etc. recognized based on sensor output of
`various sensors such as CCD camera 20, microphone 21, and touch sensor 23.
`[0049]
`(2) Speech control function in robot 1
`Next, the speech control function installed in this robot 1 will be described.
`[0050]
`The robot 1 is equipped with a speech control function that controls speech forms such as speech volume, speech
`speed, intonation, and intervals between phrases according to the physical distance to the conversation partner. In
`addition, the robot 1 is equipped with a speech control function that changes the speech volume to be used as a standard
`(hereinafter referred to as the reference volume) for each user in response to a request from the user, taking into
`consideration the large individual differences in the sense of speech forms, especially speech volume.
`[0051]
`In practice, in the case of the robot 1, as a means for performing speech control that changes the reference volume for
`each user, the behavior control system 40 is provided with a reference volume memory unit 49 for storing and holding
`the reference volume for each user. Incidentally, this reference volume memory unit 49 is configured from the non-
`volatile memory 34 (FIG. 3).
`[0052]
`Then, each time the short-term memory unit 44 detects a new user based on the image recognition result of the image
`recognition processing unit 41, the behavior selection control unit 45 associates the person ID of the user with the
`reference volume for that user (the initial setting value is "3") as shown in FIG. 5, and stores the person ID and the
`reference volume in the reference volume memory unit 49. Meanwhile, during subsequent conversations with the user,
`the reference volume for the user stored and held in the reference volume memory unit 49 is c