`(12) Patent Application Publication (10) Pub. No.: US 2009/0279714 A1
`KM et al.
`(43) Pub. Date:
`Nov. 12, 2009
`
`US 20090279714A1
`
`(54) APPARATUS AND METHOD FOR
`LOCALIZING SOUND SOURCE IN ROBOT
`
`(75) Inventors:
`
`Hyun-Soo KIM, Yongin-si (KR):
`Song-Suk Yook, Seoul (KR):
`Young-Kyu Cho, Seoul (KR):
`Woo-Jin Choi, Seoul (KR)
`Correspondence Address:
`THE FARRELL LAW FIRM, LLP
`290 Broadhollow Road, Suite 210E
`Melville, NY 11747 (US)
`(73) Assignees:
`SAMSUNGELECTRONICS
`CO.,LTD., Suwon-si (KR):
`KOREA UNIVERSITY
`RESEARCH AND BUSINESS
`FOUNDATION, Seoul (KR)
`
`(21) Appl. No.:
`
`12/436,434
`
`(22) Filed:
`
`May 6, 2009
`
`(30)
`
`Foreign Application Priority Data
`
`May 6, 2008 (KR) ........................ 10-2008-0041 786
`
`Publication Classification
`
`(51) Int. Cl.
`(2006.01)
`H04R 3/00
`(52) U.S. Cl. ............................................... 381/92; 901/1
`
`ABSTRACT
`(57)
`An apparatus and method for localizing a Sound source in a
`robot are provided. The apparatus includes a microphone unit
`implemented by one or more microphones, which picks up a
`Sound from a three-dimensional space. The apparatus also
`includes a sound source localizer for determining a position
`of the sound source in accordance with Time-Difference of
`Arrivals (TDOAs) and a highest power of the sound picked up
`by the microphone unit. Thus, the robot can rapidly and
`accurately localize the sound source in the three-dimensional
`space with minimum dead space, using a minimum number of
`microphones.
`
`
`
`
`
`E-DEAD SPACE
`
`Page 1 of 15
`
`GOOGLE EXHIBIT 1012
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 1 of 8
`
`US 2009/0279714 A1
`
`FIG.1
`
`SOURCE
`
`
`
`
`
`SOURCE
`
`
`
`Page 2 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 2 of 8
`
`US 2009/0279714 A1
`
`FIG.3
`
`1 OO
`
`
`
`MICROPHONE UNIT
`
`SOUND SOURCE
`LOCALIZER
`
`CONTROLLER
`
`CAMERA
`
`130
`
`DRIVE MOTOR
`
`Page 3 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 3 of 8
`
`US 2009/0279714 A1
`
`
`
`111
`
`FIG. 4A
`
`
`
`111
`
`Page 4 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 4 of 8
`
`US 2009/0279714 A1
`
`
`
`Page 5 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 5 of 8
`
`US 2009/0279714 A1
`
`FIG.6
`
`
`
`120
`
`FIRST ALGORTHM
`PROCESSOR
`
`121
`
`SECOND ALGORTHM
`PROCESSOR
`
`122
`
`SOUND SOURCE POSITON
`DETERMINER
`
`123
`
`SOUND SOURCE LOCALIZER
`
`Page 6 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 6 of 8
`
`US 2009/0279714 A1
`
`FIG.7
`
`
`
`Page 7 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 7 of 8
`
`US 2009/0279714 A1
`
`FIG.8
`
`DISPOSE FOUR MICROPHONES AT CORNERS S100
`OF MAGINARY TETRAHEDRON
`
`DETERMINE SOUND SOURCE DIRECTION
`USING GCC-PHAT ALGORTHM
`
`SO
`
`
`
`DETERMINE SOUND SOURCE POSITION IN
`THREE-DMENSIONAL SPACE USING
`SRP-PHAT ALGORITHM
`
`S120
`
`DIRECT ROBOT'S VIEW TOWARD
`SOUND SOURCE POSITION
`
`S30
`
`Page 8 of 15
`
`
`
`Patent Application Publication
`
`Nov. 12, 2009 Sheet 8 of 8
`
`US 2009/0279714 A1
`
`FIG.9
`
`START
`
`
`
`
`
`
`
`S111
`
`ARE DIRECTION
`CALCULATED FROM THREEPAIR
`OF MICROPHONES SAME?
`
`DETERMINEAS SOUND SOURCE DIRECTION
`TWO OF THREE DIRECTIONS CALCULATED
`FROM THREE PAIRS OF MICROPHONES
`
`S112
`DETERMINE CORRESPONDING DIRECTION
`AS SOUND SOURCE DIRECTION
`
`PERFORMSRP-PHAT ALGORITHM
`ON THREE-DIMENSIONAL SPACE
`IN TWO CORRESPONDING DIRECTIONS
`
`
`
`S121
`PERFORMSRP-PHAT ALGORITHM
`ON THREE-DIMENSIONAL SPACE
`IN CORRESPONDING DIRECTION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`DETERMINE THREE-DIMENSIONAL
`COORDINATES OF POINT OF HIGHEST
`POWER AS SOUND SOURCE POSITION
`
`Page 9 of 15
`
`
`
`US 2009/0279714 A1
`
`Nov. 12, 2009
`
`APPARATUS AND METHOD FOR
`LOCALIZING SOUND SOURCE IN ROBOT
`
`PRIORITY
`0001. This application claims priority under 35 U.S.C.
`S119(a) to an application entitled “APPARATUS AND
`METHOD FOR LOCALIZING SOUND SOURCE IN
`ROBOT” filed in the Korean Intellectual Property Office on
`May 6, 2008 and assigned Serial No. 2008-0041786, the
`contents of which are incorporated herein by reference.
`
`BACKGROUND OF THE INVENTION 1. Field of
`the Invention
`0002 The present invention relates generally to an appa
`ratus and method for localizing a sound Source in a robot, and
`more particularly, to an apparatus and method for enabling a
`miniaturized robot to rapidly and exactly localize a Sound
`Source in three-dimensional space with minimum dead space
`and using a minimum number of microphones.
`0003 2. Description of the Related Art
`0004 Utility robots that act as partners to human beings
`and assist in daily life, including various human activities
`outside of the home, are currently being developed. Unlike
`industrial robots, utility robots are built like human beings,
`move like human beings in human living environments, and
`thus are referred to as humanoid robots (herein referred to as
`“robots”).
`0005. In general, a robot walks with two legs (or moves
`using two wheels) and has a plurality of joints and drive
`motors, which drive the joints, to move its hands, arms, neck,
`legs, etc., like human beings. For example, 41 joint drive
`motors are installed in Hubo, a humanoid robot developed by
`Korea Advanced Institute of Science and Technology
`(KAIST) in December 2004, and drive respective joints.
`0006 Drive motors of a robot are generally separately
`controlled. To control the drive motors, a plurality of motor
`drivers, each of which control at least one of the drive motors,
`are installed in the robot and controlled by a control computer
`installed inside or outside of the robot.
`0007 As robots are developed to be more humanlike, tech
`nology has also been developed that enables users to commu
`nicate with the robots, for example, to issue verbal orders.
`0008 If a robot looks away from a user while the user is
`communicating with the robot, the user may not feel satisfied
`with the communication. Thus, the robot needs to localize the
`user, i.e., the Sound source, in order to look in the direction of
`the user.
`0009. In general, sound source localization methods are
`classified into the following types:
`0010 1) Methods of localizing a sound source by maxi
`mizing steered power of a beam former, 2) Methods of local
`izing a Sound source on the basis of high-resolution spectrum
`estimation, and 3)Methods of localizing a sound source using
`difference in Sound arrival times at a plurality of sensors, i.e.,
`Time-Difference Of Arrivals (TDOAs) between sensors.
`0011. A representative method of localizing a sound
`Source by maximizing steered power of a beam former is a
`Steered Response Power (SRP) algorithm, which is described
`in detail in "A High-Accuracy, Low-Latency Technique for
`Talker Localization in Reverberant Environments Using
`Microphone Arrays' written by J. Dibiase and published in
`2OOO.
`
`0012. A representative method of localizing a sound
`Source on the basis of high-resolution spectrum estimation is
`a Multiple Signal Classification (MUSIC) algorithm, which
`is described in detail in Adaptive Eigenvalue Decomposition
`Algorithm for Passive Acoustic Source Localization' written
`by J. Benesty and published in 2000.
`0013. A representative method of localizing a sound
`source using TDOAS between sensors is a Generalized Cross
`Correlation (GCC) algorithm, which is described in detail in
`“The Generalized Correlation Method for Estimation of Time
`Delay' written by C. H. Knapp and G. C. Carter and pub
`lished in 1976.
`0014. As one of the various algorithms for localizing a
`sound source, a GCC-Phase Transform (PHAT) algorithm,
`which is a GCC algorithm employing a PHAT filter, involves
`a relatively small amount of computation, and making it is
`possible to localize a sound source in real time. An SRP
`PHAT algorithm, which is an SRP algorithm employing a
`PHAT filter, is a grid search method of dividing a whole space
`into blocks and localizing a Sound source in each block.
`However, the SRP-PHAT algorithm involves a large amount
`of computation. Thus, the SRP-PHAT algorithm is difficult to
`use in real time but has better sound Source localization per
`formance than the GCC-PHAT algorithm.
`0015 The PHAT filter is described in detail in “Use of The
`Crosspower-Spectrum Phase in Acoustic Event Location'
`written by M. Omologo and P. Svaizer and published in 1997.
`0016 FIG. 1 illustrates a microphone array for localizing
`a sound source in three-dimensional space using the GCC
`PHAT algorithm. As illustrated in FIG. 1, to localize a sound
`source in a three-dimensional space using the GCC-PHAT
`algorithm, at least eight microphones 10 must be arranged in
`the form of a cube, that is, at the corners of the cube.
`0017 More specifically, to localize a sound source in a
`three-dimensional space using the GCC-PHAT algorithm, the
`position of the sound source must be searched for in all
`directions (up, down, forward, backward, left and right) from
`the robot. Thus, the sound source is localized using TDOAS
`between the microphones 10 diagonally disposed in each
`square surface of the cube.
`0018. In a method of localizing a sound source in a three
`dimensional space using the SRP-PHAT algorithm, the posi
`tions of the microphones 10 are unlimited.
`(0019. As mentioned above, the SRP-PHAT algorithm
`divides the whole space in all directions from the robot into
`blocks, searches each block for a Sound Source, and thus
`involves a larger amount of computation than the GCC-PHAT
`algorithm. Thus, the SRP-PHAT algorithm is difficult to use
`to localize a Sound source in real time but has excellent Sound
`Source localization performance in a three-dimensional
`Space.
`0020. The general GCC-PHAT algorithm using the eight
`microphones 10 as illustrated in FIG. 1 can accurately local
`ize a sound source in a three-dimensional space. However,
`since eight or more microphones are necessary, it is difficult
`to use the general GCC-PHAT algorithm in a miniaturized
`robot, Such as a mini robot.
`0021. In order to apply the GCC-PHAT algorithm using
`the minimum number of microphones, four microphones 10
`may be disposed in a plane as illustrated in FIG. 2. However,
`when the four microphones 10 are disposed in a rectangular
`form, a sound source to the front, back left or right can be
`localized but a Sound Source disposed above or below cannot.
`For a mini robot, this drawback is not a serious problem
`
`Page 10 of 15
`
`
`
`US 2009/0279714 A1
`
`Nov. 12, 2009
`
`because of its small height. But the larger the robot and the
`higher the position of the microphones 10, the greater a dead
`space in which a sound Source cannot be localized.
`0022. The method of localizing a sound source using the
`SRP-PHAT algorithm does not limit the positions of micro
`phones and has better performance than the method using the
`GCC-PHAT algorithm. But the method using the SRP-PHAT
`algorithm involves too much computation to process in a
`real-time system, and thus, it is difficult to apply the method
`to a miniaturized robot.
`0023 The sound source localization method of a minia
`turized robot must be able to minimize the number of micro
`phones used, minimize a dead space in Sound Source direction
`estimation, and rapidly and accurately localize the sound
`Source in three-dimensional space.
`
`SUMMARY OF THE INVENTION
`0024. The present invention has been made to address at
`least the above problems and/or disadvantages and to provide
`at least the advantages described below. Accordingly, an
`aspect of the present invention provides an apparatus and
`method of a robot for localizing a Sound source in three
`dimensional space using a minimum number of microphones.
`0025. Another aspect of the present invention provides a
`hybrid Sound source localization apparatus and method of a
`robot rapidly determining the direction of a sound Source
`using a Generalized Cross-Correlation (GCC)-Phase Trans
`form (PHAT) algorithm and accurately localizing the sound
`source in the sound source direction using a Steered Response
`Power (SRP)-PHAT algorithm.
`0026. An additional aspect of the present invention pro
`vides a Sound Source localization apparatus and method of a
`robot appropriately disposing and installing a plurality of
`e.g., four, microphones for localizing a sound source and
`minimizing a dead space in which a sound source cannot be
`localized.
`0027. According to one aspect of the present invention an
`apparatus is provided for localizing a Sound source in a robot.
`The apparatus comprises a microphone unit implemented by
`one or more microphones, which picks up Sound from a
`three-dimensional space. The apparatus also comprises a
`Sound source localizer for determining a position of the Sound
`source in accordance with Time-Difference Of Arrivals
`(TDOAs) and a highest power of the sound picked up by the
`microphone unit.
`0028. In the microphone unit, four microphones may be
`disposed at comers of an imaginary tetrahedron.
`0029. The sound source localizer may determine a direc
`tion of the Sound source using a first algorithm in accordance
`with the TDOAS between the microphones, and may deter
`mine one of three directions from the robot as the direction of
`the sound source using a GCC-PHAT algorithm in accor
`dance with the TDOAs of respective pairs of the micro
`phones.
`0030 The sound source localizer may determine two
`directions calculated from three pairs of the microphones as
`the direction of the sound source when the directions calcu
`lated in accordance with the TDOAs of the three pairs of the
`microphones are not the same, and may determine the posi
`tion of the Sound source in the three-dimensional space in the
`direction of the Sound source using a second algorithm when
`the direction of the sound source is determined.
`0031. The sound source localizer may determine as the
`position of the Sound source a point of highest power in the
`
`three-dimensional space in the direction of the sound Source
`using an SRP-PHAT algorithm.
`0032. The sound source localizer may include a first algo
`rithm processor for determining a direction of the Sound
`source according to the TDOAS between the microphones
`using a GCC-PHAT algorithm. The sound source localizer
`may also include a second algorithm processor for determin
`ing a point of highest power in the three-dimensional space in
`the direction of the sound source determined by the first
`algorithm processor using an SRP-PHAT algorithm. The
`Sound source localizer may further include a sound source
`position determiner for determining as the position of the
`Sound source three-dimensional coordinates of the point
`determined by the second algorithm processor to have highest
`power.
`0033. The robot may include a camera for taking an image
`in a view direction of the robot, a plurality of drive motors for
`providing driving power to move the robot, and a controller
`for controlling the drive motors to direct the camera toward
`the three-dimensional coordinates determined by the sound
`Source position determiner.
`0034. According to another aspect of the present invention
`an apparatus is provided for localizing a Sound source in a
`robot. The apparatus comprises a microphone unit imple
`mented by four microphones disposed at comers of an imagi
`nary tetrahedron and picking up a sound from a three-dimen
`sional space. The apparatus also comprises a Sound source
`localizer for determining a direction of the Sound source
`according to TDOAs of the sound picked up from respective
`pairs of the four microphones of the microphone unit, and
`determining as a position of the Sound source a point of
`highest power in the three-dimensional space in the direction
`of the Sound source.
`0035. According to a further aspect of the present inven
`tion a method of localizing a sound source in a robot is
`provided. A Sound is picked up through four microphones
`disposed at corners of an imaginary tetrahedron at the robot.
`The direction of a sound Source is determined in accordance
`with TDOAs of the sound between the four microphones
`using a first algorithm. The position of the Sound source is
`determined in three-dimensional space in the direction of the
`Sound source using a second algorithm.
`0036) Determining the direction of the sound source may
`include determining whether directions calculated according
`to the TDOAS between the four microphones using a GCC
`PHAT algorithm are the same. When the calculated directions
`are the same, determining the direction of the Sound Source
`may also include determining a direction from among three
`directions divided according to a position of the robot as the
`direction of the sound source. When the calculated directions
`are not the same, determining the direction of the Sound
`Source may further include determining two directions calcu
`lated according to the TDOAS between the microphones as
`the direction of the sound source.
`0037 Determining the position of the sound source may
`include determining as the position of the Sound Source three
`dimensional coordinates of a point of highest power in three
`dimensional space in the determined one or two directions of
`the sound source using an SRP-PHAT algorithm.
`0038. When the position of the sound source in three
`dimensional space is determined, a drive motor may be con
`trolled to direct a view of the robot toward the position of the
`Sound source.
`
`Page 11 of 15
`
`
`
`US 2009/0279714 A1
`
`Nov. 12, 2009
`
`0039. According to an additional aspect of the present
`invention a method of localizing a sound source in a robot is
`provided. Sound is picked up, at the robot, through four
`microphones disposed at corners of an imaginary tetrahe
`dron. It is determined whether directions calculated accord
`ing to TDOAS between the four microphones using a GCC
`PHAT algorithm are the same. When the directions are the
`same, a direction from among three directions divided
`according to a position of the robot is determined as a direc
`tion of the Sound Source. Three-dimensional coordinates of a
`point of highest power in a three-dimensional space in the
`determined Sound source direction is determined as the posi
`tion of the sound source using an SRP-PHAT algorithm.
`When the directions calculated according to the TDOAS
`between the microphones are not the same, two directions
`calculated according to the TDOAS between the microphones
`are determined as the direction of the sound source. Three
`dimensional coordinates of a point of highest power in the
`three-dimensional space in the determined sound Source
`directions is determined as the position of the Sound Source
`using the SRP-PHAT algorithm.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0040. The above and other aspects, features and advan
`tages of the present invention will be more apparent from the
`following detailed description when taken in conjunction
`with the accompanying drawings, in which:
`0041
`FIG. 1 is a diagram illustrating a microphone array
`for localizing a sound source in three-dimensional space
`using a Generalized Cross-Correlation (GCC)-Phase Trans
`form (PHAT) algorithm;
`0042 FIG. 2 is a diagram illustrating four microphones
`disposed in a plane;
`0043 FIG. 3 is a block diagram illustrating an apparatus
`for localizing a Sound source in a robot according to an
`embodiment of the present invention;
`0044 FIGS. 4A and 4B illustrate a microphone array of a
`microphone unit according to an embodiment of the present
`invention;
`004.5 FIGS. 5A and 5B illustrate dead space in which a
`robot cannot localize a sound source:
`0046 FIG. 6 is a block diagram illustrating a sound source
`localizer according to an embodiment of the present inven
`tion;
`0047 FIG. 7 is a diagram of a microphone array illustrat
`ing a method of determining the position of a sound Source
`according to an embodiment of the present invention;
`0048 FIG. 8 is a flowchart illustrating a method of local
`izing a sound Source in a robot according to an embodiment of
`the present invention; and
`0049 FIG. 9 is a flowchart illustrating a method of deter
`mining the direction and position of a Sound source in a robot
`according to an embodiment of the present invention.
`
`DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENTS
`0050 Preferred embodiments of the present invention are
`described in detail with reference to the accompanying draw
`ings. The same or similar components may be designated by
`the same or similar reference numerals although they are
`illustrated in different drawings. Detailed descriptions of con
`structions or processes known in the art may be omitted to
`avoid obscuring the Subject matter of the present invention.
`
`FIG. 3 is a block diagram illustrating an apparatus
`0051
`for localizing a Sound source in a robot according to an
`embodiment of the present invention.
`0052 Referring to FIG. 3, a robot 100 according to an
`embodiment of the present invention includes a microphone
`unit 110, which is implemented by a plurality of, e.g., four,
`microphones 111, a sound source localizer 120, which local
`izes a Sound source in three-dimensional space, a camera 140,
`which takes an image in the view direction of the robot 100,
`a plurality of drive motors 150, which provide driving power
`for moving the robot 100 itself and the view direction, hands,
`etc., of the robot 100, and a controller 130, which controls the
`drive motors 150 to direct the view of the robot 100 toward the
`position of the Sound source in three-dimensional space, i.e.,
`three-dimensional coordinates localized by the Sound Source
`localizer 120.
`0053. When the sound source localizer 120 determines the
`position of a sound source, the controller 130 controls the
`drive motors 150 to direct the view of the robot 100 toward the
`position of the Sound source, which is presumed to be a user.
`0054) The drive motors 150 provide driving power to
`change joint angles of the robot 100, and the robot 100 moves
`using the driving power provided by the drive motors 150.
`0055. The microphone unit 110 may be implemented by,
`for example, the four microphones 111 disposed at comers of
`an imaginary tetrahedron.
`0056 FIGS. 4A and 4B illustrate a microphone array of a
`microphone unit according to an embodiment of the present
`invention.
`0057. As illustrated in FIGS. 4A and 4B, the microphones
`111-1, 111-2, 111-3 and 111-4 of the microphone unit 110.
`according to an embodiment of the present invention, are
`disposed at the comers of an imaginary regular tetrahedron,
`respectively, and neither the distances nor the distance ratios
`between the microphones 111-1, 111-2, 111-3 and 111-4 are
`limited.
`0058 When the four microphones 111-1, 111-2, 111-3
`and 111-4 are disposed in the form of a regular tetrahedron as
`illustrated in FIGS. 4A and 4B, there are direct paths from a
`Sound source in three-dimensional space to three or more of
`the microphones 111-1, 111-2, 111-3 and 111-4 so that the
`Sound Source can be localized. In comparison with the rect
`angular array of the microphones 10 shown in FIG. 2, dead
`space in which a sound source cannot be localized is remark
`ably reduced.
`0059 FIGS. 5A and 5B illustrate dead space in which a
`robot cannot localize a sound source. FIGS.5A and 5B illus
`trate example cases in which a microphone unit is imple
`mented in the head of the robot 100.
`0060 FIG. 5A illustrates a dead space formed when the
`four microphones 10 are disposed in a rectangular form, and
`FIG. 5B illustrates a dead space formed when the four micro
`phones 111-1, 111-2, 111-3 and 111-4 are disposed in the
`form of a regular tetrahedron. When the four microphones
`111-1, 111-2, 111-3 and 111-4 are disposed in the form of a
`regular tetrahedron, a Sound source that is above or below can
`be localized, and thus dead space is remarkably reduced.
`0061. When sound is picked up through the microphone
`unit 110, the sound source localizer 120 determines the direc
`tion of the Sound source using a Generalized Cross-Correla
`tion (GCC)-Phase Transform (PHAT) algorithm, and the
`position of the sound Source in three-dimensional space in the
`determined sound source direction using a Steered Response
`Power (SRP)-PHAT algorithm.
`
`Page 12 of 15
`
`
`
`US 2009/0279714 A1
`
`Nov. 12, 2009
`
`0062 More specifically, the sound source localizer 120
`determines a rough direction of the sound source, i.e., the
`sound source direction, using the GCC-PHAT algorithm,
`divides three-dimensional space not in all directions but only
`toward the sound source from the robot 100 into blocks, and
`determines the position of the sound source using the SRP
`PHAT algorithm.
`0063. In addition, the sound source localizer 120 provides
`the three-dimensional coordinates of the determined sound
`source position to the controller 130 so that the controller 130
`directs the view of the robot 100 toward the sound source
`position.
`0064 FIG. 6 is a block diagram of a sound source localizer
`according to an embodiment of the present invention.
`0065 Referring to FIG. 6, the sound source localizer 120
`according to an embodiment of the present invention includes
`a first algorithm processor 121, a second algorithm processor
`122 and a sound source position determiner 123.
`0066. When sound is picked up through the microphone
`unit 110, the first algorithm processor 121 determines a sound
`Source direction using a first algorithm, that is, the GCC
`PHAT algorithm on the basis of Time-Difference Of Arrivals
`(TDOAs) between the microphones 111-1, 111-2, 111-3 and
`111-4.
`0067. The first algorithm processor 121 may calculate the
`TDOAS between the microphones 111 using the following
`Equation (1):
`
`0068. Equation (1) denotes a cross-correlation when a
`TDOA between two of the microphones 111-1, 111-2, 111-3
`and 1114 is t, and the cross-correlation may be a TDOA of a
`Sound source obtained when T is maximized.
`0069. A time relationship is converted into a frequency
`relationship according to a PHAT filter, and the maximum
`TDOA is calculated.
`0070 Then, using the maximum TDOA, a sound source
`direction is determined by the following Equation (2):
`
`f12 = argmaxR12(i)
`(SO
`
`(2)
`
`0071. In Equation (2), D is a variable denoting a possible
`TDOA according to a physical distance between the two
`microphones 111-1, 111-2, 111-3 and 111-4. Thus, the dis
`tance between the microphones 111-1, 111-2, 111-3 and
`111-4 does not need to be limited.
`0072 The first algorithm processor 121 determines the
`Sound source direction using Equation (1) and Equation (2).
`0073. When the first algorithm processor 121 determines
`the Sound source direction, the second algorithm processor
`122 determines a sound source position in three-dimensional
`space in the Sound Source direction using a second algorithm,
`that is, the SRP-PHAT algorithm.
`0074 To determine the sound source position, the second
`algorithm processor 122 divides three-dimensional space into
`blocks and calculates block-specific powers using the follow
`ing Equation (3):
`
`W W
`
`"X1(co)X(co) A At
`Pa=X. 2. vie "do
`
`i = argmaxP(q)
`g
`
`(3)
`
`(4)
`
`0075 Powers of all the blocks in three-dimensional space
`are calculated by Equation (3) for calculating steered power
`of a beam former at a point q, and a point at which the highest
`power is obtained, as expressed by Equation (4), is deter
`mined as the Sound source position.
`0076. When the first algorithm processor 121 determines
`the Sound source direction and the second algorithm proces
`Sor 122 determines the Sound source position, the Sound
`source position determiner 123 transfers three-dimensional
`coordinates of the sound source to the controller 130.
`0077. A method for the sound source localizer 120 to
`determine the position of a sound source will be described in
`detail below.
`0078 FIG. 7 is a diagram of a microphone array illustrat
`ing a method of determining the position of a sound source
`according to an embodiment of the present invention.
`(0079 Referring to FIG. 7, when sound is picked up
`through the microphone unit 110, the first algorithm proces
`sor 121 of the sound source localizer 120 calculates TDOAS
`between the microphones 111-1, 111-2, 111-3 and 111-4
`using the GCC-PHAT algorithm and determines the direction
`of the Sound source.
`0080 For example, the sound may be generated in front of
`a regular tetrahedron formed by the microphones 111-1, 111
`2, 111-3 and 111-4 of the microphone unit 110. In this case,
`the Sound source directions calculated from microphone pairs
`a, b and d using the GCC-PHAT algorithm are all forward.
`I0081. Meanwhile, when sound is generated on the left, all
`the Sound source directions calculated from microphone pairs
`a, b and fare left, and when Sound is generated on the right, all
`the Sound source directions calculated from microphone pairs
`a, c and e are right.
`I0082. Thus, the first algorithm processor 121 may deter
`mine as the Sound Source direction one of the three directions,
`i.e., forward, left and right of the robot 100, on the basis of
`TDOAS between the microphones 111-1, 111-2, 111-3 and
`111-4.
`I0083. Then, the second algorithm processor 122 deter
`mines the position of the Sound source in three-dimensional
`space in the determined sound direction. More specifically,
`the second algorithm processor 122 executes the SRP-PHAT
`algorithm using three of the microphones 111-1, 111-2, 111-3
`and 111-4 having direct paths to the Sound Source direction
`among the four microphones 111-1, 111-2, 111-3 and 111-4
`disposed in the form of a regular tetrahedron.
`I0084. For example, when the sound source direction is
`determined to be forward, the second algorithm processor
`122 localizes the sound source position using (1), (2) and (4)
`microphones on the basis of the SRP-PHAT algorithm. When
`the sound source direction is determined to be left, the second
`algorithm processor 122 localizes the Sound source position
`using (2), (3) and (4) microphones on the basis of the SRP
`PHAT algorithm, and when the sound source direction is
`determined to be right, the second algorithm processor 122
`localizes the Sound source position using (1), (3) and (4)
`microphones on the basis of the SRP-PHAT algorithm.
`
`Page 13 of 15
`
`
`
`US 2009/0279714 A1
`
`Nov. 12, 2009
`
`0085. Since the sound source localizer 120 determines the
`sound source direction using the GCC-PHAT algorithm and
`determines the Sound source position in three-dimensional
`space in the sound source direction using the SRP-PHAT
`algorithm, it is possible to have the advantages of both the
`GCC-PHAT algorithm and the SRP-PHAT algorithm, that is,
`the ability to determine a sound source direction in real time
`and the ability to accurately localize a Sound source. The
`Sound source localizer 120 can rapidly and accurately deter
`mine a sound source position in three-dimensional space.
`I0086) However, when a sound source is on the x, y or
`Z-axis shown in FIG. 7, the first algorithm processor 121
`cannot determine one of the three directions as the Sound
`Source direction.
`0087 More specifically, when sound is generated on the
`X-axis, Sound Source directions calculated from the micro
`phone pairs b and d using the GCC-PHAT algorithm are all
`forward. However, a TDOA between the microphone pair c is
`0, and thus a sound source direction calculated from the
`microphone pair c is not forward.
`0088. In addition, sound source directions calculated from
`the microphone pairs aande using the GCC-PHAT algorithm
`are right, but a sound source direction calculated from the
`microphone pair c is not right. In other words, all sound
`Source directions calculated from three microphone pairs
`using the GCC-PHAT algorithm are not the same, and thus
`any one of the three directions cannot be determined as the
`Sound source direction.
`0089 Consequently, when all sound source directions cal
`culated from three microphone pairs using the GCC-PHAT
`algorithm are not the same, the first algorithm processor 121
`determines as the sound source direction two of the three
`directions calculated from the three microphone pairs, and the
`second algorithm processor determines the sound source
`position in three-dimensional space in the two of the three
`directions using the SRP-PHAT algorithm.
`0090. As described above, even if sound source directions
`calculated from three microphone pairs are not the same, the
`SRP-PHAT algorithm is executed not on all the directions but
`on only two of the three directions. Thus, it is possible to
`determine the position of a sound Source faster than a con
`ventional method of determining the position of a Sound
`source using only the SRP-PHAT algorithm.
`0091
`FIG. 8 is a flowchart illustrating a method of local
`izing a sound Source in