`(12) Patent Application Publication (10) Pub. No.: US 2005/0047611 A1
`(43) Pub. Date:
`Mar. 3, 2005
`Mao
`
`US 20050047611A1
`
`(54) AUDIO INPUT SYSTEM
`(76) Inventor: Xiadong Mao, Foster City, CA (US)
`Correspondence Address:
`MARTINE PENILLA & GENCARELLA, LLP
`710 LAKEWAY DRIVE
`SUTE 200
`SUNNYVALE, CA 94085 (US)
`(21) Appl. No.:
`10/650,409
`(22) Filed:
`Aug. 27, 2003
`
`Publication Classification
`
`(51) Int. Cl. ............................. H04R 3/00; H04B 15/00
`(52) U.S. Cl. ............................................. 381/94.7; 381/92
`
`(57)
`
`ABSTRACT
`
`A method for reducing noise associated with an audio signal
`received through a microphone Sensor array is provided. The
`method initiates with enhancing a target Signal component
`of the audio signal through a first filter. Simultaneously, the
`target Signal component is blocked by a Second filter. Then,
`the output of the first filter and the output of the second filter
`are combined in a manner to reduce noise without distorting
`the target Signal. Next, an acoustic Set-up associated with the
`audio signal is periodically monitored. Then, a value of the
`first filter and a value of the second filter are both calibrated
`based upon the acoustic Set-up. A System capable of isolating
`a target audio signal from multiple noise Sources, a Video
`game controller, and an integrated circuit configured to
`isolate a target audio Signal are included.
`
`
`
`Page 1 of 19
`
`GOOGLE EXHIBIT 1008
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 1 of 9
`
`US 2005/0047611 A1
`
`110
`
`
`
`110
`
`Page 2 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 2 of 9
`
`US 2005/0047611 A1
`
`118 120 122a
`
`s
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Source
`signal
`
`Adaptive noise cancellation:
`:
`Adaptive Beamforming
`:.ACQUsite Echogancellation.
`
`s- - - -
`8
`
`,
`M
`
`s:
`s
`
`h
`
`s... s r.
`
`a. ha s
`
`124
`
`
`
`Output Clean Source Signal
`128b.
`
`Page 3 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 3 of 9
`
`US 2005/0047611 A1
`
`
`
`Page 4 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 4 of 9
`
`US 2005/0047611 A1
`
`
`
`Page 5 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 5 of 9
`
`US 2005/0047611 A1
`
`
`
`Data mining
`
`Fig. 5
`
`Page 6 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 6 of 9
`
`US 2005/0047611 A1
`
`
`
`th
`c)
`t
`co)
`.
`-N4
`c
`C
`na
`
`Page 7 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 7 of 9
`
`US 2005/0047611 A1
`
`192a
`192b
`;
`:
`192c
`
`ty
`
`2
`
`3
`
`4
`
`time
`
`Fig. 7A
`
`
`
`- 194
`
`ty
`
`2
`
`3
`
`4
`
`time
`
`Fig. 7B
`
`
`
`
`
`
`
`ty
`
`2
`
`is
`
`a
`
`time
`
`Fig. 7C
`
`Page 8 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 8 of 9
`
`US 2005/0047611 A1
`
`
`
`Page 9 of 19
`
`
`
`Patent Application Publication Mar. 3, 2005 Sheet 9 of 9
`
`US 2005/0047611 A1
`
`Enhance a target signal associated with a
`listening direction through a first filter
`
`Block the target signal through a second filter
`
`Combine the output of the first filter and the Output of the second filter
`in a manner to reduce noise without distorting the signal
`
`Periodically monitor an acoustic set-up
`associated with the audio signal
`
`Calibrate both the first filter and the Second
`filter based upon the acoustic set-up
`
`C Done D
`Fig. 9
`
`210
`
`212
`
`214
`
`216
`
`218
`
`Page 10 of 19
`
`
`
`US 2005/0047611 A1
`
`Mar. 3, 2005
`
`AUDIO INPUT SYSTEM
`
`BACKGROUND OF THE INVENTION
`0001) 1. Field of the Invention
`0002 This invention relates generally to audio process
`ing and more particularly to a microphone array System
`capable of tracking an audio Signal from a particular Source
`while filtering out Signals from other competing or interfer
`ing Sources.
`0003 2. Description of the Related Art
`0004 Voice input systems are typically designed as a
`microphone worn near the mouth of the Speaker where the
`microphone is tethered to a headset. Since this imposes a
`physical restraint on the user, i.e., having to wear the
`headset, users will typically use the headset for only a
`Substantial dictation and rely on keyboard typing for rela
`tively brief input and computer commands in order to avoid
`wearing the headset.
`0005 Video game consoles have become a commonplace
`item in the home. The Video game manufacturers are con
`Stantly Striving to provide a more realistic experience for the
`user and to expand the limitations of gaming, e.g., on line
`applications. For example, the ability to communicate with
`additional players in a room having a number of noises being
`generated, or even for users to Send and receive audio
`Signals when playing on-line games against each other
`where background noises and noise from the game itself
`interferes with this communication, has So far prevented the
`ability for clear and effective player to player communica
`tion in real time. These same obstacles have prevented the
`ability of the player to provide Voice commands that are
`delivered to the Video game console. Here again, the back
`ground noise, game noise and room reverberations all inter
`fere with the audio Signal from the player.
`0006 AS users are not so inclined to wear a headset, one
`alternative to the headset is the use of microphone arrays in
`order to capture the Sound. However, shortcomings with the
`microphone arrays currently on the market today is the
`inability to track a Sound from a moving Source and/or the
`inability to Separate the Source Sound from the reverberation
`and environmental Sounds from the general area being
`monitored. Additionally, with respect to a Video game appli
`cation, a user will move around relative to the fixed positions
`of the game console and the display monitor. Where a user
`is Stationary, the microphone array may be able to be
`"factory Set' to focus on audio signals emanating from a
`particular location or region. For example, inside an auto
`mobile, the microphone array may be configured to focus
`around the driver's Seat region for a cellular phone appli
`cation. However, this type of microphone array is not
`Suitable for a Video game application. That is, a microphone
`array on the monitor or game console would not be able to
`track a moving user, Since the user may be mobile, i.e., not
`Stationary, during a Video game. Furthermore, a Video game
`application, a microphone array on the game controller is
`also moving relative to the user. Consequently, for a portable
`microphone array, e.g., affixed to the game controller, the
`Source positioning poses a major challenge to higher fidelity
`Sound capturing in Selective Spatial Volumes.
`0007 Another issue with the microphone arrays and
`asSociated Systems is the inability to adapt to high noise
`
`environments. For example, where multiple Sources are
`contributing to an audio signal, the current Systems available
`for consumer devices are unable to efficiently filter the Signal
`from a Selected Source. It should be appreciated that the
`inability to efficiently filter the Signal in a high noise
`environment only exacerbates the Source positioning issues
`mentioned above. Yet another shortcoming of the micro
`phone array Systems is the lack of bandwidth for a processor
`to handle the input Signals from each microphone of the
`array and track a moving user.
`0008. As a result, there is a need to solve the problems of
`the prior art to provide a microphone array that is capable of
`capturing an audio signal from a user when the user and or
`the device to which the array is affixed are capable of
`changing position. There is also a need to design the System
`for robustneSS in a high noise environment where the System
`is configured to provide the bandwidth for multiple micro
`phones Sending input signals to be processed.
`
`SUMMARY OF THE INVENTION
`0009 Broadly speaking, the present invention fills these
`needs by providing a method and apparatus that defines a
`microphone array framework capable of identifying a Source
`Signal irrespective of the movement of microphone array or
`the origination of the Source Signal. It should be appreciated
`that the present invention can be implemented in numerous
`ways, including as a method, a System, computer readable
`medium or a device. Several inventive embodiments of the
`present invention are described below.
`0010. In one embodiment, a method for processing an
`audio signal received through a microphone array is pro
`Vided. The method initiates with receiving a signal. Then,
`adaptive beam-forming is applied to the Signal to yield an
`enhanced Source component of the Signal. Inverse beam
`forming is also applied to the Signal to yield an enhanced
`noise component of the Signal. Then, the enhanced Source
`component and the enhanced noise component are combined
`to produce a noise reduced Signal.
`0011. In another embodiment, a method for reducing
`noise associated with an audio signal received through a
`microphone Sensor array is provided. The method initiates
`with enhancing a target Signal component of the audio signal
`through a first filter. Simultaneously, the target Signal com
`ponent is blocked by a second filter. Then, the output of the
`first filter and the output of the second filter are combined in
`a manner to reduce noise without distorting the target Signal.
`Next, an acoustic Set-up associated with the audio signal is
`periodically monitored. Then, a value of the first filter and a
`value of the second filter are both calibrated based upon the
`acoustic Set-up.
`0012. In yet another embodiment, a computer readable
`medium having program instructions for processing an
`audio signal received through a microphone array is pro
`Vided. The computer readable medium includes program
`instructions for receiving a signal and program instructions
`for applying adaptive beam-forming to the Signal to yield an
`enhanced Source component of the Signal. Program instruc
`tions for applying inverse beam-forming to the Signal to
`yield an enhanced noise component of the Signal are
`included. Program instructions for combining the enhanced
`Source component and the enhanced noise component to
`produce a noise reduced Signal are provided
`
`Page 11 of 19
`
`
`
`US 2005/0047611 A1
`
`Mar. 3, 2005
`
`0013 In still yet another embodiment, a computer read
`able medium having program instructions for reducing noise
`asSociated with an audio signal is provided. The computer
`readable medium includes program instructions for enhanc
`ing a target Signal associated with a listening direction
`through a first filter and program instructions for blocking
`the target Signal through a Second filter. Program instructions
`for combining an output of the first filter and an output of the
`Second filter in a manner to reduce noise without distorting
`the target Signal are provided. Program instructions for
`periodically monitoring an acoustic Set up associated with
`the audio signal are included. Program instructions for
`calibrating both the first filter and the second filter based
`upon the acoustic Setup are provided.
`0.014.
`In another embodiment, a system capable of iso
`lating a target audio signal from multiple noise Sources is
`provided. The System includes a portable consumer device
`configured to move independently from a user. A computing
`device is included. The computing device includes logic
`configured enhance the target audio signal without con
`Straining movement of the portable consumer device. A
`microphone array affixed to the portable consumer device is
`provided. The microphone array is configured to capture
`audio signals, wherein a listening direction associated with
`the microphone array is controlled through the logic con
`figured to enhance the target audio Signal.
`0.015. In yet another embodiment, a video game control
`ler is provided. The video game controller includes a micro
`phone array affixed to the Video game controller. The micro
`phone array is configured to detect an audio Signal that
`includes a target audio signal and noise. The Video game
`controller includes circuitry configured to process the audio
`Signal. Filtering and enhancing logic configured to filter the
`noise and enhance the target audio Signal as a position of the
`Video game controller and a position of a Source of the target
`audio signal change is provided. Here, the filtering of the
`noise is achieved through a plurality of filter-and-Sum opera
`tions.
`0016. An integrated circuit is provided. The integrated
`circuit includes circuitry configured to receive an audio
`Signal from a microphone array in a multiple noise Source
`environment. Circuitry configured to enhance a listening
`direction signal is included. Circuitry configured to block
`the listening direction Signal, i.e., enhance a non listening
`direction signal, and circuitry configured to combine the
`enhanced listening direction signal and the enhanced non
`listening direction signal to yield a noise reduced Signal.
`Circuitry configured to adjust a listening direction according
`to filters computed through an adaptive array calibration
`Scheme is included.
`0.017. Other aspects and advantages of the invention will
`become apparent from the following detailed description,
`taken in conjunction with the accompanying drawings,
`illustrating by way of example the principles of the inven
`tion.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0.018. The present invention will be readily understood by
`the following detailed description in conjunction with the
`accompanying drawings, and like reference numerals des
`ignate like Structural elements.
`
`0019 FIGS. 1A and 1B are exemplary microphone
`Sensor array placements on a Video game controller in
`accordance with one embodiment of the invention.
`0020 FIG. 2 is a simplified high-level schematic dia
`gram illustrating a robust voice input System in accordance
`with one embodiment of the invention.
`0021
`FIG. 3 is a simplified schematic diagram illustrat
`ing an acoustic echo cancellation Scheme in accordance with
`one embodiment of the invention
`0022 FIG. 4 is a simplified schematic diagram illustrat
`ing an array beam-forming module configured to SuppreSS a
`Signal not coming from a listening direction in accordance
`with one embodiment of the invention.
`0023 FIG. 5 is a high level schematic diagram illustrat
`ing a blind Source Separation Scheme for Separating the noise
`and Source Signal components of an audio signal in accor
`dance with one embodiment of the invention.
`0024 FIG. 6 is a schematic diagram illustrating a micro
`phone array framework that incorporates adaptive noise
`cancellation in accordance with one embodiment of the
`invention.
`0025 FIGS. 7A through 7C graphically represent the
`processing Scheme illustrated through the framework of
`FIG. 6 in accordance with one embodiment of the invention.
`0026 FIG. 8 is a simplified schematic diagram illustrat
`ing a portable consumer device configured to track a Source
`Signal in a noisy environment in accordance with one
`embodiment of the invention.
`0027 FIG. 9 is a flow chart diagram illustrating the
`method operations for reducing noise associated with an
`audio Signal in accordance with one embodiment of the
`invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`0028. An invention is described for a system, apparatus
`and method for an audio input System configured to isolate
`a Source audio signal from a noisy environment in real time
`through an economic and efficient scheme. It will be obvi
`ous, however, to one skilled in the art, that the present
`invention may be practiced without Some or all of these
`Specific details. In other instances, well known process
`operations have not been described in detail in order not to
`unnecessarily obscure the present invention.
`0029. The embodiments of the present invention provide
`a System and method for an audio input System asSociated
`with a portable consumer device through a microphone
`array. The Voice input System is capable of isolating a target
`audio signal from multiple noise Signals. Additionally, there
`are no constraints on the movement of the portable con
`Sumer device, which has the microphone array affixed
`thereto. The microphone array framework includes four
`main modules in one embodiment of the invention. The first
`module is an acoustic echo cancellation (AEC) module. The
`AEC module is configured to cancel portable consumer
`device generated noises. For example, where the portable
`consumer device is a video game controller, the noises,
`asSociated with Video game play, i.e., music, explosions,
`Voices, etc., are all known. Thus, a filter applied to the Signal
`
`Page 12 of 19
`
`
`
`US 2005/0047611 A1
`
`Mar. 3, 2005
`
`from each of the microphone Sensors of the microphone
`array may remove these known device generated noises. In
`another embodiment, the AEC module is optional and may
`not be included with the modules described below. Further
`details on acoustic echo cancellation may be found in
`“Frequency-Domain and Multirate Adaptive Filtering” by
`John J. Shynk, IEEE Signal Processing Magazine, pp.
`14-37, January 1992. This article is incorporated by refer
`ence for all purposes.
`0030) A second module includes a separation filter. In one
`embodiment, the Separation filter includes a signal passing
`filter and a signal blocking filter. In this module, array
`beam-forming is performed to SuppreSS a Signal not coming
`from an identified listening direction. Both, the Signal pass
`ing filter and the blocking filter are finite impulse response
`(FIR) filters that are generated through an adaptive array
`calibration module. The adaptive array calibration module,
`the third module, is configured to run in the background. The
`adaptive array calibration module is further configured to
`Separate interference or noise from a Source Signal, where
`the noise and the Source Signal are captured by the micro
`phone Sensors of the Sensor array. Through the adaptive
`array calibration module, as will be explained in more detail
`below, a user may freely move around in 3-dimensional
`Space with six degrees of freedom during audio recording.
`Additionally, with reference to a Video game application, the
`microphone array framework discussed herein, may be used
`in a loud gaming environment with background noises
`which may include, television audio signals, high fidelity
`music, Voices of other players, ambient noise, etc. AS
`discussed below, the Signal passing filter is used by a
`filter-and-Sum beam-former to enhance the Source Signal.
`The Signal blocking filter effectively blocks the Source Signal
`and generates interferences or noise, which is later used to
`generate a noise reduced Signal in combination with the
`output of the Signal passing filter.
`0.031) A fourth module, the adaptive noise cancellation
`module, takes the interferences from the Signal blocking
`filter for Subtraction from the beam-forming output, i.e., the
`Signal passing filter output. It should be appreciated that
`adaptive noise cancellation (ANC) may be analogized to
`AEC with the exception that the noise templates for ANC are
`generated from the Signal blocking filter of the microphone
`Sensor array, instead of a Video game console's output. In
`one embodiment, in order to maximize noise cancellation
`while minimizing target Signal distorting, the interferences
`used as noise templates should prevent the Source Signal
`leakage that is covered by the Signal blocking filter. Addi
`tionally, the use of ANC as described herein, enables the
`attainment of high interference-reduction performance with
`a relatively Small number of microphones arranged in a
`compact region.
`0032 FIGS. 1A and 1B are exemplary microphone
`Sensor array placements on a Video game controller in
`accordance with one embodiment of the invention. FIG. 1A
`illustrates microphone sensors 112-1, 112-2, 112-3 and
`112-4 oriented in an equally spaced Straight line array
`geometry on Video game controller 110. In one embodiment,
`each of the microphone sensors 112-1 through 112-4 are
`approximately 2.5 cm apart. However, it should be appre
`ciated that microphone sensors 112-1 through 112-4 may be
`placed at any Suitable distance apart from each other on
`Video game controller 110. Additionally, Video game con
`
`troller 110 is illustrated as a SONY PLAYSTATION 2 Video
`Game Controller, however, video game controller 110 may
`be any Suitable video game controller.
`0033 FIG. 1B illustrates an 8 sensor, equally spaced
`rectangle array geometry for microphone Sensors 112-1
`through 112-8 on video game controller 110. It will be
`apparent to one skilled in the art that the number of Sensors
`used on video game controller 110 may be any suitable
`number of Sensors. Furthermore, the audio Sampling rate and
`the available mounting area on the game controller may
`place limitations on the configuration of the microphone
`Sensor array. In one embodiment, the arrayed geometry
`includes four to twelve Sensors forming a convex geometry,
`e.g., a rectangle. The convex geometry is capable of pro
`viding not only the Sound Source direction (two-dimension)
`tracking as the Straight line array does, but is also capable of
`providing an accurate Sound location detection in three
`dimensional Space. AS will be explained further below, the
`added dimension will assist the noise reduction Software to
`achieve three-dimensional Spatial Volume based arrayed
`beam-forming. While the embodiments described herein
`refer typically to a Straight line array System, it will be
`apparent to one skilled in the art that the embodiments
`described herein may be extended to any number of Sensors
`as well as any Suitable array geometry Set up. Moreover, the
`embodiments described herein refer to a Video game con
`troller having the microphone array affixed thereto. How
`ever, the embodiments described below may be extended to
`any Suitable portable consumer device utilizing a voice input
`System.
`In one embodiment, an exemplary four-sensor
`0034.
`based microphone array may be configured to have the
`following characteristics:
`0035 1. An audio sampling rate that is 16 kHz;
`0036 2. A geometry that is an equally spaced
`Straight-line array, with a spacing of one-half wave
`length at the highest frequency of interest, e.g., 2.0
`cm. between each of the microphone Sensors. The
`frequency range is about 120 Hz to about 8 kHz;
`0037 3. The hardware for the four-sensor based
`microphone array may also include a sequential
`analog-to-digital converter with 64 kHz Sampling
`rate; and
`0038 4. The microphone sensor may be a general
`p
`9.
`purpose omni-directional Sensor.
`0039. It should be appreciated that the microphone sensor
`array affixed to a Video game controller may move freely in
`3-D space with Six degrees of freedom during audio record
`ing. Furthermore, as mentioned above, the microphone
`Sensor array may be used in extremely loud gaming envi
`ronments which include multiple background noises, e.g.,
`television audio signals, high-fidelity music Signals, Voices
`of other players, ambient noises, etc. Thus, the memory
`bandwidth and computational power available through a
`Video game console in communication with the Video game
`controller makes it possible for the console to be used as a
`general purpose processor to Serve even the most Sophisti
`cated real-time Signal processing applications. It should be
`further appreciated that the above configuration is exem
`plary and not meant to be limiting as any Suitable geometry,
`Sampling rate, number of microphones, type of Sensor, etc.,
`may be used.
`
`Page 13 of 19
`
`
`
`US 2005/0047611 A1
`
`Mar. 3, 2005
`
`0040 FIG. 2 is a simplified high-level schematic dia
`gram illustrating a robust voice input System in accordance
`with one embodiment of the invention. Video game control
`ler 110 includes microphone sensors 112-1 through 112-4.
`Here, video game controller 110 may be located in high
`noise environment 116. High-noise environment 116
`includes background noise 118, reverberation noise 120,
`acoustic echoes 126 emanating from SpeakerS 122a and
`122b, and source signal 128a. Source signal 128a may be a
`Voice of a user playing the Video game in one embodiment.
`Thus, Source Signal 128a may be contaminated by Sounds
`generated from the game console or Video game application,
`Such as music, explosions, car racing, etc. In addition,
`background noise, e.g., music, Stereo, television, high-fidel
`ity Surround Sound, etc., may also be contaminating Source
`Signal 128a. Additionally, environmental ambient noises,
`e.g., air conditioning, fans, people moving, doors Slamming,
`outdoor activities, Video game controller input noises, etc.,
`will also add to the contamination of Source Signal 128a, as
`well as Voices from other game players and room acoustic
`reverberation.
`0041. The output of the microphone sensors 112-1
`through 112-4 is processed through module 124 in order to
`isolate the Source Signal and provide output Source Signal
`128b, which may be used as a voice command for a
`computing device or as communication between users.
`Module 124 includes acoustic echo cancellation module,
`adaptive beam-forming module, and adaptive noise cancel
`lation module. Additionally, an array calibration module is
`running in the background as described below. AS illus
`trated, module 124 is included in video game console 130.
`AS will be explained in more detail below, the components
`of module 124 are tailored for a portable consumer device to
`enhance a voice Signal in a noisy environment without
`posing any constraints on a controller's position, orientation,
`or movement. AS mentioned above, acoustic echo cancella
`tion reduces noise generated from the console's Sound
`output, while adaptive beam-forming SuppreSSes Signals not
`coming from a listening direction, where the listening direc
`tion is updated through an adaptive array calibration
`Scheme. The adaptive noise cancellation module is config
`ured to Subtract interferences from the beam-forming output
`through templates generated by a signal filter and a blocking
`filter associated with the microphone Sensor array.
`0.042
`FIG. 3 is a simplified schematic diagram illustrat
`ing an acoustic echo cancellation Scheme in accordance with
`one embodiment of the invention. As mentioned above, AEC
`cancels noises generated by the Video game console, i.e., a
`game being played by a user. It should be appreciated that
`the audio signal being played on the console may be
`intercepted in either analog or digital format. The inter
`cepted Signal is a noise template that may be Subtracted from
`a signal captured by the microphone Sensor array on Video
`game controller 110. Here, audio source signal 128 and
`acoustic echoes 126 are captured through the microphone
`Sensor array. It should be appreciated that acoustic echoes
`126 are generated from audio signals emanating from the
`Video game console or Video game application. Filter 134
`generates a template that effectively cancels acoustic echoes
`126, thereby resulting in a Signal Substantially representing
`audio Source Signal 128. It should be appreciated that the
`AEC may be referred to as pre-processing. In essence, in a
`noisy environment where the noise includes acoustic echoes
`generated from the Video game console, or any other Suitable
`
`consumer device generating native audible signals, the
`acoustic echo cancellation Scheme effectively removes these
`audio signals while not impacting the Source Signal.
`0043 FIG. 4 is a simplified schematic diagram illustrat
`ing an array beam-forming module configured to SuppreSS a
`Signal not coming from a listening direction in accordance
`with one embodiment of the invention. In one embodiment,
`the beam-forming is based on filter-and-Sum beam-forming.
`The finite impulse response (FIR) filters, also referred to as
`Signal passing filters, are generated through an array cali
`bration process which is adaptive. Thus, the beam-forming
`is essentially an adaptive beam-former that can track and
`Steer the beam, i.e., listening direction, toward a Source
`Signal 128 without physical movement of the Sensor array.
`It will be apparent to one skilled in the art that beam
`forming, which refers to methods that can have signals from
`a focal direction enhanced, may be thought of as a process
`to algorithmically (not physically) Steer microphone Sensors
`112-1 through 112-m towards a desired target signal. The
`direction that the sensors 112-1 through 112-m look at may
`be referred to as the beam-forming direction or listening
`direction, which may either be fixed or adaptive at run time.
`0044) The fundamental idea behind beam-forming is that
`the Sound Signals from a desired Source reaches the array of
`microphone Sensors with different time delayS. The geom
`etry placement of the array being pre-calibrated, thus, the
`path-length-difference between the Sound Source and Sensor
`array is a known parameter. Therefore, a process referred to
`as cross-correlation is used to time-align signals from dif
`ferent Sensors. The time-align Signals from various Sensors
`are weighted according to the beam-forming direction. The
`weighted Signals are then filtered in terms of Sensor-specific
`noise-cancellation Setup, i.e., each Sensor is associated with
`a filter, referred to as a matched filter F, F, 142-1 through
`142-M, which are included in signal-passing-filter 160. The
`filtered signals from each Sensor are then Summed together
`through module 172 to generate output Z(c),0). It should be
`appreciated that the above-described process may be
`referred to as auto-correlation. Furthermore, as the Signals
`that do not lie along the beam-forming direction remain
`misaligned along the time axes, these Signals become attenu
`ated by the averaging. AS is common with an array-based
`capturing System, the overall performance of the micro
`phone array to capture Sound from a desired Spatial direction
`(using Straight line geometry placement) or spatial volumes
`(using convex geometry array placement) depends on the
`ability to locate and track the Sound Source. However, in an
`environment with complicated reverberation noise, e.g., a
`Videogame environment, it is practically infeasible to build
`a general Sound location tracking System without integrating
`the environmental Specific parameters.
`0045 Still referring to FIG. 4, the adaptive beam-form
`ing may be alternatively explained as a two-part process. In
`a first part, the broadside noise is assumed to be in a far field.
`That is, the distance from source 128 to microphone centers
`112-1 through 112-M is large enough so that it is initially
`assumed that Source 128 is located on a normal to each of the
`microphone Sensors. For example, with reference to micro
`phone Sensor 112-m the Source would be located along
`normal 136. Thus, the broadside noise is enhanced by
`applying a filter referred to as F1 herein. Next, a signal
`passing filter that is calibrated periodically is configured to
`determine a factor, referred to as F2, that allows the micro
`
`Page 14 of 19
`
`
`
`US 2005/0047611 A1
`
`Mar. 3, 2005
`
`phone Sensor array to adapt to movement. The determination
`of F2 is explained further with reference to the adaptive
`array calibration module. In one embodiment, the Signal
`passing filter is calibrated every 100 milliseconds. Thus,
`every 100 milliseconds the Signal passing filter is applied to
`the fixed beam-forming. In one embodiment, matched filters
`142-1 through 142-M Supply a steering factor, F2, for each
`microphone, thereby adjusting the listening direction as
`illustrated by lines 138-1 through 138-M. Considering a
`Sinusoidal far-field plane wave propagating towards the
`sensors at incidence angle of 0 in FIG. 4, the time-delay for
`the wave to travel a distance of d between two adjacent
`sensors is given by dimcos 0. Further details on fixed
`beam-forming may be found in the article entitled “Beam
`forming: A Versatile Approach to Spatial Filtering” by Barry
`D. Van Veen and Kevin M. Buckley, IEEE ASSP MAGA
`ZINE April 1988. This article is incorporated by reference
`for all purposes.
`0.046
`FIG. 5 is a high level schematic diagram illustrat
`ing a blind Source Separation Scheme for Separating the noise
`and Source Signal components of an audio signal in accor
`dance with one embodiment of the invention. It should be
`appreciated that explicit knowledge of the Source Signal and
`the noise within the audio signal is not available. However,
`it is known that the characteristics of the Source Signal and
`the noise are different. For example, a first Speaker's audio
`Signal may be distinguished from a Second Speaker's audio
`Signal because their voices a