`(12) Patent Application Publication (10) Pub. No.: US 2004/0161121 A1
`(43) Pub. Date:
`Aug. 19, 2004
`Chol et al.
`
`US 2004O161121A1
`
`(54)
`
`(75)
`
`(73)
`
`ADAPTIVE BEAMFORMING METHOD AND
`APPARATUS USING FEEDBACK
`STRUCTURE
`
`Inventors: Changkyu Chol, Seoul (KR); Jaywoo
`Kim, Gyeonggi-do (KR), Donggeon
`Kong, Busan-si (KR)
`Correspondence Address:
`STAAS & HALSEY LLP
`SUTE 700
`1201 NEW YORKAVENUE, N.W.
`WASHINGTON, DC 20005 (US)
`Assignee: Samsung Electronics Co., Ltd, Suwon
`Si (KR)
`Appl. No.:
`10/757,994
`
`(21)
`(22)
`(30)
`Jan. 17, 2003
`
`Filed:
`
`Jan. 16, 2004
`Foreign Application Priority Data
`
`(KR)......................................... 2003-3258
`
`
`
`Publication Classification
`
`51) Int. Cl."
`Int. Cl. .............................
`1)
`
`H04R 3/00
`; G 10L 21/02
`
`(52) U.S. Cl. .............................................. 381/92; 704/226
`
`(57)
`
`ABSTRACT
`
`An adaptive beam forming apparatus and method includes a
`fixed beam former that compensates for time delays of M
`noise-containing speech Signals input via a microphone
`array having M microphones (M is an integer greater than or
`equal to 2), and generates a Sum signal of the M compen
`Sated noise-containing speech Signals, and a multi-channel
`Signal Separator that extracts pure noise components from
`the M compensated noise-containing speech Signals using M
`adaptive blocking filters that are connected to M adaptive
`canceling filters in a feedback Structure and extracts pure
`Speech components from the added Signal using the M
`adaptive canceling filters that are connected to the M adap
`tive blocking filters in the feedback structure.
`
`430
`
`
`
`Page i
`
`IPR PETITION
`US RE48,371
`Sonos Ex. 1024
`
`
`
`Patent Application Publication Aug. 19, 2004 Sheet 1 of 5
`
`US 2004/0161121 A1
`
`(y) {
`
`
`
`
`
`(LHV (JOIHd)
`
`Page ii
`
`
`
`Patent Application Publication Aug. 19, 2004 Sheet 2 of 5
`
`US 2004/0161121 A1
`
`FIG. 2 (PRIOR ART)
`
`S+N/ S+N
`
`27
`
`S / S+N
`y(k)
`
`b(k)
`
`x(k)
`
`FIG. 3
`
`
`
`SN ()
`
`y(k) S
`
`S--N x(k)
`
`z(k) N
`
`Page iii
`
`
`
`Patent Application Publication Aug. 19, 2004 Sheet 3 of 5
`
`US 2004/0161121 A1
`
`
`
`
`
`
`
`
`
`
`
`†7 ° ?I He || || ?
`
`= = = = = = = = = =
`
`TIME DELAY
`ESTIMATOR
`
`a
`
`a
`
`r= = = = = = = = = = = = = =, =*
`
`Page iv
`
`
`
`Patent Application Publication Aug. 19, 2004 Sheet 4 of 5
`
`US 2004/0161121 A1
`
`(y) {
`
`
`
`
`
`
`
`985 g * 9IJ0 19
`
`TIME DELAY
`ESTIMATOR
`
`a
`
`a a
`
`+ • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
`
`Page v
`
`
`
`Patent Application Publication Aug. 19, 2004 Sheet 5 of 5
`
`US 2004/0161121 A1
`
`FIG. 6
`
`
`
`Page vi
`
`
`
`US 2004/O161121 A1
`
`Aug. 19, 2004
`
`ADAPTIVE BEAMFORMING METHOD AND
`APPARATUS USING FEEDBACK STRUCTURE
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`0001. This application claims the priority of Korean
`Patent Application No. 2003-3258, filed on Jan. 17, 2003, in
`the Korean Intellectual Property Office, the disclosure of
`which is incorporated herein in its entirety by reference.
`
`BACKGROUND OF THE INVENTION
`0002) 1. Field of the Invention
`0003. The present invention relates to an adaptive beam
`former, and more particularly, to a method and apparatus for
`adaptive beam forming using a feedback Structure.
`0004 2. Description of the Related Art
`0005 Mobile robots have applications in health-related
`fields, Security, home networking, entertainment, and So
`forth, and are the focus of increasing interest. Interaction
`between people and mobile robots is necessary when oper
`ating the mobile robots. Like people, a mobile robot with a
`Vision System has to recognize people and Surroundings, find
`the position of a person talking in the vicinity of the mobile
`robot, and understand what the perSon is saying.
`0006 A voice input system of the mobile robot is indis
`pensable for interaction between man and robot and is an
`important factor affecting autonomous mobility. Important
`factors affecting the Voice input System of a mobile robot in
`an indoor environment are noise, reverberation, and dis
`tance. There are a variety of noise Sources and reverberation
`due to walls or other objects in the indoor environment. Low
`frequency components of a voice are more attenuated than
`high frequency components with respect to distance.
`Accordingly, for proper interaction between a perSon and an
`autonomous mobile robot within a house, a Voice input
`System has to enable the robot to recognize the perSon's
`Voice at a distance of Several meters.
`0007 Such a voice input system generally uses a micro
`phone array comprising at least two microphones to improve
`Voice detection and recognition. In order to remove noise
`components contained in a Speech Signal input via the
`microphone array, a single channel Speech enhancement
`method, an adaptive acoustic noise canceling method, a
`blind Signal Separation method, and a generalized Sidelobe
`canceling method are employed.
`0008. The single channel speech enhancement method,
`disclosed in “Spectral Enhancement Based on Global Soft
`Decision” (IEEE Signal Processing Letters, Vol. 7, No. 5,
`pp. 108-110, 2000) by Nam-Soo Kim and Joon-Hyuk
`Chang, uses one microphone and ensures high performance
`only when Statistical characteristics of noise do not vary
`with time, like Stationary background noise. The adaptive
`acoustic noise canceling method, disclosed in "Adaptive
`Noise Canceling: Principles and Applications” (Proceedings
`of IEEE, Vol. 63, No. 12, pp. 1692-1716, 1975) by B.
`Widrow et al., uses two microphones. Here, one of the two
`microphones is a reference microphone for receiving only
`noise. Thus, if only noise cannot be received or noise
`received by the reference microphone contains other noise
`components, the performance of the adaptive acoustic noise
`
`canceling method sharply drops. Also, the blind Signal
`Separation method is difficult to use in the actual environ
`ment and to implement real-time Systems.
`0009 FIG. 1 is a block diagram of a conventional
`adaptive beam former using the generalized Sidelobe cancel
`ing method. The conventional adaptive beam former
`includes a fixed beam former (FBF) 11, an adaptive blocking
`matrix (ABM) 13, and an adaptive multi-input canceller
`(AMC) 15. The generalized sidelobe canceling method is
`described in more detail in “A Robust Adaptive Beamformer
`For Microphone Arrays With A Blocking Matrix Using
`Constrained Adaptive Filters” (IEEE Trans. Signal Process
`ing, Vol. 47, No. 10, pp. 2677-2684, 1999) by O.
`Hoshuyama et al.
`0010 Referring to FIG. 1, the FBF 11 uses a delay-and
`Sum beam former. In other words, the FBF 11 obtains the
`correlation of Signals, X, (k), where m is an integer between
`1 and M, input via microphones and calculates time delayS
`among Signals input via the microphones. Thereafter, the
`FBF 11 compensates for Signals input via the microphones
`by the calculated time delays, and then adds the Signals in
`order to output a signal b(k) having an improved signal-to
`noise ratio (SNR). The ABM 13 Subtracts the signal b(k)
`output from the FBF 11 through adaptive blocking filters
`(ABFs) from each of the signals whose time delays are
`compensated for in order to maximize noise components.
`The AMC 15 filters signals Z(k), where m is an integer
`between 1 and M, output from the ABM 13 through adaptive
`canceling filters (ACFs), and then adds the filtered signals,
`thereby generating noise components via M microphones.
`Thereafter, a signal output from the AMC 15 is subtracted
`from the signal b(k), which is delayed for a predetermined
`period of time D, to obtain a signal y(k) in which noise
`components are cancelled.
`0011. The operations of the ABM 13 and the AMC 15
`shown in FIG. 1 will be described in more detail with
`reference to FIG. 2. The operations of the ABM 13 and the
`AMC 15 are the same as in the adaptive acoustic noise
`canceling method.
`0012 Referring to FIG. 2, the size of symbols S+N, S,
`and N denotes the relative magnitude of Speech and noise
`Signals in Specific locations, and left Symbols and right
`symbols separated by a slash / denote to-be’ and 'as-is'
`States, respectively.
`0013 An ABF 21 adaptively filters the signal b(k) output
`from the FBF 11 according to the signal output from a first
`Subtractor 23 So that a characteristic of Speech components
`of the filtered signal output from the ABF 21 is the same as
`that of speech components of a microphone signal X(k)
`that is delayed for a predetermined period of time. The first
`subtractor 23 subtracts the signal output from the ABF 21
`from the microphone signal X' (k), where m is an integer
`between 1 and M, to obtain and output a signal Z(k) which
`is generated by canceling Speech components S from the
`microphone signal X(k).
`0014) An ACF 25 adaptively filters the signal Z(k)
`output from the first Subtractor 23 according to the Signal
`output from a Second Subtractor 27 So that a characteristic of
`noise components of the filtered signal output from the ACF
`25 is the same as that of noise components of the Signal b(k).
`The second subtractor 27 subtracts the signal outputs from
`
`
`
`US 2004/O161121 A1
`
`Aug. 19, 2004
`
`the ACF 25 from the signal b(k) and outputs a signal y(k)
`which is generated by canceling noise components N from
`the signal b(k).
`0.015 However, the above-described generalized side
`lobe canceling method has the following drawbacks. The
`delay-and-sum beam former of the FBF 11 has to generate
`the signal b(k) with a very high SNR so that only pure noise
`signals are input to the AMC 15. However, because the
`delay-and-Sum beam former outputs a signal whose SNR is
`not very high, the overall performance drops. As a result,
`Since the ABM 13 outputs a noise Signal containing a speech
`signal, the AMC 15, using the output of the ABM 13, regards
`Speech components contained in the Signal output from the
`ABM 13 as noise and cancels the noise. Therefore, the
`adaptive beam former finally outputs a speech Signal con
`taining noise components. Also, because filters used in the
`generalized sidelobe canceling method have a feedforward
`connection structure, finite impulse response (FIR) filters are
`employed. When such FIR filters are used in the feedforward
`connection Structure, 1000 or more filter taps are needed in
`a room reverberation environment. In addition, in a case
`where the ABF 21 and the ACF 25 are not properly trained,
`the performance of the adaptive beam former may deterio
`rate. Thus, Speech presence intervals and Speech absence
`intervals are necessary for training the ABF 21 and the ACF
`25. However, these training intervals are generally unavail
`able in practice. Moreover, because adaptation of the ABM
`13 and the AMC 15 has to be alternately performed, a voice
`activity detector (VAD) is needed. In other words, for
`adaptation of the ABF 21, a speech component is a desired
`Signal and a noise component is an undesired signal. On the
`contrary, for adaptation of the ACF 25, a noise component
`is a desired Signal and a speech component is an undesired
`Signal.
`
`SUMMARY OF THE INVENTION
`0016. The present invention provides a method of adap
`tive beam forming using a feedback Structure capable of
`almost completely canceling noise components contained in
`a wideband Speech Signal input from a microphone array
`comprising at least two microphones.
`0.017. The present invention also provides an adaptive
`beam forming apparatus including a feedback Structure to
`cancel noise components contained in wideband Speech
`Signals input from a microphone array.
`0.018. Additional aspects and/or advantages of the inven
`tion will be set forth in part in the description which follows
`and, in part, will be obvious from the description, or may be
`learned by practice of the invention.
`0.019 According to an aspect of the present invention,
`there is provided an adaptive beam forming method includ
`ing compensating for time delays of M noise-containing
`Speech Signals input via a microphone array having M
`microphones (M is an integer greater than or equal to 2), and
`generating a Sum Signal of the M compensated noise
`containing Speech Signals, and extracting pure noise com
`ponents from the M compensated noise-containing speech
`Signals using M adaptive blocking filters that are connected
`to M adaptive canceling filters in a feedback Structure and
`extracting pure Speech components from the Sum Signal
`using the M adaptive canceling filters that are connected to
`the M adaptive blocking filters in the feedback structure.
`
`0020. According to another aspect of the present inven
`tion, there is also provided an adaptive beam forming appa
`ratus including: a fixed beam former that compensates for
`time delays of M noise-containing Speech Signals input via
`a microphone array having M microphones (M is an integer
`greater than or equal to 2), and generates a Sum signal of the
`M compensated noise-containing speech Signals, and a
`multi-channel Signal Separator that extracts pure noise com
`ponents from the M compensated noise-containing speech
`Signals using M adaptive blocking filters that are connected
`to M adaptive canceling filters in a feedback Structure and
`extracts pure Speech components from the added signal
`using the M adaptive canceling filters that are connected to
`the M adaptive blocking filters in the feedback structure.
`0021. In an aspect of the present invention, the multi
`channel Signal Separator includes a first filter that filters a
`noise-removed Sum Signal through the M adaptive blocking
`filters, a first Subtractor that Subtracts Signals output from the
`M adaptive blocking filters from the M compensated noise
`containing speech Signals using M Subtractors, a Second
`filter that filters M Subtraction results of the first Subtractor
`through the Madaptive canceling filters, a Second Subtractor
`that Subtracts Signals output from the M adaptive canceling
`filters from the Sum Signal using M Subtractors, and inputs
`M Subtraction results to the M adaptive blocking filters as
`the noise-removed Sum Signal; and a Second adder that adds
`signals output from the M subtractors of the second Sub
`tractOr.
`0022. In an aspect of the present invention, the multi
`channel Signal Separator includes a first filter that filters a
`noise-removed Sum Signal through the M adaptive blocking
`filters, a first Subtractor that Subtracts Signals output from the
`M adaptive blocking filters from the M compensated noise
`containing speech Signals using M Subtractors, a Second
`filter that filters signals output from the M subtractors of the
`first Subtractor through the M adaptive canceling filters, a
`Second adder that adds signals output from M adaptive
`canceling filters of the Second filter, and a Second Subtractor
`that Subtracts Signals output from the Second adder from the
`Signals output from the fixed beam former and inputs M
`subtraction results to the M adaptive blocking filters as the
`noise-removed Sum Signal.
`BRIEF DESCRIPTION OF THE DRAWINGS
`0023 These and/or other aspects and advantages of the
`invention will become apparent and more readily appreci
`ated from the following description of the embodiments,
`taken in conjunction with the accompanying drawings of
`which:
`0024 FIG. 1 is a block diagram of a conventional
`adaptive beam former;
`0025 FIG. 2 is a circuit diagram for explaining a feed
`forward Structure used in the conventional adaptive beam
`former shown in FIG. 1;
`0026 FIG. 3 is a circuit diagram explaining a feedback
`Structure according to an embodiment of the present inven
`tion;
`0027 FIG. 4 is a block diagram of an adaptive beam
`former according to an embodiment of the present invention;
`0028 FIG. 5 is a block diagram of an adaptive beam
`former according to another embodiment of the present
`invention; and
`
`
`
`US 2004/O161121 A1
`
`Aug. 19, 2004
`
`0029 FIG. 6 illustrates an experimental environment
`used to compare an adaptive beam former according to the
`present invention and the conventional adaptive beam former
`shown in FIG. 1.
`DETAILED DESCRIPTION OF THE
`EMBODIMENTS
`0030) Reference will now be made in detail to the
`embodiments of the present invention, examples of which
`are illustrated in the accompanying drawings, wherein like
`reference numerals refer to the like elements throughout.
`The embodiments are described below to explain the present
`invention by referring to the figures.
`0.031
`Hereinafter, embodiments of the present invention
`will be described in detail with reference to the attached
`drawings. Meanwhile, “speech' used hereinafter is a repre
`Sentation implicitly including any target Signal necessary for
`using the present invention.
`0.032
`FIG. 3 is a circuit diagram for explaining a feed
`back Structure according to an embodiment of the present
`invention. The feedback Structure includes an adaptive
`blocking filter (ABF) 31, a first subtractor 33, an adaptive
`canceling filter (ACF) 35, and a second subtractor 37.
`0033 Referring to FIG. 3, the ABF 31 adaptively filters
`a signal y(k) output from the Second Subtractor 37 according
`to a signal output from the first subtractor 33 so that a
`characteristic of Speech components of the filtered signal
`output from the ABF 31 is the same as that of speech
`components of a microphone signal x(k), where m is an
`integer between 1 and M, that is delayed for a predetermined
`period of time. A first Subtractor 33 Subtracts a signal output
`from the ABF 31 from a signal X(k-D), i.e. X(k)
`obtained by delaying a signal X(k) input to an im" micro
`phone among M microphones, where M is an integer greater
`than or equal to 2, for a predetermined period of time D.
`AS a result, the first Subtractor 33 outputs only a pure noise
`Signal N contained in the signal X(k).
`0034) The ACF 35 adaptively filters a signal Z(k) output
`from the first Subtractor 33 according to a signal output from
`the second Subtractor 37 So that a characteristic of noise
`components of the filtered signal output from the ACF 35 is
`the same as that of noise components of the signal b(k)
`output from FBF 11 shown in FIG.1. The second subtractor
`37 subtracts the signal output from the ACF 35 from the
`signal b(k). Thus, the second subtractor 37 outputs only a
`pure speech Signal S derived from the Signal b(k) in which
`noise components are cancelled.
`0.035
`FIG. 4 is a block diagram of an adaptive beam
`former according to an embodiment of the present invention.
`The adaptive beam former includes a fixed beam former
`(FBF) 410 and a multi-channel signal separator 430. The
`FBF 410 includes a microphone array 411 having M micro
`phones 411a, 411b, and 411c, a time delay estimator 413, a
`delayer 415 having M delay devices 415a, 415b and 415c,
`and a first adder 417. The multi-channel signal separator 430
`includes a first filter 431 having M ABFs 431a and 431b, a
`first subtractor 433 having M subtractors 433a and 433b, a
`second filter 435 having M ACFS 435a and 435b, a second
`subtractor 437 having M subtractors 437a and 437b, and a
`Second adder 439.
`0036) Referring to FIG. 4, in the FBF 410, the micro
`phone array 411 receives speech signals X1(k), X-(k), and
`
`X(k) via the M microphones 411a, 411b and 411c. The time
`delay estimator 413 obtains the correlation of the speech
`Signals X1(k), X2(k) and XM(k) and calculates time delaySD,
`D, and DM of the Speech signals X1(k), X(k) and XM(k). The
`M delay devices 415a, 415b and 415c of the delayer 415
`respectively delay the Speech signals X1(k), X(k) and XM(k)
`by the time delays D, D and DM calculated by the time
`delay estimator 413, and output speech signals X'(k), X'(k)
`and X(k). Here, the time delay estimator 413 may calculate
`time delays of Speech Signals using various methods besides
`the calculation of the correlation.
`0037. The first adder 417 adds the speech signals X(k),
`X'(k) and X(k) and outputs a signal b(k). The signal b(k)
`output from the first adder 417 can be represented as in
`Equation 1.
`
`i
`b(k) =X xt (k), m = 1,..., M.
`
`(1)
`
`0038. In the multi-channel signal separator 430, the M
`ABFS 431a and 431b adaptively filter signals output from
`the M Subtractors 437a and 437b of the second Subtractor
`437 according to signals output from the M Subtractors 433a
`and 433b of the first Subtractor 433, So that a characteristic
`of Speech components of the filtered signals output from the
`M ABFs 431a and 431b is the same as that of speech
`components of a microphone signal x(k), that is delayed
`for a predetermined period of time.
`0039. The M subtractors 433a and 433b of the first
`subtractor 433 respectively subtract the signals output from
`the MABFs 431a and 431b from the speech signals X(k)
`and XM'(k), and respectively output signals u(k) and u(k)
`to the MACFS 435a and 435b. When a coefficient vector of
`the m'. ABF of the first filter 431 is h'(k) and the number
`of taps is L., the signal u(k) output from the Subtractors
`433a and 433b of the first subtractor 433 can be represented
`as in Equation 2.
`
`(3)
`
`0040 wherein, h'(k) and w(k) can be represented as in
`Equations 3 and 4, respectively.
`ha(k)-Ihn (k), hin2(k), . . . . hal(k)."
`0041) wherein, h, (k) is an l" coefficient of h(k).
`(4)
`W(k)=w(k-1), w(k-2). . . . , w(k-L)
`0042 wherein, w(k) denotes a vector collecting L past
`values of w(k), L denotes the number of filter taps of the
`MABFS 431a and 431b.
`0043. The MACFs 435a and 435b of the second filter
`435 adaptively filter the signals u(k) and u(k) output from
`the M Subtractors 433a and 433b of the first Subtractor 433
`according to signals output from the M subtractors 437a and
`437b of the second Subtractor 437, so that a characteristic of
`noise components of the filtered signals output from the M
`ACFS 435a and 435b is the same as that of noise compo
`nents of the signal b(k) output from the FBF 410.
`0044) The M subtractors 437a and 437b of the second
`subtractor 437 respectively subtract the signals output from
`the MACFS 435a and 435b of the Second filter 435 from the
`
`
`
`US 2004/O161121 A1
`
`Aug. 19, 2004
`
`signal b(k) output from the FBF 410, and output w(k) and
`w(k) to the second adder 439. When a coefficient vector of
`the m" ACF of the second filter 435 is g(k) and the number
`of taps is N, the signal w(k) output from the M Subtractors
`437a and 437b of the second subtractor 437 can be repre
`Sented as in Equation 5.
`
`0045 wherein, g(k) and u(k) can be represented as in
`Equations 6 and 7, respectively.
`(6)
`ga(k)-Igna(k), gma(k), . . . , gin N(k))
`0046) wherein, g(k) denotes ann" coefficient of g(k).
`un(k)=um (k-1), un(k-2). . . . , u, (k-N)
`(7)
`0047 wherein, u(k) denotes a vector collecting N past
`values of u(k) and N denotes the number of filter taps of the
`MACFS 435a and 435b.
`0048. The second adder 439 adds w(k) and w(k) output
`from the M Subtractors 437a and 437b of the Second
`subtractor 437 and outputs a signal y(k) in which noise
`components are cancelled. The signal y(k) output from the
`second adder 439 can be represented as in Equation 8.
`
`i
`y(k) =X w(k), n = 1, ... M
`
`(8)
`
`from the M subtractors 533a, 533b and 533c of the first
`subtractor 533 can be represented as in Equation 9.
`
`wherein, h'(k) and y(k) can be represented as in
`0.052
`Equations 10 and 11, respectively.
`(10)
`ha(k)-Ihn (k), hin2(k). . . . . him.L(k)
`0053 wherein, h(k) denotes an I" coefficient of h(k).
`
`0054 wherein, y(k) denotes a vector collecting L past
`values of y(k) and L denotes the number of filter taps of the
`M ABFs 531a, 531b and 531c.
`0055) The MACFs 535a, 535b and 535c of the second
`filter 535 adaptively filter the signals Z(k), Z(k) and Z.M.(k)
`output from the M subtractors 533a, 533b and 533c of the
`first subtractor 533 according to a signal output from the
`Second Subtractor 539, so that a characteristic of noise
`components of a signal V(k) output from the Second adder
`537 is the same as that of noise components of the signal
`b(k) output from the FBF 510.
`0056. The second adder 537 adds the signals output from
`the MACFs 535a, 535b and 535c. When a coefficient of the
`m". ACF of the second filter 535 is g(k) and the number of
`taps is N a signal v(k) output from the second adder 537 can
`be represented as in Equation 12.
`
`0049 FIG. 5 is a block diagram of an adaptive beam
`former according to another embodiment of the present
`invention. Referring to FIG. 5, the adaptive beam former
`includes a FBF 510 and a multi-channel signal separator
`530. The FBF 510 includes a microphone array 511 having
`M microphones 511a, 511b and 511c, a time delay estimator
`513, a delayer 515 having M delay devices 515a, 515b and
`515c, and a first adder 517. The multi-channel signal sepa
`rator 530 includes a first filter 531 having M ABFs 531a,
`531b, and 531c, and a first subtractor 533 having M sub
`stractors 533a, 533b and 533c, a second filter 535 having M
`ACFs 535a, 535b and 535c, a second adder 537, and a
`second subtractor 539. Here, the structure and operation of
`the FBF 510 are the same as those of the FBF 410 shown in
`FIG. 4, and thus will not be described herein; only the
`multi-channel separator 530 will be described.
`0050 Referring to FIG. 5, in the multi-channel signal
`separator 530, the MABFs 531a, 531b and 531c of the first
`filter 531 adaptively filter a signal y(k) output from the
`second subtractor 539 according to signals output from the
`M Subtractors 533a, 533b and 533c of the first Subtractor
`533, So that a characteristic of Speech components of the
`filtered signals output from the M ABFs 531a, 531b and
`531C is the same as that of Speech components of a micro
`phone signal X(k), that is delayed for a predetermined
`period of time.
`0051) The M subtractors 533a, 533b and 533c of the first
`subtractor 533 respectively subtract the signals output from
`ABFs 531a, 531b and 531c from microphone signals X(k),
`X'(k) and XM'(k) delayed for a predetermined period of time
`and output signals Z(k), Z-(k) and Z.M.(k) to the M ACFS
`535a, 535b and 535c of the second filter 535. When a
`coefficient vector of the m'. ABF of the first filter 531 is
`h(k) and the number of taps is L., the signal Z(k) output
`
`0057 wherein, g(k) and Z(k) can be represented as in
`Equations 13 and 14, respectively.
`(13)
`ga(k)-gala(k), gm2(k), . . . . gaN(k)|
`0058 wherein, g, (k) denotes an n" coefficient of g(k).
`
`0059 wherein, Z(k) denotes a vector collecting N past
`values of Z(k) and N denotes the number of filter taps of the
`MACFs 535a, 535b and 535c.
`0060. The second subtractor 539 subtracts the signal v(k)
`output from the second adder 537 from the signalb(k) output
`from the FBF 510 and outputs the signal y(k). The signal
`y(k) output from the second subtractor 539 can be repre
`Sented as in Equation 15.
`
`0061. In the above-described embodiments, the MABFs
`431a and 431b of the first filter 431, the MABFs 531a, 531b
`and 531c of the first filter 531, MACFs 435a and 435b of
`the second filter 435, and the MACFs 535a, 535b and 535c
`of the Second filter 535 illustrated in FIGS. 4 and 5
`respectively, may be FIR filters. In view of inputs and
`outputs, each of the filters is an FIR filter. However, the
`multi-channel signal separators 430 and 530 may be
`regarded as infinite impulse response (IIR) filters in View of
`inputs, i.e., the signalb(k) output from the FBF's 410 and 510
`and the microphone signals X (k), X2'(k) and XM'(k) delayed
`for a predetermined period of time, and outputs, i.e., the
`signal y(k) output from the second adder 439 shown in FIG.
`4 and the Second Subtractor 539 shown in FIG. 5. This is
`
`
`
`US 2004/O161121 A1
`
`Aug. 19, 2004
`
`because the MABFs 431a and 431b and the MABFs 531a,
`531b and 531C of the first filters 431 and 531 and the M
`ACFs 435a and 435b and the MACFs 535a, 535b and 535c
`of the second filters 435 and 535 have a feedback connection
`Structure.
`0062) Coefficients of the FIR filters are updated by the
`information maximization algorithm proposed by Anthony
`J. Bell. The information maximization algorithm is a Statis
`tical learning rule well known in the field of independent
`component analysis, by which non-Gaussian data Structures
`of latent Sources are found from Sensor array observations
`on the assumption that the latent Sources are Statistically
`independent. Because the information maximization algo
`rithm does not need a voice activity detector (VAD), coef
`ficients of ABFs and ACFs can be automatically adapted
`without knowledge of the desired and undesired signal
`levels.
`0.063. According to the information maximization algo
`rithm, coefficients of the MABFs 431a and 431b and the M
`ACFS 435a and 435b are updated as in Equations 16 and 17.
`
`wherein, C. and B denote step sizes for learning
`0.064
`rules and SGN() is a sign function which is +1 if an input
`is greater than Zero and -1 if the input is less than Zero.
`0065 According to the information maximization algo
`rithm, coefficients of the MABFs 531a, 531b and 531c and
`the M ACFs 535a, 535b and 535c are updated as in
`Equations 18 and 19.
`
`wherein, C. and B denote step sizes for learning
`0.066
`rules and SGN() is a sign function which is +1 if an input
`is greater than Zero and -1 if the input is less than Zero. The
`sign function SGN(-) could be replaced by any kind of
`Saturation function, Such as a Sigmoid function and a tanh()
`function.
`0067. In addition, coefficients of the MABFs 431a and
`431b, the MABFs 531a, 531b and 531c, MACFs 435a and
`435b, and the MACFs 535a, 535b and 535c can be updated
`using any kind of Statistical learning algorithms. Such as a
`least Square algorithm and its variant, a normalized least
`Square algorithm.
`0068. As described above, when the MABFs 431a and
`431b and the M ACFS 435a and 435b, and the M ABFs
`531a, 531b and 531c and the MACFs 535a, 535b and 535c
`are FIR filters and connected in a feedback structure, and the
`number of microphones of each of the microphone arrayS
`411 and 511 is 8, the number of filter taps of the adaptive
`beam former shown in FIG. 4 or 5 is 8x(128+128)=2048,
`which is much fewer than the number 8x(512+128)=5120 of
`filter taps of the conventional adaptive beam former shown in
`FIG. 1.
`0069 FIG. 6 illustrates an experimental environment
`used for comparing an adaptive beam former according to
`the present invention and the conventional adaptive beam
`former shown in FIG.1. A circular microphone array having
`a diameter of 30 cm was located in the center of a room
`having a length of 6.5 m, a width of 4.1 m, and a height of
`
`3.5 m. Eight microphones were installed on the circular
`microphone array equidistant from adjacent microphones.
`The heights of the microphone array, a target Speaker, and a
`noise speaker were all 0.79 m from the floor. Target sources
`were speech waves of 40 words pronounced by four male
`Speakers, and noise Sources were a fan and music.
`0070 The results of an objective evaluation of the per
`formance of the two adaptive beam formers in the above
`described experimental environment, e.g., a comparison of
`SNRs, are shown in Table 1 (all units are in dBs).
`
`TABLE 1.
`
`Raw Signal
`
`Prior Art (GSC)
`
`Present Invention
`
`FAN
`MUSIC
`AEAN
`AMUSIC
`
`9.O
`6.9
`X
`X
`
`19.5
`15.5
`10.5
`8.6
`
`27.5
`24.9
`18.5
`18.0
`
`0071. As can be seen in Table 1, the SNR in a beam
`forming method according to the present invention is
`roughly double the SNR in a beam forming method accord
`ing to the prior art.
`0072 For a subjective evaluation in the experimental
`environment, e.g., an AB preference test, after ten people
`had listened to outputs of a beam former according to the
`prior art and a beam former according to the present inven
`tion, they were asked to choose one of the following
`Sentences for evaluation, which are “A is much better than
`B”, “A is better than B”, “A and B are the same”, “A is worse
`than B”, and “A is much worse than B”. A test program
`randomly determined which one of the beam formers accord
`ing to the prior art and the present invention would output
`Signal A. Also, two points were given for “much better, one
`point for “better', and no points for “the same” and then the
`results were Summed. The Subjective evaluation compared
`40 words for fan noise and another 40 words for music noise,
`and the results of the comparison are shown in Table 2.
`
`TABLE 2
`
`Prior art (GSC)
`
`Present Invention
`
`FAN
`MUSIC
`
`78
`140
`
`517
`284
`
`0073. As can be seen in Table 2, the outputs of the
`beam former according to the present invention are Superior
`to the outputs of the beam former according the prior art.
`0074 As described above, according to the present
`invention, by connecting ABFs and