throbber
p
`
`1111111111111111 IIIIII IIIII 111111111111111111111111111111111111111111111111111111111111 IIII IIII
`
`100
`
`101
`
`Comm.
`channel
`
`TV
`
`;;;;;;;;;;;;
`;;;;;;;;;;;;
`
`;;;;;;;;;;;;
`
`;;;;;;;;;;;;
`
`!!!!!!!!
`
`-==
`;;;;;;;;;;;; ==
`;;;;;;;;;;;; -!!!!!!!! ==
`~ - - - - - - - - - - - - - - - - - - - - - -
`;;;;;;;;;;;; -;;;;;;;;;;;;
`;;;;;;;;;;;; -
`==
`;;;;;;;;;;;; -;;;;;;;;;;;; -
`-;;;;;;;;;;;;
`
`102
`
`108
`
`DSP
`
`. . - - - - , s
`5
`L...,L----+-------.t Amplifiers
`
`Acquisition
`module
`
`107
`
`106
`
`Amazon Ex. 1005
`IPR Petition - US RE47,049
`Amazon Ex. 1005, Page 1 of 23
`
`(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`
`(19) World Intellectual Property Organization
`International Bureau
`
`(43) International Publication Date
`10 April 2008 (10.04.2008)
`
`PCT
`
`(51) International Patent Classification:
`
`Not classified
`
`(21) International Application Number:
`PCT/RS2007/000017
`
`(22) International Filing Date:
`19 September 2007 (19.09.2007)
`
`(25) Filing Language:
`
`(26) Publication Language:
`
`English
`
`English
`
`(10) International Publication Number
`WO 2008/041878 A2
`(81) Designated States (unless otherwise indicated, for every
`kind of national protection available): AE, AG, AL, AM,
`AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, CA, CH,
`CN, CO, CR, CU, CZ, DE, DK, DM, DO, DZ, EC, EE, EG,
`ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL,
`IN, IS, JP, KE, KG, KM, KN, KP, KR, KZ, LA, LC, LK,
`LR, LS, LT, LU, LY, MA, MD, ME, MG, MK, MN, MW,
`MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PG, PH, PL,
`PT, RO, RS, RU, SC, SD, SE, SG, SK, SL, SM, SV, SY,
`TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA,
`ZM, ZW
`
`(30) Priority Data:
`P-2006/0551
`
`4 October 2006 (04. 10.2006)
`
`RS
`
`(84) Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM,
`ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM),
`European (AT,BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI,
`FR, GB, GR, HU, IE, IS, IT, LT,LU, LV,MC, MT, NL, PL,
`(72) Inventors; and
`PT, RO, SE, SI, SK, TR), OAPI (BF, BJ, CF, CG, CI, CM,
`(75) Inventors/Applicants (for US only): SARIC, Zoran
`GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG).
`[RS/RS]; Vukasovica 65/7, 11000 Novi Beograd (RS).
`JOVICIC, Slobodan [RS/RS]; Visnjicki venae 67, Declaration under Rule 4.17:
`11000 Beograd (RS). KOVACEVIC, Vladimir [RS/RS]; — of inventorship (Rule 4.17(iv))
`Radnicka 35A, 21000 Novi Sad (RS). TESLIC, Nikola
`[RS/RS]; BuI. Cara Lazara 29, 21000 Novi Sad (RS).
`Published:
`KUKOLJ, Dragan [RS/RS]; Narodnog fronta 31, 21000 — without international search report and to be republished
`Novi Sad (RS).
`upon receipt of that report
`
`(71) Applicant: MICRONAS NIT [RS/RS]; Fruskogorska
`l la, 21000 Novi Sad (RS).
`
`(54) Title: SYSTEM AND PROCEDURE OF FREE SPEECH COMMUNICATION USING A MICROPHONE ARRAY
`
`(57) Abstract: The invention relates to the system and procedure for hand-free voice communication in video-phone or teleconfer
`ence using a microphone array, whose main purpose is to make a quality recording of speaker in room, in the situation of larger ex
`pansion, with presence noise, with acoustic echo, produced by distance speaker and TV program, room reverberation and movement
`of the speaker in room. System contains: digital TV receiver and digital camera for picture reproduction and shooting, respectively,
`stereo loudspeakers and microphone array for sound reproduction and recording, respectively, amplifier and acquisition module for
`audio signals and DSP for acoustic signal processing. The procedure for microphone signal processing is done in frequency domain
`and it contains: acoustic echo suppression made of two signals: far-end speaker signal and stereo TV signal, acoustic spatial filtering
`of near-end speaker in accordance with noise sources and room reverberation, based on adaptive characteristic of microphone array
`directivity, of speaker localization in horizontal plane, of suppression of all residual noises and adaptive gain control of transmitting
`signal.
`
`Amazon Ex. 1005, Page 1 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`1
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`SYSTEM AND PROCEDURE OF FREE SPEECH COMMUNICATION USING A
`MICROPHONE ARRAY
`
`Technical Field
`
`The invention belongs to the field of acoustic signal processing, precisely speaking to the
`methods of acoustic echo cancellation, location and selection of an active speaker in the
`presence of a reverberations in the acoustic environment and the noise suppression by
`means of microphone array.
`
`Background Art
`
`Hands-free full-duplex speech communication systems are used in many existing
`applications, such as: video-phone systems, teleconference systems, room and car hands-
`free systems, human-machine interface using voice, etc.
`
`Usage of the hands-free speech communication systems implies not specified talker position
`in the acoustic environment, with variable distances from system's microphones and
`loudspeakers. The hands-free speech communication in such unknown conditions is reason
`for the number of technical problems, which should be solved, in order to preserve good
`quality of the speech communication.
`
`Basic problem is acoustic echo generated by partial acoustic energy transmission from a
`loudspeaker to the microphone, so the speaker on far-end is able to hear his own voice as an
`obstruction. Conventionally, signal echo canceling is done by adaptive filter using
`estimation of transfer function of acoustic echo between loudspeaker and microphone, so
`that its exit gets approximately same signal as acoustic echo signal. Deduction two of these
`signals cancels acoustic echo. However, canceling echo can not be perfect because of
`systems non-linearity and acoustics ambience non-steady. As a result it shows residual echo
`signal. At that basic request stays, recorded speech signal of near-end shouldn't be exposed
`by echo suppression and its process.
`
`In the acoustic ambient, acoustic disturbances of different nature and causes may appear.
`Those disturbances could be stationary and non-stationary (for example: computer noises or
`car noise) and they come from many different sources located on different positions in the
`room or space where the speaker stands.
`
`Besides that, in closed rooms (as a work rooms, halls and automobile-cabins) it shows up
`the effect of reverberation as an after effect of multiple acoustic wave reflections from walls
`and obstacles.
`Since the acoustic ambient besides the speaker contain sources of
`disturbances, the desired signal (coming from the speaker) must be separated from the
`disturbances in order to make possible its own recording. Conventionally, this problem may
`be solved by using a microphone array having a number of microphones ordered in line at
`minimum inter-distance. With appropriate processing of microphone array signal, direction
`dependent sensitivity of microphone system may be achieved. Such microphone systems
`
`Amazon Ex. 1005, Page 2 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`2
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`has narrow directivity characteristic, enough to record only the actual speaker in the
`acoustic ambient, while the signals of dislocated noise sources are suppressed,
`thereby
`providing higher signal-to-disturbance ratio. The gain depends on: directivity of the
`microphone array (width of the main lobe), side-lobe size, separability of speech sources
`and noise sources (to close sources are difficult
`to separate), reverberation time, non-
`stationary acoustic sources, etc.
`
`Determination of speaker direction in acoustic ambient and steering the diiectivity of
`microphone array according toward it is an important problem in hands-free communication
`systems. The procedures of determining the speaker direction 'are very sensitive to
`disturbances present in the ambient, specially: to non-stationary speaker (if it moves within
`ambient) and if there are several speakers in a given ambient simultaneously speaking
`(cocktail party effect). The determination of relative direction of the actual speaker to the
`microphone array in horizontal plane (determination of azimuth), is very important step in
`video-phone and teleconferencing systems, because of need to determine the speaker
`coordinates which are used for moveable camera control in the system.
`
`During speech recording in an acoustic ambient, the problem of additive stationary or non-
`stationary noise always appears so as the residual noise in processing of acoustic signals.
`They degrade the quality of the recorded speech signal. If they are intense enough, they may
`even reduce the perspicuity of the speech. There are many algorithms for noise reduction
`(NR), optimized for specific noise types. The common requirement for all of them is to
`improve the signal to noise ratio, but to avoid distorting of speech signal and reduction of its
`perspicuity.
`
`Variable ambient conditions, and variable distance between the speaker and microphone
`array, require automatic gain control (AGC), which makes the speaker voice level constant
`and more comfort for the receiver at the far-end of the communication channel. Automatic
`gain control in full-duplex systems requires additional information from near-end speech
`activity detector, from far-end speech activity detector and acoustic echo canceller.
`
`Refer to above mentioned technical problems in solution of "hand-free" communication
`system for speech signal transmission in full-duplex and its usage in video-phone and/or
`teleconference systems, are very complex. Those problems demand one integral and optimal
`solution approach, considering real time system operation based on commercial platform of
`digital signal processor (DSP).
`
`Quality of speech recording in the presence acoustic noises and room reverberations made a
`complex problem. In the conditions when the useful speech signal spectrum are overlapping
`with presence noises spectrum, using a single channel processing it is not possible to
`improve significantly of speech signal quality. In accordance with digital signal processing
`development and purchasing of enough powerful computer power of DSP, a way of multi-
`microphone procedure applying acoustic signals processing is open. Benefits of microphone
`array in relation to single channel processing is adaptation capability of its spatial receipt
`characteristics (directivity characteristic) to instantly schedule of chosen speaker and define
`noises in room. At that point, they realize a maximum suppression of presence noises, at the
`same time the speaker is emphasized. Main problems by microphone arrays usage are
`(M.S.Brandstein, D.B. Ward (Eds.), Microphone Arrays: Signal Processing Techniques and
`
`Amazon Ex. 1005, Page 3 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`3
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`Applications, Springer, Berlin 2001; Y. Huang, J. Benesty, Audio signal processing for next
`generation multimedia communication systems, Kluwer Academic Publ.; 2004): chosen
`speaker exactly location outset, outset of exactly number and positions of room presence
`noises, multi-reflections of useful source and noise of the room walls and non-steady of
`acoustic noise sources and chosen speaker.
`
`When the microphone array is used in video-phone or teleconference systems, in full duplex
`function, than the number of possible problems is getting larger. The biggest problem is
`presence of acoustic echo, and then need for automatic gain control (AGC) of system
`transmitter part, as well as possible presence of system non-steady, called microphony.
`Additional problem, which is being observed in this patent, is' presence of TV program
`signal, which shows up as an additive acoustic echo on entrance of microphone array.
`
`Large number of mentioned problems has been generated and made very different kind of
`solutions, which has been patented and which could solve some of problems or few integral
`problems. For example: U.S. published patent application 2006/ 0153360 Al,
`filled
`September 2nd 2005, entitled "Speech signal processing with combined noise reduction and
`echo compensation", gives integral solution of echo reduction and noise reduction, then
`U.S. published patent application 7,035,415 B2, filled May 15th 2001, entitled "Method and
`device for acoustic echo cancellation combined with adaptive beamforming", which gives
`integral solution of echo reduction and forming of directed microphone array characteristic,
`then EP published patent application 1 633 121 Al,
`filled September 3rd 2004, entitled
`"Speech signal processing with combined adaptive noise reduction and adaptive echo
`compensation", gives integral solution of residual echo reduction and noise reduction, then
`EP published patent application 1 571 875 A2, filled February 23rd 2005, entitled " A
`system and method for beamforming using a microphone array", which gives solution for
`only directed microphone array characteristic forming, then EP published patent application
`1 581 026 Al, filled March 17th 2004, entitled "Method for detecting and reducing noise
`from a microphone array" gives solution only for noise reduction in microphone array, as
`well as EP published patent application 1 286 175 A2, filled August 1st 2002, entitled
`"Robust
`talker localization in reverberant environment", gives solution only for talker
`localization in reverberant room.
`
`Integral solution all mentioned problems, realized in this patent, join positive characteristics
`of particular signal processing of mentioned problems and their solutions, they are going to
`be solved integrally in frequency domain, optimizing computer resources and gives real
`time solutions, securing
`quality of free speech communication in video-phone and/or
`teleconference systems.
`
`Disclosure of the Invention
`
`free speech communication system in video-phone or
`this patent
`Subject of
`is
`teleconference applying, which use microphone array and complex acoustic signal
`processing, which should secure better quality and clearness of speech signal in complex
`acoustic ambience, in which many previous mentioned failures are separately or integral
`eliminated.
`
`Amazon Ex. 1005, Page 4 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`4
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`System, which is subject of this patent, transmits speech and as transmitting medium is
`being used digital television. For recording and reproduction of speech signal is being used
`microphone array and loudspeaker, respective, which are integral TV receiver components.
`When we talk about video-phone or teleconference applying for recording and picture
`reproduction than we use digital camera and respective digital TV receiver.
`
`Invention essence is specific processing of speech signal, which has been recorded in one
`acoustic ambience in room where the speaker and system are present. For recording of
`speaker in room, which stands on define distance (few meters distance) from TV receiver,
`system uses microphone array of N microphones. Microphone array records all present
`room signals: useful signal as a directed wave, which gets from the talker to the microphone
`and different noise signals. As noise signals it shows up: acoustic echo as one loudspeaker
`direct wave, which is emitting interlocutor voice from the far-end of communication
`channel, acoustic echo as a directly sound wave, which are emitting stereo TV program,
`direct waves taken from one or more source of noise or also other sources, which we can
`hear in the room and reflected waves (room echo), made by their own sources of noise,
`including speaker, and all those noise, which appear to show during the room reverberation.
`We should emphasis that noise sources in the room can be stationary or non-steady, which
`is frequently matter, as by its characteristics, so as by its room location (mobile sound
`sources).
`
`Different kinds of noises required different techniques for its eliminating, and this invention
`essence is one optimally designed algorithm, which should at most eliminate all noises and
`which should secure the best speech signal quality, which is going to be transmitted to the
`interlocutor on the far-end of communication channel.
`
`Microphone signals from microphone array are being processed in one digital form in DSP,
`completely in one frequency domain. This domain enables certain advantages, as a
`processing speed and computer operation number, which is very important for DSP and its
`real time work. For acoustic echo cancellation it is necessary to put in all loudspeaker
`signals into the DSP.
`
`DSP run a few complex algorithms: acoustic echo canceling algorithm (AEC), microphone
`array processing signal algorithm for adaptive beam forming (ABF) and its directivity
`characteristics, estimation algorithm for direction of arrival (DOA) of useful signal for
`indoor localization of speaker, in other words speaker room localization, algorithm for
`reduction of stationary noise, non-steady noise and residual echo (NR- Noise Reduction)
`and algorithm for system automatic gain control (AGC), because of compensation between
`different speaker distance from the microphone array. Besides all those basic algorithms,
`DSP runs some others algorithms more as are: voice activity detector (VAD) on the near-
`end, VAD on far-end, double talk detector (DTD) on the both sides, additional post filtering
`(PF) of noise reduction, etc. The aim of mentioned algorithms is maximal reduction of all
`present noises with minimum of speech signal degradation, therewith secure of transmitting
`speech signal maximum quality.
`
`Specific aspect of invention subsist adaptive acoustic echo cancellation using an adaptive
`filter, which mould transferring acoustic way characteristic from loudspeaker
`to the
`microphone. Transferring characteristic is complex, working on transmitting way from 2
`
`Amazon Ex. 1005, Page 5 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`5
`
`5
`
`1Q.
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`... ' ., . ' . . . ' .··.
`
`. ' . . .
`
`' . .
`
`·~. . . ' . . . .... ,• '
`
`.
`
`. '• ~.
`
`' .
`
`(stereo) loudspeakers to the N microphone in the microphone array and each microphone
`signal is being filtered by its on adaptive filter. Work of adaptive filters is being controlled
`with speech activity detector on the both sides.
`
`Next specific part of invention is adaptive directivity characteristic of microphone array,
`which secure spatial filtering and directivity separation in the room with speaker, where the
`useful signal is being boost till the maximum of strength in accordance with and on other
`signals, which are being interfered. Directivity characteristic of microphone array is
`accomplished by adaptive weighting and summing of microphone signals, which secure
`directivity index stability in one frequency domain in one reverberation acoustic ambience.
`
`Defining direction of arrival of speaker directed acoustic wave is a..next specifi c thing of the
`invention. This system function of free speech communication is necessary for control and
`managing of directivity characteristic of microphone array by azimuth, also it can be used
`for control and video camera guiding. It uses microphone signals after acoustics echo
`cancellation. After generated cross-correlation of microphone
`signal and its phase
`transforms,
`the arrival direction of speakers directed acoustic wave is estimated. This
`function is being directly controlled by speech activity detector.
`
`Following specific of the invention is process of adaptive suppression of stationary and non-
`steady noises. Process is realized on the non-linear estimation noise compressor, which is
`being sorted to several sub-bands. Two estimation noises are being used, securing the
`optimal suppression result of speech signal characteristics. That has been done because of
`safety reason. Safety in meaning that process of adaptive noise reduction shouldn't degrade
`the quality speech signal. Process of filtration should be finished in accordance with
`adaptive Wiener post-filter.
`
`the invention is automatic gain control of speech signal before
`Specific aspect of
`transmission to the far-end interlocutor. This peculiarity is important copulative element of
`free speech communication system. System secures compensation between different speech
`signal intensity, as an individual speech characteristic on the one side, and different speech
`intensity on the other side, which is depending on speaker position, nearer or farther
`position in relation to the microphone array. The solution makes a difference between
`speaker activity and useful signal appearing of pause, residual echo, acoustic noise or far-
`end speech signal, wherefore the solution uses more information previously detected into
`the system. Analysis of possible scenarios has to be reliable; in counterpart it is possible to
`get one negative effect of useful speech signal attenuation.
`
`Specialty of this invention is improvement of each mentioned specifics, also improvement
`in the integration process of all algorithms to the one unite, which functioning is stable and
`quality. Algorithm procedures are being optimized using cooperative resources.
`
`These and other aspects, specifics arid' benefits of the invention "'are going "to be more
`evidentially after invention detail description review, patent claims and suitable figures.
`
`Amazon Ex. 1005, Page 6 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`6
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Brief Description of the Drawings
`
`Figure 1 - shows elements of free video-phone communication system using a microphone
`array and digital television.
`
`Figure 2 - shows ambience conditions for the system appliance of free speech video-phone
`communication system using a microphone array.
`
`Figure 3 - shows a diagram block of audio signal processing subsystem within free video
`phone communication system; it contains one microphone array with adaptive directivity
`characteristic (SD-BF), block of speaker indoor location (DOA), block of echo cancellation
`(AEC), block of noise reduction (NR) and block of automatic gain control (AGC).
`
`Figure 4 - shows the block diagram of acoustic echo canceling (AEC).
`
`Figure 5 - shows the block diagram of adaptive determination of near-end speaker direction
`in horizontal plane (DOA-azimuth).
`
`Figure 6 - shows the block diagram of spatial filtering (SD-BF).
`
`Figure 7 - represents the block diagram of noise reduction (NR).
`
`Figure 8 - represents the block diagram of automatic gain control (AGC).
`
`Best Mode for Carrying Out of the Invention
`
`This invention shows a system and method of acoustic signal processing in a free speech
`communication using a microphone array.
`
`Figure 1 represents
`communication using a
`free video-phone
`system elements of
`microphone array and digital television. Digital television 100, which serves the user for a
`casually TV watching,
`in the free video-phone communication system, is being used as a
`video communication and as an audio terminal for audio communication with another
`speaker. Namely, when the communication channel way 101 gets a call and connection with
`another speaker is made, then the TV 100 is being used as a multimedia interface, where
`one speaker over the loudspeakers 102 is listening, and watching on the one part 105 of the
`TV screen 100 of its far-end interlocutor.
`In the same time, on the another end of
`communication channel (far-end side), the speaker on the similar TV receiver, using camera
`104 and microphone array 103, also see its interlocutor placed at near-end side. Camera 104
`is movable and it is controlled by coordinates, obtained by microphone signal processing
`from microphone array 103.
`
`Analog signals from a microphone in microphone array 103 are amplified by the amplifier
`106 and together with loudspeakers stereo signals 102 are introduced to acquisition module
`107, which digitalized them and send them to DSP 108 on the further processing. Proceeded
`speech signal of the near-end speaker in the DSP is being sent over a communication
`channel 101 to the speaker on the far-end. Acoustic signal process in DSP 108 gets spatial
`coordinates of speaker ambience location,
`in the room with free communication system.
`With them DSP 108 controls a camera steering 104, directed on the active speaker. On that
`
`Amazon Ex. 1005, Page 7 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`7
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`way, free audio and video communication between two speakers, with a digital television
`system is completely assured.
`
`Figure 2 schematically shows ambient conditions of free video-phone communication using
`a microphone array; it shows only a part of the system, which is related to acoustic signal
`processing. The room 201 has installed the system of free video-phone communication,
`speaker 202 and noise source 203, which is normal appearance of every acoustic ambience.
`Over the loudspeaker 102 stereo audio system of digital television, the speaker 202 is
`listening of incoming speech signal of its interlocutor 204 from the far-end, mostly as a
`mono signal. Microphone array (made of N number of microphones) records ambience
`sound 201. After complex microphone signal processing in the block 207, speech signal of
`the speaker 202 is transmitted by the block 208, to the far-end speaker as a mono signal.
`
`Ambience conditions 201 during the speech communication are very complex ϊ n the case
`of the free video-phone communication in the room 201, three noise sources are presence:
`stereo loudspeakers 102, which emit a far-end speaker voice and TV program, speaker 202
`and minimum one source of noise 203. It is possible that room can have more sources of
`noise: computer noise, air-condition noise, street noise, neighbors' noise, buildings
`vibrations or another speaker, or even few speakers, music, etc.
`
`Therefore, we have one very complex acoustic picture of the room. Microphone array 103
`as a sensor system, records all room sounds, and all direct sound waves out of each sound
`source, but at
`the same time,
`it records all sound reflections. For example, from the
`loudspeaker 102 to the microphone array 103 arrives one direct wave 209 followed by
`plenty of reflected waves, where only one wave 210 has been showed on the Figure 2, the
`speaker 202 sends a direct wave 211 and besides all those waves it sends two more reflected
`waves 212a and 212b, the noise source 203 sends one direct wave 213 and besides the rest
`of waves, one reflected wave 214, too.
`
`Out of all sounds, which the microphone array records, one is a direct and useful wave 211
`taken from the speaker 202, all the rest waves are noticed as a disturbances. The biggest
`disturbance is an acoustic echo 209, which comes from the loudspeaker 102. All other
`reflections,
`together, produce a room reverberation. The task of block for audio signal
`processing 207 is to cancel acoustic echo signal, to select a useful signal 211 from the other
`signals, to suppress reverberations signals, to suppress direct noise sources and their signals,
`and the number of those sources can be more than one. Special task of the 211 block is to
`follow acoustic room scene and its non-stationary, depending of speaker mobility, or
`position, or depending of noise mobility, are they non-stationary or changeable. In the
`following text, explanations of these issues from the invention would be particularly
`described.
`
`Figure 3 shows a schematic diagram of total audio signal processing procedure in free
`video- phone communication system using a microphone array. All microphone signals 103,
`from M l till the M5, as well as a loudspeakers stereo signal 102, Sp-L I Sp-R, are being
`digitalized into acquisition block 107, Figure 1, and converted into the frequency domain
`the x 7. It should be
`using a fast Fourier transform (FFT) 301 into the signals x/ till
`emphasized that the microphone array contains 5 microphones to resolve this patent, but if
`there is a need for few additional microphones, they can be install for the need of the
`
`Amazon Ex. 1005, Page 8 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`8
`
`6
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`~
`
`• • 1 • , •
`
`, ~-
`
`• •~ ,
`
`•
`
`•
`
`I
`
`•
`
`•
`
`• •• •
`
`• •
`
`,
`
`•• ,
`
`•
`
`'
`
`,
`
`, , •
`
`•
`
`• • ' ,• I
`
`••
`
`, . "• •
`
`•
`
`application. The block 302 suppress acoustic echo in all signals (x i
`till x 5) using an x and x 7
`signals as a referents. Suppressed signals SAECJ till S AECS are being used in the block 304 for
`assignment of direction of arrival of sound wave (DOA) by horizontal plane (azimuth q
`a) to
`the actual speaker. On that way the tracking of the active speaker is possible. Marking the
`azimuth angle q
`a in the block 303, the weighted coefficient of signals x \ till x 5 are being
`optimized, with one purpose, to form horizontal directivity characteristic of microphone
`array with receiving maximum on azimuth direction q
`a. Receiving characteristic formed in
`the block 303 has a superdirective nature, which means that the receiving directivity index is
`larger then directivity characteristic, which we get from delay compensation and sum of
`microphone signals.
`
`Block 303 does the time compensation between acoustic signal delay of the speaker on the
`one side, and the microphones on the other side. Control over this delay signal DOA ( q
`a)
`from the block 304,
`it is accomplished to control the -microphone array directivity by
`azimuth. Directivity
`characteristic
`of microphone
`array SD-BF
`(Superdirective
`Beamformer) in the block 303 is formed. The main lobe of this characteristic is its narrow
`and directed course, directed into the wanted aim, and the side lobes are intensely slower.
`That secures spatial filtering to the microphone array, precisely, separation of noise sources
`in the horizontal plane. That kind of form of directivity characteristic is very important for
`the reduction of unwanted noises,
`to separate them from the useful signal and room
`reverberation effect. Characteristic of directivity has been formed by microphone signal
`weighting and its summing into the one-channel output signal.
`
`in block 303 contains constantly speech signal and noise signal, which
`Output signal
`consists one residual signal after acoustic cancellation of an echo signal, suppressed
`ambience noise and reduced reverberation noise. That signal comes to the block noise
`reduction - NR 305 where the additional noise signal reduction is done. Reduction process is
`adaptive, concerning noise signal non-stationary. Also, important claim in NR realization
`block is the fact that noise reduction- and its process shouldn't- affect on speech signal
`quality.
`
`Final block of signal processing of free speech communication system in video-phone or
`teleconference processing is block 306 for automatic gain control (AGC) of speech signal.
`This block uses more information, which it takes out of systems, which are important for
`defining of possible speech signal conditions and where is necessary to correct
`its
`amplitude, on suitable manner. On that way it can be secured almost the same level of
`transmitting speech signal,
`independently of the distance between actual speaker and
`microphone array and it can assure a better quality on opposite side of the communication
`channel.
`
`On the system exit, the signal process result, using an inverse FFT in the block 307 is
`transformed from frequency to the time domain. Estimated speech signal on the near-end (S)
`is sent through the channel to the distant speaker.
`
`Figure 4 represent block diagram of acoustic echo canceling (AEC) 302, which is
`containing two main blocks: block 401, which is containing 5 adaptive NLMS (Normalized
`
`Amazon Ex. 1005, Page 9 of 23
`
`

`

`WO 2008/041878
`
`PCT/RS2007/000017
`
`9
`
`5
`
`5
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`(1)
`
`n
`
`y
`
`Least Mean Square) algorithms and block 402, which main function is detection of activities
`between near-end speaker and far-end speaker speech DTD (Double Talk Detection).
`
`till NLMS6, processes x/ till x microphone signals and
`NLMS algorithms, from NLMSl
`certain SAECI till $AEC signals to the blocks 303, 304 and 306, Figure 3.
`
`NLMS algorithm function is to cancel echo presence in each microphone signal. This
`function secures presence of reference signals out of loudspeaker 102 and control signal out
`of DTD detector 402. NLMS algorithm models transfer functions of acoustic way from each
`loudspeaker 102 to the each microphone 103: for example, NLMSl models transfer
`functions hu out of loudspeaker Sp-L to the microphone M l and IIRI from loudspeaker Sp-R
`to the microphone Ml, etc.
`
`Signal transmitted from loudspeaker through NLMS filters, gets a signal replica on the
`microphones, which came on acoustic way and deduction of these two signals,
`is
`accomplished by cancellation of signal echo on the NLMS algorithm exit. To get maximum
`quality of echo reduction, similar to the case o

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket