throbber
Session ThAA Noise Mitigation, Speech Enhancement II
`5/24/22, 12:40 PM
`The Wayback Machine - https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurt…
`
`Session ThAA Noise Mitigation, Speech
`Enhancement II
`Chairperson Bayya Yegnanarayana IIT MADRAS, India
`
` Home
`
`NOISY SPEECH ENHANCEMENT BY FUSION OF AUDITORY
`AND VISUAL INFORMATION: A STUDY OF VOWEL
`TRANSITIONS
`
`Authors: L. Girin, G. Feng & J.L. Schwartz
`
`Institut de la Communication Parlée, UPRESA 5009 INPG/ENSERG/Université Stendhal B.P. 25, 38040
`GRENOBLE CEDEX 09, FRANCE E-mail : girin@icp.grenet.fr
`
`Volume 5 pages 2555 - 2558
`
`ABSTRACT
`
`This paper deals with a noisy speech enhancement technique based on the fusion of auditory and visual
`information. We first present the global structure of the system, and then we focus on the tool we used to melt
`both sources of information. The whole noise reduction system is implemented in the context of vowel
`transitions corrupted with white noise. A complete evaluation of the system in this context is presented,
`including distance measures, gaussian classification scores, and a perceptive test. The results are very promising.
`
`SPECTRAL SUBTRACTION USING A NON-CRITICALLY
`DECIMATED DISCRETE WAVELET TRANSFORM
`
`Authors: Andreas Engelsberg and Thomas Gulzow
`
`Institute for Network and System Theory, Technical Department, Kiel University, Kaiserstrasse 2, D-
`24143 Kiel / Germany, E-mail: ae@techfak.uni-kiel.de and tg@techfak.uni-kiel.de
`
`Volume 5 pages 2559 - 2562
`
`ABSTRACT
`
`The method of spectral subtraction has become very popular in speech enhancement. It is performed by
`modifying the spectral amplitudes of the disturbed signal. The spectral analysis of the signal is usually done by a
`Discrete Fourier Transformation (DFT). We propose a spectral transformation with nonuniform bandwidth to
`take into account the characteristics of the human ear. The spectral analysis and synthesis is performed by a non-
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`1/9
`
`Page 1 of 9
`
`GOOGLE EXHIBIT 1012
`
`

`

`Session ThAA Noise Mitigation, Speech Enhancement II
`5/24/22, 12:40 PM
`critically decimated discrete wavelet transform. Critical subsampling is not performed to avoid errors due to
`aliasing. A significant drawback of spectral-subtraction methods are tonal residual noises in speech pauses with
`unnatural sound. The application of the proposed wavelet transform results in reduced residual noise with
`subjectively more comfortable sound.
`
`BAYESIAN AFFINE TRANSFORMATION OF HMM
`PARAMETERS FOR INSTANTANEOUS AND SUPERVISED
`ADAPTATION IN TELEPHONE SPEECH RECOGNITION
`
`Authors: Jen-Tzung Chien (a), Hsiao-Chuan Wang (a) and Chin-Hui Lee (b)
`
`(a) Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan (b)
`Multimedia Communications Research Lab, Bell Laboratories, Murray Hill, USA
`chien@speech.ee.nthu.edu.tw hcwang@ee.nthu.edu.tw chl@research.bell-labs.com
`
`Volume 5 pages 2563 - 2566
`
`ABSTRACT
`
`This paper proposes a Bayesian affine transformation of hidden Markov model (HMM) parameters for reducing
`the acoustic mismatch problem in telephone speech recognition. Our purpose is to transform the existing HMM
`parameters into its new version of specific telephone environment using affine function so as to improve the
`recognition rate. The maximum a posteriori (MAP) estimation which merges the prior statistics into
`transformation is applied for estimating the transformation parameters. Experiments demonstrate that the
`proposed Bayesian affine transformation is effective for instantaneous adaptation and supervised adaptation in
`telephone speech recognition. Model transformation using MAP estimation performs better than that using
`maximum-likelihood (ML) estimation.
`
`INTEGRATED BIAS REMOVAL TECHNIQUES FOR ROBUST
`SPEECH RECOGNITION \Lambda
`
`Authors: Craig Lawrence and Mazin Rahim (1)
`
`University of Maryland, College Park, MD 20742 (1)AT&T; Labs-Research, Murray Hill, NJ 07974
`
`Volume 5 pages 2567 - 2570
`
`ABSTRACT
`
`In this paper, we present a family of maximum likelihood (ML) techniques that aim at reducing an acoustic
`mismatch between the training and testing conditions of hid- den Markov model (HMM)-based automatic
`speech recognition (ASR) systems. We propose a codebook-based stochastic matching (CBSM) approach for
`bias removal both at the feature level and at the model level. CBSM associates each bias with an ensemble of
`HMM mixture components that share similar acoustic characteristics. It is integrated with hierarchical signal
`bias removal (HSBR) and further extended to accommodate for N-best candidates. Experimental results on
`connected digits, recorded over a cellular network, shows that the proposed system reduces both the word and
`string error rates by about 36% and 31%, respectively, over a baseline system not incorporating bias removal.
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`2/9
`
`Page 2 of 9
`
`

`

`5/24/22, 12:40 PM
`
`ACOUSTIC FRONT ENDS FOR SPEAKER-INDEPENDENT DIGIT
`RECOGNITION IN CAR ENVIRONMENTS
`
`Session ThAA Noise Mitigation, Speech Enhancement II
`
`Authors: D. Langmann, A. Fischer, F. Wuppermann, R. Haeb-Umbach, T. Eisele
`
`Philips GmbH Forschungslaboratorien Aachen P.O. Box 50 01 45 D-52085 Aachen Germany Email:
`flangmann,afischer,wupper,haeb,eiseleg@pfa.research.philips.com
`
`Volume 5 pages 2571 - 2574
`
`ABSTRACT
`
`This paper describes speaker-independent speech recognition experiments concerning acoustic front end
`processing on a speech database that was recorded in 3 different cars. We investigate different feature analysis
`approaches (mel-filter bank, mel-cepstrum, perceptually linear predictive coding) and present results with noise
`compensation techniques based on spectral subtraction. Although the methods employed lead to considerable
`error rate reduction the error analysis shows that low signal-to-noise ratios are still a problem.
`
`SIGNAL BIAS REMOVAL USING THE MULTI-PATH
`STOCHASTIC EQUALIZATION TECHNIQUE
`
`Authors: Lionel Delphin-Poulat and Chafic Mokbel
`
`FT.CNET/DIH/RCP 2 av. Pierre Marzin, 22307 Lannion cedex, France. Tel. +33 2 96 05 13 47 FAX: +33 2
`96 05 35 30 e-mail : delphinp@lannion.cnet.fr
`
`Volume 5 pages 2575 - 2578
`
`ABSTRACT
`
`We propose using Hidden Markov Models (HMMs) associated with the cepstrum coefficients as a speech signal
`model in order to perform equalization or noise removal. The MUlti-path Stochastic Equalization (MUSE)
`framework allows one to process data at the frame level: it is an on-line adaptation of the model. More precisely,
`we apply this technique to perform bias removal in the cepstral domain in order to increase the robustness of
`automatic speech recognizers. Recognition experiments on two databases recorded on both PSN and GSM
`networks show the efficiency of the proposed method.
`
`SUBBAND ECHO CANCELLATION IN AUTOMATIC SPEECH
`DIALOG SYSTEMS
`
`Authors: Andrej Miksic and Bogomir Horvat
`
`Laboratory for Digital Signal Processing Faculty of Electrical Engineering and Computer Science
`University of Maribor, Smetanova 17, 2000 Maribor, Slovenia Tel. +386 62 221112, E-mail:
`andrej.miksic@uni-mb.si
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`3/9
`
`Page 3 of 9
`
`

`

`5/24/22, 12:40 PM
`Volume 5 pages 2579 - 2582
`
`ABSTRACT
`
`Session ThAA Noise Mitigation, Speech Enhancement II
`
`Echo cancellation has been most widely studied for hands-free telephony and for cancelling line echos in
`telephone central offices. The problem of echo cancelling in speech dialog systems is similar, however it has
`some specific requirements. In this contribution, a subband echo cancellation structure is proposed which can be
`integrated in the feature extraction part of a recognizer. A NLMS gradient-based adaptation is performed in
`frequency subbands that can either be derived directly from FFT analysis of input speech signal, or by using a
`proposed reduced-subband approach where the number of subbands is reduced in order to lessen the aliasing
`effect of the FFT. A double-talk detector is proposed based on the estimated error function for decision on
`stopping the adaptation. Finally, a new approach of combining echo cancellation and noise reduction is
`proposed.
`
`Speech Enhancement via Energy Separation
`
`Authors: Hesham Tolba and Douglas O'Shaughnessy
`
`Institut National de la Recherche Scientifique, INRS-Telecommunications, Quebec, Canada. E-mail:
`tolba@inrs-telecom.uquebec.ca and dougo@inrs-telecom.uquebec.ca.
`
`Volume 5 pages 2583 - 2586
`
`ABSTRACT
`
`This work presents a novel technique to enhance speech signals in the presence of interfering noise. In this
`paper, the amplitude and frequency (AM- FM) modulation model [7] and a multi-band analysis scheme [5] are
`applied to extract the speech signal parameters. The enhancement process is performed using a time-warping
`function B(n) that is used to warp the speech signal. B(n) is extracted from the speech signal using the Smoothed
`Energy Operator Separation Algorithm (SEOSA) [4]. This warping is capable of increasing the SNR of the high
`frequency harmonics of a voiced signal by forcing the the quasiperiodic nature of the voiced component to be
`more periodic, and consequently is useful for extracting more robust parameters of the signal in the presence of
`noise.
`
`A Method of Signal Extraction from Noisy Signal
`
`Authors: Masashi UNOKI and Masato AKAGI
`
`unoki@jaist.ac.jp akagi@jaist.ac.jp School of Information Science, Japan Advanced Institute of Science
`and Technology 1-1 Asahidai, Tatsunokuchi, Ishikawa 923-12, Japan
`
`Volume 5 pages 2587 - 2590
`
`ABSTRACT
`
`This paper presents a method of extracting the desired signal from a noise-added signal as a model of acoustic
`source segregation. Using physical constraints related to the four regularities proposed by Bregman, the
`proposed method can solve the problem of segregating two acoustic sources. Two simulations were carried out
`using the following signals: (a) a noise-added AM complex tone and (b) a noisy synthetic vowel. It was shown
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`4/9
`
`Page 4 of 9
`
`

`

`Session ThAA Noise Mitigation, Speech Enhancement II
`5/24/22, 12:40 PM
`that the proposed method can extract the desired AM complex tone from noise- added AM complex tone in
`which signal and noise exist in the same frequency region. The SD was reduced an average of about 20 dB. It
`was also shown that the proposed method can extract a speech signal from noisy speech.
`
`MULTI-CHANNEL NOISE REDUCTION USING WAVELET
`FILTER BANK
`
`Authors: SIKA Jiri - DAVIDEK Vratislav
`
`Faculty of Electrical Engineering Czech Technical University Prague, Czech Republic. Tel. +420 2
`24352291, FAX: +420 2 24310784 , E-mail: sika@feld.cvut.cz
`
`Volume 5 pages 2591 - 2594
`
`ABSTRACT
`
`This paper deals with the problem of estimation of a speech signal corrupted by an additive noise when
`observations from two microphones are available. The basic method for noise reduction using the coherence
`function is modified by using wavelets. The both observations are splitted by filter bank in five narrow bands
`through the whole used bandwidth (0...4kHz). The coherence functions are then computed for each band and the
`output speech estimation is reconstructed.
`
`SPEECH SIGNAL DETECTION IN NOISY ENVIRONEMENT
`USING A LOCAL ENTROPIC CRITERION
`
`Authors: I. Abdallah, S. Montrésor and M. Baudry
`
`Laboratoire d'Informatique de l'Université du Maine Email : imad@lium.univ-lemans.fr
`
`Volume 5 pages 2595 - 2598
`
`ABSTRACT
`
`This paper describes an original method for speech/non-speech detection in adverse conditions. Firstly, we
`define a time-dependent function called Local Entropic Criterion [1] based on Shannon's entropy [2]. Then we
`present the detection algorithm and show that at Signal to Noise Ratio (SNR) above 5 dB, it offers a
`segmentation comparable to the one obtained in clean conditions. We finally, describe how at very low SNR ( <
`0 dB) , it permits to detect speech units masked by noise.
`
`A New Algorithm for Robust Speech Recognition: The Delta Vector
`Taylor Series Approach
`
`Authors: Pedro J. Moreno and Brian Eberman
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`5/9
`
`Page 5 of 9
`
`

`

`5/24/22, 12:40 PM
`Session ThAA Noise Mitigation, Speech Enhancement II
`email: pjm@crl.dec.com, bse@crl.dec.com Digital Equipment Corporation Cambridge Research
`Laboratory
`
`Volume 5 pages 2599 - 2602
`
`ABSTRACT
`
`In this paper we present a new model-based compensation technique called Delta Vector Taylor Series (DVTS).
`This new technique is an extension and improvement over the Vector Taylor Series (VTS) approach [7] that
`addresses several of its limitations. In particular, we present a new statistical representation for the distribution
`of clean speech feature vectors based on a weighted vector codebook. This change to the underlying probability
`density function (PDF) allows us to produce more accurate and stable solutions for our algorithm. The algorithm
`is also presented in a EM-MAP framework where some the environmental parameters are treated as random
`variables with known PDF's. Finally, we explore a new compensation approach based on the use of convex
`hulls. We evaluate our algorithm in a phonetic classification task on the TIMIT [5] database and also in a small
`vocabulary size speech recognition database. In both databases artificial and natural noise is injected at several
`signal to noise ratios (SNR). The algorithm achieves matched performance at all SNR's above 10 dB.
`
`ROBUST ENHANCEMENT OF REVERBERANT SPEECH USING
`ITERATIVE NOISE REMOVAL
`
`Authors: David Cole (d.cole@qut.edu.au) Miles Moody (m.moody@qut.edu.au) Sridha
`Sridharan (s.sridharan@qut.edu.au)
`
`Speech Research Lab, Signal Processing Research Centre School of Electrical and Electronic Systems
`Engineering Queensland University ofTechnology GPO Box 2434 Brisbane, Australia
`
`Volume 5 pages 2603 - 2606
`
`ABSTRACT
`
`We suggest a new technique for the enhancement ofsingle channel reverberant speech. Previous methods have
`used either waveform deconvolution or modulation envelope deconvolution. Waveform deconvolution requires
`calculation of an inverse room response, and is impractical due to variation with source or receiver movement.
`Modulation envelope deconvolution has been claimed to be position independent, but our research indicates that
`envelope restoration in fact degrades intelligibility of the speech. Our method uses the observation that the
`smoothed segmental spectral magnitude of the room response is less variable with position. This is used to
`estimate the reverberant component of the signal, which is removed iteratively using conventional noise
`reduction algorithms. The enhanced output is not perceptibly affected by positional changes.
`
`A NETWORK SPEECH ECHO CANCELLER WITH COMFORT
`NOISE
`
`Authors: D.J.Jones*, S.D.Watson* ,K.G.Evans*, B.M.G.Cheetham* and R.A.Reeves#.
`
`*Department of Electrical Engineering, The University of Liverpool, Liverpool, L69 3BX, UK. #BT
`Laboratories, Martlesham Heath, Ipswich, IP5 3RE. Tel: +44 (0)151 708-7724 E-mail: davej@liv.ac.uk
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`6/9
`
`Page 6 of 9
`
`

`

`5/24/22, 12:40 PM
`Volume 5 pages 2607 - 2610
`
`ABSTRACT
`
`Session ThAA Noise Mitigation, Speech Enhancement II
`
`This paper describes a proposed comfort noise system for a network echo canceller. In this system, any residual
`echo is suppressed using a single threshold centre-clipper, but instead of transmitting silence to the far-end of the
`network, a synthetic version of the background sounds is sent. This masks any 'noise modulation' or 'noise
`pumping' that may otherwise occur. The background sounds are characterised using linear prediction. Periods
`when only background sounds are present are identified by a modified GSM Voice Activity Detector (VAD).
`Informal listening tests have shown that this 'synthetic background' is preferable to the transmission of silence or
`pseudo-random noise that is not spectrally shaped to match the original background.
`
`A NEW METRIC FOR SELECTING SUB-BAND PROCESSING IN
`ADAPTIVE SPEECH ENHANCEMENT SYSTEMS
`
`Authors: Amir Hussain, Douglas R. Campbell and Thomas J. Moir
`
`Department of Electronic Engineering and Physics, University of Paisley, High St., Paisley PA1 2BE,
`Scotland U.K. Corresponding author's email: huss_ee0@paisley.ac.uk
`
`Volume 5 pages 2611 - 2614
`
`ABSTRACT
`
`A multi-microphone adaptive speech enhancement system employing diverse sub-band processing is presented.
`A new robust metric is developed, which is capable of real-time implementation, in order to automatically select
`the best form of processing within each sub-band. It is based on an adaptively estimated inter-channel
`Magnitude Squared Coherence (MSC) relationship, which is used to detect the level of correlation between in-
`band signals from multiple sensors during noise-alone periods in intermittent speech. This paper reports recent
`results of comparative experiments with simulated anechoic data extended to include simulated reverberant data.
`The results demonstrate that the method is capable of significantly outperforming conventional noise
`cancellation schemes.
`
`ESTIMATION OF LPC CEPSTRUM VECTOR OF SPEECH
`CONTAMINATED BY ADDITIVE NOISE AND ITS APPLICATION
`TO SPEECH ENHANCEMENT
`
`Authors: Hidefumi KOBATAKE and Hideta SUZUKI
`
`Graduate School of Bio-Applications and Systems Engineering Tokyo University of Agriculture and
`Technology Koganei, Tokyo 184, JAPAN Tel. +81 423 88 7147, FAX: +81 423 85 5395, E-mail:
`kobatake@cc.tuat.ac.jp
`
`Volume 5 pages 2615 - 2618
`
`ABSTRACT
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`7/9
`
`Page 7 of 9
`
`

`

`Session ThAA Noise Mitigation, Speech Enhancement II
`5/24/22, 12:40 PM
`This paper presents a new method for speech enhancement. It is well known that Wiener filtering is effective in
`reducing additive noises and the proposed method is based on it. This paper focuses on the design of Wiener
`filter, where we place emphasis on the recovery of original formant characteristics and the smooth transition of
`speech spectrum. Transformation method of LPC cepstrum vector extracted from noisy speech to reduce noise
`effects is given, which gives an estimated LPC cepstrum vector of original speech. Sharpening of formant peaks
`and eliminating false spectral peaks are necessary for high quality speech restoration and they are realized by the
`proposed method. Experiments of noise reduction have been performed, whose results show the effectiveness of
`the proposed method.
`
`MULTI-BAND AND ADAPTATION APPROACHES TO ROBUST
`SPEECH RECOGNITION
`
`Authors: Sangita Tibrewala (1) and Hynek Hermansky (1),(2)
`
`(1) Oregon Graduate Institute of Science and Technology, Portland, Oregon, USA. (2) International
`Computer Science Institute, Berkeley, California, USA. Email: sangita,hynek@ee.ogi.edu
`
`Volume 5 pages 2619 - 2622
`
`ABSTRACT
`
`In this paper we present two approaches to deal with degradation of automatic speech recognizers due to
`acoustic mismatch in training and testing environments. The first approach is based on the multi-band approach
`to automatic speech recognition (ASR). This approach is shown to be inherently robust to frequency selective
`degradation. In the second approach, we present a conceptually simple unsupervised feature adaptation
`technique, based on recursive estimation of means and variances of the cepstral parameters to compensate for
`the noise effects. Both techniques yield significant reduction in error rates.
`
`NON-QUADRATIC CRITERION ALGORITHMS FOR SPEECH
`ENHANCFNT
`Authors: Enrique Masgrau, Eduardo Lleida, Luis Vicente
`
`Communication Technologies Group (GTC). Depart.ment of Electronic Engineering & Communications
`Centro Politecnico Superior. C/Maria de Luna 3, 50015-Zaragoza. Spain Universidad de Zaragoza Tel:
`+34-976-761930, FAX: +34-976-762111, E-mail: masgrau@posta.unizar.es
`
`Volume 5 pages 2623 - 2626
`
`ABSTRACT
`
`A new algorithm for speech enhancement based on the iterative Wiener filtering method due to Lim-Oppenheim
`[1] is presented. We propose the use of a generalized non-quadratic cost function in addition to the classical
`MSE term (quadratic term). The proposed cost function includes two signal-error cross- correlation terms and a
`L2 norm term of the filter weights. The signal-error cross- correlation terms reduce both the residual noise and
`the signal distortion in the enhanced speech. The L2 norm term of the filter weights reduces the overall gain of
`the filter, decreasing the weight noise variance and removing the side lobe of the filter response. Two solutions
`to the new cost function are presented: the classical non-causal type (ideal Wiener), working in the frequency
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`8/9
`
`Page 8 of 9
`
`

`

`Session ThAA Noise Mitigation, Speech Enhancement II
`5/24/22, 12:40 PM
`domain; and a causal finite length in the time domain. In both cases, as Lim's algorithm, the filter output of each
`iteration is used as "noiseless" speech signal for the following one. Simulation results demonstrate the
`effectiveness of these algorithms.
`
`https://web.archive.org/web/19991021233509/http://www.wcl2.ee.upatras.gr:80/eurthaa.html
`
`9/9
`
`Page 9 of 9
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket