`US 6,415,034 B1
`(10) Patent No.:
`*Jul. 2, 2002
`(45) Date of Patent:
`Hietanen
`
`
`US006415034B1
`
`(54) EARPHONE UNIT AND A TERMINAL
`DEVICE
`
`(75)
`
`Inventor:
`
`Jarmo Hietanen, Tampere (FI)
`
`5,732,143 A *
`5,748,725 A *
`5,790,684 A *
`5,909,498 A *
`5,933,506 A *
`
`3/1998 Andreaet al. sess 381/71.6
`5/1998 Kubo veces 381/711
`
`8/1998 Niino etal.
`..
`... 381/326
`6/1999 Smith........
`w+ 381/380
`8/1999 Aoki et al. veces 381/151
`
`
`
`(73) Assignee: Nokia Mobile Phones Ltd., Espoo (FI)
`
`FOREIGN PATENT DOCUMENTS
`
`(*) Notice:
`
`This patent issued on a continued pros-
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent
`term provisions of 35 U.S.C.
`154(a)(2).
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`US.C. 154(b) by 0 days.
`
`(21) Appl. No.: 08/906,371
`
`(22)
`
`Filed:
`
`Aug. 4, 1997
`
`(30)
`
`Foreign Application Priority Data
`
`Aug. 13, 1996
`
`(FD) cesessesssessssssssesssesstssesssessnesseessees 963173
`
`(51)
`
`Int. C7eee A61F 11/06; G10K 11/16;
`HO4R 25/00
`(52) US. Ch. eee 381/71.6; 381/326; 381/151
`(58) Field of Search oo... eee 381/370, 375,
`381/380, 71.6, 23.1, 60, 317, 318, 320,
`326, 71.1, 72, 74, 92, 94.1, 151, 66, 372;
`379/430; 181/127, 128
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`11/1976 Tami oo.eee 179/1 P
`3,995,113 A
`4,975,967 A * 12/1990 Rasmussen .............004 381/380
`5,099,519 A
`3/1992 Guan ou... ee
`eeeeee eee ee 381/183
`5,285,165 A
`2/1994 Renfors et al... 328/167
`5,298,692 A *
`3/1994 Ikeda etal. we. 381/380
`5,313,661 A
`5/1994 Malmiet al... 455/232.1
`8/1994 Bartlett et al. oe. 379/430
`5,343,523 A
`
`4/1995 Jarvinen 0... eee 381/94
`5,406,635 A
`
`wee 395/2.37
`5,426,719 A
`6/1995 Franksetal. ...
`
`5,692,059 A * 11/1997 Kruger... eee 381/151
`
`
`
`EP
`GB
`GB
`WO
`
`0637187 Al
`2226931
`2281004
`WO 94/06255
`
`2/1995
`7/1990
`2/1995
`3/1994
`
`OTHER PUBLICATIONS
`
`Journal of Sound And Vibration, 1994, vol. 174, pp.
`617-639, “Simultaneous Piezoelectric Sensing/Actuation:
`Analysis And Application to Controlled Structures”, Ander-
`son et al.
`Advanced Engineering Mathematics, sixth edition, pp. 271
`& 272, “Convolution. Integral Equations”, Erwin Kreyszig.
`
`* cited by examiner
`
`Primary Examiner—Xu Mei
`(74) Attorney, Agent, or Firm—Perman & Green, LLP
`
`(57)
`
`ABSTRACT
`
`The scope of the present invention is an earphone unit (11)
`to be mounted either on external ear (18) or in auditory tube
`(10), in which unit both a speech registering microphone
`(13) and a speech reproducing ear capsule (12) have been
`placed. The earphone unit (11) is suitable for use in con-
`nection with various terminal devices, in particular with
`mobile stations. When a user’s speech is registered, the ear
`capsule signal (12') containing disturbances is canceled
`utilizing methods based upon determining the transfer func-
`tion between the ear capsule (12) and the microphone (13).
`A separate error microphone (14) is used for eliminating
`external sources of disturbances (17), such as noise. In order
`to improve the quality of speech and prevent problems
`caused by double-talk, signals (15', 12', 17') are processed
`digitally utilizing e.g. band limitation and prediction of
`missing bands.
`
`13 Claims, 7 Drawing Sheets
`
`
`
`
`
`
`
`APPLE ET AL. 1004
`
`APPLE ET AL. 1004
`
`1
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 1 of 7
`
`US 6,415,034 B1
`
`
`
`ig.
`Fig.
`
`1
`
`99
`
`2
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 2 of 7
`
`US 6,415,034 B1
`
`3
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 3 of 7
`
`US 6,415,034 B1
`
`4
`
`
`
`
`
`+34 iknD)
`
`5
`
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 5 of 7
`
`US 6,415,034 B1
`
`f (kHz)
`
` 3.4
`
`
`6
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 6 of 7
`
`US 6,415,034 B1
`
`106 Fig. 11B
`
`Fig. 12
`
`7
`
`
`
`U.S. Patent
`
`Jul. 2, 2002
`
`Sheet 7 of 7
`
`INDICATOR-
`
`"DOUBLE
`TALK"
`
`US 6,415,034 B1
`
`Fig. 14
`
`8
`
`
`
`US 6,415,034 B1
`
`1
`EARPHONE UNIT AND A TERMINAL
`DEVICE
`
`FIELD OF THE INVENTION
`
`The present invention relates to an earphone unit mounted
`in the auditory tube (also called auditory canal) or on the ear,
`which unit comprises voice reproduction meansfor convert-
`ing an electric signal into acoustic sound signal and for
`forwarding the sound signal into the user’s ear, and speech
`detection means for detecting the speech of the user of the
`earphone unit from the user’s said same auditory tube. The
`earphone unit
`is suitable for use in connection with a
`terminal device, especially in connection with a mobile
`station. In addition to above the invention is related to a
`terminal device incorporating or having a separate earphone
`unit and to a method of reproduction and detection of sound.
`BACKGROUND OF THE INVENTION
`
`Traditional headsets equipped with a microphone have an
`earpiece for either both ears or only for one ear, from which
`earpiece in general a separate microphone bar extending to
`mouth or the side of mouth is protruding. The earpiece is
`either of a type to be mounted onthe ear or in the auditory
`tube. The microphoneusedis air connected, either a pressure
`or a pressure gradient microphone. The required amplifiers
`and other electronics are typically placed in a separate
`device. If a wireless system is concerned, it is possible to
`place someof the required electronics in connection with the
`earpiece device, and the rest in a separate transceiverunit. It
`is also possible to integrate the transceiver unit
`in the
`earpiece device.
`Patent publication U.S. Pat. No. 5,343,523 describes an
`earphone solution designed for pilots and telephone
`operators, in which earpieces are mounted on the ears and a
`separate microphone suspended from a bar is mounted in
`front of the mouth. In addition to above, a separate error
`microphone has been arranged in connection with the
`earpieces, by utilizing which microphone someof the envi-
`ronmental noise detected by the user can be cancelled and
`the intelligibility of speech can be improved in this way.
`Alternative solutions have been developed for occasions
`in which a separate microphone suspended from a bar
`cannot be used. Detection of speech through soft tissue is
`prior known e.g. from throat microphones used in tank
`headgear. Onthe other hand, detection of speech through the
`auditory tube has been presented in patent publication U'S.
`Pat. No. 5,099,519. In said patent publication it has been
`said that the advantages of speech detection through the
`auditory tube are the small size of the earpiece and the
`suitability of the device to noisy environment. A microphone
`closing the auditory tube acts also as an elementary hearing
`protector.
`Patent publication U.S. Pat. No. 5,426,719 presents a
`device which also acts as a combined hearing protector and
`as a means of communication. In said patent publication, as
`well as also in the above mentioned patent publication US.
`Pat. No. 5,099,519, the microphoneis placed in one earpiece
`and the ear capsule respectively in the other earpiece. This
`means that a device according to any of the two patent
`publications requires using both ears, which makes the
`device bulky and limits the field of use of the device.
`Patent publication WO 94/06255 presents an ear micro-
`phone unit for placement
`in one ear only. The unit
`is
`mounted in a holder for placementin the outer ear. For use
`in full duplex ear communication the holder further has a
`sound generator. Between the sound generator and the
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`9
`
`2
`microphoneis mounted a vibration absorbing unit. Also the
`sound generator is embedded in a thin layer of attenuation
`foam.
`
`two-way acoustic communication
`Another device for
`through oneear is described in patent publication US. Pat.
`No. 3,995,113. This device is based on an electro-acoustic
`mutual transducing device adapted to be inserted into the
`auditory canal and which can function both as a speaker and
`microphone.
`It
`forms an ear-plug type transmitting-
`receiving device. The device additionally includes meansfor
`reducing the mechanical impedance of the vibrating system
`and a meansfor eliminating the noise resulting from said
`impedance reducing means.
`
`SUMMARYOF THE INVENTION
`
`Now an improved earphoneunit has been invented, which
`unit facilitates placing of a microphone and an ear capsule
`in same auditory tube or on the same ear and which has
`means for eliminating sounds produced into the auditory
`tube by the ear capsule from sounds detected by the micro-
`phone. This improves the detection of the user’s speech,
`whichis registered via the auditory tube, especially when the
`user speaks simultaneously as sound is reproduced by the
`ear capsule. In telephones, such as mobile phonesthis is
`needed especially in double talk situations, i.e. when both
`the near end and far end speaker speak simultaneously.It is
`possible to install in the earphone unit also a separate error
`microphone for elimination of external disturbances. It is
`possible to use for microphonesand ear capsules any means
`of conversion prior knownto a person skilled in theart that
`convert acoustic energy into electric form (microphone), and
`electric energy into acoustic form (ear capsule,
`loudspeaker). The invention presents a new solution for
`determining the acoustic coupling of a microphone and a
`loudspeaker and for optimizing voice quality using digital
`signal processing.
`The earphone unit according to the invention is suitable
`for use in occasions in which environmental noise prevents
`from using a conventional microphone placed in front of
`mouth. Respectively, the small size of the earphone unit
`according to the invention enables using the device in
`occasions in which small size is an advantage e.g. due to
`inconspicuousnes. In this way the earphone unit according
`to the invention is particularly suitable for use e.g.
`in
`connection with a mobile station or a radio telephone while
`moving in public places. The use of the earphone unit is not
`limited to wireless mobile stations, but it is equally possible
`to use the earphone unit
`in connection with even other
`terminal devices. One preferable field of use is to connect
`the earphone unit to a traditional telephone or other wire-
`connected telecommunication terminal device. It is equally
`possible to use the earphone unit according to the invention
`in connection with various interactive computer programs,
`radio tape recorders and dictating machines.
`It
`is also
`possible to integrate the earphone unit as a part of a terminal
`device as presented in the embodiments below.
`When an attempt is made to detect from the auditory tube
`simultaneously speech of very low sound pressure level and
`sound is fed with relatively high sound pressure level into
`the same ear using the ear capsule, problems arise when
`analogue summing units and amplifiers equipped with fixed
`adjustments are used. In this system the auditory tube is an
`important acoustic component, becauseit has an effect upon
`both the user’s speech and on the voice produced by the ear
`capsule. Because the auditory tube of each person is unique,
`the transfer function between the microphone and the ear
`
`9
`
`
`
`US 6,415,034 B1
`
`3
`capsuleis individual. In addition to this the transfer function
`is different each time the earphone unit is set into place,
`because the ear capsule may beset e.g. at a different depth.
`If the setting of the earphone unit
`is not completely
`successful, the acoustic leakage of the ear capsule may be
`beyond control, which can disturb the operation of the
`device. An acoustic leakage meanse.g. a situation in which
`environmental noise leaks past an ear capsule placed in the
`auditory tube into the auditory tube. If an earphone unit
`according to the invention consisting of a microphone and
`an ear capsule is placed in a separate device outside the
`auditory tube, it is particularly important to have the acoustic
`leakage under control.
`In order to be able to separate the sound components
`producedby various sources of noise, which components are
`disturbing and unnecessary from the point of view of the
`intelligibility and clearness of the user’s speech and in order
`to be able to remove them from the signal detected by the
`microphone in such a way that essentially just the user’s
`voice remains, the transfer functions between the various
`components of the system must be known. Because the
`transfer function between the microphone capsule and the
`ear capsule is not constant, the transfer function must be
`monitored. Monitoring of the transfer function can be car-
`ried out e.g. through measurements based on noise. In order
`to improve voice quality and the intelligibility of speech, it
`is possible to divide the detection and reproduction of
`speech in various frequency bands which are processed
`digitally.
`It is characteristic of the ear-connectable earphone unit
`and the terminal device arrangement according to the inven-
`tion that it comprises means for eliminating sounds pro-
`duced into the auditory tube by said sound reproduction
`means from sounds detected by said speech detection
`means.
`
`It is characteristic of the terminal device according to the
`invention that said sound reproduction means and said
`speech detection means have been arranged in the terminal
`device close to each other in a manner for connecting both
`simultaneously to one and the same ear of a user, and the
`terminal device further comprising means for eliminating
`sounds produced into the auditory tube by said sound
`reproduction means from sounds detected by said speech
`detection means.
`
`It is characteristic of the method according to the inven-
`tion that disturbance caused in the ear by the first sound
`signal is subtracted from said second soundsignal.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The invention is described in detail in the following with
`reference to enclosed figures, of which
`FIG. 1 presents both the components of the earphone unit
`according to the invention and its location in the auditory
`tube,
`FIGS. 2A and 2B present various ways of placing, in
`relation to each other, the microphones and the ear capsule
`used in the earphone unit according to the invention,
`FIG. 2C presents the realization of the earphone unit
`according to the invention utilizing a dynamic ear capsule,
`FIG. 3 presents as a block diagram separating the sounds
`produced by the ear capsule and sounds produced by exter-
`nal noise from a detected microphonesignal,
`FIG. 4 presents as a block diagram the components and
`connections of an earphone unit according to the invention,
`FIG. 5 presents the digital shift register equipped with
`feed-back used for forming an MLS-signal,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`FIG. 6 presents as a block diagram determining the
`transfer function between a microphoneand an ear capsule,
`FIG. 7 presents the band limiting frequencies used in an
`embodiment according to he invention,
`FIG. 8 presents microphone signal detected in the audi-
`tory tube at frequency level,
`FIG. 9 presents band-limited microphone signal detected
`in the auditory tube at frequency level,
`FIG. 10 presents band-limited microphonesignal detected
`in the auditory tube at frequency level, in which the missing
`frequency bands have been predicted,
`FIGS. 11A and 11B present a mobile station according to
`the invention,
`FIGS. 12 and 13 present mobile station arrangements
`according to the invention, and
`FIG. 14 presents the blocks of digital signal processing
`carried out in the earphone unit according to the invention.
`DETAILED DESCRIPTION
`
`In the following the invention is explained based upon an
`embodiment. FIG. 1 presents earphone unit 11 according to
`the invention, which makesit possible to place microphone
`capsule 13 and ear capsule 12 in same auditory tube 10.
`Error microphone 14 is located on the outer surface of
`earphone unit 11. Earphone unit 11 has been given such a
`form that intrusion of external noise 17' into auditory tube 10
`has been preventedasefficiently as possible. External noise
`17' consists of e.g. noise produced by working machinery
`and speech of persons nearby. The source of noise is in FIG.
`1 represented by block 17 and the sound advancing from
`source of noise 17 directly to error microphone 14 is
`presented with reference 17". The advantage of earphone
`unit 11 is its small size and its suitability for noisy environ-
`ment.
`
`Microphone capsule 13 and ear capsule 12 can be physi-
`cally located in relation to each other in a number of ways.
`FIGS. 2A and 2B presentalternative placing of microphone
`capsule 13, error microphone 14 and ear capsule 12, and
`FIG. 2C presents utilizing of dynamic ear capsule 150 as
`both microphone capsule 13 and ear capsule 12. In FIG. 2A
`microphone capsule 13 has as an example been placed in
`front of ear capsule 12 close to acoustic axis 142. It is
`possible to integrate microphone capsule 13 in the body of
`ear capsule 12, or it can be mounted using supports 141.
`Arrow 12' presents sound emitted by ear capsule 12.
`FIG. 2B presents a solution in which ear capsule 12 has
`been installed in the other, auditory tube 10 side, end of
`earphoneunit 11. Ear capsule 12 is integrated in the body of
`earphone unit 11 e.g. using supports 144. Slots or apertures
`145 have been arranged between the housing of earphone
`unit 11 and supports 144 to the otherwise closed microphone
`chamber in which microphone capsule 13 has been placed.
`Microphone capsule 13 is integrated in the body of earphone
`unit 11 or fixed solidly on e.g. supports 146. Space 148 has
`been arranged behind microphone chamber 147for electric
`components required by earphone unit 11, such as processor
`34, amplifiers and A/D and D/A-converters (FIG. 4). Error
`microphone 14 which has an acoustic connection to noise
`17"arriving from the source of noise 17 has been placed in
`space 149 in the end of earphone unit 11 opposite to ear
`capsule 12.
`FIG. 2C presents an embodimentof earphone unit 11, in
`which separate ear capsule 12 and microphone capsule 13
`have been replaced with dynamic ear capsule 150 whichis
`capable of acting simultaneously as a sound reproducing and
`
`10
`
`10
`
`
`
`US 6,415,034 B1
`
`5
`receiving component.It is possible to use instead of dynamic
`ear capsules 150 e.g. a piezoelectric converters, which have
`been described in more detail in publication Anderson,E. H.
`and Hagood, N. W. 1994 Simultaneous piezoelectric
`sensing/actuation: analysis and applications to controlled
`structures, Journal of Sound and Vibration, vol 174,
`617-639. The solution of integrating ear capsule 12 and
`microphone capsule 13 preferably reduces the need for
`space of earphone unit 11. Such a construction is also
`simpler in its mechanical realization. It is also possible to
`use in the earphone unit 11 according to the invention other
`ways of placing and realizing microphones 13 and 14 and
`ear capsule 12, different in their realization.
`The human speech is generated in the larynx 20 (FIG. 1)
`in the upper end of the windpipe, in which the vocal cords
`15 are situated. From the vocal cords 15 the speech is
`transferred through the Eustachian tube connecting the
`throat and the middle ear to the eardrum 16. Also connected
`to the eardrum 16are the auditory ossicles (not shownin the
`figure) in the middle ear, over which the sound is forwarded
`into the inner ear (not shownin thefigure) where the sensing
`of sound takes place. The yibrations of the eardrum 16 relays
`the speech through the auditory tube 10 to the microphone
`capsule 13 in the auditory tube 10 end of earphone unit 11.
`Whenspeechis transferred to the user of earphone unit 11
`over ear capsule 12, this speech is sensed by the eardrum 16.
`In FIG. 3, block 24 illustrates sound signals received by
`microphone capsule 13. They consist of three components:
`speech signal 15' originated in the vocal cords, ear capsule
`signal 12' reproduced by ear capsule 12 in the auditory tube
`10 and noise signal 17" caused by external sources of noise
`17. In order to be able to detect the desired speech signal 15'
`in the auditory tube 10 in the best possible way, signals 12'
`and 17', which are disturbing from the point of view of
`speech signal 15', are strived to be eliminated e.g. in two
`different stages. In the first stage ear capsule signal 12'
`generated by ear capsule 12 in the auditory tube 10 is
`removed in block 24. Becausethe original electric initiator
`of ear capsule signal 12' is known,it can be subtracted from
`the signal received by microphone capsule 13 using sub-
`tractor 25 provided that the transfer function between ear
`capsule 12 and microphone capsule 13 is known. Because
`the transfer function between error microphone 14 and
`microphone capsule 13 is essentially constant, noise signal
`17' can be subtracted in second stage 25 using subtractor 27
`using a method which is explained later.
`The transfer function between ear capsule 12 and micro-
`phone capsule 13 is determined e.g. using so-called MLS
`(Maximum Length Sequence)-signal.
`In this method a
`known MLS-signalis fed into the auditory tube 10 with ear
`capsule 12, the response caused by which signal is measured
`with microphone capsule 13. This measuring is executed
`preferably at such discrete moments when no other infor-
`mation is transferred to the user over ear capsule 12. In
`principle it is possible to use any sound signal as the known
`measuring sound signal, but it is nice from the user’s point
`of view to use e.g.
`the MLS-signal resembling using a
`generator 50 (FIG. 5) which generates binary, seemingly
`random sequences (pseudo-random sequence generator),
`which generatoris realized digitally in processor 34 (FIG. 4)
`in earphone umt 11. FIG. 5 presents the realization of
`generator 50 using a n-stage shift register. Output 53 of the
`generator is, with suitably selected feed-backs 51 and 52,
`binary sequences repeated identically at certain intervals.
`The sequences are fed to D/A-converter 33 (FIG. 4), and
`from there further to amplifier 32 and ear capsule 12. The
`repeating frequency of the sequences depends on the number
`
`6
`of stages n of the generator and on the choice of feed-back
`51 and 52. The longest possible sequence available using
`n-stage generator 50 has the length of 2"-1 bits. For example
`a 64-stage generator can produce a sequence which is
`repeated identical only after 600,000 years when 1 MHz
`clock frequencyis used. It is prior knownto a person skilled
`in the art that such long sequences are generally used to
`simulate real random noise.
`
`FIG. 6 presents determining the transfer function. Ear
`capsule 12 is used to feed a known signal f(t) into the
`auditory tube 10 andthe signal is detected using microphone
`capsule 13. Processor 34 saves the supplied signal f(t) in
`memory 37. In auditory tube 10 signal f(t) is transformed
`due to the effect of impulse response h(t) (ref. 56) into form
`h(t)* f(t). Through microphone capsule 13 and amplifier 30
`signal h(t)*{(t) is directed to A/D-converter 31 and saved in
`memory 37. Signal h(t)*f(t) is a convolution of the supplied
`signal f(t) and the system impulse response h(t) (ref. 56).
`Convolution has been described e.g. in Erwin Kreyszig’s
`book Advanced Engineering Mathematics, sixth edition,
`page 271 (Convolution theorem). The system impulse
`response h(t)
`is determined by calculating the cross-
`correlation, prior known to persons skilled in theart, of the
`supplied signal f(t) and the received signal h(t)* f(t). Impulse
`response h(t) in time space can be converted into the form
`in frequency space e.g. using FFT (Fast Fourier Transform)-
`transform 58, resulting in system transfer function H().
`Relatively low signal to noise ratio (SNR) will be sufficient
`for a successful measuring. The accuracy of the impulse
`response can,
`in addition to increasing the SNR, be
`improved through averaging. In preferable conditions the
`user will not detect the determining of the impulse response
`at all.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`A microphone signal contains the following sound com-
`ponents:
`
`mO=rO+y(O+2Q
`
`@)
`
`40
`
`in which
`
`m(t) is the sound signal received by microphone capsule
`13
`
`x(t) is desired speech signal 15'
`y(t) is ear capsule signal 12' detected by microphone
`capsule 13
`z(t) is external noise signal 17' detected by microphone
`capsule 13.
`Because the speech signal x(t) transferred by eardrum 16 is
`wanted to be solved, the share of ear capsule 12 and of
`external noise 17 must be subtracted from the microphone
`signal. In this case equation (1) can be rewritten in form:
`
`xQ=mD-y(Q-2().
`
`2
`
`Sound component y(t) detected by microphone capsule 13
`can be written, utilizing the original knownelectric signal
`y'(t) supplied to the ear capsule and the determined impulse
`response h(t) as follows:
`
`yO=hO*y'O
`
`3)
`
`By substituting equation (3) into equation (2) it is obtained:
`
`xD=mD-hO*y'O-20
`
`4)
`
`Error microphone 14 is used to compensate for external
`signal z(t). Error microphone 14 measures external noise
`z(t) which is used as a reference signal. When external noise
`
`45
`
`50
`
`55
`
`60
`
`65
`
`11
`
`11
`
`
`
`US 6,415,034 B1
`
`7
`z(t) reaches microphone capsule 13 it is transformed in a
`waydetermined by acoustic transfer function K(m) between
`the microphones. Transfer function K(w) and its equivalent
`k(t) in time space can be determined most preferably in the
`manufacturing stage of earphone unit 11, because the cou-
`pling between microphones 13 and 14 is constant due to the
`construction of earphone unit 11. In this case z(t) can be
`written, using reference signal z'(t) and impulse response
`k(t) between the microphonesas follows:
`=k")
`
`(5)
`
`By substituting equation (5) into equation (4), by processing
`the microphone signal m(t) according to which the desired
`user’s speech signal can be detected:
`x(Q=m(Q-AO*y'O-AO*2(O
`
`(6)
`
`10
`
`15
`
`20
`
`8
`FIG. 4 presents in more detail the internal construction of
`earphone unit 11. The signals from microphone capsule 13
`and error microphone 14 are amplified in amplifiers 30 and
`36 after which they are directed through A/D-converters 31
`and 35 to processor 34. When speech signal or MLS-signal
`from generator 50 is transferred to the user’s auditory tube
`10 they are transferred through D/A-converter 33 and ampli-
`fier 32 to ear capsule 12. Program codes executed by
`processor 34 are stored in memory 37, which is used by
`processor 34 also for storing e.g. the interim data required
`for determining impulse response h(t). Controller 38, which
`typically is a microprocessor, the required A/D- and D/A-
`converters 39 and processor 34 with memory 37 convert
`both the incoming and outgoing speech into the form
`required by transfer path 40. Transfer of speech into both
`directions can be carried out in either analogue or digital
`
`Afilter is required for compensating external signal z(t), form to either external terminal device 121 (FIG. 13) or
`whichfilter realizes impulse response k(t). The filter can be
`device 100, 110 (FIGS. 11A, 11B and 12)built in connection
`constructed using discrete components, but preferably it is
`with earphone unit 11. The required A/D- and D/A-
`realized digitally in processor 34. Even traditional adaptive
`conversions are executed with converter 39. Also the power
`echo canceling algorithmscan be used for estimating signals
`supply to earphone unit 11 can be carried out over transfer
`y(t) and z(t).
`path 40. If earphone unit 11 has been designed for wireless
`The acoustic coupling between microphone capsule 13
`operation, the required means of transmitting and receiving
`111, 113 (FIG. 12A) and the power supply(e.g. a battery, not
`and error microphone 14 can be determined also during the
`shownin the figure) are placed e.g. in the ear-mountedpart.
`operation of the device. This can be carried out by compar-
`ing the microphone signals m(t) and z'(t). When signal y'(t)
`If both the user of earphone unit 11 and his speaking
`is 0 and such a momentis found whenthe user of the device
`partner are talking simultaneously,a so-called “double-talk”
`situation occurs. In the traditional “double-talk” detection of
`is not speaking,also x(t) is 0. In this case the remaining m(t)
`is essentially convolution k(t)*z'(t). Transfer function K(w)
`can be determined from the divisionratio of frequency space
`simply:
`M(@yZ'()=K(@)Z'(@)/Z'(@)=K()
`
`25
`
`30
`
`2)
`
`the transfer function can be converted into the
`Finally,
`impulse response k(t) of time space using inverse Fourier-
`transform. This operation can be used e.g. for determining
`the acoustic leak of earphone unit 11 or as a help to speech
`synthesis e.g. when editing a user’s speech.
`When detected in the auditory tube 10, human speech is
`somewhat distorted, because typically high frequencies are
`more attenuated in the auditory tube 10.
`By comparing in environmentwith little or preferably no
`noise at all, the differences between speech signals from
`microphone capsule 13 detecting speech in the auditory tube
`10 and speech signals received by external error microphone
`14, it is possible to determine the transfer function directed
`at the speech signal by the auditory tube utilizing e.g. the
`above described method. Based upon determining the trans-
`fer function it is possible to realize in processor 34 a filter
`which can be used for compensating the distortion in the
`speech signal caused by the auditory tube. In this case a
`better voice quality is obtained.
`In environmentwithlittle noise external error microphone
`14 can be used even in stead of main microphone 13. It is
`possible to realize the choice between microphones 13 and
`14¢.g. by comparing the amplitude levels of the microphone
`signals. In addition to this the microphone signals can be
`analyzed e.g. using a speech detector (VAD, Voice-Activity
`Detection) and further through correlation calculation, with
`which one can confirm that signal z'(t) arriving in error
`microphone 14 has sufficient resemblance with the pro-
`cessed signal x(t). These actions can be used for preventing
`noise of nearby machinery or other corresponding source of
`noise and speech of nearby persons from passing on after the
`processor. When error microphone 14 is used instead of
`microphone capsule 13 it is possible to obtain better voice
`quality in conditions with little noise.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`12
`
`mobile telephones speech detectors are used in both the
`channel which transfers speech from the user to the mobile
`communication network (up-link) and in the channel which
`receives speech from the mobile communication network
`(down-link). When the speech detectors of both channels
`indicate that the channels indicate speech, the teaching of the
`adaptive echo cancellator is temporarily interrupted and its
`settings are saved. This state can be continued as long as the
`situation is stable, after which the attenuating of the micro-
`phone channel is started. Interrupting the teaching of the
`echo cancellator is possible because the eventualerroris at
`least in the beginning lower than the up-link and down-link
`signals. In case of earphone unit 11 the traditional detection
`of “double talk” cannot be applied without problems,
`because a smallest error in determining impulse response
`h(t) will produce.an error which is of the same order than
`original signal x(t). In principle the problemsarising could
`be avoided by giving priority to information transferred to
`one of the directions, but this solution is not attractive from
`the user’s point of view. In this case users would experience
`interruptions or high attenuation in speech transfer. A better
`solution is achieved by striving for as good as possible
`separation of signals transferred to different directions.
`FIG. 14 presents an embodiment in which microphone
`signal 13" and ear capsule signal 12" transferred to different
`directions are separated from each other using band-pass
`filters 132, 133, 134 and 137. The band-passfilters divide the
`speech band into sub-bands(references 61-68, FIGS. 7-10),
`in which case ear capsule 12 can be run on part of the
`sub-bands and the signal from microphone capsule 13 is
`correspondingly forwarded only on sub-bands which remain
`free. FIG. 7 presents an example of sub-bands, in which
`speech signal
`is transferred to both directions on three
`different frequency bands. In telephone systems the speech
`band is typically 300 to 3400 Hz. Out of the signal from
`microphone capsule 13 in this case frequency bands 300 to
`700 Hz, 1.3 to 1.9 kHz and 2.4 to 3.0 kHz, or sub-bands 62,
`64 and 66, are utilized directly. The signal repeated by ear
`capsule 12 contains correspondingly frequency bands 700
`
`12
`
`
`
`US 6,415,034 B1
`
`9
`Hz to 1.3 kHz, 1.9 to 2.4 kHz and 3.0 to 3.4 kHz, or
`sub-bands 63, 65 and 67. In traditional mobile telephone
`communication frequency bands below 300 Hz (reference
`61) and higher than 3.4 kHz (reference 68) are not used. The
`number of sub-bands has not been limited for reasons of
`
`principle, but to the more sub-bandsthe frequency range in
`use is divided, the better voice quality is obtained. As a
`counterweight
`to this the required processing capacity
`increases.
`
`The above described utilizing of sub-bands needsprefer