throbber
United States Patent (19)
`McAulife et al.
`
`USOO5870705A
`Patent Number:
`11
`(45) Date of Patent:
`
`5,870,705
`Feb. 9, 1999
`
`54 METHOD OF SETTING INPUT LEVELS IN A
`VOICE RECOGNITION SYSTEM
`
`4,829,578 5/1989 Roberts.
`4,837,831
`6/1989 Gillick et al..
`4.866,778 9/1989 Baker.
`
`75 Inventors: Garrett McAuliffe, Kirkland; Leonard
`-
`0
`
`Zuvela, Mikilteo, both of Wash.
`73 Assignee: Microsoft Corporation, Redmond,
`Wash.
`
`21 Appl. No.: 327,543
`22 Filed:
`Oct. 21, 1994
`(51) Int. Cl. .................................................. G10L 9/00
`52 U.S. Cl. .......................... 704/225; 704/275, 704/200;
`381/106; 381/107; 381/108
`58 Field of Search .................................. 395/2.34, 2.84,
`395/2; 381/68.4, 28, 107, 106, 108
`References Cited
`
`56)
`
`U.S. PATENT DOCUMENTS
`2/1981 Scott.
`4,250,637
`4,292.469 9/1981 Scott et al..
`4,297,527 10/1981 Pate ......................................... 381/107
`4,354,064 10/1982 Scott.
`4,383,135 5/1983 Scott et al..
`
`4,455,676 6/1984 Kaneda - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 381/106
`
`4,468,204 8/1984 Scott et al..
`4,495.384
`1/1985 Scott et al. .
`4,610,023 9/1986 Noso et al... 30s,284
`4.672,667 6/1987 Scott et al.,
`4,776,016 10/1988 Hansen ................................... 395/2.84
`4,777,649 10/1988 Carlson et al. ........................... 381/26
`4,783,803 11/1988 Baker et al..
`4,829,576 5/1989 Porter.
`
`: 3. Stil - - - - - - - - - - - - - - - - - - - - - - - - - - 381/31
`2- Y -
`CKC a
`
`4.914,703 4/1990 Gillick.
`4.969,193 11/1990 Scott et al..
`5,025,471
`6/1991 S. N.
`5,027,406 6/1991 Roberts et al..
`5,208,866 5/1993 Kato et al. .............................. 381/107
`5,267,322 11/1993 Smith et al. ............................ 381/107
`5,345,538 9/1994 Narayannan et al.
`... 395/2.84
`5,363,147 11/1994 Joseph et al. ........................... 381/108
`Primary Examiner David R. Hudspeth
`Assistant Examiner Vijay B. Chawan
`Attorney, Agent, or Firm-Ratner & Prestia
`57
`ABSTRACT
`A computer implemented Voice recognition method and
`System for adjusting an input level to adjust the input Signal
`amplitude level of spoken words to enhance voice recogni
`tion. A user is prompted with a word to Speak into a
`microphone. The spoken word is converted into an analog
`electrical signal having an input Signal amplitude level. A
`Sound card then converts this analog signal to a digital
`Stream of data. This input Signal amplitude level is compared
`to a reference amplitude level. An adjustment to an input
`
`volume control is made with respect tO the comparison tO
`
`h th
`litude level
`1
`he i
`di
`adjust the input Signal amp itu e leve tO approach the
`reference amplitude level. The invention also uses an itera
`tive process for a Set number of iterations to make the
`adjustment for the input signal amplitude level to approach
`the reference amplitude level.
`28 Claims, 4 Drawing Sheets
`
`Prompt User for Word
`
`Generate Input Signal
`Amplitude Level
`
`Y.
`
`
`
`
`
`Word
`Detected
`
`
`
`
`
`
`
`Compare Input Signal
`Amplitude Level to
`Preselected Signal
`Amplitude Level
`
`
`
`Stort Over
`
`30
`
`Acceptoble
`28
`
`Page 1
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 1 of 4
`
`5,870,705
`
`
`
`
`
`
`
`
`
`
`
`Prompt User for Word
`
`Generate Input Signal
`Amplitude Level
`
`12
`
`14
`
`Word
`Detected
`
`Compare Input Signal
`Amplitude Level to
`Preselected Signal
`Amplitude Level
`
`
`
`
`
`No
`
`Stort Over
`
`Acceptable
`
`32
`
`3O
`
`28
`
`Page 2
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 2 of 4
`
`5,870,705
`
`eIqoqdeooo
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Page 3
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 3 of 4
`
`5,870,705
`
`
`
`FIG, 3
`Prior Art
`
`Page 4
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 4 of 4
`
`5,870,705
`
`
`
`Page 5
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`1
`METHOD OF SETTING INPUT LEVELS IN A
`VOICE RECOGNITION SYSTEM
`
`5,870,705
`
`15
`
`2
`Newton, Mass. is an example of a voice recognition engine,
`which can be run on a personal computer.
`Because different userS Speak at different Sound levels, as
`well as the difference in background sound levels, both of
`which can effect the reception of a userS Speech by a voice
`recognition System, it is likely that in many situations, a
`Voice recognition engine or System may not function at an
`optimal level. The audio input may not be within the
`acceptable range for the Voice recognition engine being
`used.
`Computers equipped for voice recognition may typically
`have a Sound card in addition to an input device Such as a
`microphone. A Sound card typically includes a coder/
`decoder or CODEC. The Microsoft Sound System Sound
`Card uses the Analog Devices AD 1848 Parallel-Port Sound
`Port Stereo CODEC. Among other functions, the CODEC
`contains an input volume control which can be used to adjust
`the amplitude level of an analog input signal from the
`microphone. The CODEC also converts an analog signal
`(representative of a voice input) into a digital signal. The
`digital signal can then be transmitted from the Sound card
`through the computer bus for processing (Such as pattern
`matching) by the computer.
`One widely Sold operating System program which helps
`control a computer is WINDOWSTM version 3.1
`(“WINDOWS") of Microsoft Corporation. Among other
`features, WINDOWS provides a graphical user interface
`allowing the user the option of using a pointing device Such
`as a mouse, to control the operation of the computer without
`the need to memorize text commands usually required in
`DOS based applications. WINDOWS also provides appli
`cation programmerS with tools So that applications have a
`common look in Structure as well as execution of common
`operations. A WINDOWS application programmer is thus
`provided with a variety of tools to assist in controlling
`various computer functions as well as designing “user
`friendly' applications.
`A Software program written for WINDOWS operation
`uses dynamic link libraries (DLLS) which contain a plurality
`of application programming interfaces (APIs). Examples of
`Such DLLs are USER.EXE, KRNL386.EXE, and GDI.EXE
`which contain the core functionality APIs that make up
`Microsoft Windows 3.1. Although each of these three DLLs
`has the .EXE extension (usually representing an executable
`application), each is a DLL. The APIs are used to carry out
`various WINDOWS functions. For example, if a software
`program requires a dialog box displayed on a computer
`monitor to prompt a user for a command or data entry, the
`Software program would make a call to the DialogBox API
`which brings up a dialog box on the computer monitor. The
`contents of the dialog box are local to or associated with the
`particular application which made the call. Another example
`of a WINDOWS API is the SetWindowLong API. This API
`asSociates data with a particular window, allowing a user
`who has Switched applications to return to the point in the
`original application where processing had been taking place
`prior to the switch to the other application. WINDOWS
`operation and WINDOWS programming, including the use
`of DLLs and APIs are well known by those skilled in the art.
`The Microsoft WINDOWS Software Development Kit,
`Guide to Programming, Volumes 1-3, 1992, is incorporated
`by reference herein. It is available and used by WINDOWS
`programmerS and provides reference information for many
`of the DLLs and APIs which are available to WINDOWS
`programmerS.
`WINDOWS, while providing ease of use for running
`applications, may serve as a platform for a voice recognition
`
`FIELD OF THE INVENTION
`This invention relates to a method and System for adjust
`ing audio input volume for a System which uses voice
`recognition.
`BACKGROUND OF THE INVENTION
`Voice recognition is the process by which spoken words
`are interpreted and “understood” by a computer. Voice
`recognition Systems thus become another means for entering
`data and controlling a computer, to the function of a key
`board or a pointing device (e.g., mouse).
`In a typical Voice recognition System, a user Speaks into
`an input device Such as a microphone, which converts the
`audible Sound waves of Voice into an analog electrical
`Signal. This analog electrical Signal has a characteristic
`waveform defined by several factors including the volume at
`which the words are spoken. The volume component of the
`spoken word translates into the amplitude of the waveform.
`Voice recognition involves pattern matching to compare
`the electrical Signal associated with a spoken word against a
`reference Signal associated with a "known word. A
`25
`“known word is Stored in a computer by a user. In a typical
`System, the user Speaks a word into a microphone and the
`electrical Signal of this spoken word is associated with a
`typed word. Instead of a typed word, the word can also be
`called up from a database, for example. After a word is
`“known,” Voice recognition can take place.
`Thus, if the electrical signal of a spoken word matches the
`waveform of the reference signal of the “known” word,
`within an acceptable range of error, the System “recognizes'
`the spoken word as the “known” word (which has previously
`been associated with the reference signal). A Software appli
`cation which uses voice recognition could then use the Voice
`input for entering data or controlling a Software application
`(similar to the way a keyboard would be used). For example,
`in a word processor or dictation System using Voice recog
`nition text could be audibly entered into the body of a
`document via a microphone instead of typing the words into
`the text on a keyboard.
`Digital Signal processing can be used to provide an
`accurate comparison between the waveform of the Voice
`audio input and that of the reference Signal. Digital signal
`processing requires that the waveform of the Voice audio
`input, as well as the waveform of the reference Signal are
`represented as digital Signals. Having a Sufficient amplitude
`level for the Voice audio input provides a better Signal for
`conversion to a digital Signal and thus a better reference
`Signal for voice recognition. If the amplitude level is too low,
`there may not be enough range in the electrical Signal of
`either the reference Signal or the Spoken word to provide a
`high enough level of confidence that the electrical signal of
`a spoken word matches that of the “known” word. If the
`amplitude level is too high, certain attributes of the electrical
`Signals may be "clipped.” This, too, may lower the confi
`dence level of the pattern matching. In more extreme cases,
`the electrical Signals may be too low or too high, resulting
`in no match. The sufficiency of the amplitude level is
`determined for a particular voice recognition “engine'. The
`Voice recognition engine is Software or hardware which
`carries out the interpretation and analysis of the Voice audio
`input (or its digital representative) to determine whether a
`match has occurred and the confidence level of the match.
`The Dragon Recognizer by Dragon Systems, Inc. of
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 6
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`3
`system. WINDOWS lacks, however, a system for adjusting
`input levels to optimize Voice recognition for a particular
`user at given location.
`A speech detection recognition apparatus for use with
`background noise of varying levels is described in U.S. Pat.
`No. 4,829,578, to Roberts. The apparatus compares the
`amplitude of an audio signal during Successive time periods
`with certain speech detection thresholds and generates an
`indication of whether the Signal contains Speech. The ampli
`tude of the audio signal is altered relative to speech detection
`thresholds as a function of background noise Signals which
`are detected to improve Speech detection.
`Roberts and other systems which relate to speech
`detection, do not address adjusting the input amplitude level
`to assist in and improve voice recognition. Still further, there
`is a lack of a System to make Systematic adjustments to input
`amplitude levels by Sampling a userS Speech and analyzing
`it in a controlled fashion and then adjusting an input device
`based on that Sampling and analysis.
`SUMMARY OF THE INVENTION
`There is provided, in accordance with the present inven
`tion a voice recognition method and System for adjusting an
`input Volume control of an input device. This in turn, adjusts
`an input signal amplitude level of a word spoken into the
`input device. The user is prompted with a word to Speak into
`an input device, Such as a microphone connected to a Sound
`card. The input device converts the Spoken word into an
`electrical signal with an amplitude level ("input signal
`amplitude level”) relative to the volume at which the words
`were spoken. The input signal amplitude level is compared
`against a preselected reference Signal amplitude level. The
`preSelected reference Signal amplitude level is Set to a level
`to enhance voice recognition. An input volume control of the
`input device is then adjusted to cause the input Signal
`amplitude level to approach the preselected reference signal
`amplitude level in a predetermined manner.
`In a preferred embodiment of the present invention, a Step
`of determining if a word was spoken is performed prior to
`comparing the Spoken word to a reference. The Steps of
`prompting the user for a word, generating an input signal
`amplitude level for the spoken word, determining if a word
`was spoken, comparing the input Signal amplitude level of
`the Spoken word with respect to the preselected reference
`Signal amplitude level and adjusting an input volume control
`with respect to the comparison are repeated nine times.
`During each of the nine iterations, the user is prompted to
`Speak a different word than had been previously prompted.
`BRIEF DESCRIPTION OF THE FIGURES
`The invention will now be described by way of non
`limiting example, with reference to the attached drawings in
`which:
`FIG. 1 is a flow diagram showing the method which
`operates in accordance with the present invention;
`FIG. 2 is a flow chart showing the operation of proceSS
`block 24 shown in FIG. 1;
`FIG. 3 shows a personal computer and asSociated periph
`eral devices used in operating the System and performing the
`method in accordance with the present invention; and
`FIG. 4 shows a Screen display in accordance with the
`present invention.
`DETAILED DESCRIPTION OF THE
`INVENTION
`There is shown in FIG.3 an example computer system 50
`for carrying out process 10 shown in the flow charts of FIGS.
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,870,705
`
`4
`1 and 2. Computer system 50 is comprised of a personal
`computer 42 having Several peripheral devices including
`monitor 44, keyboard 46, mouse 48 (resting on mouse pad
`60), microphone 52, Sound card 54 (including a CODEC56)
`plugged inside of computer 42 and a Speaker 58. The present
`invention is not limited to the configuration for computer
`system 50 shown in FIG. 3. Other configurations which can
`operate the present method and System will be understood
`by those skilled in the art.
`Microphone 52 is an input device for generating a repre
`Sentative input Signal amplitude level as microphone 52
`converts acoustic energy into an analog electrical Signal
`including audio signal information. Microphone 52 is shown
`connected to Sound card 54 in FIG. 3. The combination of
`microphone 52 and sound card 54 can also be viewed as an
`input device for generating an input Signal amplitude level.
`A device having a similar function, Such as a microphone
`containing circuitry to carry out the functions of a Sound
`card, including an input volume control, could also serve as
`an input device.
`Other input devices, Such as an optical Storage device or
`magnetic Storage device could also serve as an input device
`containing prerecorded “audio' information. In Such a
`System, a digital representation of the audio information is
`Stored on the respective Storage device. In Such a System, the
`Stored digital audio information could also be used directly
`in determining an input Signal amplitude level.
`In a preferred embodiment of the present invention, a
`microphone is used and thus operates on a real time signal,
`not a prerecorded signal.
`AS previously mentioned, microphone 52 is connected to
`Sound card 54. Sound card 54 handles the interface between
`audio input and output I/O and the computer. It also converts
`analog signals (i.e., audio waveforms) into a stream of
`digital data. An example of Sound card 54 is the Microsoft
`WINDOWS Sound System model #206-151v200, which
`contains a CODEC 56. An example for CODEC 56 is the
`Analog Devices AD 1848 Parallel-Port SoundPort Stereo
`CODEC (“AD1848 CODEC"). The operation of the
`AD 1848 CODEC is described in the Analog Devices Speci
`fication REVO for the AD 1848 CODEC. Sound Card 54 and
`CODEC 56 and the Windows Sound System software allow
`adjustments to the Volume or amplitude level of an input
`Signal amplitude level to enhance Voice recognition. Other
`Sound cards or devices which handle the interface between
`audio input/output (I/O) and computer 42 can be used in the
`System of the present invention.
`A monitor 44 is used in the present invention to display a
`Visual prompt to a user with words and messages. Although
`in a preferred embodiment the user is provided with visual
`prompts on monitor 44, a user could also be prompted
`audibly through a speaker 58 or through another output
`device Such as a serial or parallel connected printer (not
`shown).
`In a preferred embodiment, the input signal amplitude
`level adjustment of the present invention is used in a
`WINDOWS voice recognition application, such as Voice
`Pilot version 2.0 which is included with the Microsoft
`WINDOWS Sound System software (version 2.0) (the ref
`erence manual for which is incorporated herein by
`reference). With this software, spoken words can be used to
`execute commands, Such as resizing a window (i.e., using
`the spoken words “minimize” or “maximize,” respectively).
`As a WINDOWS application, certain application program
`ming interfaces (APIs) are used to access functions Sup
`ported by the WINDOWS operating system.
`
`Page 7
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`5,870,705
`
`1O
`
`15
`
`25
`
`35
`
`S
`For example, the mixer API included with the
`MSMIXMGR.DLL and which is a part of the WINDOWS
`Sound System device driver kit, controls sound card 54
`which is plugged into personal computer 42. The mixer API
`controls a particular Sound card 54 plugged into personal
`computer 42 through a mixer driver written for the particular
`sound card being used. The WINDOWS Sound System
`mixer driver is one example of Such a driver. The mixer
`driver allows communication through, and control of, the
`CODEC 56 located on Sound card 54. The mixer API can
`control Several functions, including the input and output
`Volume of Sound card 54. Accordingly, it is through the
`mixer API that input volume adjustments to sound card 54
`are made based on the comparison of an input signal
`amplitude level and a preselected reference Signal amplitude
`level, as discussed below.
`Another API used by WINDOWS is the Wavelnopen
`API, which is a part of the MMSystem.DLL. It is used to
`access sound card 54 to input sound via microphone 52. The
`WavenCopen API calls the Wave driver which in turn is used
`to “push’ data into and out of sound card 54. Sound files
`used by WINDOWS are formatted in the WAV file format.
`Voice Pilot also contains a voice recognition engine
`known as the Dragon Recognizer by Dragon Systems, Inc.
`of Newton, Mass. This voice recognition engine is called/
`operated through a DLL of Microsoft Sound System called
`VLAYER.DLL. This DLL contains Several APIs which
`allow function calls to the Dragon Recognizer and is used
`for polling the Dragon Recognizer to make the comparison
`between the input Signal amplitude level and the preselected
`reference Signal amplitude level. Other voice recognition
`engines could also be used in Voice recognition Systems in
`accordance with the present invention. If a Voice recognition
`engine other than the Dragon Recognizer is used, the APIs
`in VLAYER.DLL would be changed accordingly to accom
`modate a different Voice recognition engine. AS previously
`noted, the preselected reference Signal amplitude level is
`already entered into the Dragon Recognizer. Some of the
`APIs and their associated function used in VLAYER.DLL
`C.
`
`40
`
`45
`
`1) SetVoiceWindow-associates a window with the
`Dragon Recognizer;
`2) InitvoiceRecognizer-initializes the Dragon Recog
`nizer to the input hardware (Sound card);
`3) Recognize-determines whether a word was spoken;
`and
`4) GetUttMeasure-determines whether the input signal
`amplitude too high, too low, or within an acceptable
`range as determined by the Dragon Recognizer (by
`comparison to the reference Signal amplitude level).
`Other DLLs and APIs for WINDOWS, the WINDOWS
`50
`Sound System, and the Dragon Recognizer will be under
`stood by those skilled in the art.
`There is shown in FIG. 1, a flow chart of a process 10 of
`the present invention which is carried out on computer
`system 50. In block 12, a user is prompted for a word to
`Speak into an input means Such as microphone 52 or the
`combination of microphone 52 and sound card 54 shown in
`FIG. 3. The prompt to the user occurs by way of a dialog box
`(Smaller window) shown in FIG. 4, with text identifying the
`word to be spoken by the user. In block 14, an input signal
`amplitude level is determined. The input signal amplitude
`level is one component of the waveform which results from
`the conversion of the spoken word (audio signal) into an
`analog electrical Signal by microphone 52. The present
`invention adjusts the audio input Volume, thereby adjusting
`the input signal amplitude level, to enhance Voice recogni
`tion.
`
`55
`
`60
`
`65
`
`6
`Although the input waveform has been previously
`described as including an input signal amplitude level, it is
`also comprised of a plurality of amplitude levels which
`result from normal speech. This is a result of both how the
`human Voice operates and how language is communicated
`with the human Voice. AS there is no one amplitude level
`“value, an algorithm is necessary to either generate a single
`value (Such as an average of all amplitude levels Sampled if
`digital signal processing is being used) or analyze the Series
`of amplitude values which comprise the waveform of the
`spoken word. In the preferred embodiment of the present
`invention, it is not critical how the comparison of the input
`Signal amplitude level is made, as the preferred embodiment
`of the present invention is used to enhance the input Signal
`amplitude level for the Voice recognition engine being used,
`in this case, the Dragon Recognizer. The generation of an
`algorithm to accomplish this comparison would be under
`stood by those skilled in the art such as described in
`Principles of Digital Audio, Ken Pohlmann, Sams, 1989
`(2nd edition).
`In block 16 it is determined whether or not a user has
`spoken a word as prompted in block 12. This determination
`is made on the basis of a preselected threshold input Signal
`amplitude level being detected, not whether a word match
`has occurred. The prompt to the user in block 12 is to have
`the user Speak and then generate a waveform which has an
`amplitude level which can be detected and analyzed by the
`present invention. Certain input devices 52, Such as a
`microphone, can be adjusted or built with varying Sensitivity
`to help isolate or pickup a user's voice.
`The preselected threshold input signal amplitude level can
`be set to a value which accounts for any normal background
`noise in a typical home or office Setting, taking into account
`the Sensitivity and directional characteristics of the micro
`phone or other input device being used. Whether block 16
`determines whether a word has been spoken or a preselected
`threshold input Signal amplitude level has been detected, the
`description which follows refers to block 16 as detecting
`whether or not a word has been detected or Spoken.
`If no word is detected in block 16, the user is prompted
`in block 18 to acknowledge whether or not a word has
`actually been spoken. In a preferred embodiment, the user is
`prompted via a WINDOWS dialog box displayed on a
`computer monitor. The Windows dialog box asks the user to
`acknowledge whether a word has been spoken by Selecting
`the appropriate button (“yes” or “no” or “help”) in the dialog
`box. As in many WINDOWS applications, the selection of
`the appropriate button can be made using the keyboard or a
`pointing device Such as a mouse. Also in a preferred
`embodiment, a timer counts approximately five Seconds
`from when the user is first prompted in block 12 before the
`user is prompted in block 18 to acknowledge whether a word
`has been spoken. If a user acknowledges that a word has not
`been Spoken, control returns to block 12 and the user is
`prompted to Speak the particular word into the microphone.
`If the user acknowledges that a word has been spoken,
`control drops down to block 22 and the volume level of the
`input means is adjusted upwardly. The adjustment will
`always be upward in Such a situation as the input volume of
`input device 52 (such as microphone 52 in FIG. 3) was so
`low that the System could not even detect that the prompted
`word had been spoken.
`If a word is detected in block 16, processing continues
`into block 24. Block 24 is comprised of two steps shown in
`blocks 20 and 22, respectively.
`In block 20, the input Signal amplitude level is compared
`to a preselected reference signal amplitude level. The pre
`
`Page 8
`
`AMAZON 1017
`Amazon v. SpeakWare
`IPR2019-00999
`
`

`

`5,870,705
`
`8
`varies for each of the nine words as shown in Table 1:
`
`7
`Selected reference Signal amplitude level is a desired ampli
`tude level for input signals which is programmed into the
`System. In one embodiment, the Voice recognition engine is
`preprogrammed with the desired input level, in order to
`enhance voice recognition. The Voice recognition engine
`also includes a tool which can be queried or polled with an
`input Signal amplitude level to determine whether or not the
`input signal amplitude level is acceptable (i.e. is within an
`acceptable range of the preselected reference Signal ampli
`tude level). In an additional embodiment, a voice recogni
`tion engine may not be polled to make the comparison
`between the input Signal amplitude level and the preselected
`Signal amplitude level. In Such a case, an algorithm would
`have to be generated to determine whether the input Signal
`amplitude level is within an acceptable range of the prese
`lected Signal amplitude level.
`Once the comparison is made in block 20, processing
`continues in block 22 where the input volume control, or
`input level, of the input means is adjusted relative to the
`comparison made in block 20. If the comparison of the input
`Signal amplitude level to the preselected reference Signal
`amplitude level made in block 20 determines that the input
`Signal amplitude level is below the preselected Signal ampli
`tude level, the input volume control is adjusted to amplify or
`increase the amplitude of the electrical Signal generated
`from the audio input signal. If the comparison in block 20
`determines that the input signal amplitude level is higher
`than the preselected reference Signal amplitude level, the
`input volume control is adjusted to reduce or lower the
`amplitude of the electrical Signal generated from the audio
`input signal. If the comparison determines that the input
`Signal amplitude level is within an accepted range of the
`preSelected reference Signal amplitude level, then no adjust
`ment to the input Volume control is made. In a preferred
`embodiment, an acceptable range for the comparison is
`within two percent of the preselected reference Signal ampli
`tude level. The acceptable range may be adjusted, depending
`upon the requirements of a particular voice recognition
`System and/or voice recognition application. Once the input
`means of Volume has been adjusted in block 22, processing
`continues to block 26. In block 26, it is determined whether
`or not the user is required to be prompted for additional
`words. This takes place if multiple passes or iterations are
`used to initialize the System.
`In a preferred embodiment, an iterative proceSS is used
`whereby the user is prompted to Speak nine different words,
`one following the prompt associated with each iteration,
`with a comparison to the preselected reference signal ampli
`tude level (the same reference level for each iteration/word)
`and corresponding adjustments to the input means Volume
`made after each word is spoken. Nine iterations is an
`arbitrary value which has provided Satisfactory results.
`Fewer or greater number of iterations can be used, depend
`ing upon the Voice recognition System and or voice recog
`nition application for which the present invention is being
`used. Although the user is prompted for nine different words,
`the present invention is not “recognizing” (matching) the
`words, instead it is detecting and analyzing the input signal
`amplitude levels.
`The iterative process of the present invention begins by
`prompting the user with the first word in block 12, as
`previously described. Processing continues as previously
`described until block 22 is reached. In this iterative method,
`the input volume control of the input means is initially Set to
`a 50 percent level, half way between the maximum volume
`Setting and the minimum Volume Setting of the input means.
`The Subsequent adjustments to the input volume control
`
`TABLE 1.
`
`Word No.
`
`Percent Volume
`is Adjusted
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Table 1 shows the percent upward or downward that the
`Volume control of the input means can be adjusted for each
`word of the nine for which the user is prompted to Speak.
`The iterations shown in Table 1 are chosen to achieve a
`result within an acceptable range of two percent between the
`input signal amplitude level and the preselected reference
`Signal amplitude level. This assumes that the user is speak
`ing at a fairly consistent level for each of the nine words for
`which the user is prompted. Each iteration may result in a
`corresponding adjustment made in CODEC 56 which con
`tains the input volume control, in a preferred embodiment.
`The operation of the iterative process of the present
`invention can be understood by the following example.
`Because the Volume control of the input means is initially Set
`to 50 percent for the first word Spoken, correspondingly, the
`input Signal amplitude level of the first word Spoken by the
`user is given a value of 50 percent. When compared to the
`preSelected reference Signal amplitude level, the preselected
`reference Signal amplitude level is thus given a value
`relative to the 50 percent level assigned to the input Signal
`amplitude level of the first word. For this example, assume
`that the preselected reference Signal amplitude level is at 38
`percent, relative to the 50 percent value assigned to the input
`Signal amplitude level.
`On pass one through proceSS 10, the comparison carried
`out in block 20 would indicate that the input signal ampli
`tude level of the first word is higher than the preselected
`reference signal amplitude level (50>38). Looking at Table
`1, the input volume control would be adjusted downward 10
`percent, adjusting the input Signal amplitude level for the
`first spoken word to a value of 40. Since this is the first of
`nine iterations, block 26 would determine that there are more
`words to prompt the user and return proceSS control to block
`12. This determination can be made by initializing an
`iteration or word counter with the first word prompted and
`incrementing the counter with each additional prompt.
`When the counter equals the total of the words to be
`prompted, it is complete.
`The user would be prompted with a second word which
`would pass through system 10 for comparison in block 20.
`Block 20 would again provide a signal that the input Signal
`amplitude level (now for the Second word) is greater than the
`preselected signal amplitude level (40>38). Looking at Table
`1, for the second word, the input volume control would be
`adjusted down another 10 percent adjusting the input signal
`amplitude l

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket