`(12) Patent Application Publication (10) Pub. No.: US 2007/0086764 A1
`(43) Pub. Date: Apr. 19, 2007
`
`Konicek
`
`US 20070086764A1
`
`(54) USER-FRIENDLIER INTERFACES FOR A
`CAMERA
`
`(76)
`
`Inventor:
`
`Jeffrey C. Konicek, Tolono, IL (US)
`
`Correspondence Address:
`DOUGLAS W RUDY
`LAW OFFICES OF DOUGLAS W. RUDY, LLC
`401 N. MICIIIGAN AVENUE
`SUITE 1200
`
`CHICAGO, IL 60611 (US)
`
`(21) App]. No.:
`
`11/163,391
`
`(22)
`
`Filed:
`
`Oct. 17, 2005
`
`Publication Classification
`
`Int. Cl.
`
`(2006.01)
`G03B 17/00
`US. Cl.
`................................................................ 396/56
`
`(51)
`
`(52)
`
`ABSTRACT
`(57)
`A system and method is disclosed for enabling user friendly
`interaction with a camera system. Specifically, the inventive
`system and method has several aspects to improve the
`interaction with a camera system, including voice recogni-
`tion, gaze tracking, touch sensitive inputs and others. The
`voice recognition unit is operable for, among other things,
`receiving multiple different voice commands, recognizing
`the vocal commands, associating the different voice com-
`mands to one camera command and controlling at least some
`aspect of the digital camera operation in response to these
`voice commands. The gaze tracking unit is operable for,
`among other things, determining the location on the view-
`finder image that the user is gazing upon. One aspect of the
`touch sensitive inputs provides that the touch sensitive pad
`is mouse-like and is operable for, among other things,
`receiving user touch inputs to control at least some aspect of
`the camera operation. Another aspect of the disclosed inven-
`tion provides for gesture recognition to be used to interface
`with and control the camera system.
`
`70
`
`SCL
`
`56
`
`VOICE
`RECOGNITION
`
`UNIT
`
`54
`
`CAMERA
`CCD
`
`A
`
`52
`
`VIEW
`FINDER
`
`MICS
`72a ’l
`72b
`
`72c
`
`’-
`.I
`68
`
` VIEW
`
`FINDER
`
`SENSORS
`
`
`
`64
`
`BUTTONS
`
`57
`
`
`
`
`CAMERA
`
`CONTROLLER
`
`4
`
`i0
`
`A
`A
`A
`60
`
`WINK
`DETECTOR
`
`TOUCH PAD
`DEVICE
`
`‘
`
`62
`
`REMOTE
`LIGHT
`
`SENSOR
`
`STORAGE
`
`
`MEDIA
`RNV
`
`44
`
`'
`LCD
`DISPLAY
`
`42
`
`OTHER
`CAMERA
`
`CONTROL
`
`AF
`MOTOR
`
`>
`
`ZOOM
`'MOTOR
`
`46
`
`APPLE 1011
`
`APPLE 1011
`
`1
`
`
`
`Patent Application Publication Apr. 19, 2007 Sheet 1 of 4
`
`US 2007/0086764 A1
`
`REAR VIEW
`
`
`
`FRONT VIEW
`
`2
`
`
`
`Patent Application Publication Apr. 19, 2007 Sheet 2 of 4
`
`US 2007/0086764 A1
`
`IFIIG. 2
`
`10a
`
`FRONT MIC
`
`
`
`
`
`
`35
`__
`
`FRONT/REAR
`
`REAR MIC
`
`TO VOICE
`
`RECOGNITION
`
`UNIT
`
`FIG. 4
`
`CAMERA
`TOP
`
`OR TOUCH PAD
`
`SHUTTER BUTTON
`
`3
`
`
`
`Patent Application Publication Apr. 19, 2007 Sheet 3 0f 4
`
`US 2007/0086764 A1
`
`mmxko
`
`<mm§<o
`
`Jomkzoo....
`
`
`
`vmmxo<m._.MN<Oi
`
`>>m=>
`
`mwozi
`
`
`
`
`
`mmozl>>m=>mewzmm
`
`EOON
`
`
`
`mokoz,.
`
`owmo<mokw
`
`>
`
`u<<mm2<o
`
`
`
`moFOE..mmJJOszoo
`
`am
`
`<44
`
`sow:
`
`>>E
`
`vv»
`
`DO.—
`
`><4Qw5
`
`m.0:NV
`
`'
`
`00
`
`No
`
`
`
`O<n_IODOF
`
`m0_>mo
`
`mOHOwaQmm:
`
`sz>
`
`
`
`vmoFQMFmo.
`
`MFOEME
`
`FIG:
`
`MOmem
`
`szkHDm
`
`we
`
`4
`
`Ems
`
`mmoza
`
`mm.
`
`vm
`
`<mm2<o
`
`coo
`
`mO_O>
`
`mm-Amm“
`
`on
`
`ZOEZOOOmmi40w-‘QNN
`.523mu
`
`A O
`
`wozz
`
`4
`
`
`
`
`
`
`
`
`
`
`Patent Application Publication Apr. 19, 2007 Sheet 4 of 4
`
`US 2007/0086764 A1
`
`@~% 33,
`
`ILLUSTRATIVE TOUCHPAD
`
`OVERLAY WITH CUTOUTS
`
`FIG. 5a
`
`ROUND DIGIT PATTERN WITH
`
`CENTER ACTIVATION PATTERN
`
`6 86
`© 5?
`8 6 6
`
`66
`
`FIG. 5b
`
`i
`
`CAMERA 0R CELL PHONE
`
`JOYSTICK PATTERN
`
`5
`
`
`
`US 2007/0086764 A1
`
`Apr. 19, 2007
`
`USER-FRIENDLIER INTERFACES FOR A
`CAMERA
`
`BACKGROUND OF THE INVENTION
`
`[0001] Digitally-based and film-based cameras abound
`and are extremely flexible and convenient. One use for a
`camera is in the taking of self portraits. Typically, the user
`frames the shot and places the camera in a mode whereby
`when the shutter button is depressed; the camera waits a
`predetermined time so that the user may incorporate himself
`back into the shot before the camera actually takes the
`picture. This is cumbersome and leads to nontrivial prob-
`lems. Sometimes the predetermined delay time is not long
`enough. Other times, it may be too long. For participates
`who are in place and ready to have their picture taken,
`especially children, waiting with a smile on their face for the
`picture to be snapped by the camera can seem endless even
`if it is just a few seconds long. Additionally, many who
`might like to be included into a shot find themselves not able
`to be because they have to take the picture and it is simply
`too much trouble to set up for a shutter-delayed photograph.
`
`[0002] Voice recognition techniques are well known in the
`art and have been applied to cameras, see for example, US.
`Pat. Nos. 4,951,079, 6,021,278 and 6,101,338 which are
`herein incorporated by reference. It is currently possible to
`have fairly large vocabularies of uttered words recognized
`by electronic device. Speech recognition devices can be of
`a type whereby they are trained to recognize a specific
`person’s vocalizations, so called speaker dependent recog-
`nition, or can be of a type which recognizes spoken words
`without regard to who speaks them,
`so called speaker
`indcpcndcnt rccognition. Prior art voicc opcratcd cameras
`have several defects remedied or improved upon by various
`aspects of the present invention more fully disclosed below.
`One such problem is that in self portrait mode, the camera
`may snap the picture while the user is uttering the command.
`Another defect is that the microphone coupled to the voice
`recognition unit
`is usually mounted on the back of the
`camera. This placement is non-optimal when the user is in
`front of the camera as when taking a self portrait. Still
`another problem with prior art voice activated cameras is
`that they associate one vocalization or utterance to one
`camera operation. Thus,
`the user must remember which
`command word is to be spoken for which camera operation.
`This is overly constraining, unnatural, and significantly
`reduces the utility of adding voice recognition to the camera.
`
`[0003] One prior art implementation of voice recognition
`allows for menu driven prompts to help guide the user
`through the task of remembering which command to speak
`for which camera function. This method however requires
`that the user be looking at the camera’s dedicated LCD
`display for the menu. One aspect of the present invention
`provides for the menus to be displayed in the electronic view
`finder of the camera and be manipulated with both voice and
`gaze. Another aspect of the present invention incorporates
`touchpad technology which is typically used in laptop
`computers, such technology being well know in the art, as
`the camera input device for at least some functions.
`
`SUMMARY OF THE INVENTION
`
`[0004] A self-contained camera system, according to vari-
`ous aspects of the present invention, includes voice recog-
`
`nition wherein multiple different vocalizations can be rec-
`ognized and wherein some such recognized vocalizations
`can be associated with the same camera command. Another
`
`aspect of the invention provides for multiple microphones
`disposed on or in the camera system body and be operable
`so that the user can be anywhere around the camera system
`and be heard by the camera system equally well. According
`to other aspects of the present invention, the camera system
`viewfinder includes gaze tracking ability and in exemplary
`preferred embodiments, gaze tracking is used alone or in
`combination with other aspects of the invention to, for
`example, manipulate menus, improve picture taking speed,
`or improve the auto focus capability of the camera. Other
`aspects of the present invention, such as the addition of
`touchpad technology and gesture recognition provide for a
`improved and more natural user interface to the camera
`system.
`
`[0005] Thus, it is an object of the invention to provide an
`improved self-portrait mode for a camera system.
`It
`is
`further an object of the invention to provide an improved
`user interface for a camera system. It is yet a further object
`of the invention to make a camera system more user friendly
`with a more natural and intuitive user interface. It is still a
`
`further object of the invention to broaden the capabilities of
`the camera system. It is further an object of the invention to
`more easily allow a user to compose a shot to be taken by
`the camera system.
`It
`is still
`further an object of the
`invention to improve image quality of pictures taken by the
`camera system. It is yet another object of the invention to
`improve the speed of picture taking by the camera system.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0006] BIG. 1 is an exemplary perspective view of the
`front and rear (back) of the camera system according to
`various aspects of the present invention.
`
`[0007] BIG. 2 is a functional representation of automatic
`microphone selection circuitry that may be uses in various
`aspects of the present invention.
`
`[0008] BIG. 3 shows an exemplary functional block dia—
`gram of an inventive camera system implementing various
`aspects of the present invention.
`
`
`
`[0009] BIG. 4 shows an exemplary embodiment of a wink
`detector according to various aspects of the present inven-
`tion.
`
`[0010] BIG. 5 shows exemplary touchpad overlays with
`cutouts according to various aspects of the present inven-
`tion.
`
`DESCRIPTION OF PREFERRED EXEMPLARY
`EMBODIMENTS
`
`[0011] One aspect of the present invention solves several
`of the problems of the prior art voice recognition cameras in
`that this aspect provides for more than one microphone to be
`the source to the recognition unit. With reference to FIG. 1,
`this aspect of the present invention provides for at least two
`microphones to be used, one microphone, 10b, placed on the
`back of the camera and one microphone, 10a, placed on the
`front, either of which can receive voice commands. In a first
`preferred embodiment of this aspect of the invention, a
`detection device determines which microphone is to be used
`as the input to the recognition unit based upon the strength
`
`6
`
`
`
`US 2007/0086764 A1
`
`Apr. 19, 2007
`
`of the voice signal or sound level received by each of the
`microphones. In another preferred embodiment, the outputs
`of the microphones are combined as the input to the voice
`recognition unit. In still another embodiment, the user can
`select which microphone is used as the input to the voice
`recognition unit, for example, by a switch or by selection
`through a camera menu.
`
`[0012] Automatic microphone selection is preferred and
`with reference to FIG. 2, microphones 10a and 10b are each
`amplified by amplifiers 20 and 22 respectively. Diode 24,
`capacitor 28 and resister 32 form a simple energy detector
`and filter for microphone 10a. The output of this detector/
`filter is applied to one side of a comparator, 36. Similarly,
`diode 26, capacitor 30, and resister 34 form the other energy
`detector associated with microphone 10b. The output of this
`filter/detector combination is also applied to comparator 36.
`Thus, the output of this comparator selects which amplified
`microphone output is passed to the voice recognition unit
`through multiplexer 38 based on which amplified micro-
`phone output contains the greatest energy.
`
`In yet another novel embodiment of this aspect of
`[0013]
`the invention, the multiple microphones are preferably asso-
`ciated with multiple voice recognition units or, alternatively,
`with diiferent voice recognition algorithms well know in the
`art. The outputs of these multiple voice recognition units or
`different voice recognition algorithms are then coupled to
`the camera controller (FIG. 3 element 40). The camera
`controller preferably selects one of these outputs as being
`the camera controller’s voice recognition input. Alterna-
`tively, the camera controller accepts the outputs of all the
`voice recognition units or algorithms and preferably uses a
`voting scheme to determine the most
`likely rccognizcd
`command. This would obviously improve recognition rates
`and this aspect of the invention is contemplated to have
`utility beyond camera systems including, by way of example
`and not limitation, consumer computer devices such as PCs
`and laptops; portable electronic devices such as cell phones,
`PDAs, IPODs, etc.; entertainment devices such as TVs,
`video recorders, etc; and other areas.
`
`[0014] To illustrate this embodiment using the example of
`the camera system having microphones on its frontside and
`backside given above, each of these microphones is coupled
`to a voice recognition unit. When an utterance is received,
`each voice recognition unit recognizes the utterance. The
`camera controller then selects which voice recognition unit’ s
`recognition to accept. This is preferably based on the energy
`received by each microphone using circuitry similar to FIG.
`2 Alternatively, the selection of which voice recognition unit
`to use would be a static selection. Additionally, both recog-
`nizers’ recognition would be considered by the camera
`controller with conflicting results resolved by voting or
`using ancillary information (such as microphone energy
`content).
`
`[0015] An embodiment using multiple algorithms prefer—
`ably has one voice recognition algorithm associated with the
`frontside microphone and, a different voice recognition
`algorithm associated with the backside microphone. Prefer-
`ably, the voice recognition algorithm associated with the
`frontside microphone is adapted to recognize vocalizations
`uttered from afar (owing to this microphone probably being
`used in self-portraits), while the voice recognition algorithm
`associated with the backside microphone is optimal for
`
`closely uttered vocalizations. Selection of which algorithm
`is to be used as the camera controller input is preferably as
`above. Alternatively, as above, the selection would be by
`static selection or both applied to the camera controller and
`a voting scheme used to resolve discrepancies. While the
`above example contemplates using different voice recogni-
`tion algorithms, there is no reason this must be so. The same
`algorithms could also be used in which case this example
`functions the same as multiple voice recognition units.
`
`It is further contemplated in another aspect of the
`[0016]
`invention that the voice recognition subsystem be used in
`conjunction with the photograph storing hardware and soft-
`ware. In a preferred use of this aspect of the invention, the
`user utters names to be assigned to the photographs during
`storage and, later, utter then again for recall of the stored
`image. Thus, according to this aspect of the present inven-
`tion, a stored photograph can be recalled for display simply
`by uttering the associated name of the photograph. The name
`association is preferably by direct association, that is, the
`name stored with the picture. In a second preferred embodi-
`ment, the photograph storage media contains a secondary
`file managed by the camera system and which associates the
`given (i.e., uttered) name with the default file name assigned
`by the camera system’s storage hardware and/or software to
`the photograph when the photograph is stored on the storage
`media. According to the second embodiment, when a pho-
`tograph is to be vocally recalled for viewing, the camera
`system first recognizes the utterance (in this case, the name)
`which will be used to identify the picture to be recalled. The
`camera system then scans the association file for the name
`which was uttered and recognized. Next, the camera system
`determines the default name which was given to the photo-
`graph during storage and associated with the user-given
`name (which was uttered and recognized) in the association
`file. The camera system then recalls and displays the pho-
`tograph by this associated default name.
`
`In another preferred embodiment, the voice recog-
`[0017]
`nition subsystem of the improved camera system recognizes
`at least some vocalized letters of the alphabet and/or num-
`bers so that the user may assign names to pictures simply by
`spelling the name by vocalizing letters and/or numbers.
`Another aspect of the invention provides that stored photo-
`graphs be categorized on the storage media through use of
`voice-recognized utterances being used to reference and/or
`create categories labels and that, additionally, the recognizer
`subsystem preferably recognize key words for manipulating
`the stored pictures. For instance, according to this aspect of
`the invention, the inventive camera system would recognize
`the word “move” to mean that a picture is to be moved to or
`from a specific category. More specifically, “move, Christ-
`mas” would indicate that the currently referenced photo-
`graph is to be moved to the Christmas folder. An alternative
`example is “John move new year’s” indicating that the
`picture named John (either directly named or by association,
`depending on embodiment) be moved to the folder named
`“New Year’s”. It
`is further contemplated that the folder
`names may be used for picture delineation as well. For
`instance, the picture “John” in the Christmas folder is not the
`same as the picture “John” in the Birthday folder and the
`former may be referenced by “Christmas, John” while the
`latter is referenced by “Birthday, John”.
`
`[0018] Another aspect of the present invention provides
`that the voice recognition camera system be capable of
`
`7
`
`
`
`US 2007/0086764 A1
`
`Apr. 19, 2007
`
`associating more than one vocal utterance or sound with a
`single command. The different utterances are contemplated
`to be different words, sounds or the same word under
`demonstrably different conditions. As an example, the voice
`recognition camera system of this aspect of the present
`invention allows the inventive camera system to understand,
`for example, any of “shoot”, “snap”, “cheese”, and a whistle
`to indicate to the camera system that a picture is to be taken.
`In another example, perhaps the phrase and word “watch the
`birdie” and “click” instruct the camera to take the picture. It
`is further envisioned that the user select command words
`
`from a predetermined list of the camera command words and
`that he then select which words correspond to which com-
`mand. It is alternatively envisioned that the association of
`multiple recognizable words to camera commands may also
`be predetermined or preassigned.
`In another alternate
`embodiment, the inventive camera system allows the user to
`teach the camera system which words to recognize and also
`inform the camera system as to which recognized words to
`associate with which camera commands. There are obvi-
`
`ously other embodiments for associating recognized vocal-
`izations to camera commands and the foregoing embodi-
`ments are simply preferred examples.
`[0019]
`In another embodiment of this aspect of the present
`invention, the user has his littered commands recognized
`under demonstrably different conditions and recognized as
`being different utterances. For instance, according to this
`aspect of the invention, the voice operated camera system
`operates so that it understand commands vocalized close to
`the camera (as if the user is taking the picture in traditional
`fashion with the camera back to his face) and significantly
`farther away (as if the user is taking a self portrait picture
`and is part of the shot and thus has to vocalizc loudly to the
`front of the camera.) For this illustration,
`in a preferred
`embodiment the user teaches the words to the camera under
`
`the different conditions anticipated. For example, the user
`would teach the camera system by speaking the word “snap”
`close to the camera and inform the camera that this is a
`
`picture taking command and would then stand far from the
`camera and say “snap”, thus teaching another utterance, and
`instruct the camera that this is also a picture taking com-
`mand. These two different utterances of the same word
`
`under different conditions would be stored and recognized as
`different utterances. This aspect of the invention contem-
`plates that the words vocalized and/or taught need not be the
`same word and, as illustrated above, different words would
`also be considered different utterances as well.
`
`Since voice recognition is not always 100 percent
`[0020]
`accurate, another aspect of the present invention contem-
`plates that the camera system or a remote device, or both,
`preferably provide an indication that a voice command was
`or was not understood. Thus, using the self portrait example
`above, if the user vocalizes the command to take a picture
`but the camera system does not properly recognize the
`vocalization as being something it understands, the camera
`system would beep, or light an LED, etc. to indicate it’s
`misrecognition. Because of the relatively small number of
`anticipated camera commands and allowing for multiple
`vocalizations to command the same action,
`it is expected
`that the recognition rates will be quite high and fairly
`tolerant of extraneous noise without necessarily resorting to
`the use of a highly directional or closely coupled (to the
`user’s mouth) microphone though the use of such devices is
`within the scope of the invention.
`
`the user of the inventive
`is anticipated that
`It
`[0021]
`camera system may be too far away from the camera system
`for the camera system to recognize and understand the user’ s
`vocalizations. Thus, another aspect of the invention provides
`that the camera is equipped with a small laser sensor (FIG.
`1 element 18) or other optically sensitive device such that
`when a light of a given frequency or intensity or having a
`given pulse sequence encoded within it is sensed by the
`camera system equipped with the optically sensitive device,
`the camera system immediately, or shortly thereafter (to give
`the user time to put the light emitting device down or
`otherwise hide it, for example) takes a picture. The light
`emitting device is preferably a laser pointer or similar, stored
`within the camera housing when not needed so as to not be
`lost when not in use. Additionally, the light emitting device’ s
`power source would preferably be recharged by the camera
`system’s power source when so stored. In another embodi-
`ment, it is also contemplated that the light emitting device
`may be housed in a remotely coupled display which is
`disclosed below. The light emitting device preferably
`includes further electronics to regulate the emitted light
`intensity or to encode a predetermined pulse sequence
`(on-off pulses for example) or otherwise onto the emitted
`light, all of which techniques are well known in the art,
`which the camera system of this aspect of the present
`invention would receive and recognize by methods well
`known in the art.
`
`[0022] Another aspect of the present invention provides
`for there being a predetermined delay introduced between
`recognizing a voice command and the camera actually
`implementing the command. This aspect of the invention
`allows time, for example, for the user to close his mouth or
`for others in a self-portrait shot to settle down quickly before
`the picture is actually taken. In a first preferred embodiment
`of this aspect of the invention, the delay is implemented
`unconditionally for at least the picture taking command. In
`a second preferred embodiment of this aspect of the inven-
`tion, the delay introduced is dependent upon from where the
`command came relative to the camera system. For instance,
`if the camera system recognized the command as coming
`from the frontside microphone, delay is used, but if the
`command comes from the backside microphone, then no
`delay is implemented. The simple energy detection circuitry
`of FIG. 2, described above is easily adapted for this function.
`In an alternative embodiment, implementation of the delay
`is dependent upon the location of the microphone due to the
`orientation of the flip-up or swivel LCD display when the
`microphone is attached to the LCD display (FIG. 1, element
`120). For example, if the microphone in the display sub-
`housing is oriented forward relative to the camera body then
`delay is implemented, if the microphone is not oriented
`forward then no delay is introduced. Determining the ori-
`entation of this microphone relative to the camera body is
`known in the art and would typically be done with switches
`or other sensor devices. Another preferred embodiment of
`this aspect of the invention implements the delay for only
`certain commands, such as the command to take a picture.
`In yet another preferred embodiment, whether the delay is
`implemented at all is selectable by the user.
`[0023] Another aspect of the present invention provides
`that the camera LCD display (FIG. 1, element 14) employs
`touch sensitive technology. This technology is well known
`in the computer art and can be any of resistive, capacitive,
`RF, etc touch technology. This aspect of the present inven-
`
`8
`
`
`
`US 2007/0086764 A1
`
`Apr. 19, 2007
`
`tion allows the user to interact with menus, features and
`functions displayed on the LCD display directly rather than
`through ancillary buttons or cursor control. For those
`embodiments of touch technology requiring use of a stylus,
`it is further contemplated that the camera body house the
`stylus for easy access by the user.
`
`[0024] According to another aspect of the present inven-
`tion, it is envisioned that the current dedicated LCD display
`(FIG. 1, element 14) incorporated on a digital camera be
`made to be removable and be extendable from the camera by
`cable, wireless, optical, etc. interconnection with the cam-
`era. In one embodiment, this remote LCD would be wire-
`coupled to receive display information from the digital
`camera through a pluggable port. In another embodiment,
`the remote LCD would be wirelessly coupled to the digital
`camera through any of several technologies well understood
`in the art including, by way of example only, Bluetooth,
`WIFI (802.11 a/b/g/n), wireless USB, FM, optical, etc. In a
`another embodiment of this aspect of the invention, the
`remotely coupled display would serve the dual purpose of
`being a remote input terminal
`to the camera system in
`addition to being a dedicated display for the camera system.
`Preferably, as mentioned earlier, the display is touch sensi-
`tive using any of the touch sensitive technology well under-
`stood in the art such as resistive, capacitive, RF, etc.,
`methods mentioned above. Touch commands input by the
`user would be coupled back to the camera system as needed.
`It is also contemplated that the remote display house the
`stylus if one is required.
`
`the remotely
`In another preferred embodiment,
`[0025]
`coupled display has buttons on it to control the camera
`system. In another embodiment, the remotely coupled dis-
`play contains the microphone for receiving the voice com-
`mands of the user, digitizing the received voice, analyzing
`and recognizing the vocalization locally and sending a
`command to the camera system.
`In another preferred
`embodiment, the remotely coupled display containing the
`microphone simply digitizes the vocalization received by the
`microphone and transmits the digitized vocalization to the
`camera system for recognition of the vocalization by the
`camera system itself In all embodiments of the wireless
`remote display, it is preferred that the display contain its own
`power source, separate from the power source of the camera.
`It is also contemplated that the display’s separate power
`source may be coupled to the camera’s power source when
`the display is ‘docked’ to the camera so that both may share
`power sources or so that the camera’s power source may
`recharge the display’s power source.
`
`[0026] According to another aspect of the present inven-
`tion,
`the electronic view finder (EVF) typically used on
`modern digital cameras includes a gaze tracking capability
`which is well known in the art, see for example US. Pat. No.
`6,758,563 to Levola which is herein incorporated by refer-
`ence. In this aspect of the present invention, menus typically
`used for user interface to the camera are electronically
`superimposed in the image in the EVF. The gaze tracker
`subsystem is operable for determining the area or approxi-
`mate location of the viewfinder image at which the user is
`gazing. Thus, by the user looking at different areas of the
`EVF image, the gaze tracker subsystem informs the camera
`system so that a mouse-like pointer or cursor is moved by
`the camera system to the area of the EVF image indicated by
`the gaze tracking device to be the area the user is viewing.
`
`Preferably, the user then speaks a command to indicate his
`selection of the item pointed to by the pointer image.
`Alternatively, the user may indicate through other methods
`that this is his selection, such as staring at a position in the
`image for a minimum predetermined time or pressing a
`button, etc. As an example, the EVF displays icons for flash,
`shutter speed, camera mode, etc (alone or superimposed on
`the normal viewfinder image.) By gazing at an icon, a small
`compositely rendered arrow, cursor, etc., in the EVF image
`is caused by the gaze tracker subsystem to move to point to
`the icon at which the user is determined to be gazing by the
`gaze tracking subsystem, for instance, the camera mode icon
`as an example here. Preferably,
`the user then utters a
`command which is recognized by the camera system as
`indicating his desire to select that icon, for example, “yes”
`or “open”.
`the icon is selected by the user
`[0027] Altematively,
`gazing at the icon for some predetermined amount of time.
`When the icon is selected by whatever method, the EVF
`image shows a drop down menu of available camera modes,
`for example, portrait, landscape, fireworks, etc. The user,
`preferably, then utters the proper command word from the
`list or he may optionally gaze down the list at the mode he
`desires whereupon the gaze tracker subsystem directs that
`the pointer or cursor in the EVF image moves to the word
`and, preferably highlighting it, indicates that this is what the
`camera system thinks the user want to do. The user, pref-
`erably, then utters a command indicating his acceptance or
`rejection of that mode in this example, such as ‘yes’ or ‘no’.
`If the command uttered indicates acceptance, the camera
`system implements the command, if the command indicates
`rejection of the selected command,
`the camera system
`prcfcrably moves the pointer to a neighboring command. To
`leave a menu, the user may utter ‘end’ to return to the menu
`above or ‘home’ to indicate the home menu. Preferably, the
`user can also manipulate the pointer position by uttering
`commands such as “up”, “down”, “left” and “right” to
`indicate relative cursor movement. In this way, the user
`interacts with the camera in the most natural of ways,
`through sight and sound cooperatively. While the above
`example used the preferred combination of gaze and voice
`recognition, it is contemplated that gaze tracking be com-
`bined with other input methods such as pushing buttons (like
`a mouse click) or touch input disclosed below, or gesture
`recognition disclosed below, etc. as examples.
`[0028] Another application of this aspect of the invention
`uses gaze tracking to assist the auto focus (AF) capability of
`the prior art camera. AF generally has too modes, one mode
`uses the entire image, center weighted, to determine focus,
`another mode allows different areas of the image to have
`greater weight in determining focus. In the second mode, the
`user typically pre-selects the area of the framed image that
`he wishes to be over-weighted by the AF capability. This is
`cumbersome in that the user must predict where he wants the
`weighting to be ahead of time, thus, this embodiment of this
`aspect of the invention provides that the gaze tracker sub—
`system inform the AF capability of the camera system as to
`the location of the image that the user is gazing and that the
`AF capability use this information to weight this area of the
`image when determining focus. It is contemplated that the
`AF system may only provide for discrete areas of the image
`to be so weighted and in this case, preferably,
`the AF
`capability selects the discrete area of the image closest to
`that being gazed upon.
`
`9
`
`
`
`US 2007/0086764 A1
`
`Apr. 19, 2007
`
`[0029] Another embodiment of this aspect of the invention
`uses the gaze tracker to enable the flash of the camera
`system. Flash is common used to “fill” dimly lit photo-
`graphic scenes but sometimes this is not warranted. Other
`times, it is desired to have “fill” flash because the area of the
`scene desired is dark but the rest of the scene is quite bright
`(taking a picture in shade for example) and the camera does
`not automatically provide “fill” flash because the overall
`image is bright enough. Typically, the amount of “fill” flash
`the camera will give is determined by the camera measuring
`the brightness of the scene. The inventive camera system
`with gaze tracking is used to enhance the prior art method of
`determining the desire and amount of “fill” flash in that the
`inventive camera system gives more weight, in determining
`the scene brightness, to the area of the scene indicated by the
`gaze tracker as being gazed upon.
`[0030] Another aspect of the present invention adds touch-
`pad technology to