throbber
(12) United States Patent
`Cohen et al.
`
`I lllll llllllll Ill lllll lllll lllll lllll lllll 111111111111111111111111111111111
`US006681031B2
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,681,031 B2
`*Jan.20,2004
`
`(54) GESTURE-CONTROLLED INTERFACES
`FOR SELF-SERVICE MACHINES AND
`OTHER APPLICATIONS
`
`6/1995 Davis ......................... 273/437
`5,423,554 A
`5,454,043 A * 9/1995 Freeman ..................... 382/168
`5,481,454 A
`1/1996 Inoue et al.
`................ 364/419
`
`(75)
`
`Inventors: Charles J. Cohen, Ann Arbor, MI
`(US); Glenn Beach, Ypsilanti, MI (US);
`Brook Cavell, Ypsilanti, MI (US);
`Gene Foulk, Ann Arbor, MI (US);
`Charles J. Jacobus, Ann Arbor, MI
`(US); Jay Obermark, Ann Arbor, MI
`(US); George Paul, Ypsilanti, MI (US)
`
`(73) Assignee: Cybernet Systems Corporation, Ann
`Arbor, MI (US)
`
`( *) Notice:
`
`This patent issued on a continued pros(cid:173)
`ecution application filed under 37 CFR
`1.53( d), and is subject to the twenty year
`patent term provisions of 35 U.S.C.
`154(a)(2).
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21)
`(22)
`(65)
`
`Appl. No.: 09/371,460
`Aug. 10, 1999
`Filed:
`Prior Publication Data
`
`US 2003/0138130 Al Jul. 24, 2003
`
`(60)
`
`(51)
`(52)
`
`(58)
`
`(56)
`
`Related U.S. Application Data
`Provisional application No. 60/096,126, filed on Aug. 10,
`1998.
`Int. Cl.7 .................................................. G06K 9/00
`U.S. Cl. ......................... 382/103; 382/209; 701/45;
`345/473; 345/474
`Field of Search ................................. 382/103, 107,
`382/168, 153, 154, 117, 118, 170, 181,
`190, 209, 219, 276; 701/45; 348/169, 170,
`171, 172
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`(List continued on next page.)
`
`OTHER PUBLICATIONS
`
`C. Cohen, G. Beach, G. Paul, J. Obermark, G. Foulk, "Issues
`of Controlling Public Kiosks and other Self Service
`Machines using Gesture Recognition," Oct. 1998.
`
`(List continued on next page.)
`
`Primary Examiner-Jayanti K. Patel
`Assistant Examiner---Abolfazl Tabatabai
`(74) Attorney, Agent, or Firm-Gifford, Krass, Groh,
`Sprinkle, Anderson & Citkowski, PC
`
`(57)
`
`ABSTRACT
`
`A gesture recognition interface for use in controlling self(cid:173)
`service machines and other devices is disclosed. A gesture is
`defined as motions and kinematic poses generated by
`humans, animals, or machines. Specific body features are
`tracked, and static and motion gestures are interpreted.
`Motion gestures are defined as a family of parametrically
`delimited oscillatory motions, modeled as a linear-in(cid:173)
`parameters dynamic system with added geometric con(cid:173)
`straints to allow for real-time recognition using a small
`amount of memory and processing time. A linear least
`squares method is preferably used to determine the param(cid:173)
`eters which represent each gesture. Feature position measure
`is used in conjunction with a bank of predictor bins seeded
`with the gesture parameters, and the system determines
`which bin best fits the observed motion. Recognizing static
`pose gestures is preferably performed by localizing the
`body/object from the rest of the image, describing that
`object, and identifying that description. The disclosure
`details methods for gesture recognition, as well as the
`overall architecture for using gesture recognition to control
`of devices, including self-service machines.
`
`5,047,952 A
`
`9/1991 Kramer et al. ........... 364/513.5
`
`17 Claims, 19 Drawing Sheets
`
`Gesture
`Generation
`
`Vision
`System
`
`Gesture
`Recognition
`
`Translator
`
`Multimedia
`Interface
`
`Device
`Control
`
`Virtual World
`Interaction
`Gesture Recognition System Flow
`Chart.
`
`1
`
`IROBOT 2011
`Shenzhen Zhiyi Technology v. iRobot
`IPR2017-02061
`
`

`

`US 6,681,031 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`8/1996 Abe et al. ................... 364/419
`5,544,050 A
`10/1996 Maes et al. ................. 395/121
`5,563,988 A
`10/1996 Barrus ........................ 364/559
`5,570,301 A
`12/1996 Cipolla et al. .............. 345/156
`5,581,276 A
`1/1997 Freeman ..................... 345/158
`5,594,469 A
`3/1997 Beernink et al.
`........... 345/173
`5,612,719 A
`7/1997 Conway et al. ............. 395/327
`5,652,849 A
`8/1997 Sakiyama et al.
`.......... 395/753
`5,659,764 A
`9/1997 Favot et al.
`................ 345/156
`5,668,573 A
`9/1997 Doi et al.
`................... 345/156
`5,670,987 A
`12/1997 Sagawa et al.
`............. 382/100
`5,699,441 A
`1/1998 Moghaddam et al. ....... 382/228
`5,710,833 A
`2/1998 Tokioka et al. ............ 73/865.4
`5,714,698 A
`3/1998 Kuzunuki et al.
`.......... 395/333
`5,732,227 A
`5/1998 Nitta et al.
`................. 345/156
`5,757,360 A
`6/1998 Redmond ............... 434/307 R
`5,759,044 A
`6/1998 Korth ......................... 345/168
`5,767,842 A
`8/1998 Harada et al.
`.............. 345/339
`5,798,758 A
`9/1998 Oohara et al. .............. 345/358
`5,801,704 A
`9/1998 Kramer et al. .............. 128/782
`5,813,406 A
`10/1998 Maggioni ................... 382/165
`5,828,779 A
`1/1999 Ando et al. ................. 704/251
`5,864,808 A
`1/1999 Horvitz et al. ................. 707/6
`5,864,848 A
`5,875,257 A * 2/1999 Marrin et al. ............... 382/107
`5,880,411 A
`3/1999 Gillespie et al. ......... 178/18.01
`3/1999 Sakou et al. ................ 382/100
`5,887,069 A
`5,889,236 A
`3/1999 Gillespie et al. ......... 178/18.01
`5,889,523 A
`3/1999 Wilcox et al.
`.............. 345/357
`................ 345/348
`4/1999 Small et al.
`5,898,434 A
`5/1999 Hoffberg et al. ............ 382/209
`5,901,246 A
`5,903,229 A
`5/1999 Kishi
`.......................... 341/20
`5,907,328 A
`5/1999 Brush II et al.
`............ 345/358
`5/1999 Yamada ...................... 707/541
`5,907,852 A
`5,917,490 A
`6/1999 Kuzunuki et al.
`.......... 345/351
`5,990,865 A * 11/1999 Gard .......................... 345/156
`6,035,053 A * 3/2000 Yoshioka et al.
`........... 382/104
`6,137,908 A * 10/2000 Rhee .......................... 382/187
`6,272,231 Bl * 8/2001 Maurer et al. .............. 382/103
`6,301,370 Bl * 10/2001 Steffens et al. ............. 382/103
`6,335,977 Bl * 1/2002 Kage .......................... 382/107
`
`OIBER PUBLICATIONS
`
`L. Conway, C. Cohen, "Video Mirroring and Iconic Ges(cid:173)
`tures: Enhancing Basic Videophones to Provide Visual
`Coaching and Visual Control," (no date available).
`C. Cohen, L. Conway, D. Koditschek, G. Roston, "Dynamic
`System Representation of Basic and Non-Linear in Param(cid:173)
`eters Oscillatory Motion Gestures," Oct. 1997.
`C. Cohen, L. Conway, D. Koditschek, "Dynamic System
`Representation, Generation, and Recognition of Basic Oscil(cid:173)
`latory Motion Gestures," Oct. 1996.
`
`C. Cohen, G. Beach, B. Cavell, G. Foulk, J. Obermark, G.
`Paul, "The Control of Self Service Machines Using Gesture
`Recognition," (Aug. 1999).
`United States Air f orce Instruction, "Aircraft Cockpit and
`Formation Flight Signals," May 1994 U.S. Army Field
`Manual No. 21-60, Washington, D.C., Sep. 30, 1987
`Arnold, V.I., "Ordinary Differential Equations," MIT Press,
`1978.
`Cohen, C., "Dynamical System Representation, Generation
`and Recognition of Basic Oscillatory Motion Gestures and
`Applications for the Control of Actuated Mechanisms,"
`Ph.D. Dissertation, Univ. of Michigan, 1996. Frank, D.,
`"HUD Expands Kiosk Program," Federal Computer Week,
`Mar. 8, 1999.
`Hager, G., Chang, W., Morse, A.; "Robert Feedback Control
`Based on Stereo Vision: Towards Calibration-Free Hand(cid:173)
`Eye Coordination," IEEE Int. Conf. Robotics and Automa(cid:173)
`tion, San Diego, CA, May 1994. Hauptmann, A., "Speech
`and Gestures for Graphic Image Manipulation," Computer
`Human Interaction 1989 Proc., pp. 241-245, May 1989.
`Hirsch, M. Smale, S., "Differential Equations, Dynamical
`Systems and Linear Algebra," Academic Press, Orlando, FL,
`1974 Kanade, T., "Computer Recognitionof Human Faces,
`"Birkhauser Verlag, Basel and Stuttgart, 1977.
`Karon, P., "Beating an Electronic Pathway to Government
`with Online Kiosks," Los Angeles Times, Aug. 25, 1996.
`Link-Belt Construction Equipment Co., "Operating Safety:
`Crames & Excavators," 1987. Turk, M., Pentland, A,
`"Eigenfaces for Recognition," Journal of Cognitive Neuro(cid:173)
`science, 3, 1, 71-86, 1991.
`Narendra, K. Balakrishnan, J. "Improving Transient
`Response to Adaptive Control Systems Using Multiple
`Models and Switching," IEEE Trans. on Automatic Control,
`39:1861-1866, Sep. 1994. Rizzi, A, Whitcomb, L.,
`Koditschek, D .; "Distributed Real-Time Control of a Spatial
`Robot Juggler," IEEE Computer, 25(5) May 1992.
`Wolf, C., Morrel-Samuels, P., "The use of hand-drawn
`gesetures for text editing, Int. Journ. of Man-Machine
`Studies," vol. 27, pp. 91-102, 1987. Wolf, C., Rhyne, J., "A
`Taxonomic Approach to Understanding Direct Manipula(cid:173)
`tion," lour. of the Human Factors Society 31th Annual
`Meeting, pp. 576-580.
`Yuille, A., "Deformable Templates for Face Recognition,"
`Journ. of Cognitive Neuroscience, 3, 1, 59-70, 1991.
`
`* cited by examiner
`
`2
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 1of19
`
`US 6,681,031 B2
`
`Ged\11'
`GenemliOn
`
`kiosk
`
`Multimedia Interface
`
`Figure 1: Gesture Recognition System.
`
`Gesture
`Generation
`
`Vision
`System
`
`Gesture
`Recognition
`
`Translator
`
`Multimedia
`Interface
`
`Device
`Control
`
`Virtual World
`Interaction
`
`Figure 2: Gesture Recognition System Flow
`Chart.
`
`3
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 2of19
`
`US 6,681,031 B2
`
`G Gesture 1--l...._
`
`Creation 1------.
`
`Sensor
`Module
`
`s
`
`x,y
`pos, vet
`image data
`
`Identification
`Module
`
`I
`
`identified
`gesture
`
`Transformation T
`
`Module
`
`transformed
`command
`
`Controlled R
`
`System
`
`system
`response
`
`Figure 3: Signal Flow Diagram of the Gesture
`Recognition System.
`
`4
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 3of19
`
`US 6,681,031 B2
`
`x-pos
`
`large slow line
`xymin-xymax
`
`clockwise
`large slow circle
`
`large slow line
`xmaxymin-xmlnymax
`
`counter clockwise
`large slow circle
`
`Figure 4: Example gestures, showed
`in two dimensions.
`
`y-pos
`
`x-pos
`
`slow:
`large slow circle
`
`medium:
`large fast circle
`
`fast:
`small fast circle
`
`Figure 5: Three Example
`Gestures.
`
`5
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 4of19
`
`US 6,681,031 B2
`
`y
`
`y
`
`y
`
`y
`
`clockwise large
`slow circle
`
`clockwise large
`fast circle
`
`y
`
`clockwise small
`slow circle
`y
`
`clockwise small
`fast circle
`y
`
`x
`
`x
`
`x
`
`x
`
`x
`
`ccw large
`slow circle
`y
`
`ccw large
`fast circle
`y
`
`ccw small
`slow circle
`y
`
`ccw small fast
`circle
`
`y
`
`x
`
`large slow line
`xmin-xymax
`y
`
`large fasl line
`xmin-xymax
`y
`
`x
`
`x
`
`x
`
`x
`
`x
`
`small slow line
`xm in·xyma·x
`y
`
`small fast line
`x m in·xymax
`y
`
`x
`
`x
`
`small fast ine
`large fast line
`1.
`I.
`11 1
`I
`I
`.
`sm a sow 1ne
`.
`arge sow tnc
`xmaxym in-xminymall
`xm axy m in·xm inym ax x m axym rn·x m mym axxm axym in-xm inym ax
`y
`y
`y
`
`y
`
`x
`
`large slow y-line
`y
`
`large fast y-line
`y
`
`x
`
`x
`
`x
`
`x
`
`small slow y-line
`y
`
`small fast y-line
`y
`
`x
`
`x
`
`large slow x-Iine
`
`large fast x-line
`
`small slow x-line
`
`small fast x-linc
`
`Figure 6: An Example 24 Gesture Lexicon.
`
`6
`
`

`

`U.S. Patent
`
`Jan. 20, 2004
`
`Sheets of 19
`
`US 6,681,031 B2
`
`Figure 7: Slow Down Gesture.
`
`Figure 8: Prepare to Move Gesture.
`
`7
`
`

`

`U.S. Patent
`
`Jan. 20, 2004
`
`Sheet 6 of 19
`
`US 6,681,031 B2
`
`Figure 9: Attention Gesture.
`
`8
`
`

`

`U.S. Patent
`
`Jan. 20, 2004
`
`Sheet 7 of 19
`
`US 6,681,031 B2
`
`Figure 1 O: Stop Gesture.
`
`Figure 11: Right or Left Turn Gestures.
`
`Figure 12: 110kay 11 Gesture.
`
`9
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 8of19
`
`US 6,681,031 B2
`
`Figure 13: Freeze Gesture.
`
`y-position
`
`x-vekx;ity
`
`x-positlon
`
`x-posltion
`
`A one dimension oscillating
`x-line human gesture performed
`in two dimensional space.
`
`A time hisl<ny of the x-line
`human created g.sture.
`
`A two dimensional phase
`space lrajectory ol lhe
`human created x-line gestute.
`
`Figure 14: Pl~ts of a Human Created One
`Dimensional X-Line Oscillating Motion.
`
`x-position
`
`•
`
`Figure 15: Possible Lines Associated with
`x(t,p)=p0+p1t and Their Equivalent
`Representation in the p Parameter Space.
`
`10
`
`

`

`U.S. Patent
`
`Jan. 20, 2004
`
`Sheet 9 of 19
`
`US 6,681,031 B2
`
`x
`
`f(x,9 )
`
`• x
`
`I\ x
`
`e
`
`Figure 16: Parameter·Fitting: We Require a
`Rule for q to Bring the Error to Zero.
`
`y
`
`e
`
`y
`
`2
`
`y
`
`9
`
`x
`
`x
`
`x
`
`Figure 17: Plots of Different (xi,yi) Data
`Points that Result in a Different Best Fitting q
`Line.
`
`11
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 10 of 19
`
`US 6,681,031 B2
`
`y
`
`9
`
`x
`
`x
`
`8
`
`x
`
`y
`
`4
`
`Figure 18: The Recursive Linear Least
`Squares Method for Updating q with Each
`Additional (xi,yi) Data Point.
`
`x-velocity
`
`.,....------.--....- e
`e ~ actual next state
`~ computed state from
`
`~ current state
`
`x-position
`
`fast res
`error
`
`•
`
`•
`
`slow prediction bin
`..____computed state from
`medium prediction bin
`
`.,..
`
`_computed state from
`fast prediction bin
`
`Figure 19: An Exaggerated Representation
`of the Residual Error Measurement.
`
`12
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 11 of 19
`
`US 6,681,031 B2
`
`plot lexicon gestures
`in phase-plane
`
`1
`
`"guess" appropriate
`models to match plots
`
`l
`
`for each model, determine
`parameters for each
`gesture in lexicon
`
`l
`
`test models using total
`residual error calculation
`
`l
`does the model with lowest
`no
`total residual error have small -
`enough residuals?
`l yes
`
`select model with
`the smallest total
`residual error
`
`Figure 20: An Algorithm for Determining the
`Specific Gesture Model.
`
`13
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 12 of 19
`
`US 6,681,031 B2
`
`worst residual
`error ratio
`
`1.0 - -
`
`0.8 - -
`
`0.6 - -
`
`0.4 - -
`
`0.2 - -
`
`Linear
`with Offset
`Component
`
`Van Van der Pol Higher
`der Pol with Drift Order
`Component Terms
`
`model type
`Velocity
`Damping
`
`Figure 21: The Worst Case Residual Ratios
`for Each Gesture Model. The Lower the
`Ratio, the Better the Model.
`
`14
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 13 of 19
`
`US 6,681,031 B2
`
`y position
`
`x position
`
`x velocity
`
`x position
`
`time
`
`x position
`
`X-axis portion of a gesture
`
`y position
`
`A plot of the above x- me The two di nsional phase
`space trajectory of x-line
`portion's position as a
`function of time.
`gesture
`y position
`
`y velocity
`
`x position
`
`Y-axis portion of a gesture
`
`xandy
`
`\_; u
`
`A plot of the above y-
`line portion's position as
`a function of time.
`
`y
`
`nsional phase
`space trajectory of y-line
`gesture
`
`x positi
`
`Figure 22: Two Perpendicular Oscillatory
`Line Motions Combined into a Circular
`Gesture.
`
`15
`
`

`

`U.S. Patent
`
`Jan. 20, 2004
`
`Sheet 14 of 19
`
`US 6,681,031 B2
`
`Figure 23: Bounding Box Around Hand.
`
`Figure 24: Descriptions from Bounding Box.
`
`16
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 15 of 19
`
`US 6,681,031 B2
`
`ills
`wx-pos
`
`_&s
`
`~x-pos
`
`slow:
`large slow circle
`
`medium:
`large fast circle
`
`fast:
`small fast circle
`
`Figure 25: The Example Gestures.
`
`Color Camera t---~
`
`Figure 26: Schematic of the Hand Tracking
`System Hardware.
`
`17
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 16 of 19
`
`US 6,681,031 B2
`
`Capture New Image
`
`No
`
`Find Difference Image
`
`Compute Moving Center
`
`Compute Static Center
`
`Display Target Center
`
`Figure 27: Flowchart of the CTS.
`
`18
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 17 of 19
`
`US 6,681,031 B2
`
`r. ·-·--· . ·--·- ·-·--··- ·····--· ·- ----··--···------·
`· File
`1 -
`
`~lp
`
`Box Row Size
`
`·.
`
`.
`
`. . . . 20 "··
`---
`'
`
`·~ .~~;~.
`
`:.
`
`. ···~! . ';.,
`
`;.·.· :.::. :.;
`
`· · ···
`
`-- -
`
`-
`
`-
`
`......• ··
`-
`-
`
`-
`
`-
`
`-
`
`,
`- - ~ - -,,.- ~ - --- -- -
`
`-
`
`-
`
`I .. I .
`I.· I
`I
`11 ~ ·ai-;:.;,-s'iz~ · ..,::-··~· ·"":7:- -r:)y-·-·,7:·-: - ·:-f'·· · - :~::- -·-T,,,,-'.~ :::;;:¥''. .. ··.:·J·..
`I· ,30.'.:
`I Hot!.;. ln~ity n..' tiOiion.C....t 1lr
`I ~~:£::;f ~ ~m:: ~~~ .. .
`
`.
`
`.
`
`. .·
`
`·.
`
`.. . . . .
`
`li 1:
`
`'
`
`-===--...::::....::..-=:::.-:·
`
`.
`
`l! i.A
`..·===-=~=~:=·-·-=-::~-:-.=.::...-:::::::. --:!
`'
`=·==·--·· ---· -·
`Figure 28: Graphical User Interface of the
`CTS.
`
`19
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 18 of 19
`
`US 6,681,031 B2
`
`•
`
`Image 1
`
`•
`
`lmage2
`
`0
`
`Image 2 - Image 1
`
`)
`Diff. Image & Color-Filter Target Center
`
`Figure 29: Target Center from Difference
`Image.
`
`Figure 30: Color Matching Technique.
`
`Dynamic
`Gestures
`
`Which
`
`Screen
`
`Gesture?
`
`Display
`
`Static
`Gestures
`Figure 31: Identification Module.
`
`20
`
`

`

`U.S. Patent
`
`Jan.20,2004
`
`Sheet 19 of 19
`
`US 6,681,031 B2
`
`from
`sensor
`module
`
`geometric
`information
`
`min res and
`bin number
`
`threshold
`
`or null
`
`information
`specific overall
`gesture number
`
`Figure 32: Simplified Diagram of the
`Dynamic Gesture Prediction Module.
`
`21
`
`

`

`US 6,681,031 B2
`
`1
`GESTURE-CONTROLLED INTERFACES
`FOR SELF-SERVICE MACHINES AND
`OTHER APPLICATIONS
`
`REFERENCE TO RELATED APPLICATIONS
`
`This application claims priority of U.S. provisional patent
`application Ser. No. 60/096,126, filed Aug. 10, 1998, the
`entire contents of which are incorporated here by reference.
`
`STATEMENT
`
`This invention was made with Government support under
`contracts NAS9-98068 (awarded by NASA), DASWOl-98
`M-0791 (awarded by the U.S. Army), and F29601-98-C-
`0096 (awarded by the U.S. Air Force). The Government has
`certain rights in this invention.
`
`FIELD OF THE INVENTION
`
`This invention relates to person-machine interfaces and,
`in particular, to gesture-controlled interfaces for self-service
`machines and other applications.
`
`s
`
`2
`Simple tests can then be used to determine what gestures are
`truly intuitive for any given application.
`For certain types of devices, gesture inputs are the more
`practical and intuitive choice. For example, when control-
`ling a mobile robot, basic commands such as "come here",
`"go there", "increase speed", "decrease speed" would be
`most efficiently expressed in the form of gestures. Certain
`environments gain a practical benefit from using gestures.
`For example, certain military operations have situations
`10 where keyboards would be awkward to carry, or where
`silence is essential to mission success. In such situations,
`gestures might be the most effective and safe form of input.
`A system using gesture recognition would be ideal as
`input devices for self-service machines (SSMs) such as
`1s public information kiosks and ticket dispensers. SSMs are
`rugged and secure cases approximately the size of a phone
`booth that contain a number of computer peripheral tech(cid:173)
`nologies to collect and dispense information and services. A
`typical SSM system includes a processor, input device(s)
`20 (including those listed above), and video display. Many
`SSMs also contain a magnetic card reader, image/document
`scanner, and printer/form dispenser. The SSM system may
`or may not be connected to a host system or even the
`Internet.
`The purpose of SSMs is to provide information without
`the traditional constraints of traveling to the source of
`information and being frustrated by limited manned office
`hours or to dispense objects. One SSM can host several
`different applications providing access to a number of
`information/service providers. Eventually, SSMs could be
`the solution for providing access to the information con-
`tained on the World Wide Web to the majority of a popu(cid:173)
`lation which currently has no means of accessing the Inter(cid:173)
`net.
`SSMs are based on PC technology and have a great deal
`of flexibility in gathering and providing information. In the
`next two years SSMs can be expected to follow the tech(cid:173)
`nology and price trends of PC's. As processors become
`faster and storage becomes cheaper, the capabilities of SSMs
`40 will also increase.
`Currently SSMs are being used by corporations,
`governments, and colleges. Corporations use them for many
`purposes, such as displaying advertising (e.g. previews for a
`new movie), selling products (e.g. movie tickets and
`45 refreshments), and providing in-store directories. SSMs are
`deployed performing a variety of functions for federal, state,
`and municipal governments. These include providing motor
`vehicle registration, gift registries, employment information,
`near-real time traffic data, information about available
`so services, and tourism/special event information. Colleges
`use SSMs to display information about courses and campus
`life, including maps of the campus.
`
`BACKGROUND OF THE INVENTION
`Gesture recognition has many advantages over other input 25
`means, such as the keyboard, mouse, speech recognition,
`and touch screen. The keyboard is a very open ended input
`device and assumes that the user has at least a basic typing
`proficiency. The keyboard and mouse both contain moving
`parts. Therefore, extended use will lead to decreased per- 30
`formance as the device wears down. The keyboard, mouse,
`and touch screen all need direct physical contact between the
`user and the input device, which could cause the system
`performance to degrade as these contacts are exposed to the
`environment. Furthermore, there is the potential for abuse 35
`and damage from vandalism to any tactile interface which is
`exposed to the public.
`Tactile interfaces can also lead hygiene problems, in that
`the system may become unsanitary or unattractive to users,
`or performance may suffer. These effects would greatly
`diminish the usefulness of systems designed to target a wide
`range of users, such as advertising kiosks open to the general
`public. This cleanliness issue is very important for the touch
`screen, where the input device and the display are the same
`device. Therefore, when the input device is soiled, the
`effectiveness of the input and display decreases. Speech
`recognition is very limited in a noisy environment, such as
`sports arenas, convention halls, or even city streets. Speech
`recognition is also of limited use in situations where silence
`is crucial, such as certain military missions or library card
`catalog rooms.
`Gesture recognition systems do not suffer from the prob(cid:173)
`lems listed above. There are no moving parts, so device wear
`is not an issue. Cameras, used to detect features for gesture
`recognition, can easily be built to withstand the elements and ss
`stress, and can also be made very small and used in a wider
`variety of locations. In a gesture system, there is no direct
`contact between the user and the device, so there is no
`hygiene problem. The gesture system requires no sound to
`be made or detected, so background noise level is not a 60
`factor. A gesture recognition system can control a number of
`devices through the implementation of a set of intuitive
`gestures. The gestures recognized by the system would be
`designed to be those that seem natural to users, thereby
`decreasing the learning time required. The system can also 65
`provide users with symbol pictures of useful gestures similar
`to those normally used in American Sign Language books.
`
`SUMMARY OF THE INVENTION
`The subject invention resides in gesture recognition meth(cid:173)
`ods and apparatus. In the preferred embodiment, a gesture
`recognition system according to the invention is engineered
`for device control, and not as a human communication
`language. That is, the apparatus preferably recognizes com(cid:173)
`mands for the expressed purpose of controlling a device
`such as a self-service machine, regardless of whether the
`gestures originated from a live or inanimate source. The
`system preferably not only recognizes static symbols, but
`dynamic gestures as well, since motion gestures are typi(cid:173)
`cally able to convey more information.
`In terms of apparatus, a system according to the invention
`is preferably modular, and includes a gesture generator,
`
`22
`
`

`

`US 6,681,031 B2
`
`4
`of a gesture it represents will exhibit a smaller residual error
`than a bin predicting the future state of a gesture that it does
`not represent. For simple dynamic gestures applications, a
`linear-with-offset-component model is preferably used to
`discriminate between gestures. For more complex gestures,
`a variation of a velocity damping model is used.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`3
`sensing system, modules for identification and transforma(cid:173)
`tion in to a command, and a device response unit. At a high
`level, the flow of the system is as follows. Within the field
`of view of one or more standard video cameras, a gesture is
`made by a person or device. During the gesture making 5
`process, a video image is captured, producing image data
`along with timing information. As the image data is
`produced, a feature-tracking algorithm is implemented
`which outputs position and time information. This position
`information is processed by static and dynamic gesture
`recognition algorithms. When the gesture is recognized, a
`command message corresponding to that gesture type is sent
`to the device to be controlled, which then performs the
`appropriate response.
`The system only searches for static gestures when the 15
`motion is very slow (i.e. the norm of the x and y-and
`z-velocities is below a threshold amount). When this
`occurs, the system continually identifies a static gesture or
`outputs that no gesture was found. Static gestures are
`represented as geometric templates for commonly used 20
`commands such as Halt, Left/Right Turn, "OK," and Freeze.
`Language gestures, such as the American Sign Language,
`can also be recognized. A file of recognized gestures, which
`lists named gestures along with their vector descriptions, is
`loaded in the initialization of the system. Static gesture 25
`recognition is then performed by identifying each new
`description. A simple nearest neighbor metric is preferably
`used to choose an identification. In recognizing static human
`hand gestures, the image of the hand is preferably localized
`from the rest of the image to permit identification and 30
`classification. The edges of the image are preferably found
`with a Sobel operator. A box which tightly encloses the hand
`is also located to assist in the identification.
`Dynamic (circular and skew) gestures are preferably
`treated as one-dimensional oscillatory motions. Recognition 35
`of higher-dimensional motions is achieved by independently
`recognizing multiple, simultaneously created one(cid:173)
`dimensional motions. A circle, for example, is created by
`combining repeating motions in two dimensions that have
`the same magnitude and frequency of oscillation, but
`wherein the individual motions ninety degrees out of phase.
`A diagonal line is another example. Distinct circular ges(cid:173)
`tures are defined in terms of their frequency rate; that is,
`slow, medium, and fast.
`Additional dynamic gestures are derived by varying phase 45
`relationships. During the analysis of a particular gesture, the
`x and y minimum and maximum image plane positions are
`computed. Z position is computed if the system is set up for
`three dimensions. If the x and y motions are out of phase, as
`in a circle, then when x or y is minimum or maximum, the 50
`velocity along the other is large. The direction
`(clockwiseness in two dimensions) of the motion is deter(cid:173)
`mined by looking at the sign of this velocity component.
`Similarly, if the x and y motion are in phase, then at these
`extremum points both velocities are small. Using clockwise 55
`and counter-clockwise circles, diagonal lines, one(cid:173)
`dimensional lines, and small and large circles and lines, a
`twenty-four gesture lexicon was developed and described
`herein. A similar method is used when the gesture is per(cid:173)
`formed in three dimensions.
`An important aspect of the invention is the use of param(cid:173)
`eterization and predictor bins to determine a gesture's future
`position and velocity based upon its current state. The bin
`predictions are compared to the next position and velocity of
`each gesture, and the difference between the bin's prediction 65
`and the next gesture state is defined as the residual error.
`According to the invention, a bin predicting the future state
`
`10
`
`FIG. 1 is a drawing of a gesture recognition system
`according to the invention;
`FIG. 2 is a gesture recognition system flow chart;
`FIG. 3 is a signal flow diagram of a gesture recognition
`system according to the invention;
`FIG. 4 is a drawing which shows example gestures in two
`dimensions;
`FIG. 5 shows three example gestures;
`FIG. 6 is an example of a 24-gesture lexicon according to
`the invention;
`FIG. 7 depicts a Slow-Down gesture;
`FIG. 8 depicts a Move gesture;
`FIG. 9 depicts an Attention gesture;
`FIG. 10 depicts a Stop gesture;
`FIG. 11 shows Right/Left Turn gestures;
`FIG. 12 shows an "Okay" gesture;
`FIG. 13 shows a Freeze gesture;
`FIG. 14 provides three plots of a human created one
`dimensional X-Line oscillating motion;
`FIG. 15 shows possible lines associated with x(t,p)=pO+
`plt and their equivalent representation in the p-parameter
`space;
`FIG. 16 illustrates parameter fitting wherein a rule is used
`for q to bring the error to zero;
`FIG. 17 plots different (xi,yi) data points resulting in a
`different best fitting q line;
`FIG. 18 depicts a recursive linear least squares method for
`40 updating q with subsequent (xi,yi) data points;
`FIG. 19 illustrates an algorithm for determining a specific
`gesture model according to the invention;
`FIG. 20 is an exaggerated representation of a residual
`error measurement;
`FIG. 21 is a plot which shows worst case residual ratios
`for each gesture model, wherein the lower the ratio, the
`better the model;
`FIG. 22 illustrates how two perpendicular oscillatory line
`motions may be combined into a circular gesture;
`FIG. 23 shows how a bounding box may be placed around
`a hand associated with a gesture;
`FIG. 24 provides descriptions from the bounding box of
`FIG. 23;
`FIG. 25 shows example gestures;
`FIG. 26 is a schematic of hand-tracking system hardware
`according to the invention;
`FIG. 27 is a flowchart of a color tracking system (CTS)
`according to the invention;
`FIG. 28 depicts a preferred graphical user interface of the
`CTS;
`FIG. 29 illustrates the application of target center from
`difference image techniques;
`FIG. 30 illustrates a color matching technique;
`FIG. 31 is a representation of an identification module;
`and
`
`60
`
`23
`
`

`

`5
`FIG. 32 is a simplified diagram of a dynamic gesture
`prediction module according to the invention.
`
`US 6,681,031 B2
`
`6
`neously in two or three dimensions. A circle is such a
`motion, created by combining repeating motions in two
`dimensions that have the same magnitude and frequency of
`oscillation, but with the individual motions ninety degrees
`5 out of phase. A "diagonal" line is another such motion. We
`have defined three distinct circular gestures in terms of their
`frequency rates: slow, medium, and fast. An example set of
`such gestures is shown in FIG. 4. These gestures can also be
`performed in three dimensions, and such more complex
`motions can be identified by this system.
`The dynamic gestures are represented by a second order
`equation, one for each axis:
`
`10
`
`DETAILED DESCRIPTION OF IBE
`INVENTION
`FIG. 1 presents a system overview of a gesture controlled
`self service machine system according to the invention. FIG.
`2 shows a flow chart representation of how a vision system
`is views the gesture created, with the image data sent to the
`gesture recognition module, translated into a response, and
`then used to control a SSM, including the display of data, a
`virtual environment, and devices. The gesture recognition
`system takes the feature positions of the moving body parts
`(two or three dimensional space coordinates, plus a time
`stamp) as the input as quickly as vision system can output 15
`the data and outputs what gesture (if any) was recognized,
`again at the same rate as the vision system outputs data.
`The specific components of the gesture recognition sys(cid:173)
`tem are detailed in FIG. 3, and these include five modules:
`Gesture Generation
`S: Sensing (vision)
`I: Identification Module
`T: Transformation
`R: Response
`At a high level, the flow of the system is as follows.
`Within the field of view of one or more standard video
`cameras, a gesture is made by a person or device. During the
`gesture making process, a video capture card is capturing
`images, producing image data along with timing informa(cid:173)
`tion. As the image data is produced, they are run through a
`feature tracking algorithm which outputs position and time
`information. This position information is processed by static
`and dynamic gesture recognition algorithms. When the
`gesture is recognized, a command message corresponding to
`that gesture type is sent to the device to be controlled, which
`then performs

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket