`
`
`
`
`
`
`
`Exhibit C
`
`United States Patent No. 7,715,476
`
`
`
`
`
`
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 2 of 25 PageID #: 52
`I 1111111111111111 11111 111111111111111 11111 111111111111111 lll111111111111111
`
`US0077154 76B2
`
`c12) United States Patent
`Edwards et al.
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7,715,476 B2
`May 11, 2010
`
`(54) SYSTEM, METHOD AND ARTICLE OF
`MANUFACTURE FOR TRACKING A HEAD
`OF A CAMERA-GENERATED IMAGE OF A
`PERSON
`
`(76)
`
`Inventors: Jeffrey L. Edwards, 1072 Tanland Dr.,
`#203, Palo Alto, CA (US) 94306;
`Katerina H. Nguyen, 3341 Alma St.,
`Palo Alto, CA (US) 94306
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1206 days.
`
`(21) Appl. No.: 11/112,433
`
`(22) Filed:
`
`Apr. 21, 2005
`
`(65)
`
`Prior Publication Data
`
`US 2005/0185054 Al
`
`Aug. 25, 2005
`
`Related U.S. Application Data
`
`(63) Continuation of application No. 10/353,858, filed on
`Jan. 28, 2003, now Pat. No. 6,909,455, which is a
`continuation of application No. 09/364,859, filed on
`Jul. 30, 1999, now Pat. No. 6,545,706.
`
`(51)
`
`Int. Cl.
`H04N 7118
`(2006.01)
`(52) U.S. Cl. ............................. 375/240.08; 375/240.16
`( 58) Field of Classification Search ............................... .
`375/240.08-240.17
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,843,568 A
`5,148,477 A
`
`6/ 1989 Krueger et al.
`9/1992 Neely et al.
`
`5,384,912 A
`5,454,043 A
`5,469,536 A
`5,534,917 A
`5,548,659 A
`5,570,113 A
`5,581,276 A
`5,623,587 A
`5,631,697 A
`5,767,867 A
`5,781,198 A
`5,790,124 A
`
`1/1995 Ogrinc et al.
`9/1995 Freeman
`11/1995 Blank
`7/1996 Mac Dougall
`8/1996 Okamoto
`10/1996 Zetts
`12/1996 Cipolla et al.
`4/1997 Bulman
`5/1997 Nishimura et al.
`6/1998 Hu
`7/1998 Korn
`8/1998 Fischer et al.
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Crow, F. C., "Summed-Area Tables for Texture Mapping," Computer
`Graphics, vol. 18(3), 207-212, Jul. 1984.
`
`(Continued)
`
`Primary Examiner-Andy S Rao
`(74) Attorney, Agent, or Firm-Workman Nydegger
`
`(57)
`
`ABSTRACT
`
`A system, method and article of manufacture are provided for
`tracking a head portion of a person image in video images.
`Upon receiving video images, a first head tracking operation
`is executed for generating a first confidence value. Such first
`confidence value is representative of a confidence that a head
`portion of a person image in the video images is correctly
`located. Also executed is a second head tracking operation for
`generating a second confidence value representative of a con(cid:173)
`fidence that the head portion of the person image in the video
`images is correctly located. The first confidence value and the
`second confidence value are then outputted. Subsequently,
`the depiction of the head portion of the person image in the
`video images is based on the first confidence value and the
`second confidence value.
`
`32 Claims, 15 Drawing Sheets
`
`120
`
`110
`
`116
`
`114
`
`-------,--11 s
`
`Network (135) 134
`
`CPU
`
`1/0
`adapter
`
`Communication
`adapter
`
`124
`
`122
`
`136
`
`138
`
`User
`interface
`adapter
`
`Display
`adapter
`
`□-
`
`133
`
`132
`
`126
`
`128
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 3 of 25 PageID #: 53
`
`US 7,715,476 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`5,802,220 A
`6,154,559 A
`6,301,370 Bl
`
`9/ 1998 Black et al.
`11/2000 Beardsley
`10/2001 Steffens et al.
`
`OTHER PUBLICATIONS
`
`Huang, Chu-Lin, Wu, Ming-Shan, "A Model-based Complex Back(cid:173)
`ground Gesture Recognition System," IEEE International Confer(cid:173)
`ence on Systems, Man and Cybernetics, vol. 1 pp. 93-98, Oct. 1996.
`Aggarwal, J. K., Cai, Q. "Human Motion Analysis: A Review," IEEE
`Nonrigid and Articulated Motion Workshop Proceedings, 90-102,
`(1997).
`
`Cortes, C., Vapnik, V., "Support-Vector Networks," Machine Learn(cid:173)
`ing, vol. 20, pp. 273-297, (1995).
`GMD Digital Media Lab: The Virtual Studio; http://viswiz.gmd.de/
`DML/vst/vst.html.
`Swain, M. J., Ballard, D. H., "Indexing Via Color Histograms," Third
`International Conference on Computer Vision, pp. 390-393, Dec.
`1990.
`Review: Game Boy Camera, Jul. 15, 1998, http://www.gameweek.
`corn/reviews/july 15/ gbc .html.
`Barbie Photo Designer w/Digital Camera, Box, http://www.actioncd.
`corn/ktktO 126.asp.
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 4 of 25 PageID #: 54
`
`120
`
`110
`
`116
`
`114
`
`Network (135) 134
`
`118
`
`CPU
`
`ROM
`
`RAM
`
`1/0
`adapter
`
`Communication
`adapter
`
`122
`
`136
`
`138
`
`User
`interface
`adapter
`
`Display
`adapter
`
`□
`
`112
`
`124
`
`Camera
`
`133
`
`132
`
`126
`
`128
`
`Figure 1
`
`N
`
`0 ....
`
`0
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`....
`0 ....
`....
`
`Ul
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 5 of 25 PageID #: 55
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 2 of 15
`
`US 7,715,476 B2
`
`Video images
`
`200
`
`202
`
`Background
`subtraction
`head tracker
`
`Free form
`head tracker
`
`Mediator
`
`204
`
`Figure 2
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 6 of 25 PageID #: 56
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 3 of 15
`
`US 7,715,476 B2
`..r-- 202
`
`Image time
`
`300
`~
`
`Get foreground
`
`I,
`
`302
`~
`Background model
`
`Scene parser
`
`i-./-- 304
`
`Find head for each
`person
`
`v---- 306
`
`l
`
`Head conference
`
`Figure 3
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 7 of 25 PageID #: 57
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 4 of 15
`
`US 7,715,476 B2
`
`~ 304
`
`Receive subtracted
`image
`
`Figure to create mass
`distribution
`
`400
`
`402
`
`Threshold elimination
`
`410
`
`Pick best mass as
`person
`
`Update person(s)
`data based upon
`frame differencing
`and stored history
`
`Store
`
`418
`
`Figure 4
`
`404
`
`408
`
`Figure 4A
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 8 of 25 PageID #: 58
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 5 of 15
`
`US 7,715,476 B2
`
`Foreground pixels
`
`-t 306
`
`Generate Y histogram
`
`500
`
`Search for head torso
`separation based on histogram
`
`504
`
`Search for head top
`
`506
`
`Search for left/right side
`
`508
`
`No
`
`514
`
`No
`
`size similar to
`history?
`
`Change bounding
`box to be consistent
`with history
`
`t--------------------'
`
`Determine confidence of
`head bounding box
`
`514
`
`Update history if confidence
`is above threshold
`
`516
`
`End
`
`Figure 5
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 9 of 25 PageID #: 59
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 6 of 15
`
`US 7,715,476 B2
`
`501
`
`0
`
`502
`
`Figure SA
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 10 of 25 PageID #: 60
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 7 of 15
`
`US 7,715,476 B2
`
`Image, time
`
`600 i- - - I - - - - - - - - - --I -
`
`I,
`
`- -:
`
`I
`Head motion
`I
`detection
`I --~-___,
`I
`I
`~ 618 I
`I
`I
`I
`I
`I
`
`608 -..r-
`
`i---- 606
`
`.
`
`Head
`verifier
`
`Skin
`operation
`604 --r-.___~ _
`I
`
`__.J
`
`I
`...... ~_..__
`I
`I
`I
`I
`I
`I
`I
`~ - r -~
`I
`Head
`I
`capture
`- - - - - - - - - - - - - - - - -
`
`~ 610
`
`1--1----1--'
`
`, - - - - - - - - - - - - - - - - -7
`I
`I
`602 '--'7
`I
`I
`I
`Motion
`Color
`I
`follower
`follower
`612 ------r----' I
`I ----..------'
`I
`I
`I
`I
`J,
`I
`I
`I
`I 616 -..r-
`Head
`1
`I
`verifier
`.__~_,
`I
`I
`I
`I
`Head
`I
`I
`tracker
`- - - - - - - - - - - - - - - - - -
`
`614 -...r-
`
`Figure 6
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 11 of 25 PageID #: 61
`
`Current
`Module
`
`702
`
`Extract
`flesh
`map
`
`Raw flesh
`map
`
`706
`
`708
`
`710
`
`712
`
`714
`
`716
`
`718
`
`Median
`filter
`
`Form
`regions
`
`Fill
`holes
`
`Extract
`regions
`
`Combine
`regions
`
`Generate
`hypothesis
`
`Evaluate
`hypothesis
`
`~ 608
`
`Figure 7
`
`700
`
`704
`
`710
`
`715
`\
`
`80
`
`oO
`
`C,
`
`80
`
`oO
`
`C,
`
`Figure 7A
`
`Figure 78
`
`Figure 7C
`
`Figure 70
`
`N
`
`0 ....
`
`0
`
`('D
`('D
`
`QO
`
`rJJ =(cid:173)
`.....
`0 ....
`....
`
`Ul
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 12 of 25 PageID #: 62
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 9 of 15
`
`US 7,715,476 B2
`
`~ 716
`
`Generate a score for
`each region
`
`V"' 800
`
`. ',
`
`Multiple scores for
`each possible
`combination of regions
`
`V""' 802
`
`• I,
`
`Pick combination with
`best score
`
`.,_,...... 804
`
`Figure 8
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 13 of 25 PageID #: 63
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 10 of 15
`
`US 7,715,476 B2
`
`~ 606
`
`Image, time
`
`'
`Generate motion
`map
`
`~ 900
`
`'
`
`I
`
`Convert into summed
`area table
`
`__,,...... 902
`
`',,
`
`Generate X, Y history
`
`V""" 904
`
`,,,
`
`Determine number of
`intersecting objects
`
`~ 906
`
`,,,
`
`Determine head for
`each object
`
`~ 908
`
`',.
`Head bounding box
`confidence
`
`Figure 9
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 14 of 25 PageID #: 64
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 11 of 15
`
`US 7,715,476 B2
`
`Capture time
`
`Verified
`head
`rectangle
`
`Current
`image
`
`~ 604
`
`Tracking time
`
`Extract out image
`sub-window
`
`1000
`
`Form color look-up
`table
`
`1006
`
`Current
`image
`
`Previous
`head
`rectangle
`
`Set up
`search grid
`
`1008
`
`Rectangular
`search grid
`.----------------------~O<..._
`Perform
`search
`
`1016
`
`Smooth
`simulation
`map
`
`Find
`best
`head estimate
`
`Figure 10
`
`1018
`
`1020
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 15 of 25 PageID #: 65
`
`U.S. Patent
`
`Mayll,2010
`
`Sheet 12 of 15
`
`US 7,715,476 B2
`
`0 ..--
`0 ..--
`
`N
`..--
`0
`..--
`
`. : I
`.
`I . . . . . : I
`. . . . .
`L:. ·_: ·_: :J
`
`0
`0
`
`~
`
`Q)
`I....
`::J
`0)
`LL
`
`ca
`0
`
`~
`Q)
`I....
`::J
`0)
`LL
`
`r - - ,
`I
`v
`I
`I
`_J
`
`N
`T'""
`0
`..--
`
`(X)
`..--
`0
`T'""
`
`,,_
`0
`T'""
`
`<{
`
`~
`
`~
`
`ID
`I....
`::J
`0)
`LL
`
`N
`0
`0
`T'""
`
`~[
`
`-.;t
`0
`0
`..--
`
`0
`C")
`r - ,
`
`0
`N ,,_
`
`0
`<O
`..--
`
`<(
`0
`~
`Q)
`I....
`::J
`0)
`
`·-LL
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 16 of 25 PageID #: 66
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 13 of 15
`
`US 7,715,476 B2
`
`~ 1016
`
`Start
`
`Get search grid point
`
`Generate 3-0 history
`for each point of
`search grid
`
`1110
`
`1112
`
`Compare 3-0 history
`to color model
`
`1114
`
`Generate score based
`on comparison
`
`1116
`
`No
`
`Yes
`
`Done
`
`Figure 11
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 17 of 25 PageID #: 67
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 14 of 15
`
`US 7,715,476 B2
`
`,r 702
`
`Generate R, G map ~ 1200
`
`..
`
`Find "Best Fit" oval ~ 1202
`
`~'•
`
`Fill in oval
`
`._/"',. 1206
`
`Figure 12
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 18 of 25 PageID #: 68
`
`U.S. Patent
`
`May 11, 2010
`
`Sheet 15 of 15
`
`US 7,715,476 B2
`
`1301
`~
`Get
`history
`, :-.
`
`Image, time
`
`~ 610
`
`. ,.,
`Determine search
`region
`
`'
`,
`
`L,,--,.
`
`1300
`
`',,
`
`Motion histograms
`(smooth)
`
`f.../""
`
`1302
`
`' ,
`
`Look for head shape
`
`i....-- 1308
`
`L,
`
`Output head bounding
`box and confidence
`
`1-...,'-,
`
`1310
`
`Figure 13
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 19 of 25 PageID #: 69
`
`US 7,715,476 B2
`
`1
`SYSTEM, METHOD AND ARTICLE OF
`MANUFACTURE FOR TRACKING A HEAD
`OF A CAMERA-GENERATED IMAGE OF A
`PERSON
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`2
`image. It is often very difficult to discern the head portion
`when relying on a single technique. For example, when iden(cid:173)
`tifying the location of a head portion using shape, color,
`motion etc., portions of the background image and the
`remaining body parts of the user image may be confused with
`the head. For example, a flesh coloring of a hand may be
`mistaken for features of the head.
`
`5
`
`This application is a continuation of U.S. patent applica(cid:173)
`tion Ser. No. 10/353,858, entitled SYSTEM, METHOD AND 10
`ARTICLE OF MANUFACTURE FOR TRACKING A
`HEAD OF A CAMERA-GENERATED IMAGE OF A PER(cid:173)
`SON filed Jan. 28, 2003 now U.S. Pat. Ser. No. 6,909,455
`which is incorporated herein by reference for all purposes,
`which is a continuation of U.S. patent application Ser. No.
`09/364,859, entitled SYSTEM, METHOD AND ARTICLE
`OF MANUFACTURE FOR TRACKING A HEAD OF A
`CAMERA-GENERATED IMAGE OF A PERSON filed Jul.
`30, 1999 now U.S. Pat. No. 6,545,706 which is incorporated
`herein by reference for all purposes.
`This application is related to a U.S. patent application filed
`Jul. 30, 1999 with the title "SYSTEM, METHOD AND
`ARTICLE OF MANUFACTURE FOR DETECTING COL(cid:173)
`LISIONS BETWEEN VIDEO IMAGES GENERATED BY
`A CAMERA AND AN OBJECT DEPICTED ON A DIS- 25
`PLAY" and Katerina H. Nguyen listed as inventor; a U.S.
`patent application filed Oct. 15, 1997 under Ser. No. 08/951,
`083 with the title "A SYSTEM AND METHOD FOR PRO(cid:173)
`VIDING A JOINT FORAN ANIMATABLE CHARACTER
`FOR DISPLAY VIA A COMPUTER SYSTEM"; and a U.S.
`patent application filed Jul. 30, 1999 with the title "WEB
`BASED VIDEO
`ENHANCEMENT APPARATUS,
`METHOD, AND ARTICLE OF MANUFACTURE" and
`Subutai Ahniad and Jonathan Cohen listed as inventors and
`which are all incorporated herein by reference in their 35
`entirety.
`
`BACKGROUND OF THE INVENTION
`
`1. The Field of the Invention
`The present invention relates to displaying video images
`generated by a camera on a display, and more particularly to
`tracking a head portion of a person image in camera-gener(cid:173)
`ated video images.
`2. The Relevant Art
`It is common for personal computers to be equipped with a
`camera for receiving video images as input. Conventionally,
`such camera is directed toward a user of the personal com(cid:173)
`puter so as to allow the user to view himself or herself on a
`display of the personal computer during use. To this end, the
`user is permitted to view real-time images that can be used for
`various purposes.
`One purpose for use of a personal computer-mounted cam(cid:173)
`era is to display an interaction between camera-generated
`video images and objects generated by the personal computer
`and depicted on the associated display. In order to afford this
`interaction, a current position of the user image must be
`identified. This includes identifying a current position of the
`body parts of the user image, including the head. Identifica(cid:173)
`tion of an exact current location of the user image and his or
`her body parts is critical for affording accurate and realistic
`interaction with objects in the virtual computer-generated
`environment. In particular, it is important to track a head
`portion of the user image since this specific body part is often
`the focus of the most attention.
`Many difficulties arise, however, during the process of
`identifying the current position of the head portion of the user
`
`SUMMARY OF THE INVENTION
`
`A system, method and article of manufacture are provided
`for tracking a head portion ofa person image in video images.
`Upon receiving video images, a first head tracking operation
`is executed for generating a first confidence value. Such first
`15 confidence value is representative of a confidence that a head
`portion of a person image in the video images is correctly
`located. Also executed is a second head tracking operation for
`generating a second confidence value representative of a con(cid:173)
`fidence that the head portion of the person image in the video
`20 images is correctly located. The first confidence value and the
`second confidence value are then outputted. Subsequently,
`the depiction of the head portion of the person image in the
`video images is based on the first confidence value and the
`second confidence value.
`In one embodiment of the present invention, the first head
`tracking operation begins with subtracting a background
`image from the video images in order to extract the person
`image. Further, a mass-distribution histogram may be gener(cid:173)
`ated that represents the extracted person image. A point of
`30 separation is then identified between a torso portion of the
`person image and the head portion of the person image.
`Next, the first head tracking operation continues by iden-
`tifying a top of the head portion of the person image. This may
`be accomplished by performing a search upwardly from the
`point of separation between the torso portion and the head
`portion of the person image. Subsequently, sides of the head
`portion of the person image are also identified. As an option,
`the first head tracking operation may track the head portion of
`the person image in the video images using previous video
`40 images including the head portion of the person image.
`In one embodiment, the second head tracking operation
`may begin by identifying an initial location of the head por(cid:173)
`tion of the person image in the video images. Thereafter, a
`current location of the head portion of the person image may
`45 be tracked starting at the initial location. As an option, the
`initial location of the head portion of the person image may be
`identified upon each instance that the second confidence
`value falls below a predetermined amount. By this feature, the
`tracking is "restarted" when the confidence is low that the
`50 head is being tracked correctly. This ensures improved accu(cid:173)
`racy during tracking.
`As an option, the initial location of the head portion of the
`person image may be identified based on the detection of a
`skin color in the video images. This may be accomplished by
`55 extracting a flesh map; filtering the flesh map; identifying
`distinct regions of flesh color on the flesh map; ranking the
`regions of flesh color on the flesh map; and selecting at least
`one of the regions of flesh color as the initial location of the
`head portion of the person image based on the ranking. Dur-
`60 ing such procedure, holes in the regions of flesh color on the
`flesh map may be filled. Further, the regions of flesh color on
`the flesh map may be combined upon meeting a predeter(cid:173)
`mined criteria.
`In a similar mamier, the current location of the head portion
`65 of the person image may be tracked based on the detection of
`a skin color in the video images. Such technique includes
`extracting a sub-window of the head portion of the person
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 20 of 25 PageID #: 70
`
`US 7,715,476 B2
`
`5
`
`3
`image in the video images; forming a color model based on
`the sub-window; searching the video images for a color simi(cid:173)
`lar to the color model; and estimating the current location of
`the head portion of the person image based on the search.
`In one embodiment, the module that identifies the initial
`location of the head portion of the person image and the
`module that identifies the current location of the head portion
`of the person image may work together. In particular, while
`tracking the current location of the head portion of the person
`image, a flesh map may be obtained. Thereafter, the flesh map 1 o
`may be used during subsequent identification of an initial
`location of the head portion of the person image when the
`associated confidence level drops below the predetermined
`amount.
`Similar to using the skin color, the initial location of the 15
`head portion of the person image may also be identified based
`on the detection of motion in the video images. Such identi(cid:173)
`fication is achieved by creating a motion distribution map
`from the video images; generating a histogram based on the
`motion distribution map; identifying areas of motion using
`the histogram; and selecting at least one of the areas of motion
`as being the initial location of the head portion of the person
`image.
`Similarly, the current location of the head portion of the
`person image may be tracked based on the detection of
`motion in the video images. This may be accomplished by
`determining a search window based on a previous location of
`the head portion of the person image; creating a motion dis(cid:173)
`tribution map within the search window; generating a histo(cid:173)
`gram based on the distribution motion map; identifying areas
`of motion using the histogram; and selecting at least one of
`the areas of motion as being the initial location of the head
`portion of the person image.
`These and other aspects and advantages of the present
`invention will become more apparent when the Description
`below is read in conjunction with the accompanying Draw-
`ings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`4
`FIG. 7 shows a flow chart for a process of the present
`invention associated with the skin detection operation 604 of
`FIG. 6;
`FIG. 7A illustrates a person image of the video images, as
`inputted into the extract flesh map operation 702 of FIG. 7;
`FIG. 7B illustrates a raw flesh map, as outputted from the
`extract flesh map operation 702 of FIG. 7;
`FIG. 7C illustrates a flesh map, as outputted from the fill
`holes operation 710 of FIG. 7;
`FIG. 7D illustrates a flesh map, as outputted from the
`combine regions operation 714 of FIG. 7;
`FIG. 8 illustrates a flow chart for a process of the present
`invention associated with the generate hypothesis operation
`716 of FIG. 7;
`FIG. 9 shows a flow chart for a process of the present
`invention associated with the motion detection operation 606
`of FIG. 6;
`FIG. 10 shows a flow chart for a process of the present
`invention associated with the color follower operation 604 of
`20 FIG. 6;
`FIG. l0A illustrates a sub-window of the present invention
`associated with operation 1000 of FIG. 10;
`FIG. lOB shows an RGB histogram of the present inven(cid:173)
`tion outputted for each pixel within the image sub-window of
`25 FIG. l0B as a result of operation 1006 of FIG. 10;
`FIG. l0C is an illustration of a previous verified head
`rectangle and a search grid generated therefrom in operation
`1009 of FIG. 10;
`FIG. 11 shows a flow chart for a process of the present
`30 invention associated with the perform search operation 1016
`ofFIG.10;
`FIG. llA shows the search grid and the areas involved with
`the process of FIG. 11;
`FIG. 12 illustrates a flow chart for a process of the present
`35 invention associated with a feedback process between the
`color follower operation 612 and the skin detection operation
`604 of FIG. 6; and
`FIG. 13 shows a flow chart for a process of the present
`invention associated with the motion follower operation 610
`40 of FIG. 6.
`
`The present invention will be readily understood by the
`following detailed description in conjunction with the accom(cid:173)
`panying drawings, with like reference numerals designating
`like elements.
`FIG. 1 is a schematic diagram illustrating an exemplary 45
`hardware implementation in accordance with one embodi(cid:173)
`ment of the present invention;
`FIG. 2 illustrates a flowchart of a process for tracking a
`head portion of a person image in camera-generated video
`images in accordance with one embodiment of the present 50
`invention;
`FIG. 3 shows a flow chart for a first head tracking operation
`that tracks a head portion of a person image in camera-gen(cid:173)
`erated video images using background subtraction in accor(cid:173)
`dance with one embodiment of the present invention;
`FIG. 4 illustrates a flow chart for a process of the present
`invention which carries out the scene parsing operation 304 of
`FIG. 3;
`FIG. 5 illustrates a flow chart for a process of the present
`invention which carries out operation 306 of FIG. 3;
`FIG. SA is an illustration of a y-axis histogram generated in
`operation 500 shown in FIG. 5.
`FIG. 6 shows a flow chart for a second head tracking
`operation that tracks a head portion of a person image in
`camera-generated video images using capture and tracker
`routines in accordance with one embodiment of the present
`invention;
`
`DETAILED DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`The present invention affords a technique for tracking a
`head portion of a person image in camera-generated video
`images. This is accomplished using at least two head tracking
`operations that each track the head portion of the person
`image in camera-generated video images. In addition, each
`head tracking operation further generates a confidence value
`that is indicative of a certainty that the head portion of the
`person image is being tracked correctly. This information
`may be used by an associated application for depicting an
`interaction between the head and a virtual computer-gener-
`55 ated environment.
`FIG. 1 shows an exemplary hardware configuration in
`accordance with one embodiment of the present invention
`where a central processing unit 110, such as a microproces(cid:173)
`sor, and a number of other units interconnected via a system
`60 bus 112. The hardware configuration shown in FIG. 1
`includes Random Access Memory (RAM) 114, Read Only
`Memory (ROM) 116, an I/0 adapter 118 for connecting
`peripheral devices such as disk storage units 120 to the bus
`112, a user interface adapter 122 for connecting a keyboard
`65 124, a mouse 126, a speaker 128, a microphone 132, a camera
`133 and/or other user interface devices to the bus 112, com(cid:173)
`munication adapter 134 for connecting the hardware configu-
`
`
`
`Case 2:20-cv-02640-NGG-SIL Document 1-6 Filed 06/14/20 Page 21 of 25 PageID #: 71
`
`US 7,715,476 B2
`
`5
`ration to a connnunication network 135 ( e.g., a data process(cid:173)
`ing network) and a display adapter 136 for connecting the bus
`112 to a display device 138.
`The hardware configuration typically has resident thereon
`an operating system such as the Microsoft Windows NT or 5
`Windows/98/2000 Operating System (OS), the IBM OS/2
`operating system, the MAC OS, or UNIX operating system.
`Those skilled in the art will appreciate that the present inven(cid:173)
`tion may also be implemented on platforms and operating
`systems other than those mentioned. For example, a game 10
`system such as a SONY PLAYSTATION or the like may be
`employed. Yet another example includes an application spe(cid:173)
`cific integrated circuit (ASIC) or any other type of hardware
`logic that is capable of executing the processes of the present
`invention. Further, in one embodiment, the various processes 15
`employed by the present invention may be implemented
`using C++ progrannning language or the like.
`FIG. 2 illustrates a flowchart of a process for tracking a
`head portion of a person image in camera-generated video
`images in accordance with one embodiment of the present 20
`invention. As shown, upon receiving video images generated
`by a camera, a first head tracking operation 200 is executed
`for generating a first confidence value. It should be noted that
`the video images may be generated by the camera at any time
`and not necessarily innnediately before being received by the 25
`head tracking operation. Further, the video images may be
`partly computer enhanced or completely computer generated
`per the desires of the user.
`The first confidence value generated by the first head track(cid:173)
`ing operation is representative of a confidence that a head 30
`portion of a person image in the camera-generated video
`images is located. Also executed is a second head tracking
`operation 202 for generating a second confidence value rep(cid:173)
`resentative of a confidence that the head portion of the person
`image in the camera-generated video images is located.
`The first confidence value and the second confidence value
`may then be made available for use by various applications in
`operation 204. Such applications may decide whether the
`head portion of the person image has moved based on the
`confidence values. Logic such as an AND operation, an OR
`operation, or any other more sophisticated logic may be
`employed to decide whether the results of the first head track(cid:173)
`ing operation and/or the second head tracking operation are
`indicative of true head movement.
`For example, if at least one of the head tracking operations
`indicates a high confidence of head movement, it may be
`decided to assume that the head has moved. On the other
`hand, if both head tracking operations indicate a medium
`confidence of movement, it may be assumed with similar
`certainty that the head has moved. If it is decided to assume
`that the head has moved, an interaction may be shown
`between the video images generated by the camera and the
`virtual computer-generated environment.
`FIG. 3 shows a flow chart for a process associated with the
`first head tracking operation 200. In use, the first head track- 55
`ing operation 200 tracks a head portion of a person image in
`camera-generated video images using background subtrac(cid:173)
`tion. As shown, in operation 3 00, the first head tracking opera(cid:173)
`tion begins by obtaining a foreground by subtracting a back(cid:173)
`ground image from the video images generated by the 60
`camera. This may be accomplished by first storing the back(cid:173)
`ground image, or model 302, without the presence of the
`person image. Then, a difference may be found between a
`current image and the background image. More information
`on the background model and background subtraction may be 65
`found in a patent application entitled "METHOD AND
`APPARATUS FOR MODEL-BASED COMPOSITING"
`
`6
`filed Oct. 15, 1997 under application Ser. No. 08/951,089
`which is incorporated herein by reference in its entirety.
`Next, in operation 304, a "scene parsing" process is carried
`which identifies a location and a number of person images in
`the video images. This is accomplished by utilizing a person
`image, or foreground mask(s), that is generated by the back-
`ground subtraction carried out in operation 300 of FIG. 3.
`Additional information will be set forth regarding the "scene
`parsing" process with reference to FIG. 4. Finally, the head
`portion is found for each person image in operation 306 that
`will be set forth in greater detail with reference to FIG. 5.
`FIG. 4 illustrates a flow chart for a process of the present
`invention which carries out the scene parsing operation304 of
`FIG. 3. As shown, in operation 400, the subtracted image, or
`foreground mask(s), is first received as a result of the back(cid:173)
`ground subtraction operation 300 of FIG. 3. Next, in opera-
`tion 402, the foreground mask(s) is filtered using a conven(cid:173)
`tional median filter to create a mass distribution map.
`FIG. 4A is an illustration of a mass distribution 404 used in
`the scene parsing process of FIG. 4. As shown, the mass
`distribution 404 indicates a number of pixels, or a pixel den(cid:173)
`sity, along the horizontal axis of the display that do not rep(cid:173)
`resent the background image. In the mass distribution 404 of
`FIG. 4A, a curve 406 of the mass distribution 404 has a
`plurality of peaks 408 which represent high concentrations of
`pixels along the horizontal axis that do not correspond to the
`background image and, possibly, a person image or other
`objects.
`With continuing reference to FIG. 4, in operation 410,
`portions of the mass distribution 404 are eliminated if they do
`not surpass a predetermined threshold. This ensures that
`small peaks 408 of the curve 406 of the mass distribution 404
`having a low probability of being a person image are elimi(cid:173)
`nated. Next, it is then determined whether a previous mass
`35 distribution 404, or history, is available in memory. Note
`decision 412.
`If a history is available, the location and number of person
`images in the video images are identified based on a frame
`difference between the peaks 408 of a previous mass distri-
`40 bution and the peaks 408 of the current mass distribution 404,
`as indicated in operation 414.
`On the other hand, if the history is not available in decision
`412, the peaks 408 of the current mass distribution 404 are
`considered person images in operation 416. In any case, the
`45 location and number of person images that are assumed based
`on the peaks 408 of the mass distribution 404 are stored in
`operation 418. Further information may be found regarding
`scene parsing and locating person images in the video images
`in a U.S. patent application filed Jul. 30, 1999 with the title
`50 "SYSTEM, METHOD AND ARTICLE OF MANUFAC(cid:173)
`TURE FOR DETECTING COLLISIONS BETWEEN
`VIDEO IMAGES GENERATED BY A CAMERA AND AN
`OBJECT DEPICTED ON A DISPLAY" which is incorpo-
`rated herein by reference in its entirety. Once the person
`image(s) have been located in the video images generated by
`the camera, it is then required that the head portion of each
`person image be located.
`FIG. 5 illustrates a flow chart for a process of the present
`invention which carries out operation 306 of FIG. 3. Such
`process starts in operation 500 by generating a mass-distri(cid:173)
`bution histogram that represents the ext