`(12) Patent Application Publication (10) Pub. No.: US 2002/0131641 A1
`(43) Pub. Date:
`Sep. 19, 2002
`Luo et al.
`
`US 2002O131641A1
`
`(54) SYSTEM AND METHOD FOR
`DETERMINING IMAGE SIMILARITY
`(76) Inventors: Jiebo Luo, Rochester, NY (US); Wei
`Zhu, Cambridge, MA (US); George E.
`Sotak, Mendon, NY (US); Robert T.
`Gray, Rochester, NY (US); Rajiv
`Mehrotra, Rochester, NY (US)
`Correspondence Address:
`Thomas H. Close
`Patent Legal Staff
`Eastman Kodak Company
`343 State Street
`Rochester, NY 14650-2201 (US)
`Appl. No.:
`09/798,604
`
`(21)
`(22)
`
`Filed:
`
`Mar. 2, 2001
`Related U.S. Application Data
`(60) Provisional application No. 60/263,960, filed on Jan.
`24, 2001.
`
`Publication Classification
`
`Int. Cl." ............................. G06K 9/00; G06K 9/54;
`G06K 9/68; G06K 9/60
`U.S. Cl. ........................... 382/218; 382/165; 382/305
`
`(51)
`(52)
`ABSTRACT
`(57)
`A System and method for determining image Similarity. The
`method includes the Steps of automatically providing per
`
`ceptually significant features of main Subject or background
`of a first image, automatically providing perceptually Sig
`nificant features of main Subject or background of a Second
`image, automatically comparing the perceptually significant
`features of the main Subject or the background of the first
`image to the main Subject or the background of the Second
`image, and providing an output in response thereto. In the
`illustrative implementation, the features are provided by a
`number of belief levels, where the number of belief levels
`are preferably greater than two. The perceptually significant
`features include color, texture and/or shape. In the preferred
`embodiment, the main Subject is indicated by a continuously
`valued belief map. The belief values of the main subject are
`determined by Segmenting the image into regions of homog
`enous color and texture, computing at least one structure
`feature and at least one Semantic feature for each region, and
`computing a belief value for all the pixels in the region using
`a Bayes net to combine the features. In an illustrative
`application, the inventive method is implemented in an
`image retrieval System. In this implementation, the inventive
`method automatically Stores perceptually significant fea
`tures of the main Subject or background of a plurality of first
`images in a database to facilitate retrieval of a target image
`in response to an input or query image. Features correspond
`ing to each of the plurality of Stored images are automati
`cally Sequentially compared to Similar features of the query
`image. Consequently, the present invention provides an
`automatic System and method for controlling the feature
`eXtraction, representation, and feature-based Similarity
`retrieval Strategies of a content-based image archival and
`retrieval System based on an analysis of main Subject and
`background derived from a continuously valued main Sub
`ject belief map.
`
`O
`
`2
`
`s
`
`1.
`Image segmeritation
`
`16
`
`Feature extraction
`
`Belief computation
`
`&
`
`8 in
`Dr. 22
`
`MSD =
`
`
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 1
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 1 of 8
`
`US 2002/0131641 A1
`
`Figure 1
`
`12
`
`O
`
`
`
`
`
`
`
`14
`Image segmentation
`
`S
`
`16
`
`Feature extraction
`
`18
`
`Belief computation
`
`MSD =
`
`- - - m - - - - - - we re - - - - - - - - - - - - - - - - - -
`
`22
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 2
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 2 of 8
`
`US 2002/0131641 A1
`
`
`
`
`
`Figure 2
`
`22
`
`Belief
`Map B(I)
`
`1O
`
`-21
`
`
`
`
`
`
`
`
`
`Image
`Feature
`Extraction
`
`40
`
`Database for Storing
`Perceptually Significant
`Features and Index Structure
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 3
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 3 of 8
`
`US 2002/0131641 A1
`
`Figure 3
`
`Input Image
`
`Belief Level Image
`
`SO
`
`-11
`
`Compute coherent histogram of the belief
`level image for each belief level
`
`52
`
`
`
`Analyze coherent histogram to identify perceptually
`significant colors
`
`54
`
`
`
`
`
`Combine all coherent histograms to represent belief level image in
`terms of identified perceptually Significant colors in the image
`
`56
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 4
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 4 of 8
`
`US 2002/0131641 A1
`
`Figure 4
`
`Input Image
`
`Belief Level Image
`
`60
`
`-1-1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Apply first two steps of Figure 2 to obtain the initial set of perceptually
`significant colors of the belief level image for each belief level
`
`Extract regions composed of the pixels of colors belonging
`to the initial set of perceptually significant colors found above
`
`Analyze regions to determine the final set of
`perceptually significant colors
`
`52, 54
`
`62
`
`Represent belief level image in terms of the
`perceptually significant colors in the believe level image
`
`66
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 5
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 5 of 8
`
`US 2002/0131641 A1
`
`Figure 5
`
`Input Image
`
`Belief Level Image
`
`-11
`
`70
`
`Detect color
`transition detection
`
`
`
`72
`
`Identify all frequently occurring color transitions
`
`74
`
`Analyze texture property of frequently occurring color transitions
`
`76
`
`Represent belief level image in terms of the perceptually
`significant textures in the belief level image
`
`78
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 6
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 6 of 8
`
`US 2002/0131641 A1
`
`Figure 6
`
`82
`
`80 -1
`
`84
`
`
`
`
`
`
`
`Query Image
`Feature
`Extraction
`
`
`
`90
`
`Database and Index
`Structure Search &
`Feature Comparison
`
`92
`
`94
`96
`
`Retrieved
`image R1
`
`
`
`image Rim
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 7
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 7 of 8
`
`US 2002/0131641 A1
`
`80
`
`-11
`
`82
`
`Query
`image Q
`
`
`
`84
`
`Belief map
`B(Q)
`
`91
`
`Figure 7
`
`Feature
`Extraction
`
`image. In
`
`
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 8
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 8 of 8
`
`US 2002/0131641 A1
`
`Figure 8
`
`
`
`
`
`Belief Levels
`
`Belief Levels
`
`(a)
`
`(b)
`
`
`
`Belief Levels
`
`(c)
`
`
`
`Belief Levels
`
`(d)
`
`
`
`
`
`
`
`
`
`Belief Levels
`
`Belief Levels
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 9
`
`
`
`US 2002/0131641 A1
`
`Sep. 19, 2002
`
`SYSTEMAND METHOD FOR DETERMINING
`IMAGE SIMLARITY
`
`FIELD OF THE INVENTION
`0001. The present invention relates to systems and meth
`ods for processing images. More Specifically, the present
`invention relates to Systems and methods for effecting
`automatic image retrieval.
`
`BACKGROUND OF THE INVENTION
`0002) 1. Description of the Related Art
`0.003
`Image-based document retrieval is required for a
`variety of consumer, commercial and government applica
`tions. Originally, images were retrieved manually. However,
`as image databases became larger, automated image
`retrieval Systems were developed to accelerate the Search
`and retrieval process.
`0004 One conventional automated approach involves the
`asSociation of certain keywords with each image in a data
`base. Images are then retrieved by a keyword Search. How
`ever, this System Suffers from the time intensive process of
`keyword input for large databases. In addition, the approach
`is highly dependent on the Somewhat Subjective manual
`assignment of keywords for each image and for the Search
`itself. Finally, there is a limit with respect to the extent to
`which an image can be described adequately to allow for
`effective Searching.
`0005) Another approach is that of automatic CBIR (con
`tent-based image retrieval). This System involves an analysis
`of each Stored image with respect to its content (in terms of
`color, texture, shape, etc.). For example, the color content is
`Stored in a histogram. In the Search and retrieval process, the
`histogram from a query image is compared to the Stored
`histogram data to find a best match. However, this System
`does not take into account Spatial distribution of the color
`data.
`0006 The most often used approach to searching a data
`base to Select/retrieve images that are similar to a query is to
`compare the query image with the images in the database
`using their feature-based representation by means of dis
`tance functions. (See U.S. Pat. No. 5,579,471, entitled
`“Image Query System and Method,” issued Nov. 26, 1996 to
`R. J. Barber et al.; U.S. Pat. No. 5,852,823, entitled “Auto
`matic Image Classification and Retrieval System From Data
`base Using Query-By-Example Paradigm, issued Dec. 22,
`1998 to J. S. De Bonet; “Color Indexing", published in Intl.
`Journal, of Computer Vision, by M. J. Swain and D. H.
`Ballard, Vol. 7, No. 1, 1991, pp. 11-32; and “Comparing
`Images Using Color Coherence Vectors,” published by G.
`Pass, et al., in Proceedings ACM Multimedia Conf, (1996).
`0007. These techniques represent an image in terms of its
`depictive features, Such as color or texture. Given a query
`image Q, its feature-based representation is compared
`against the representation of every image I in the database to
`compute the Similarity of Q and I. The images in the
`database are then ranked in decreasing order of their simi
`larity with respect to the query image to form the response
`to the query. A key Shortcoming of these techniques is that
`no distinction is made between perceptually significant and
`insignificant image features in the image representation and
`matching Schemes.
`
`0008. In general, a human observer determines the con
`tent-based similarity of two imageS primarily on the basis of
`the perceptually significant contents of the image and not the
`finer details. By mimicking this behavior, a similarity
`retrieval System might produce results that are in more
`agreement with human interpretation of similarity. However,
`this fact has not been exploited by any of the above
`mentioned techniques.
`0009. In a copending U.S. Patent Application entitled
`“Perceptually Significant Feature-based Image Archival and
`Retrieval,” U.S. Ser. No. filed Apr. 14, 1999 by Wei Zhu and
`Rajiv Mehrotra, the teachings of which are incorporated
`herein by reference, Zhu et al. attempt to overcome the
`above-mentioned shortcoming by representing an image in
`terms of its perceptually significant features. Thus, Similarity
`of two images becomes a function of the Similarity of their
`perceptually significant features.
`0010. However, in this approach, image features are
`extracted from the properties of the entire image. There is no
`flexibility in computing image features or comparing image
`Similarities based on main Subject or background regions. AS
`a result, more targeted Searches, Such as finding images with
`Similar main Subjects but dissimilar backgrounds as the
`query. cannot be performed.
`0011 Recently, U.S. Pat. No. 6,038,365, entitled “Image
`Retrieval-Oriented Processing Apparatus Which Generates
`and Displays Search Image Data That Is Used AS Index.”
`was issued to T. Yamagami on Mar. 14, 2000. An image
`processing apparatus according to this invention includes a
`designating unit for designating an image area to be used as
`a retrieval image from a recorded image recorded in a
`recording medium, a Storing unit for Storing image area data
`representing the image area designated by the designating
`unit in connection with the corresponding recorded image,
`and a displaying unit for displaying, as the retrieval image,
`an image of the image area on the basis of the corresponding
`image area data Stored in the Storing unit.
`0012 Further, an image processing apparatus according
`to Yamagami's invention includes a designating unit for
`designating an image area from an original image consti
`tuting a Screen as a retrieval image, a Storing unit for Storing
`the retrieval image designated by the designating unit in
`connection with the corresponding original image, a dis
`playing unit for displaying the retrieval image designated by
`the designating unit, an instructing unit for instructing the
`retrieval image displayed by the displaying unit, and a
`display control unit for displaying, on the displaying unit,
`the original image corresponding to the retrieval image
`instructed by the instructing unit.
`0013 Hence, Yamagami appears to disclose use of a
`Selected area of an image for image retrieval. However, the
`Selection is done manually using a designating unit. Further,
`the use of the Selected area is motivated by an image
`reduction problem that makes characters too small to read.
`Since image data can generally be recognized only when a
`human being looks at them, when image data are repro
`duced, a list of a plurality of reduced images may generally
`be displayed So that the user can check the contents of image
`files, using the reduced images themselves as the retrieval
`images. However, in retrieval display of reduced images,
`Since an entire image is simply reduced to, for example, one
`eighth in both its longitudinal and lateral dimensions, the
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 10
`
`
`
`US 2002/0131641 A1
`
`Sep. 19, 2002
`
`reduced image may be too Small to be recognized easily,
`making the use of that reduced image as a retrieval image
`impossible.
`0.014
`Consequently, Yamagami does not teach an auto
`matic, general-purpose image retrieval apparatus. Nor is
`Yamagami's invention built upon an automatic Scene-con
`tent analysis Scheme. Accordingly, a need remains in the art
`for a more accurate System or method for automatically
`retrieving images from a database.
`
`SUMMARY OF THE INVENTION
`0.015 The need in the art is addressed by the system and
`method for determining image Similarity of the present
`invention. The inventive method includes the steps of auto
`matically providing perceptually Significant features of main
`Subject or background of a first image, automatically pro
`Viding perceptually significant features of main Subject or
`background of a Second image, automatically comparing the
`perceptually significant features of the main Subject or the
`background of the first image to the main Subject or the
`background of the Second image, and providing an output in
`response thereto.
`0016. In the illustrative implementation, the features are
`provided by a number of belief levels, where the number of
`belief levels are preferably greater than two. In the illustra
`tive embodiment, the Step of automatically providing per
`ceptually significant features of the main Subject or back
`ground of the first image includes the Steps of automatically
`identifying main Subject or background of the first image
`and the Step of identifying perceptually significant features
`of the main Subject or the background of the first image.
`Further, the Step of automatically providing perceptually
`Significant features of the main Subject or background of the
`Second image includes the Steps of automatically identifying
`main Subject or background of the Second image and the Step
`of identifying perceptually significant features of the main
`Subject or the background of the Second image.
`0.017. The perceptually significant features may include
`color, texture and/or shape. In the preferred embodiment, the
`main Subject is indicated by a continuously valued belief
`map. The belief values of the main subject are determined by
`Segmenting the image into regions of homogenous color and
`texture, computing at least one Structure feature and at least
`one Semantic feature for each region, and computing a belief
`value for all the pixels in the region using a Bayes net to
`combine the features.
`0.018. In an illustrative application, the inventive method
`is implemented in an image retrieval System. In this imple
`mentation, the inventive method automatically Stores per
`ceptually significant features of the main Subject or back
`ground of a plurality of first images in a database to facilitate
`retrieval of a target image in response to an input or query
`image. Features corresponding to each of the plurality of
`Stored images are automatically Sequentially compared to
`Similar features of the query image. Consequently, the
`present invention provides an automatic System and method
`for controlling the feature extraction, representation, and
`feature-based Similarity retrieval Strategies of a content
`based image archival and retrieval System based on an
`analysis of main Subject and background derived from a
`continuously valued main Subject belief map.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0019 FIG. 1 is a block diagram of an illustrative embodi
`ment of an automatic main Subject detection System.
`0020 FIG. 2 is a simplified block diagram of a general
`Scheme for image feature extraction in accordance with the
`teachings of the present invention.
`0021
`FIG. 3 is a flow diagram showing an illustrative
`embodiment of a method for identifying perceptually Sig
`nificant colors of a belief level image in accordance with the
`teachings of the present invention.
`0022 FIG. 4 is a flow diagram showing an illustrative
`alternative embodiment of a method for identifying percep
`tually significant colors of a belief level image in accordance
`with the teachings of the present invention.
`0023 FIG. 5 is a flow diagram of an illustrative method
`for identifying perceptually significant textures in accor
`dance with the teachings of the present invention.
`0024 FIG. 6 and FIG. 7 are simplified block diagrams of
`a general Scheme for image retrieval implemented in accor
`dance with the teachings of the present invention.
`0025 FIG. 8 is a diagram showing a series of belief level
`representations illustrative of numerous options for image
`retrieval in accordance with the teachings of the present
`invention.
`
`DESCRIPTION OF THE INVENTION
`0026 Illustrative embodiments and exemplary applica
`tions will now be described with reference to the accompa
`nying drawings to disclose the advantageous teachings of
`the present invention.
`0027. While the present invention is described herein
`with reference to illustrative embodiments for particular
`applications, it should be understood that the invention is not
`limited thereto. Those having ordinary skill in the art and
`access to the teachings provided herein will recognize
`additional modifications, applications, and embodiments
`within the scope thereof and additional fields in which the
`present invention would be of Significant utility.
`0028. As discussed more filly below, the present inven
`tion automatically determines image Similarity according to
`an analysis of the main Subject in the Scene. A System for
`detecting main Subjects (i.e., main Subject detection or
`“MSD) in a consumer-type photographic image from the
`perspective of a third-party observer is described in copend
`ing U.S. patent application Ser. No. 09/223,860, filed Dec.
`31, 1998, by J. Luo et al. and entitled METHOD FOR
`AUTOMATIC DETERMINATION OF MAIN SUBJECTS
`IN PHOTOGRAPHIC IMAGES (Atty. Docket No. 78783)
`the teachings of which are incorporated herein by reference.
`0029 Main subject detection provides a measure of
`Saliency or relative importance for different regions that are
`asSociated with different Subjects in an image. Main Subject
`detection enables a discriminative treatment of the Scene
`content for a number of applications related to consumer
`photographic images, including automatic content-based
`image retrieval.
`0030 Conventional wisdom in the field of computer
`vision, which reflects how a human observer would perform
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 11
`
`
`
`US 2002/0131641 A1
`
`Sep. 19, 2002
`
`Such tasks as main Subject detection and cropping, calls for
`a problem-Solving path Via object recognition and Scene
`content determination according to the Semantic meaning of
`recognized objects. However, generic object recognition
`remains a largely unsolved problem despite decades of effort
`from academia and industry.
`0031) The MSD system is built upon mostly low-level
`Vision features with Semantic information integrated when
`ever available. This MSD system has a number of compo
`nents, including region Segmentation, feature extraction, and
`probabilistic reasoning. In particular, a large number of
`features are extracted for each Segmented region in the
`image to represent a wide variety of Visual Saliency prop
`erties, which are then input into a tunable, extensible prob
`ability network to generate a belief map containing a con
`tinuum of values.
`0032. Using MSD, regions that belong to the main Sub
`ject are generally differentiated from the background clutter
`in the image. Thus, Selective retrieval according to Similar
`main Subjects or Similar background becomes possible. It
`even becomes possible to perform Selective retrieval accord
`ing to dissimilar main Subjects or dissimilar background.
`0033) Automatic subject-based image indexing is a non
`trivial operation that would be considered impossible for
`unconstrained images, which do not necessarily contain
`uniform background, without a certain amount of Scene
`understanding and Scene-content differentiation. In the
`absence of automatic Subject/background Segmentation,
`conventional Systems either have to rely on a manually
`created mask to outline where the main Subject is or do not
`have any capability of Subject-based image retrieval. The
`manual procedure is laborious and therefore not feasible for
`commercial mass processing for consumers.
`0034 FIG. 1 is a block diagram illustrative of an embodi
`ment of an automatic main Subject detection System imple
`mented in accordance with the teachings of the above
`referenced application filed by Luo et al. In accordance with
`the system 10' of Luo et al., an input image 12" is first
`Segmented into a few regions of homogeneous properties
`(e.g., color and texture) in an image Segmentation process
`Step 14'. Next, the regions are evaluated for their Saliency in
`terms of two independent but complementary types-struc
`tural features and Semantic features in a feature extraction
`proceSS Step 16'. For example, recognition of human skin or
`faces is Semantic while determination of what Stands out
`from the background clutter is categorized as Structural. For
`Structural features, a Set of low-level vision features and a Set
`of geometric features are extracted. For Semantic features,
`key Subject matters frequently Seen in photographic pictures
`are detected. In a belief computation proceSS Step 18,
`evidence of both types of features are integrated using a
`Bayes net-based reasoning engine to yield the final belief
`map 22 of the main Subject. For reference on Bayes nets, See
`J. Pearl, Probabilistic Reasoning in Intelligent Systems,
`Morgan Kaufmann, San Francisco, Calif., 1988.
`0035. One structural feature is centrality. In terms of
`location, the main Subject tends to be located near the center
`instead of the periphery of the image, though not necessarily
`right in the center of the image. In fact, professional pho
`tographers tend to position the main Subject at the horizontal
`gold-partition positions ("rule of a third”).
`0036. It is recognized that the centroid of the region alone
`may not be Sufficient to indicate the location of a region
`
`without any indication of its Size and shape. A centrality
`measure is defined by computing the integral of a probability
`density function (PDF) over the area of a given region. The
`PDF is derived from the ground truth data, in which the main
`Subject regions are manually outlined and marked by a value
`of 1 and the background regions are marked by a value of 0,
`by Summing up the ground truth maps over the entire
`training set. In essence, the PDF represents the distribution
`of main Subjects in terms of location.
`0037. In accordance with the present teachings, a cen
`trality measure is devised Such that every pixel of a given
`region, not just the centroid, contributes to the centrality
`measure of the region to a varying degree depending on its
`location. The centrality measure is defined as:
`
`1
`centrality = N X. PDFMSD Location (x, y)
`R (x,y)e R
`
`0038 where (x, y) denotes a pixel in the region R, N is
`the number of pixels in region R. If the orientation is
`unknown, the PDF is symmetric about the center of the
`image in both vertical and horizontal directions, which
`results in an orientation-independent centrality measure. If
`the orientation is known, the PDF is symmetric about the
`center of the image in the horizontal direction but not in the
`Vertical direction, which results in an orientation-dependent
`centrality measure.
`0039. Another structure feature is borderness. Many
`background regions tend to contact one or more of the image
`borders. Therefore, a region that has significant amount of its
`contour on the image borders tends to belong to the back
`ground. In accordance with the present teachings, two
`measures are used to characterize the borderness of a region,
`the percentage of its perimeter along the image border(s) and
`the number of image borders that a region intersects.
`0040. When orientation is unknown, one borderness fea
`ture places each region in one of six categories determined
`by the number and configuration of image borders the region
`is “in contact” with. A region is “in contact” with a border
`when at least one pixel in the region falls within a fixed
`distance of the border of the image. Distance is expressed as
`a fraction of the shorter dimension of the image. The Six
`categories for bordernes 1 are none, one border, two
`touching borders, two facing borders, three, four.
`0041 Knowing the image orientation allows us to rede
`fine the borderness feature to account for the fact that
`regions in contact with the top border are much more likely
`to be background than regions in contact with the bottom
`border. This results in 12 categories for borderness 1 deter
`mined by the number and configuration of image borders the
`region is "in contact with, using the definition of "in contact
`with from above. The four borders of the image are labeled
`as “Top”, “Bottom”, “Left', and “Right” according to their
`position when the image is oriented with objects in the Scene
`Standing upright.
`0042 A second borderness features borderness 2 is
`defined to indicate what fraction of the region perimeter is
`on the image border. Because Such a fraction cannot exceed
`0.5, we use the following definition to normalize the feature
`value to 0, 1).
`
`Syte - Visual Conception Ltd. Ex. 1010 p. 12
`
`
`
`US 2002/0131641 A1
`
`Sep. 19, 2002
`
`borderness 1 =
`2 number of region perimeter pixels on image border
`X
`number of region perimeter pixels
`
`2
`
`0.043 Yet another structural feature may be depth. In
`general, depth of all the objects in the Scene is not available.
`However, if available, for example through a range finder,
`Such a feature is valuable for differentiating the main Subject
`from the background because the main Subject tends to be in
`the foreground and closer to the observer. Note that, how
`ever, objects in the foreground may not necessarily be the
`main Subject.
`0044 One semantic feature is skin. According a study of
`a photographic image database of over 2000 images, over
`70% of the photographic images have people and about the
`Same number of images have sizable faces in them. Indeed,
`people are the Single most important Subject in photographs.
`004.5 The current skin detection algorithm utilizes color
`image Segmentation and a pre-determined skin distribution
`in a specific chrominance space, P(skinchrominance). It is
`known that the largest variation between different races is
`along the luminance direction, and the impact of illumina
`tion Sources is also primarily in the luminance direction. The
`skin region classification is based on maximum probability
`according to the average color of a Segmented region. The
`probabilities are mapped to a belief output via a sigmoid
`belief function.
`0046) The task of main subject detection, therefore, is to
`determine the likelihood of a given region in the image being
`the main subject based on the posterior probability of
`P(MSD feature). Note that there is one Bayes net active for
`each region in the image. In other words, the reasoning is
`performed on a per region basis (instead of per image).
`0047 The output of MSD operation, is a list of seg
`mented regions ranked in descending order of their likeli
`hood (or belief) as potential main Subjects for a generic or
`Specific application. This list can be readily converted into a
`map in which the brightness of a region is proportional to the
`main Subject belief of the region. Therefore, this map is
`called a main subject “belief map. This “belief map is
`more than a binary map that only indicates location of the
`determined main Subject. The associated likelihood is also
`attached to each region So that the regions with large values
`correspond to regions with high confidence or belief of
`being part of the main Subject.
`0.048. To some extent, this belief map reflects the inherent
`uncertainty for humans to perform such a task as MSD
`because different observers may disagree on certain Subject
`matters while agreeing on other Subject matters in terms of
`main Subjects. However, a binary decision, when desired,
`can be readily obtained by using an appropriate threshold on
`the belief map. Moreover, the belief information may be
`very useful for downstream applications. For example, dif
`ferent weighting factors can be assigned to different regions
`(Subject matters) in determining the amount of emphasis on
`Subject or background.
`0049. For determination of subject and background, the
`present invention can also use the main Subject belief map
`
`instead of a binarized version of the map to avoid making a
`Suboptimal decision about main Subject and background that
`is visually incorrect. A binary decision on what to include
`and what not to include, once made, leaves little room for
`error. For example, even if portions of the main Subject are
`not assigned the highest belief, with a gradual (as opposed
`to binary) emphasizing process, it is likely they would retain
`Some importance. In other words, if an undesirable binary
`decision on what to include/exclude is made, there is no
`recourse to correct the mistake. Consequently, the accuracy
`of the retrieval becomes sensitive to the robustness of the
`automatic MSD method and the threshold used to obtain the
`binary decision. With a continuous-valued main Subject
`belief map, every region or object is associated with a
`likelihood of being emphasized or de-emphasized. More
`over, Secondary main Subjects are indicated by intermediate
`belief values in the main Subject belief map, and can be
`Somewhat emphasized according to a descending order of
`belief values while the main subject of highest belief values
`are emphasized the most.
`0050. After the main subject belief map is created, a
`multilevel belief map can be derived from the continuous
`valued main subject belief map by multi-level thresholding
`or clustering. This process creates a Step-valued belief map,
`which characterizes a gradual but discrete belief transition
`from definite main Subject, to most likely main Subject, all
`the way down to definite background. Those skilled in the art
`may note that within the Scope of this invention, the number
`of discrete belief levels (N) can be any integer value between
`2 (binary decision) and the original resolution of the con
`tinuous belief map. After the multi-level belief map is
`created, in order to allow image Similarity computation
`based on main Subject regions or background regions of the
`image, or a combination thereof, image features are com
`puted for each of the N discrete levels of the belief map.
`Together with the original image, each level of the belief
`map acts as a mask that Selects only those pixels that belong
`to that particular belief level from the original image, and
`perceptually significant features for the pixels that belong to
`that particular level are computed. Henceforth, an image
`masked for a particular belief level will be referred to as a
`“belief level image.” According to the present invention, the
`preferred features for the representation of each belief level
`of an image are color and texture. Those skilled in the art
`should note that additional features Such as Shape can be
`used without departing from the Scope of this invention.
`0051. In accordance with the present teachings, the
`inventive method includes the Steps of automatically pro
`Viding perceptually significant features of main Subject or
`background of a first image; automatically pr