throbber
Query by Image
`Query by Image
`and Video Content:
`The QBIC System
`The QBIC System
`
`Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom,
`Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom,
`Monika Gorkani, Jim Hafher, Denis Lee, Dragutin Petkovie, David Steele, and Peter Yanker
`Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, and Peter Yanker
`ZBMAlmaden Research Center
`IBM Almaden Research Center
`
`P
`picture yourself as a fashion designer needing images of fabrics
`
`icture yourself as a fashion designer needing images of fabrics
`with a particular mixture of colors, a museum cataloger looking
`with a particular mixture of colors, a museum cataloger looking
`for artifacts of a particular shape and textured pattern, or a movie
`for artifacts of a particular shape and textured pattern, or a movie
`producer needing a video clip of a red car-like object moving from right
`producer needing a video clip of a red car-like object moving from right
`to left with the camera zooming. How do you find these images? Even
`to left with the camera zooming. How do you find these images? Even
`though today’s technology enables us to acquire, manipulate, transmit,
`though today's technology enables us to acquire, manipulate, transmit,
`and store vast on-line image and video collections, the search method-
`and store vast on-line image and video collections, the search method-
`ologies used to find pictorial information are still limited due to difficult
`ologies used to find pictorial information are still limited due to difficult
`research problems (see “Semantic versus nonsemantic” sidebar). Typ-
`research problems (see "Semantic versus nonsemantic" sidebar). Typ-
`ically, these methodologies depend on file IDS, keywords, or text associ-
`ically, these methodologies depend on file IDs, keywords, or text associ-
`ated with the images. And, although powerful, they
`ated with the images. And, although powerful, they
`don’t allow queries based directly on the visual properties of the images,
`• don't allow queries based directly on the visual properties of the images,
`are dependent on the particular vocabulary used, and
`• are dependent on the particular vocabulary used, and
`don’t provide queries for images similar to a given image.
`• don't provide queries for images similar to a given image.
`Research on ways to extend and improve query methods for image data-
`Research on ways to extend and improve query methods for image data-
`bases is widespread, and results have been presented in workshops, con-
`bases is widespread, and results have been presented in workshops, con-
`ferences,’.* and surveys.
`ferences,L2 and surveys.
`We have developed the QBIC (Query by Image Content) system to
`We have developed the QBIC (Query by Image Content) system to
`explore content-based retrieval methods. QBIC allows queries on large
`explore content-based retrieval methods. QBIC allows queries on large
`image and video databases based on
`image and video databases based on
`example images,
`• example images,
`user-constructed sketches and drawings,
`• user-constructed sketches and drawings,
`selected color and texture patterns,
`• selected color and texture patterns,
`
`m
`
`QBlC* lets users
`QBIC* lets users
`find pictorial information
`find pictorial information
`in large image and video
`in large image and video
`databases based on color,
`databases based on color,
`shape, texture, and sketches.
`shape, texture, and sketches.
`QBIC technology is part of
`QBIC technology is part of
`several I B M products.
`several IBM products.
`
`‘To run an interacnve query, vult the QBIC Web sewer
`*To run an interactive query, visit the QBIC Web server
`at http //imwqbic almaden ibm COW
`' at lutp://wwwqbic. alma den. ibm. con3.
`
`Semantic versus nonsemantic information
`Semantic versus nonsemantic information
`At first glance, content-based querying appears deceptively
`descriptions t o scenes through model matching-is an
`At first glance, content-based querying appears deceptively
`descriptions to scenes through model matching—is an
`simple because we humans seem to be so good at it. If a pro-
`unsolved problem in image understanding. Humans are
`simple because we humans seem to be so good at it. If a pro-
`unsolved problem in image understanding. Humans are
`gram can be written to extract semantically relevant text
`much better than computers at extracting semantic descrip-
`gram can be written to extract semantically relevant text
`much better than computers at extracting semantic descrip-
`phrases from images, the problem may be solved by using
`tions from pictures. Computers, however, are better than
`phrases from images, the problem may be solved by using
`tions from pictures. Computers, however, are better than
`currently available text-search technology. Unfortunately, in
`humans at measuring properties and retaining these in
`currently available text-search technology. Unfortunately, in
`humans at measuring properties and retaining these in
`an unconstrained environment, the task of writing this pro-
`long-term memory.
`an unconstrained environment, the task of writing this pro-
`long-term memory.
`gram is beyond the reach of current technology in image
`One of the guiding principles used by QBIC is t o let com-
`gram is beyond the reach of current technology in image
`One of the guiding principles used by QBIC is to let com-
`understanding. At an artificial intelligence conference sev-
`puters do what they do best-quantifiable measurement-
`understanding. At an artificial intelligence conference sev-
`puters do what they do best—quantifiable measurement—
`eral years ago, a challenge was issued to the audience to write
`and let humans do what they do best-attaching semantic
`eral years ago, a challenge was issued to the audience to write
`and let humans do what they do best—attaching semantic
`a program that would identify all the dogs pictured in a chil-
`meaning. QBIC can find “fish-shaped objects,” since shape
`a program that would identify all the dogs pictured in a chil-
`meaning. QBIC can find "fish-shaped objects," since shape
`is a measurable property that can be extracted. However,
`dren’s book, a task most 3-year-olds can easily accomplish.
`dren's book, a task most 3-year-olds can easily accomplish.
`is a measurable property that can be extracted. However,
`Nobody in the audience accepted the challenge, and this
`since fish occur in many shapes, the only fish that will be
`Nobody in the audience accepted the challenge, and this
`since fish occur in many shapes, the only fish that will be
`found will have a shape close t o the drawn shape. This is not
`remains an open problem.
`remains an open problem.
`found will have a shape close to the drawn shape. This is not
`the same as the much harder semantical query of finding
`Perceptual organization-the process of grouping image
`Perceptual organization—the process of grouping image
`the same as the much harder semantical query of finding
`all the pictures of fish in a pictorial database.
`features into meaningful objects and attaching semantic
`features into meaningful objects and attaching semantic
`all the pictures of fish in a pictorial database.
`
`0018-9162/95/$4 00
`0018-9162195154.00 (cid:9)
`
`1995 IEEE
`1995 IEEE
`
`September 1995
`September 1995
`
`23
`
`Page 1 of 10
`
`MINDGEEK EXHIBIT 1008
`
`

`

`Grid: (cid:9)
`
`Undo
`
`att
`
`It18 (cid:9)
`11 (cid:9)
`
`.43/111 111 MIELe
`51.
`4,111111M111111191111811 DN,
`1111.,f
`URI (cid:9)
`.1111111 '
`.111811 (cid:9)
`AM* ,IF.11*1111111111111111
`g (cid:9)
`011till,n181111111112l
`11111111111111111111111111111111111111111111111111
`11101111111111M1111111111g11111111
`811111110111111111111111M1111111111111
`118118111111111111111111111111111111101111
`
`poly
`
`elli
`
`root
`
`line
`
`Pft) Y & Query
`
`I -C ome
`-21 hits retuned (at loam 12941 palatial. results atorahma
`,ear? (cid:9)
`Figure 1. QBlC query by drawn color. Drawn query specification on left; best 21 results sorted by similarity
`Figure 1. QBIC query by drawn color. Drawn query specification on left; best 21 results sorted by similarity
`t o the query on right. The results were selected from a 12,968-picture database.
`to the query on right. The results were selected from a 12,968-picture database.
`
`~.
`
`,-\,
`S t i l l images
`Still images (cid:9)
`
`)
`)
`
`R-frames
`
`Use
`
`Object
`identification
`
`Scene
`Scene
`
`s,
`
`Objects
`\,
`\ Feature
`Feature
`extraction
`
`/Motion objects
`Motion objects
`f
`
`‘ Shots
`Shots
`
`Sketch (cid:9)
`
`Scene
`Positional
`color/texture
`
`Object
`
`User (cid:9)
`defined Texture Color)
`
`Location Shape
`
`mere
`motion
`
`•
`
`Database
`
`Filtering/indexing i•41
`
`I
`
`Query interface
`Query interface
`
`Text
`Location
`Sketch
`Multiobject
`Shape
`User -Pia Color Texture Shape Multiobject Sketch Location Text
`Existlng
`ional Object Camera User
`Positional Object Camera User Existing
`exture motion motlon deflned
`color/texture motion motion defined image
`
`1 *
`
`Match engine
`Match engine
`
`I
`I
`I
`
`,
`
`-
`
`Text
`Location
`Sketch
`Multiobject
`Shape
`Texture
`Color
`Color Texture Shape Multiobject Sketch Location Text +if_
`Positional Object Camera User
`Positional Object Camera User
`color/texture motion motion defined
`color/texture motion motion defined
`i
`
`/’
`
`User
`User-4 (cid:9)
`
` Best matches returned
`in similarity order (cid:9)
`
`)
`
`Figure 2. QBIC database population (top) and query (bottom) architecture.
`
`Computer
`Computer
`
`Page 2 of 10
`
`MINDGEEK EXHIBIT 1008
`
`(cid:9)
`

`

`Figure 3. QBIC still image population interface. Entry for scene
`Figure 3. QBIC still image population interface. Entry for scene
`text at top. Tools in row are polygon outliner, rectangle outliner,
`text at top. Tools in row are polygon outliner, rectangle outliner,
`ellipse outliner, paintbrush, eraser, line drawing, object
`ellipse outliner, paintbrush, eraser, line drawing, object
`translation, flood fill, and snake outliner.
`translation, flood fill, and snake outliner.
`
`1
`
`I
`
`eJ
`
`$
`
`camera and object motion, and
`• camera and object motion, and
`other graphical information.
`• other graphical information.
`Two key properties of QBIC are (1) its
`Two key properties of QBIC are (1) its
`use of image and video content-com-
`use of image and video content—com-
`putable properties of color, texture, shape,
`putable properties of color, texture, shape,
`and motion of images, videos, and their
`and motion of images, videos, and their
`the queries, and (2) its graph-
`objects-in
`objects—in the queries, and (2) its graph-
`ical query language in which queries are
`ical query language in which queries are
`posed by drawing, selecting, and other
`posed by drawing, selecting, and other
`graphical means. Related systems, such as
`graphical means. Related systems, such as
`MIT’s Photobook3 and the Trademark and
`MIT's Photobook' and the Trademark and
`Art Museum applications from ETL,4 also
`Art Museum applications from ETL,4 also
`address these common issues. This article
`address these common issues. This article
`describes the QBIC system and demon-
`describes the QBIC system and demon-
`strates its query capabilities.
`strates its query capabilities.
`QBIC SYSTEM OVERVIEW
`QBIC SYSTEM OVERVIEW
`Figure 1 illustrates a typical QBIC query.”
`Figure 1 illustrates a typical QBIC query.'
`The left side shows the query specification,
`The left side shows the query specification,
`where the user painted a large magenta cir-
`where the user painted a large magenta cir-
`cular area on a green background using standard drawing
`cular area on a green background using standard drawing
`tools. Query results are shown on the right: an ordered list of
`tools. Query results are shown on the right: an ordered list of
`“hits” similar to the query. The order of the results is top to
`"hits" similar to the query. The order of the results is top to
`bottom, then left to right, to support horizontal scrolling. In
`bottom, then left to right, to support horizontal scrolling. In
`general, all queries follow this model in that the query is spec-
`general, all queries follow this model in that the query is spec-
`ified by using graphical means-drawing, selecting from a
`ified by using graphical means—drawing, selecting from a
`results
`color wheel, selecting a sample image, and so on-and
`color wheel, selecting a sample image, and so on—and results
`are displayed as an ordered set of images.
`are displayed as an ordered set of images.
`To achieve this functionality, QBIC has two main com-
`To achieve this functionality, QBIC has two main com-
`ponents: database population (the process of creating an
`ponents: database population (the process of creating an
`image database) and database query. During the popula-
`image database) and database query. During the popula-
`tion, images and videos are processed to extract features
`tion, images and videos are processed to extract features
`describing their content-colors,
`textures, shapes, and
`describing their content—colors, textures, shapes, and
`camera and object motion-and
`the features are stored in
`camera and object motion—and the features are stored in
`a database. During the query, the user composes a query
`a database. During the query, the user composes a query
`graphically. Features are generated from the graphical
`graphically. Features are generated from the graphical
`query and then input to a matching engine that finds
`query and then input to a matching engine that finds
`images or videos from the database with similar features.
`images or videos from the database with similar features.
`Figure 2 shows the system architecture.
`Figure 2 shows the system architecture.
`
`Data model
`Data model
`For both population and query, the QBIC data model has
`For both population and query, the QBIC data model has
`
`still images or scenes (full images) that contain objects
`• still images or scenes (full images) that contain objects
`(subsets of an image), and
`(subsets of an image), and
`video shots that consist of sets of contiguous frames and
`• video shots that consist of sets of contiguous frames and
`contain motion objects.
`contain motion objects.
`
`For still images, the QBIC data model distinguishes between
`For still images, the QBIC data model distinguishes between
`“scenes” (or images) and “objects.” A scene is an image or
`"scenes" (or images) and "objects." A scene is an image or
`single representative frame of video. An object is a part of
`single representative frame of video. An object is a part of
`a scene-for
`example, the fox in Figure 3-or
`a moving
`a scene—for example, the fox in Figure 3—or a moving
`entity in a video. For still image database population, fea-
`entity in a video. For still image database population, fea-
`tures are extracted from images and objects and stored in a
`tures are extracted from images and objects and stored in a
`database as shown in the top left part of Figure 2.
`database as shown in the top left part of Figure 2.
`Videos are broken into clips called shots. Representative
`Videos are broken into clips called shots. Representative
`
`frames, or r-frames, are generated for each extracted shot.
`frames, or r-frames, are generated for each extracted shot.
`R-frames are treated as still images, and features are
`R-frames are treated as still images, and features are
`extracted and stored in the database. Further processing
`extracted and stored in the database. Further processing
`of shots generates motion objects-for
`example, a car
`of shots generates motion objects—for example, a car
`moving across the screen.
`moving across the screen.
`Queries are allowed on objects (“Find images with a red,
`Queries are allowed on objects ("Find images with a red,
`round object”), scenes (“Find images that have approxi-
`round object"), scenes ("Find images that have approxi-
`mately 30-percent red and 15-percent blue colors”), shots
`mately 30-percent red and 15-percent blue colors"), shots
`(“Find all shots panning from left to right”), or any com-
`("Find all shots panning from left to right"), or any com-
`bination (“Find images that have 30 percent red and con-
`bination ("Find images that have 30 percent red and con-
`tain a blue textured object”).
`tain a blue textured object").
`In QBIC, similarity queries are done against the data-
`In QBIC, similarity queries are done against the data-
`base of pre-extracted features using distance functions
`base of pre-extracted features using distance functions
`between the features. These functions are intended to
`between the features. These functions are intended to
`mimic human perception to approximate a perceptual
`mimic human perception to approximate a perceptual
`ordering of the database. Figure 2 shows the match
`ordering of the database. Figure 2 shows the match
`engine, the collection of all distance functions. The match
`engine, the collection of all distance functions. The match
`engine interacts with a filteringhndexing module (see
`engine interacts with a filtering/indexing module (see
`“Fast searching and indexing” sidebar, next page) to sup-
`'Fast searching and indexing" sidebar, next page) to sup-
`port fast searching methodologies such as indexing. Users
`port fast searching methodologies such as indexing. Users
`interact with the query interface to generate a query spec-
`interact with the query interface to generate a query spec-
`ification, resulting in the features that define the query.
`ification, resulting in the features that define the query.
`DATABASE POPULATION
`DATABASE POPULATION
`In still image database population, the images are
`In still image database population, the images are
`reduced to a standard-sized icon called a thumbnail and
`reduced to a standard-sized icon called a thumbnail and
`annotated with any available text information. Object
`annotated with any available text information. Object
`identification is an optional but key part of this step. It lets
`identification is an optional but key part of this step. It lets
`users manually, semiautomatically, or fully automatically
`users manually, semiautomatically, or fully automatically
`identify interesting regions-which we call objects-in
`identify interesting regions—which we call objects—in
`the images. Internally, each object is represented as a
`the images. Internally, each object is represented as a
`binary mask. There may be an arbitrary number of objects
`binary mask. There may be an arbitrary number of objects
`per image. Objects can overlap and can consist of multi-
`per image. Objects can overlap and can consist of multi-
`ple disconnected components like the set of dots on a
`ple disconnected components like the set of dots on a
`polka-dot dress. Text, like “baby on beach,” can be associ-
`polka-dot dress. Text, like "baby on beach," can be associ-
`ated with an outlined object orwith the scene as a whole.
`ated with an outlined object or with the scene as a whole.
`
`’’ The scene image database used in thefigures consists of about 2 4 5 0
`The scene image database used in the figures consists of about 7,450
`imagesfrom the Mediasource Series of images and audiofrom Applred
`images from the Mediusource Series of images and audio from Applied
`Optical Media Corp., 4,100 imagesfiom the PhotoDiscsampler CD, 950
`Optical Media Corp., 4,100 images from the PhotoDisc sampler CD, 950
`imagesfrom the Corel Professional Photo CD collection, and 450 images
`images from the Corel Professional Photo CD collection, and 450 images
`J?om an IBM collection.
`from an IBM collection.
`
`Object-outlining tools
`Object-outlining tools
`Ideally, object identification would be automatic, but
`Ideally, object identification would be automatic, but
`this is generally difficult. The alternative-manual
`iden-
`this is generally difficult. The alternative—manual iden-
`tification-is
`tedious and can inhibit query-by-content
`tification—is tedious and can inhibit query-by-content
`
`September 1995
`September 1995
`
`25
`
`Page 3 of 10
`
`MINDGEEK EXHIBIT 1008
`
`

`

`Fast searching and indexing
`Fast searching and indexing
`Indexing tabular data for exact matching or range
`Indexing tabular data for exact matching or range
`searches in traditional databases is a well-understood prob-
`searches in traditional databases is a well-understood prob-
`lem, and structures like B-trees provide efficient access
`lem, and structures like B-trees provide efficient access
`mechanisms. In this scenario, indexing assures sublinear
`mechanisms. In this scenario, indexing assures sublinear
`search while maintaining completeness; that is, all records
`search while maintaining completeness; that is, all records
`satisfying the query are returned without the need for
`satisfying the query are returned without the need for
`examining each record in the database. However, in the con-
`examining each record in the database. However, in the con-
`text of similarity matching for visual content, traditional
`text of similarity matching for visual content, traditional
`indexing methods may not be appropriate. For queries in
`indexing methods may not be appropriate. For queries in
`which similarity is defined as a distance metric in high-
`which similarity is defined as a distance metric in high-
`dimensional feature spaces (for example, color histogram
`dimensional feature spaces (for example, color histogram
`queries), indexing involves clustering and indexable repre-
`queries), indexing involves clustering and indexable repre-
`sentations of the clusters. In the case of queries that com-
`sentations of the clusters. In the case of queries that com-
`bine similarity matching with spatial constraints on objects,
`bine similarity matching with spatial constraints on objects,
`the problem is more involved. Data structures for fast access
`the problem is more involved. Data structures for fast access
`of high-dimensional features for spatial relationships must
`of high-dimensional features for spatial relationships must
`be invented.
`be invented.
`In a query, features from the database are compared t o
`In a query, features from the database are compared to
`corresponding features from the query specification t o
`corresponding features from the query specification to
`determine which images are a good match. For a small data-
`determine which images are a good match. For a small data-
`base, sequentigl scanning of the features followed by
`base, sequential scanning of the features followed by
`straightforward similarity computations is adequate. But as
`straightforward similarity computations is adequate. But as
`the database grows, this combination can be too slow. To
`the database grows, this combination can be too slow. To
`speed up the queries, we have investigated a variety of tech-
`speed up the queries, we have investigated a variety of tech-
`niques. Two of the most promising follow.
`niques. Two of the most promising follow.
`
`Filtering
`Filtering
`A computationally fast filter is applied to all data, and only
`A computationally fast filter is applied to all data, and only
`items that passthrough the filter are operated on by the sec-
`items that pass through the filter are operated on by the sec-
`ond stage, which computes the true similarity metric. For
`ond stage, which computes the true similarity metric. For
`example, in QBlC we have shown that color histogram match-
`example, in QBIC we have shown that color histogram match-
`ing, which is based on a 256-dimensional color histogram and
`ing, which is based on a 256-dimensional color histogram and
`requires a 256 matrix-vector multiply, can be made efficient
`requires a 256 matrix-vector multiply, can be made efficient
`by filtering. The filtering step employs a much faster com-
`by filtering. The filtering step employs a much faster com-
`putation in a 3D space with no loss in accuracy. Thus, for a
`putation in a 3D space with no loss in accuracy. Thus, for a
`query on a database of 10,000 elements, the fast filter is
`query on a database of 10,000 elements, the fast filter is
`applied t o produce the best 1,000 color histogram matches.
`applied to produce the best 1,000 color histogram matches.
`These filtered histograms are subsequently passed to the
`These filtered histograms are subsequently passed to the
`slower complete matching operation t o obtain, say, the best
`slower complete matching operation to obtain, say, the best
`200 matches t o displayto a user, with the guarantee that the
`200 matches to display to a user, with the guarantee that the
`global best 200 in the database have been found.
`global best 200 in the database have been found.
`
`Indexing
`Indexing
`For low-dimensional features such as average color and
`For low-dimensional features such as average color and
`texture (each 3D), multidimensional indexing methods such
`texture (each 3D), multidimensional indexing methods such
`as R*-trees can be used. For high-dimensional features-for
`as R*-trees can be used. For high-dimensional features—for
`example, our 20-dimensional moment-based shape feature
`example, our 20-dimensional moment-based shape feature
`vector-the dimensionality is reduced using the K-L, or prin-
`vector—the dimensionality is reduced using the K-L, or prin-
`cipal component, transform. This produces a low-dimen-
`cipal component, transform. This produces a low-dimen-
`sional space, as low astwo or three dimensions, which could
`sional space, as low as two or three dimensions, which could
`be indexed by using /?*-trees.
`be indexed by using R*-trees.
`
`applications. As a result, we have devoted considerable
`applications. As a result, we have devoted considerable
`effort to developing tools to aid in this step. In recent
`effort to developing tools to aid in this step. In recent
`work, we have successfully used fully automatic unsu-
`work, we have successfully used fully automatic unsu-
`pervised segmentation methods along with a fore-
`pervised segmentation methods along with a fore-
`ground/background model to identify objects in a re-
`ground/background model to identify objects in a re-
`stricted class of images. The images, typical of museums
`stricted class of images. The images, typical of museums
`and retail catalogs, have a small number of foreground
`and retail catalogs, have a small number of foreground
`objects on a generally separable background. Figure 4
`objects on a generally separable background. Figure 4
`shows example results. Even in this domain, robust algo-
`shows example results. Even in this domain, robust algo-
`rithms are required because of the textured and varie-
`rithms are required because of the textured and varie-
`gated backgrounds.
`gated backgrounds.
`
`We also provide semiautomatic tools for identifying
`We also provide semiautomatic tools for identifying
`objects. One is an enhanced flood-fill technique. Flood-fill
`objects. One is an enhanced flood-fill technique. Flood-fill
`methods, found in most photo-editing programs, start
`methods, found in most photo-editing programs, start
`from a single object pixel and repeatedly add adjacent pix-
`from a single object pixel and repeatedly add adjacent pix-
`els whose values are within some given threshold of the
`els whose values are within some given threshold of the
`original pixel. Selecting the chreshold, which must change
`original pixel. Selecting the threshold, which must change
`from image to image and object to object, is tedious. We
`from image to image and object to object, is tedious. We
`automatically calculate a dynamic threshold by having the
`automatically calculate a dynamic threshold by having the
`user click on background as well as object points. For rea-
`user click on background as well as object points. For rea-
`sonably uniform objects that are distinct from the back-
`sonably uniform objects that are distinct from the back-
`ground, this operation allows fast object identification
`ground, this operation allows fast object identification
`
`Figure 4. Top row is the original image. Bottom row contains the automatically extracted objects using a
`Figure 4. Top row is the original image. Bottom row contains the automatically extracted objects using a
`foregroundhackground model. Heuristics encode the knowledge that objects tend to be in the center of
`foreground/background model. Heuristics encode the knowledge that objects tend to be in the center of
`the picture.
`the picture.
`
`26
`
`Computer
`Computer
`
`Page 4 of 10
`
`MINDGEEK EXHIBIT 1008
`
`

`

`without manually adjust-
`without manually adjust-
`ing a threshold. The exam-
`ing a threshold. The exam-
`ple in Figure 3 shows an
`ple in Figure 3 shows an
`object, a fox, identified by
`object, a fox, identified by
`using only a few clicks.
`using only a few clicks.
`We designed another
`We designed another
`outlining tool to help users
`outlining tool to help users
`track object edges. This tool
`track object edges. This tool
`takes a user-drawn curve
`takes a user-drawn curve
`and automatically aligns it
`and automatically aligns it
`with nearby image edges.
`with nearby image edges.
`Based on the “snakes” con-
`Based on the "snakes" con-
`cept developed in recent
`cept developed in recent
`computer vision research,
`computer vision research,
`the tool finds the curve that
`the tool finds the curve that
`maximizes the image gra-
`maximizes the image gra-
`dient magnitude along the
`dient magnitude along the
`curve.
`curve.
`The spline snake formu-
`The spline snake formu-
`lation we use allows for
`lation we use allows for
`smooth solutions to the
`smooth solutions to the
`resulting nonlinear mini-
`resulting nonlinear mini-
`mization problem. The
`mization problem. The
`computation is done at
`computation is done at
`interactive speeds so that,
`interactive speeds so that,
`as the user draws a curve, it
`as the user draws a curve, it
`is “rubber-banded’’ to lie
`is "rubber-banded" to lie
`along object boundaries.
`along object boundaries.
`
`Video data
`Video data
`For video data, database
`For video data, database
`population has three major
`population has three major
`components:
`components:
`
`shot detection,
`• shot detection, (cid:9)
`representative frame cre-
`• representative frame cre-
`ation for each shot, and
`ation for each shot, and
`derivation of a layered representation of coherently
`• derivation of a layered representation of coherently
`moving structures/objects.
`moving structures/objects.
`
`Shots are short sequences of contiguous frames that we
`Shots are short sequences of contiguous frames that we
`use for annotation and querying. For instance, a video clip
`use for annotation and querying. For instance, a video clip
`may consist of a shot smoothly panning over the skyline
`may consist of a shot smoothly panning over the skyline
`of San Francisco, switching to a panning shot of the Bay
`of San Francisco, switching to a panning shot of the Bay
`meeting the ocean, and then to one that zooms to the
`meeting the ocean, and then to one that zooms to the
`Golden Gate Bridge. In general, a set of contiguous frames
`Golden Gate Bridge. In general, a set of contiguous frames
`may be grouped into a shot because they
`may be grouped into a shot because they
`
`depict the same scene,
`• depict the same scene,
`signify a single camera operation,
`• signify a single camera operation,
`contain a distinct event or an action like a significant
`• contain a distinct event or an action like a significant
`presence and persistence of an object, or
`presence and persistence of an object, or
`are chosen as a single indexable entity by the user.
`• are chosen as a single indexable entity by the user.
`
`Our effort is to detect many shots automatically in a pre-
`Our effort is to detect many shots automatically in a pre-
`processing step and provide an easy-to-use interface for
`processing step and provide an easy-to-use interface for
`the rest.
`the rest.
`
`SHOT DETECTION. Gross scene changes or scene cuts
`SHOT DETECTION. Gross scene changes or scene cuts
`are the first indicators of shot boundaries. Methods for
`are the first indicators of shot boundaries. Methods for
`detecting scene cuts proposed in the literature essentially
`detecting scene cuts proposed in the literature essentially
`fall into two classes: (1) those based on global represen-
`fall into two classes: (1) those based on global represen-
`
`Figure 5. Scene cuts automatically extracted from a 1,148-frame sales demo
`Figure S. Scene cuts automatically extracted from a 1048-frame sales demo
`from Energy Productions.
`from Energy Productions.
`
`~
`
`U-M-I
`U-M-I
`BEST COPY AVAILABLE
`BEST COPY AVAILABLE
`-~
`tations like color/intensity histogram

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket