`Content-Based
`Video Indexing
`Video Indexing
`and Retrieval
`and Retrieval
`-
`V
`Video has become an important ele-
`
`Stephen W. Smoliar and Hongliang Zhang
`Stephen W. Smoliar and HongJiang Zhang
`National University of Singapore
`National University of Singapore
`
`-
`
`11111111111111111•1
`Current video
`Current video
`management
`management
`tools and
`tools and
`techniques are
`techniques are
`based on pixels
`based on pixels
`rather than
`rather than
`perceived content.
`perceived content.
`Thus, state-of-the-
`Thus, state-of-the-
`art video editing
`art video editing
`systems can easily
`systems can easily
`manipulate such
`manipulate such
`things as time
`things as time
`codes and image
`codes and image
`frames, but they
`frames, but they
`cannot “know,”
`cannot "know,"
`for example, what
`for example, what
`a basketball is.
`a basketball is.
`Our research
`Our research
`addresses four
`addresses four
`areas of content-
`areas of content-
`based video
`based video
`management.
`management.
`
`ideo has become an important ele-
`ment of multimedia computing and
`ment of multimedia computing and
`communication environments, with
`communication environments, with
`applications as varied as broadcast-
`applications as varied as broadcast-
`ing, education, publishing, and military intelli-
`ing, education, publishing, and military intelli-
`gence. However, video will only become an
`gence. However, video will only become an
`effective part of everyday computing environ-
`effective part of everyday computing environ-
`ments when we can use it with the same facility
`ments when we can use it with the same facility
`that we currently use text. Computer literacy
`that we currently use text. Computer literacy
`today entails the ability to set our ideas down
`today entails the ability to set our ideas down
`spontaneously with a word processor, perhaps
`spontaneously with a word processor, perhaps
`while examining other text documents to devel-
`while examining other text documents to devel-
`op those ideas and even using editing operations
`op those ideas and even using editing operations
`to transfer some of that text into our own com-
`to transfer some of that text into our own com-
`positions. Similar composition using video
`positions. Similar composition using video
`remains far in the future, even though worksta-
`remains far in the future, even though worksta-
`tions now come equipped with built-in video
`tions now come equipped with built-in video
`cameras and microphones, not to mention ports
`cameras and microphones, not to mention ports
`for connecting our increasingly popular hand-
`for connecting our increasingly popular hand-
`held video cameras.
`held video cameras.
`Why is this move to communication incorpo-
`Why is this move to communication incorpo-
`rating video still beyond our grasp? The problem is
`rating video still beyond our grasp? The problem is
`that video technology has developed thus far as a
`that video technology has developed thus far as a
`technology of images. Little has been done to help
`technology of images. Little has been done to help
`us use those images effectively. Thus, we can buy a
`us use those images effectively. Thus, we can buy a
`camera that “knows” all about how to focus itself
`camera that "knows" all about how to focus itself
`properly and even how to compensate for the fact
`properly and even how to compensate for the fact
`that we can rarely hold it steady without a tripod.
`that we can rarely hold it steady without a tripod.
`But no camera knows “where the action is” during
`But no camera knows "where the action is" during
`a basketball game or a family reunion. A camera
`a basketball game or a family reunion. A camera
`can give us a clear shot of the ball going through
`can give us a clear shot of the ball going through
`the basket, but only if we find the ball for it.
`the basket, but only if we find the ball for it.
`The point is that we do not use images just
`The point is that we do not use images just
`because they are steady or clearly focused. We use
`because they are steady or clearly focused. We use
`them for their content. If we wish to compose
`them for their content. If we wish to compose
`with images in the same way that we compose
`with images in the same way that we compose
`
`with words, we must focus our attention on con-
`with words, we must focus our attention on con-
`tent. Video composition should not entail think-
`tent. Video composition should not entail think-
`ing about image “bits” (pixels), any more than
`ing about image "bits" (pixels), any more than
`text composition requires thinking about ASCII
`text composition requires thinking about ASCII
`character codes. Video content objects include
`character codes. Video content objects include
`basketballs, athletes, and hoops. Unfortunately,
`basketballs, athletes, and hoops. Unfortunately,
`state-of-the-art software for manipulating video
`state-of-the-art software for manipulating video
`does not “know” about such objects. At best, it
`does not "know" about such objects. At best, it
`“knows“ about time codes, individual frames, and
`"knows" about time codes, individual frames, and
`clips of video and sound. To compose a video doc-
`clips of video and sound. To compose a video doc-
`ument-or
`even just incorporate video as part of
`ument—or even just incorporate video as part of
`a text document-we find ourselves thinking one
`a text document—we find ourselves thinking one
`way (with ideas) when we are working with text
`way (with ideas) when we are working with text
`and another (with pixels) when we are working
`and another (with pixels) when we are working
`with video. The pieces do not fit together effec-
`with video. The pieces do not fit together effec-
`tively, and video suffers for it.
`tively, and video suffers for it.
`Similarly, if we wish to incorporate other text
`Similarly, if we wish to incorporate other text
`material in a document, word processing offers a
`material in a document, word processing offers a
`powerful repertoire of techniques for finding what
`powerful repertoire of techniques for finding what
`we want. In video, about the only technique we
`we want. In video, about the only technique we
`have is our own memory coupled with some intu-
`have is our own memory coupled with some intu-
`ition about how to use fast forward and fast
`ition about how to use fast forward and fast
`reverse buttons while viewing.
`reverse buttons while viewing.
`The moral of all this is that the effective use of
`The moral of all this is that the effective use of
`video is still beyond our grasp because the effec-
`video is still beyond our grasp because the effec-
`tive use of its content is still beyond our grasp.
`tive use of its content is still beyond our grasp.
`How can we remedy this situation? At the
`How can we remedy this situation? At the
`Institute of Systems Science of the National
`Institute of Systems Science of the National
`University of Singapore, the Video Classification
`University of Singapore, the Video Classification
`project addresses this question. We are currently
`project addresses this question. We are currently
`tackling problems in four areas:
`tackling problems in four areas:
`
`I Defining an architecture that characterizes the
`I (cid:9) Defining an architecture that characterizes the
`tasks of managing video content.
`tasks of managing video content.
`
`I Developing software tools and techniques that
`I Developing software tools and techniques that
`identify and represent video content.
`identify and represent video content.
`
`I Applying knowledge representation techniques
`I Applying knowledge representation techniques
`to the development of index construction and
`to the development of index construction and
`retrieval tools.
`retrieval tools.
`
`I Developing an environment for interacting
`I Developing an environment for interacting
`with video objects.
`with video objects.
`
`In this article, we discuss each of these problem
`In this article, we discuss each of these problem
`areas in detail, then briefly review a recent case
`areas in detail, then briefly review a recent case
`study concerned with content analysis of news
`study concerned with content analysis of news
`videos. We conclude with a discussion of our
`videos. We conclude with a discussion of our
`plans to extend our work into the audio domain.
`plans to extend our work into the audio domain.
`
`Architecture for video management
`Architecture for video management
`Our architecture is based on the assumption
`Our architecture is based on the assumption
`that video information will be maintained in a
`that video information will be maintained in a
`
`62
`
`1070-986X/94/$4.00 8 1 994 IEEE
`1070-986X/94/$4.00 ©1994 IEEE
`
`Page 1 of 11
`
`MINDGEEK EXHIBIT 1010
`
`
`
`database.’ This assumption requires us to define
`database.' This assumption requires us to define
`tools for the construction of such databases and
`tools for the construction of such databases and
`the insertion of new material into existing data-
`the insertion of new material into existing data-
`bases. We can characterize these tools in terms of
`bases. We can characterize these tools in terms of
`a sequence of specific task requirements:
`a sequence of specific task requirements:
`
`basic units for indexing. The second set identifies
`basic units for indexing. The second set identifies
`different manifestations of camera technique in
`different manifestations of camera technique in
`these clips. The third set applies content models
`these clips. The third set applies content models
`to the identification of context-dependent seman-
`to the identification of context-dependent seman-
`tic primitives.
`tic primitives.
`
`Video/a ud io
`Video/audio
`data
`data
`
`Content attributes:
`Content attributes:
`frame based
`frame based
`
`DBMS
`
`I
`
`Knowledge
`
`Toolbox (cid:9)
`
`Petsing
`
`Raw
`Raw
`video/audio
`video/audio
`data
`data
`
`L
`
`I
`
`reference
`-
`enqine
`
`I
`
`J
`
`Brows 19 , (cid:9)
`
`.117
`Appliiations
`Applications
`
`Representation
`browsing
`browsing
`tools
`tools
`
`Figure 1. Diagram of
`Figure 1. Diagram of
`video management
`video management
`architecture.
`architecture.
`
`t766L JawlunS
`
`63
`
`Locating camera shot boundaries
`Locating camera shot boundaries
`We decided that the most viable segmentation
`We decided that the most viable segmentation
`criteria for motion video are those that detect
`criteria for motion video are those that detect
`boundaries between camera shots. Thus, the imi-
`boundaries between camera shots. Thus, the (-am-
`em shot-consisting of one or more frames gener-
`era shot—consisting of one or more frames gener-
`ated and recorded contiguously and representing
`ated and recorded contiguously and representing
`a continuous action in time and space-becomes
`a continuous action in time and space—becomes
`the smallest unit for indexing video. ’The simplest
`the smallest unit for indexing video. The simplest
`shot transition is a camera cut, where the bound-
`shot transition is a camera cut, where the bound-
`ary lies between two successive frames. More
`ary lies between two successive frames. More
`sophisticated transition techniques include dis-
`sophisticated transition techniques include dis-
`solvc~s, wipes, and fade-outs-all of which take
`solves, wipes, and fade-outs—all of which take
`placr over a sequence of frames.
`place over a sequence of frames.
`In any case, camera shots can always be distin-
`In any case, camera shots can always be distin-
`guished by significant qualitative differences. If we
`guished by significant qualitative differences. If we
`can expresh those differences by a suitable quan-
`can express those differences by a suitable quan-
`titative measure, then we can declare a segment
`titative measure, then we can declare a segment
`boundary whenever that measure exceeds a given
`boundary whenever that measure exceeds a given
`threshold. The key issues in locating shot bound-
`threshold. The key issues in locating shot bound-
`aries, therefore, are selecting suitable difference
`aries, therefore, are selecting suitable difference
`measures and thresholds, and applying them to
`measures and thresholds, and applying them to
`the comparison of video frames. We now briefly
`the comparison of video frames. We now briefly
`review the segmentation techniques we currently
`review the segmentation techniques we currently
`employ. (For details, see Zhang et al.?)
`employ. (For details, see Zhang et al.2)
`The most suitable measures rely on compar-
`The most suitable measures rely on compar-
`isons between the pixel-intensity histograms of
`isons between the pixel-intensity histograms of
`two frames. The principle behind this metric is
`two frames. The principle behind this metric is
`that two frames with little change in the back-
`that two frames with little change in the back-
`ground and object content will also differ little in
`ground and object content will also differ little in
`their overall intensity distributions. Further
`their overall intensity distributions. Further
`strengthening this approach, it is easy to define a
`strengthening this approach, it is easy to define a
`histogram that effectively accounts for color infor-
`histogram that effectively accounts for color infor-
`mation. We also developed an automatic
`mation.' We also developed an automatic
`approach to detect the segmentation threshold on
`approach to detect the segmentation threshold on
`
`I Pming, which segments the video stream into
`I Parsing, which segments the video stream into
`generic clips. These clips are the elemental index
`generic clips. These clips arc the elemental index
`units in the database. Ideally, the system decom-
`units in the database. Ideally, the system decom-
`poses individual images into semantic primitives.
`poses individual images into semantic primitives.
`On the basis of these primitives, a video clip can
`On the basis of these primitives, a video clip can
`be indexed with a semantic description using
`be indexed with a semantic description using
`existing knowledge-representation techniques.
`existing knowledge-representation techniques.
`
`I Indexing, which tags video clips when the sys-
`I Indexing, which tags video clips when the sys-
`tem inserts them into the database. The tag
`tem inserts them into the database. The tag
`includes information based on a knowledge
`includes information based on a knowledge
`model that guides the classification according to
`model that guides the classification according to
`the semantic primitives of the images. Indexing is
`the semantic primitives of the images. Indexing is
`thus driven by the image itself and any semantic
`thus driven by the image itself and any semantic
`descriptors provided by the model.
`descriptors provided by the model.
`
`I RfJtriewl c 7 n d browsing, where users can access
`I Retrieval and browsing, where users can access
`the database through queries based on text and/or
`the database through queries based on text and/or
`visual examples or browse it through interaction
`visual examples or browse it through interaction
`with displays of meaningful icon?. Users can also
`with displays of meaningful icons. Users can also
`browse the results of a retrieiral query. tt is impor-
`browse the results of a retrieval query. It is impor-
`tant that both retrieval and browsing appeal to
`tant that both retrieval and browsing appeal to
`the user’$ visual intuition.
`the user's visual intuition.
`
`Figure 1 summarizes this task analysis as an
`Figure 1 summarizes this task analysis as an
`architectural diagram. The heart of the system is
`architectural diagram. The heart of the system is
`a database management system containing the
`a database management system containing the
`video and audio data from video source material
`video and audio data from video source material
`that has been compressed wherever possible. The
`that has been compressed wherever possible. The
`DBMS detines attributes and relations among
`DBMS defines attributes and relations among
`these entities in terms of a frame-based approach
`these entities in terms of a frame-based approach
`to knowledge representation (described further
`to knowledge representation (described further
`under the subhead “A frame-based knowledge
`under the subhead "A frame-based knowledge
`base,” p. 65). This representation approach, in
`base," p. 65). This representation approach, in
`turn, drives the indexing of entities as they are
`turn, drives the indexing of entities as they are
`added to the database. Those entities are initially
`added to the database. Those entities are initially
`extracted by the tools that support the parsing
`extracted by the tools that support the parsing
`task. In the opposite direction, the database con-
`task. In the opposite direction, the database con-
`tents are made available by tools that support the
`tents are made available by tools that support the
`processing of both specific queries and the more
`processing of both specific queries and the more
`general needs of casual browsing.
`general needs of casual browsing.
`The next three sections discuss elements of this
`The next three sections discuss elements of this
`architecture in greater detail.
`architecture in greater detail.
`
`Video content parsing
`Video content parsing
`‘l‘hree tool sets address the parsing task. The
`Three tool sets address the parsing task. The
`first set segments the video source material into
`first set segments the video source material into
`individual camera shots, which then serve as the
`individual camera shots, which then serve as the
`
`Page 2 of 11
`
`MINDGEEK EXHIBIT 1010
`
`
`
`camera), in which the camera position does
`camera), in which the camera position does
`~ h a n g e . ~ These operations may also occur in com-
`change.4 These operations may also occur in com-
`binations. They are most readily detected through
`binations. They are most readily detected through
`motion field analysis, since each operation has its
`motion field analysis, since each operation has its
`own characteristic pattern of motion vectors. For
`own characteristic pattern of motion vectors. For
`example, a zoom causes most of the motion vec-
`example, a zoom causes most of the motion vec-
`tors to point either toward or away from a focus
`tors to point either toward or away from a focus
`center, while movement of the camera itself
`center, while movement of the camera itself
`shows up as a modal value across the entire
`shows up as a modal value across the entire
`motion field.
`motion field.
`The motion vectors can be computed by the
`The motion vectors can be computed by the
`block-matching algorithms used in motion com-
`block-matching algorithms used in motion com-
`pensation for video compression. Thus, a system
`pensation for video compression. Thus, a system
`can often retrieve the vectors from files of video
`can often retrieve the vectors from files of video
`compressed according to standards such as MPEG
`compressed according to standards such as MPEG
`and H.261. The system could also compute them
`and H.261. The system could also compute them
`in real time by using chips that perform such
`in real time by using chips that perform such
`compression in hardware.
`compression in hardware.
`
`Content models
`Content models
`Content parsing is most effective with an a pri-
`Content parsing is most effective with an a pri-
`ori model of a video’s structure.’ Such a model can
`ori model of a video's structure.' Such a model can
`represent a strong spatial order within the indi-
`represent a strong spatial order within the indi-
`vidual frames of shots and/or a strong temporal
`vidual frames of shots and/or a strong temporal
`order across a sequence of shots. News broadcasts
`order across a sequence of shots. News broadcasts
`usually provide simple examples of such models.
`usually provide simple examples of such models.
`For example, all shots of the anchorperson
`For example, all shots of the anchorperson
`conform to a common spatial layout, and the
`conform to a common spatial layout, and the
`temporal structure simply alternates between the
`temporal structure simply alternates between the
`anchorperson and more detailed footage (possibly
`anchorperson and more detailed footage (possibly
`including breaks for commercials).
`including breaks for commercials).
`Our approach to content parsing begins with
`Our approach to content parsing begins with
`identifying key features of the image data, which
`identifying key features of the image data, which
`are then compared to domain models to identify
`are then compared to domain models to identify
`objects inferred to be part of the domain. We then
`objects inferred to be part of the domain. We then
`identify domain events as segments that include
`identify domain events as segments that include
`specific domain objects. Our initial experiments
`specific domain objects. Our initial experiments
`involve models for cut boundaries, typed shots,
`involve models for cut boundaries, typed shots,
`and episodes. The cut boundary model drives the
`and episodes. The cut boundary model drives the
`segmentation process that locates camera shot
`segmentation process that locates camera shot
`boundaries. Once a shot has been isolated
`boundaries. Once a shot has been isolated
`through segmentation, it can be compared against
`through segmentation, it can be compared against
`type models based both on features to be detect-
`type models based both on features to be detect-
`ed and on measures that determine acceptable
`ed and on measures that determine acceptable
`similarity. Sequences of typed shots can then be
`similarity. Sequences of typed shots can then be
`similarly compared against episode models. We
`similarly compared against episode models. We
`discuss this in more detail later, under “Case study
`discuss this in more detail later, under "Case study
`of video content analysis.”
`of video content analysis."
`
`Index construction and retrieval tools
`Index construction and retrieval tools
`The fundamental task of any database system
`The fundamental task of any database system
`is to support retrieval, so we must consider how to
`is to support retrieval, so we must consider how to
`build indexes that facilitate such retrieval services
`build indexes that facilitate such retrieval services
`for video. We want to base the index on semantic
`for video. We want to base the index on semantic
`
`80
`
`70
`
`60
`
`50
`
`40
`
`30
`
`20
`
`10
`
`0
`
`Histogram difference
`
`N V CO CO 0 N ,z1- 0 0 0 N
` N NNN
`
`Frame number
`
`the basis of statistics of frame difference values
`the basis of statistics of frame difference values
`and a multipass technique that improves process-
`and a multipass technique that improves process-
`ing speed.2
`ing speed.'
`Figure 2 illustrates a typical sequence of differ-
`Figure 2 illustrates a typical sequence of differ-
`ence values. The graph exhibits two high pulses
`ence values. The graph exhibits two high pulses
`corresponding to two camera breaks. It also illus-
`corresponding to two camera breaks. It also illus-
`trates a gradual transition occurring over a
`trates a gradual transition occurring over a
`sequence of frames. In this case, the task is to
`sequence of frames. In this case, the task is to
`identify the sequence start and end points. As the
`identify the sequence start and end points. As the
`inset in Figure 2 shows, the difference values dur-
`inset in Figure 2 shows, the difference values dur-
`ing such a transition are far less than across a cam-
`ing such a transition are far less than across a cam-
`era break. Thus, a single threshold lacks the power
`era break. Thus, a single threshold lacks the power
`to detect gradual transitions.
`to detect gradual transitions.
`A so-called twin-comparison approach solves
`A so-called twin-comparison approach solves
`this problem. The name refers to the use of two
`this problem. The name refers to the use of two
`thresholds. First, a reduced threshold detects the
`thresholds. First, a reduced threshold detects the
`potential starting frame of a transition sequence.
`potential starting frame of a transition sequence.
`Once that frame has been identified, it is com-
`Once that frame has been identified, it is com-
`pared against successive frames, thus measuring
`pared against successive frames, thus measuring
`an accumulated difference instead of frame-to-
`an accumulated difference instead of frame-to-
`frame differences. This accumulated difference
`frame differences. This accumulated difference
`must be monotonic. When it ceases to be monot-
`must be monotonic. When it ceases to be monot-
`onic, it is compared against a second, higher
`onic, it is compared against a second, higher
`threshold. If this threshold is exceeded, we con-
`threshold. If this threshold is exceeded, we con-
`clude that the monotonically increasing sequence
`clude that the monotonically increasing sequence
`of accumulated differences corresponds to a grad-
`of accumulated differences corresponds to a grad-
`ual transition. Experiments have shown this
`ual transition. Experiments have shown this
`approach to be very effective.2
`approach to be very effective.'
`
`Shot classification
`Shot classification
`Before a system can parse content, it must first
`Before a system can parse content, it must first
`recognize and account for artifacts caused by cam-
`recognize and account for artifacts caused by cam-
`era movement. These movements include pan-
`era movement. These movements include pan-
`ning and tilting (horizontal or vertical rotation of
`ning and tilting (horizontal or vertical rotation of
`the camera) and zooming (focal length change),
`the camera) and zooming (focal length change),
`in which the camera position does not change,
`in which the camera position does not change,
`and tracking and booming (horizontal and verti-
`and tracking and booming (horizontal and verti-
`cal transverse movement of the camera) and dol-
`cal transverse movement of the camera) and dol-
`lying (horizontal lateral movement of
`the
`lying (horizontal lateral movement of the
`
`Figure 2. A sequence of
`Figure 2. A sequence of
`frame-to-frame
`frame-to-frame
`histogram diferences
`histogram differences
`obtained from a
`obtained from a
`documentary video,
`documentary video,
`where direrences
`where differences
`corresponding both to
`corresponding both to
`camera breaks and to
`camera breaks and to
`transitions
`transitions
`implemented by special
`implemented by special
`effects can be observed.
`effects can be observed.
`
`IEEE MultiMedia
`
`w
`w w
`
`Page 3 of 11
`
`MINDGEEK EXHIBIT 1010
`
`(cid:9)
`
`
`Engineering
`
`Activity
`
`Person
`
`Video Types
`
`Academic
`
`Nonacademic
`
`Classroom
`
`_1
`Arts
`
`Laboratory (cid:9)
`
`Sports
`
`External
`
`Convocation
`Convocation
`
`Demonstration
`
`Scenery
`Scenery
`
`Headings
`
`~
`
`Figure 3. A tree
`Figure 3. A tree
`structure of topical
`structure of topical
`categories for a
`categories for a
`documentary video
`documentary video
`about engineering at the
`about engineering at the
`National University of
`National University of
`Singapore.
`Singapore.
`
`Figure 4. Examples of
`Figure 4. Examples of
`class frame Laboratory
`class frame Laboratory
`(top) and subclass
`(top) and subclass
`instance
`instance
`Wave-Simulator
`Wave_Simulator
`(bottom).
`(bottom).
`
`17661. JauuinS
`
`65
`
`records of a database, frames are structured as a
`records of a database, frames are structured as a
`collection of fields (usually called slots in frame-
`collection of fields (usually called slots in frame-
`based systems). These slots provide different ele-
`based systems). These slots provide different ele-
`ments of descriptive information, and the
`ments of descriptive information, and the
`elements distinguish the topical characteristics for
`elements distinguish the topical characteristics for
`each object represented by a frame.
`each object represented by a frame.
`It is important to recognize that we use frames
`It is important to recognize that we use frames
`to represent both classes (the categories) and
`to represent both classes (the categories) and
`instances (the elements categorized). As an exam-
`instances (the elements categorized). As an exam-
`ple of a class frame, consider the Laboratory cate-
`ple of a class frame, consider the Laboratory cate-
`gory in Figure 3. We might define the frame for it
`gory in Figure 3. We might define the frame for it
`as shown in Figure 4a. Alternatively, we can define
`as shown in Figure 4a. Alternatively, we can define
`an instance of one of its subclasses in a slightly
`an instance of one of its subclasses in a slightly
`similar manner as shown in Figure 4b.
`similar manner as shown in Figure 4b.
`Note that not all slots need to be filled in a class
`Note that not all slots need to be filled in a class
`definition (“void” indicates an unfilled slot), while
`definition ("void" indicates an unfilled slot), while
`
`Name : Laboratory
`Name: Laboratory
`Superclass: Academic
`SuperClass: Academic
`Suwlasses: #table[Computer-Lab
`Subclasses: #table[Computer_Lab
`Elect ronic-lab Mechan ical-Lab
`Electronic_Lab Mechanical_Lab
`Civil-Lab Chemical-Ldbl
`Civil_Lab Chemical_Lab]
`Instances : void
`Instances: void
`Description: vold
`Description: void
`Video: void
`Video: void
`Course: void
`Course: void
`Equipment : void
`Equipment: void
`
`Name: Wave-Simulator
`Name: Wave_Simulator
`Class: Civil-Ldb
`Class: Civil_Lab
`Description: ”Monitoring plessure
`Description: "Monitoring pressure
`variation in breaklny waves. ”
`variation in breaking waves."
`Video: WdveBreaker-CoverFldme
`Video: WaveRreaker_CoverFrame
`C i v L I-Enq
`Course :
`Course: Civil_Eng
`Equipment : ittable [Corrputer
`Djuipment:#table[Computer
`v e-& ne L ci t o 1 I
`Wave_Generatorl
`
`properties, rather than lower level features. A
`properties, rather than lower level features. A
`knowledge model can support such semantic
`knowledge model can support such semantic
`properties. The model for our system is a frame-
`properties. The model for our system is a frame-
`based knowledge base. In the following discus-
`based knowledge base. In the following discus-
`sion, the word “frame” refers to such a knowledge
`sion, the word "frame" refers to such a knowledge
`base object rather than a video image frame.
`base object rather than a video image frame.
`
`A frame-based knowledge base
`A frame-based knowledge base
`An index based on semantic properties requires
`An index based on semantic properties requires
`an organization that explicitly represents the var-
`an organization that explicitly represents the var-
`ious subject matter categories of the material
`ious subject matter categories of the material
`being indexed. Such a representation is often real-
`being indexed. Such a representation is often real-
`ized as a semantic network, but text indexes tend
`ized as a semantic network, but text indexes tend
`to be structured as trees (as revealed by the indent-
`to be structured as trees (as revealed by the indent-
`ed representations of most book indexes). We
`ed representations of most book indexes). We
`decided that the more restricted tree form also
`decided that the more restricted tree form also
`suited our purposes.
`suited our purposes.
`Figure 3 gives an example of such a tree. It rep-
`Figure 3 gives an example of such a tree. It rep-
`resents a selection of topical categories taken from
`resents a selection of topical categories taken from
`a documentary video about the Faculty of Engi-
`a documentary video about the Faculty of Engi-
`neering at the National University of Singapore.
`neering at the National University of Singapore.
`The tree structure represents relations of special-
`The tree structure represents relations of special-
`ization and generalization among these cate-
`ization and generalization among these cate-
`gories. Note, in particular, that categories
`gories. Note, in particular, that categories
`correspond both to content material about stu-
`correspond both to content material about stu-
`dent activities (Activity) and to classifications of
`dent activities (Activity) and to classifications of
`different approaches to producing the video
`different approaches to producing the video
`(Video-Types).
`(Video_Types).
`Users tend to classify material on the basis of
`Users tend to classify material on the basis of
`the information they hope to extract. This partic-
`the information they hope to extract. This partic-
`ular set of categories reflects interest both in the
`ular set of categories reflects interest both in the
`faculty and in documen