throbber
(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2004/0024898A1
`(43) Pub. Date:
`Feb. 5, 2004
`Wan
`
`US 2004.0024898A1
`
`(54) DELIVERING MULTIMEDIA DESCRIPTIONS
`(76) Inventor: Ernest Yiu Cheong Wan, Carlingford,
`NSW (AU)
`Correspondence Address:
`FITZPATRICK CELLAHARPER & SCINTO
`30 ROCKEFELLER PLAZA
`NEW YORK, NY 10112 (US)
`(21) Appl. No.:
`10/296,162
`(22) PCT Filed:
`Jul. 5, 2001
`(86) PCT No.:
`PCT/AU01/00799
`(30)
`Foreign Application Priority Data
`
`Jul. 10, 2000 (AU)............................................ PO 8677
`Publication Classification
`
`(51) Int. Cl. .................................................. G06F 15/16
`(52) U.S. Cl. ............................................ 709/231; 709/246
`(57)
`ABSTRACT
`Disclosed is method of processing a document (20)
`described in a mark up language (eg. XML). Initially, a
`
`Structure (21a) and a text content (21b) of the document are
`Separated, and then the structure (22) is transmitted, for
`example by Streaming, before the text content (23). Parsing
`of the received structure (22) is commenced before the text
`content (23) is received. Also disclosed is a method of
`forming a streamed presentation (37,38) from at least one
`media object having content (31, 32) and description (33)
`components. A presentation description (35) is generated
`(36) from at least one component description of the media
`object and is then processed (34) to schedule delivery of
`component descriptions and content of the presentation to
`generate elementary data Streams associated with the com
`ponent descriptions (38) and content (37). Another method
`of forming a streamed presentation of at least one media
`object having content and description components is also
`disclosed. A presentation template (53) is provided that
`defines a structure of a presentation description (56). The
`template is then applied (54) to at least one description
`component (52) of the associated media object to form the
`presentation description from each description component.
`The presentation description is then Stream encoded with
`each associated media object (51) to form the streamed
`presentation (57, 58), whereby the media object is repro
`ducible using the presentation description.
`
`go
`Scene description stream do
`(
`N
`
`do
`
`46
`47
`
`4.
`
`Audio and Video StreamS
`
`Description streams
`
`DZDDDDDDD-42
`sye.<rite
`43
`
`CSCeed...
`</scened
`
`CSCened...
`</scened
`
`<Shotel...
`<Shot)
`
`<Shotd...
`<Shotd
`
`Elementary Streams
`
`44
`
`45
`
`-1-
`
`Amazon v. Audio Pod
`US Patent 10,805,111
`Amazon EX-1067
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004
`
`Sheet 1 of 11
`
`US 2004/0024898 Al
`
`ol
`
`pepooug "stein49}01018}U}
`iU,tPena,ihiu,ryia00o19,12,chv910LO
`908810SO00£060SOZhOO.AOWY
`vO€8100054“€0S800%AxX€08098
`
`
`}USWINDOGsy}Josadedsepoy
`
`10101000WN.€0VO209870€810
`pAPNee=|XSL|LP
`juawinsog
`sd1n0sg
`
`
`
`<LXAL(AHOMSSVdlLXS.L)SdALLNdNILSOLLLVi>
`
`
`
`
`<G3NdINi#NAYOLAINSNVNGHYVDLSMLLVi>
`
`
`
`<.(O0|LNdNI!VLVdOd#)GHYOLNSWa1si>
`</,S/Ho'ZA//:dYU=TWNwdIOOVu=AdALOG>
`
`
`
`
`</uNu=ASYwLXSL=SdALLNdNI>oweusayuy
`
`
`<CAITdWI#NAYOLAINASMLANILSILLVi>
`
`
`
`
`<1SI1.(LASILSIN)STALSGHV9LSIMLLWi>
`
`
`<GSHINDAY#VLVG9SdALOGISTLLLiVi>
`
`QloO}
`<d3dWitVLVG9THNOdLSMLLV
`
`(uy40d)Wh“BI
`
`
`SALSMFRSTALS1°92,=45NNNquvo>
`
`<+(GuVvO)ZAXLNSWSATSi>
`
`
`<ALdINALAdNILNAWS1ai>
`
`
`
`
`
`<En0"Lu=UOISIOA[LUXZ>
`
`
`
`1ZAXAdALOOG>
`
`qusWNdOd
`
`<ALdW3OGLNSWI1S5i>
`
`
`
`<v09149,GsquALLLNAI>
`
`<-JUSULUODBS}SIU—j><[
`
`<ZAX>
`
`<duvor
`
`<ZAX/>
`
`-2-
`
`-2-
`
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 2 of 11
`
`US 2004/0024898A1
`
`
`
`
`
`
`
`
`
`
`
`
`
`Token in Stream
`
`Description
`
`12
`'a', 'b', 'c', OO,'', 'E', 'n',
`'t, 'e', 'r',
`', 'n', 'a', 'm',
`'e', '', ', OO
`
`String table length
`String table
`
`09
`3
`
`06
`86
`08
`03
`
`NAME
`tring table reference follows
`trind table index
`
`D
`TYPE
`URL="http://
`inline string follows
`
`Inline string follows
`
`String table reference follows
`
`Inline string follows
`
`
`
`Fig. 1 B (Prior Art)
`
`
`
`
`
`
`
`
`
`-3-
`
`

`

`Patent Application Publication
`
`US 2004/0024898A1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`-4-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 4 of 11
`
`US 2004/0024898A1
`
`09
`
`
`
`
`
`
`
`
`
`
`
`
`
`99
`
`
`
`S?? Opný
`
`}}}}}}}}
`
`| 9
`
`===============
`
`-5-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 5 of 11
`
`US 2004/0024898A1
`
`MPEG-4
`
`OCI
`
`description
`
`Fig. 4A
`(Prior Art)
`
`)
`
`N
`
`description
`
`
`
`MPEG-4
`
`OCI
`
`MPEG-7
`
`URI:URI:
`URI
`YS.N.
`
`description;
`
`description
`
`Fig. 4B
`
`-6-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 6 of 11
`
`US 2004/0024898A1
`
`
`
`Oy '61-I
`
`-7-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 7 of 11
`
`US 2004/0024898A1
`
`JesoduuOO
`
`a
`
`is
`
`a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`99
`
`| 9
`
`-8-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 8 of 11
`
`US 2004/0024898 Al
`
`Fig. 6(a)
`
`Fig. 6(a)
`
`</xsl:template>
`<xsI:template match="/movie/right’>
`
`</xsl:template>
`<xsktemplate match="/movie/scene’”>
`
`</xsk:template>
`
`Presentation Template
`
` «<xsl:template match="/moviestitle’>
`
`
`
`
`
`
`
`
`
`
`
`
`<movie ...=”aMovie.mpg’>
`<title>...</title>
`
`<right>...</right>
`
`<scene ...begin="0:2:0.0" dur="300s">
`<shot...begin="0:0:30.0" dur="30s">
`
`
`</shot>
`Composer
`
`
`
` <scene . .begin="1:0:0.0” dur="600s">
`
`<shot...begin="0:0:15.0" dur="60s">
`
`</shot>
`
`Fig. 6(b)
`
`</xsl:template>
`<xsl:template match="/movie/scene/shot">
`
`Movie Description
`
`</scene>
`
`</movie>
`
`</scene>
`
`
`
`
`
`-9-
`
`-9-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004
`
`Sheet 9 of 11
`
`US 2004/0024898A1
`
`
`
`
`
`(q)9 -61-I
`
`-10-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 10 of 11
`
`US 2004/0024898A1
`
`715
`
`
`
`Computer
`NetWOrk
`
`714.
`
`716
`
`Video
`Display
`
`720
`
`7OO
`?n
`
`701
`
`707
`
`710
`
`711
`
`
`
`
`
`Video
`Interface
`
`
`
`V
`
`to the Il- 709
`
`Storage Device
`
`
`
`704
`
`705
`
`8 N- 706 Y712
`71
`
`Keyboard
`
`702
`
`N-703
`
`-11-
`
`

`

`Patent Application Publication
`
`Feb. 5, 2004 Sheet 11 of 11
`
`US 2004/0024898A1
`
`Z CIO 4.
`
`
`
`
`
`| aiueº || || vuae |uueÐJ?S ?OO
`
`8 (61-)
`
`\ /
`
`
`
`
`
`
`
`ŒØ] No.º|| uenis uodesopousos
`
`-12-
`
`

`

`US 2004/0024898A1
`
`Feb. 5, 2004
`
`DELIVERING MULTIMEDIA DESCRIPTIONS
`
`TECHNICAL FIELD OF THE INVENTION
`0001. The present invention relates generally to the dis
`tribution of multimedia and, in particular, to the delivery of
`multimedia descriptions in different types of applications.
`The present invention has particular application to, but is not
`limited to, the evolving MPEG-7 standard.
`
`BACKGROUND ART
`0002 Multimedia may be defined as the provision of, or
`access to, media, Such as text, audio and images, in which an
`application can handle or manipulate a range of media types.
`Invariably where access to a Video is desired, the application
`must handle both audio and images. Often Such media is
`accompanied by text that describes the content and may
`include references to other content. AS Such, multimedia
`may be conveniently referred to as being formed of content
`and descriptions. The description is typically formed by
`metadata which is, practically Speaking, data which is used
`to described other data.
`0003) The World Wide Web (WWW or, the “Web”) uses
`a client/server paradigm. Traditional access to multimedia
`over the Web involves an individual client accessing a
`database available via a server. The client downloads the
`multimedia (content and description) to the local processing
`System where the multimedia may be utilised, typically by
`compiling and replaying the content with the aid of the
`description. The description is “static” in that usually the
`entire description must be available at the client in order for
`the content, or parts thereof, to be reproduced. Such tradi
`tional access is problematic in the delay between client
`request and actual reproduction, and the Sporadic load on
`both the Server and any communications network linking the
`Server and local processing System as media components are
`delivered. Real-time delivery and reproduction of multime
`dia in this fashion is typically unobtainable.
`0004) The evolving MPEG-7 standard has identified a
`number of potential applications for MPEG-7 descriptions.
`The various MPEG-7 “pull”, or retrieval applications,
`involve client access to databases and audio-Visual archives.
`The “push” applications are related to content Selection and
`filtering and are used in broadcasting, and the emerging
`concept of “webcasting”, in which media, traditionally
`broadcast over the airways by radio frequency propagation,
`is broadcast over the structured links of the Web. Webcast
`ing, in its most fundamental form, requires a Static descrip
`tion and Streamed content. However webcasting usually
`necessitates the downloading of the entire description before
`any content may be received. Desirably, webcasting requires
`Streamed descriptions received with or in association with,
`the content. Both types of applications benefit Strongly from
`the use of metadata.
`0005. The Web is likely to be the primary medium for
`most people to Search and retrieve audio-visual (AV) con
`tent. Typically, when locating information, the client issues
`a query and a Search engine Searches its database and/or
`other remote databases for relevant content. MPEG-7
`descriptions, which are constructed using XML documents,
`enable more efficient and effective Searching because of the
`well-known Semantics of the Standardised descriptors and
`description schemes used in MPEG-7. Nevertheless,
`
`MPEG-7 descriptions are expected to form only a (small)
`portion of all content descriptions available on the Web. It is
`desirable for MPEG-7 descriptions to be searchable and
`retrievable (or downloadable) in the same manner as other
`XML documents on the Web since users of the Web do not
`expect or want AV content to be downloaded with descrip
`tion. In Some cases, the descriptions rather than the AV
`content are what may be required. In other cases, users will
`want to examine the description before deciding on whether
`to download or Stream the content.
`0006 MPEG-7 descriptors and description schemes are
`only a sub-set of the set of (well-known) vocabulary used on
`the Web. Using the terminology of XML, the MPEG-7
`descriptors and description Schemes are elements and types
`defined in the MPEG-7 namespace. Further, Web users
`would expect that MPEG-7 elements and types could be
`used in conjunction with those of other namespaces. Exclud
`ing other widely used Vocabularies and restricting all
`MPEG-7 descriptions to consist only of the standardised
`MPEG-7 descriptors and description schemes and their
`derivatives would make the MPEG-7 standard excessively
`rigid and unusable. A widely accepted approach is for a
`description to include Vocabularies from multiple
`namespaces and to permit applications to process elements
`(from any namespace, including MPEG-7) that the applica
`tion understands, and ignore those elements that are not
`understood.
`0007 To make downloading, and any consequential stor
`ing, of a multimedia (eg. MPEG-7) description more effi
`cient, the descriptions can be compressed. A number of
`encoding formats have been proposed for XML, and include
`WBXML, derived from the Wireless Application Protocol
`(WAP). In WBXML, frequently used XML tags, attributes
`and values are assigned a fixed Set of codes from a global
`code Space. Application Specific tag names, attribute names
`and Some attribute values that are repeated throughout
`document instances are assigned codes from Some local
`code spaces. WBXML preserves the structure of XML
`documents. The content as well as attribute values that are
`not defined in the Document Type Definition (DTD) can be
`Stored in line or in a String table. An example of encoding
`using WBXML is shown in FIGS. 1A and 1B. FIG. 1A
`depicts how an XML source document 10 is processed by an
`interpreter 14 according various code Spaces 12 defining
`encoding rules for WBXML. The interpreter 14 produces an
`encoded document 16 Suitable for communication according
`to the WBXML standard. FIG. 1B provides a description of
`each token in the data stream formed by the document 16.
`0008 While WBXML encodes XML tags and attributes
`into tokens, no compression is performed on any textual
`content of the XML description. Such may be achieved
`using a traditional text compression algorithm, preferably
`taking advantage of the Schema and data-types of XML to
`enable better compression of attribute values that are of
`primitive data-types.
`SUMMARY OF THE INVENTION
`0009. It is an object of the present invention to substan
`tially overcome, or at least ameliorate, one or more disad
`Vantages of existing arrangements to Support the Streaming
`of multimedia descriptions.
`0010 General aspects of the present invention provide
`for Streaming descriptions, and for Streaming descriptions
`
`-13-
`
`

`

`US 2004/0024898A1
`
`Feb. 5, 2004
`
`with AV (audio-visual) content. When streaming descrip
`tions with AV content, the Streaming can be "description
`centric' or “media-centric'. The Streaming can also be
`unicast with upstream channel or broadcast.
`0011. According to a first aspect of the invention, there is
`provided a method of forming a streamed presentation from
`at least one media object having content and description
`components, Said method comprising the Steps of:
`0012 generating a presentation description from at
`least one component description of Said at least one
`media object; and
`0013 processing said presentation description to
`Schedule delivery of component descriptions and
`content of Said presentation to generate elementary
`data Streams associated with Said component
`descriptions and content.
`0.014. According to another aspect of the present inven
`tion there is disclosed a method of forming a presentation
`description for Streaming content with description, Said
`method comprising the Steps of
`0015 providing a presentation template that defines
`a structure of a presentation description;
`0016 applying Said template to at least one descrip
`tion component of at least one associated media
`object to form Said presentation description from
`each said description component, Said presentation
`description defining a Sequential relationship
`between description components desired for
`Streamed reproduction and content components asso
`ciated with Said desired descriptions.
`0.017. According to another aspect of the present inven
`tion there is disclosed a streamed presentation comprising a
`plurality of content objects interspersed amongst a plurality
`of description objects, Said description objects comprising
`references to multimedia content reproducible from Said
`content objects.
`0.018. According to another aspect of the present inven
`tion there is disclosed a method of delivering an XML
`document, Said method comprising the Steps of
`0019 dividing the document to separate XML struc
`ture from XML text; and
`0020 delivering said document in a plurality of data
`Streams, at least one Said Stream comprising Said
`XML structure and at least one other of said streams
`comprising Said XML text.
`0021. In accordance with another aspect of the present
`invention, there is disclosed a method of processing a
`document described in a mark up language, Said method
`comprising the Steps of:
`0022 Separating a structure and a text content of
`Said document;
`0023 sending the structure before the text content;
`and
`0024 commencing to parse the received structure
`before the text content is received.
`0.025. Other aspects of the present invention are also
`disclosed.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0026. At least one embodiment of the present invention
`will now be described with reference to the drawings, in
`which:
`0027 FIGS. 1A and 1B show an example of a prior art
`encoding of an XML document;
`0028 FIG. 2 illustrates a first method of streaming an
`XML document;
`0029 FIG.3 illustrates a second method of “description
`centric' Streaming in which the Streaming is driven by a
`presentation description;
`0030)
`FIG. 4A illustrates a prior art stream;
`0031
`FIG. 4B shows a stream according to one imple
`mentation of the present disclosure;
`0032 FIG. 4C shows a preferred division of a descrip
`tion Stream;
`0033 FIG. 5 illustrates a third method of “media-cen
`tric' Streaming,
`0034 FIG. 6 is an example of a composer application;
`0035 FIG. 7 is a schematic block diagram of a general
`purpose computer upon which the implementation of the
`present disclosure can be practiced; and
`0036 FIG. 8 schematically represents an MPEG-4
`Stream.
`
`DETAILED DESCRIPTION INCLUDING BEST
`MODE
`0037. The implementations to be described are each
`founded upon the relevant multimedia descriptions being
`XML documents. XML documents are mostly stored and
`transmitted in their raw textual format. In Some applications,
`XML documents are compressed using Some traditional text
`compression algorithms for Storage or transmission, and
`decompressed back into XML before they are parsed and
`processed. Although compression may greatly reduce the
`size of an XML document, and thus reduce the time for
`reading or transmitting the document, an application Still has
`to receive the entire XML document before the document
`can be parsed and processed. A traditional XML parser
`expects an XML document to be well-formed (ie. the
`document has matching and non-overlapping Start-tag and
`end-tag pairs), and is unable to complete the parsing of the
`XML document until the whole XML document is received.
`Incremental parsing of a Streamed XML document is unable
`to be performed using a traditional XML parser.
`0038 Streaming an XML document permits parsing and
`processing to commence as Soon as a Sufficient portion of the
`XML document is received. Such capability will be most
`useful in the case of a low bandwidth communication link
`and/or a device with very limited resources.
`0039. One way of achieving incremental parsing of an
`XML document is to send the tree hierarchy of an XML
`document (such as the Dominant Object Model (DOM)
`representation of the document) in a breadth-first or depth
`first manner. To make Such a process more efficient, the
`XML (tree) structure of the document can be separated from
`the text components of the document and encoded and Sent
`
`-14-
`
`

`

`US 2004/0024898A1
`
`Feb. 5, 2004
`
`before the text. The XML structure is critical in providing
`the context for interpreting the text. Separating the two
`components allows the decoder (parser) to parse the Struc
`ture of the document more quickly, and to ignore elements
`that are not required or are unable to be interpreted. Such a
`decoder (parser) may optionally choose not to buffer any
`irrelevant text that arrives at a later stage. Whether the
`decoder converts the encoded document back into XML or
`not depends on the application.
`0040. The XML structure is vital in the interpretation of
`the text. In addition, as different encoding Schemes are
`usually used for the Structure and the text and, in general,
`there is far less Structural information than textual content,
`two (or more) separate streams may be used for delivering
`the Structure and the text.
`0041 FIG. 2 shows one method of streaming XML
`document 20. Firstly, the document 20 is converted to a
`DOM representation 21, which is then streamed in a depth
`first fashion. The structure of the document 20, depicted by
`the tree 21a of the DOM representation 21, and the text
`content 21b, are encoded as two Separate Streams 22 and 23
`respectively. The structure stream 23 is headed by code
`tables 24. Each encoded node 25, representing a node of the
`DOM representation 21, has a size field that indicates its size
`including the total size of corresponding descendant nodes.
`Where appropriate, encoded leaf nodes and attribute nodes
`contain pointerS 26 to their corresponding encoded content
`27 in the text stream 23. Each encoded string in the text
`Stream is headed by a size field that indicates the size of the
`String.
`0042. Not all multimedia (eg. MPEG-7) descriptions
`need be streamed with content or Serve as a presentation. For
`instance, television and film archives Store a vast amounts of
`multimedia material in Several different formats, including
`analogue tapes. It would not be possible to Stream the
`description of a movie, in which the movie is recorded on
`analogue tapes, with the actual movie content. Similarly,
`treating the multimedia description of a patient's medical
`records as a multimedia presentation makes little Sense. AS
`an analogy, while Synchronised Multimedia Integration
`Language (SMIL) presentations are themselves XML docu
`ments, not all XML documents are SMIL presentations.
`Indeed, only a very small number of XML documents are
`SMIL presentations. SMIL can be used for creating presen
`tation Script that enables a local processor to compile an
`output presentation from a number of local files or resources.
`SMIL specifies the timing and synchronisation model but
`does not have any built-in Support for the Streaming of
`content or description.
`0.043
`FIG. 3 shows an arrangement 30 for streaming
`descriptions together with content. A number of multimedia
`resources are shown including audio files 31 and Video files
`32. Associated with the resources 31 and 32 are descriptions
`33 each typically formed of a number of descriptors and
`descriptor relationships. Significantly, there need not be a
`one-to-one relationship between the descriptions 33 and the
`content files 31 and 32. For example, a Single description
`may relate to a number of files 31 and/or 32, or any one file
`31 or 32 may have associated therewith more than one
`description.
`0044 As seen in FIG. 3, a presentation description 35 is
`provided to describe the temporal behaviour of a multimedia
`
`presentation desired to be reproduced through a method of
`description-centric Streaming. The presentation description
`35 can be created manually or interactively through the use
`of editing tools and a Standardized presentation description
`Scheme 36. The scheme 36 utilises elements and attributes to
`define the hyperlinks between the multimedia objects and
`the layout of the desired multimedia presentation. The
`presentation description 35 can be used to drive the Stream
`ing process. Preferably, the presentation description is an
`XML document that uses a SMIL-based description scheme.
`0045 An encoder 34, with knowledge of the presentation
`description Scheme 36, interprets the presentation descrip
`tion 35, to construct an internal time graph of the desired
`multimedia presentation. The time graph forms a model of
`the presentation Schedule and Synchronization relationships
`between the various resources. Using the time graph, the
`encoder 34 schedules the delivery of the required compo
`nents and then generates elementary data Streams 37 and 38
`that may be transmitted. Preferably, the encoder 34 splits the
`descriptions 33 of the content into multiple data streams 38.
`The encoder 34 preferably operates by constructing a URI
`table that maps the URI-references contained in the AV
`content 31, 32 and the descriptions 33 to a local address (eg.
`offset) in the corresponding elementary (bit) streams 37 and
`38. The streams 37 and 38, having been transmitted, are
`received into a decoder (not illustrated) that uses the URI
`table when attempting to decode any URI-reference.
`0046) The presentation description scheme 36, in some
`implementations, may be based on SMIL. Current develop
`ments in MPEG-4 enable SMIL-based presentation descrip
`tion to be processed into MPEG-4 streams.
`0047. An MPEG-4 presentation is made up of scenes. An
`MPEG-4 Scene follows a hierarchical structure called a
`Scene graph. Each node of the Scene graph is a compound or
`primitive media object. Compound media objects group
`primitive media objects together. Primitive media objects
`correspond to leaves in the Scene graph and are AV media
`objects. The Scene graph is not necessarily Static. Node
`attributes (eg. positioning parameters) can be changed and
`nodes can be added, replaced or removed. Hence, a Scene
`description Stream may be used for transmitting Scene
`graphs, and updates to Scene graphs.
`0048. An AV media object may rely on streaming data
`that is conveyed in one or more elementary streams (ES). All
`Streams associated to one media object are identified by an
`object descriptor (OD). However, streams that represent
`different content must be referenced through distinct object
`descriptors. Additional auxiliary information can be attached
`to an object descriptor in a textual form as an OCI (object
`content information) descriptor. It is also possible to attach
`an OCI stream to the object descriptor. The OCI stream
`conveys a set of OCI events that are qualified by their start
`time and duration. The elementary streams of an MPEG-4
`presentation are schematically illustrated in FIG. 8.
`0049. In MPEG-4, information about an AV object is
`Stored and transmitted using the Object Content Information
`(OCI) descriptor or stream. The AV object contains a refer
`ence to the relevant OCI descriptor or Stream. AS Seen in
`FIG. 4A, Such an arrangement requires a specific temporal
`relationship between the description and the content and a
`one-to-one relationship between AV objects and OCI.
`0050. However, typically, multimedia (eg. MPEG-7)
`descriptions are not written for specific MPEG-4 AV objects
`
`-15-
`
`

`

`US 2004/0024898A1
`
`Feb. 5, 2004
`
`or Scene graphs and, indeed are written without any specific
`knowledge of the MPEG-4 AV objects and scene graphs that
`make up the presentation. The descriptions usually provide
`a high level view of the information of the AV content.
`Hence, the temporal Scope of the descriptions might not
`align with those of the MPEG-4 AV objects and scene
`graphs. For instance, a Video/audio Segment described by an
`MPEG-7 description may not correspond to any MPEG-4
`Video/audio Stream or Scene description Stream. The Seg
`ment may describe the last portion of one video Stream and
`the beginning part of the following one.
`0051. The present disclosure presents a more flexible and
`consistent approach in which the multimedia description, or
`each fragment thereof, is treated as another class of AV
`object. That is, like other AV objects, each description will
`have its own temporal scope and object descriptor (OD). The
`Scene graph is extended to Support the new (eg. MPEG-7)
`description node. With Such a configuration, it is possible to
`send a multimedia (eg. MPEG-7) description fragment, that
`has Sub-fragments of different temporal Scopes, as a Single
`data Stream or as Separate Streams, regardless of the tem
`poral Scopes of the other AV media objects. Such a task is
`performed by the encoder 34 and a example of Such a
`structure, applied to the MPEG-4 example of FIG. 4A, is
`shown in FIG. 4B. In FIG. 4B, the OCI stream is also used
`to contain references of relevant description fragments and
`other AV object Specific information as required.
`0.052
`Treating MPEG-7 descriptions in the same way as
`other AV objects also means that both can be mapped to a
`media object element of the presentation description Scheme
`36 and Subjected to the Same timing and Synchronisation
`model. Specifically, in the case of an SMIL-based presen
`tation description Scheme 36, a new media object element,
`Such as an <mpeg7> tag, may be defined. Alternately,
`MPEG-7 descriptions can be treated as a specific type of text
`(eg. represented in Italics). Note that a set of common media
`object elements <VideoD, <audio>, <animation>, <text>, etc.
`are pre-defined in SMIL. The description Stream can poten
`tially be further Separated into a structure Stream and a text
`Stream.
`0053. In FIG. 4C, a multimedia stream 40 is shown
`which includes an audio Stream 41 and a Video stream 42.
`Also included is a high-level Scene description Stream 46
`comprising (compound or primitive) nodes of media objects
`and having leaf nodes (which are primitive media objects)
`that point to object descriptorS ODn that make up an object
`descriptor stream 47. A number of low level description
`Streams 43, 44 and 45 are also shown, each having compo
`nents configured to be pointed to, or linked to the object
`description Stream 47, as do the audio and Video Streams 41
`and 42. With Such an object-oriented Streaming treating both
`content and description as media objects, the temporally
`irregular relationship between description and content may
`be accommodated through a temporal object description
`Structured into the Streams.
`0.054 The above approach to streaming descriptions with
`content is appropriate where the description has Some tem
`poral relationship with the content. An example of this is a
`description of a particular Scene in a movie, that provides for
`multiple camera angles to be viewed, thus permitting viewer
`access to multiple Video Streams for which only one Video
`Stream may, practically Speaking, be viewed in the real-time
`
`running of the movie. This is to be contrasted with arbitrary
`descriptions which have no definable temporal relationship
`with the Streamed content. An example of Such may be a
`newspaper critic's text review of the movie. Such a review
`may make text reference, as opposed to a temporal and
`Spatial reference to Scenes and characters. Converting an
`arbitrary description into a presentation is a non-trivial (and
`often impossible) task. Most descriptions of AV content are
`not written with presentation in mind. They simply describe
`the content and its relationship with other objects at various
`levels of granularity and from different perspectives. Gen
`erating a presentation from a description that does not use
`the presentation description Scheme 36 involves arbitrary
`decisions, best made by a user operating a specific applica
`tion, as opposed to the Systematic generation of the presen
`tation description 35.
`0055 FIG. 5 shows another arrangement 50 for stream
`ing descriptions with content that the present inventor has
`termed “media-centric'. AV content 51 and descriptions 52
`of the content 51 are provided to a composer 54, also input
`with a presentation template 53 and having knowledge of a
`presentation description scheme 55. Although the content 51
`shows a Video and its audio track is shown as the initial AV
`media object, the initial AV object can actually be a multi
`media presentation.
`0056. In media-centric streaming, an AV media object
`provides the AV content 51 and the timeline of the final
`presentation. This is in contrast to the description centric
`Streaming where the presentation description provides the
`timeline of the presentation. Information relevant to the AV
`content is pulled in from a set of descriptions 52 of the
`content by the composer 54 and delivered with the content
`in a final presentation. The final presentation output from the
`composer 54 is in the form of elementary streams 57 and 58,
`as with the previous configuration of FIG. 3, or as a
`presentation description 56 of all the associated content.
`0057 The presentation template 53 is used to specify the
`type of descriptive elements that are required and those that
`should be omitted for the final presentation. The template 53
`may also contain instructions as to how the required descrip
`tions should be incorporated into the presentation. An exist
`ing language Such as XSL Transformations (XSLT) may be
`used for Specifying the templates. The composer 54, which
`may be implemented as a Software application, parses the Set
`of required descriptions that describe the content, and
`extracts the required elements (and any associated Sub
`elements) to incorporate the elements into the time line of
`the presentation. Required elements are preferably those
`elements that contain descriptive information about the AV
`content that is useful for the presentation. In addition,
`elements (from the same set of the descriptions) that are
`referred to (by IDREF's or URI-references) by the selected
`elements are also included and Streamed before their corre
`sponding referring elements (their “referrers'). It is possible
`that a selected element is in turn referenced (either directly
`or indirectly) by an element that it references. It is also
`possible that a Selected element has a forward reference to
`another Selected element. An appropriate heuristic may be
`used to determine the order by which Such elements are
`Streamed. The presentation template 53 can also be config
`ured to avoid Such situations.
`0058. The composer 54 may generate the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket