`US008782268B2
`
`c12) United States Patent
`Pyle et al.
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 8,782,268 B2
`Jul. 15, 2014
`
`(54) DYNAMIC COMPOSITION OF MEDIA
`
`(75)
`
`Inventors: Harry Pyle, Bellevue, WA (US); Robert
`Kilroy Hughes, Seattle, WA (US)
`
`(73) Assignee: Microsoft Corporation, Redmond, WA
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 218 days.
`
`(21) Appl. No.: 12/938,747
`
`(22) Filed:
`
`Nov. 3, 2010
`
`(65)
`
`Prior Publication Data
`
`US 2012/0023251 Al
`
`Jan.26,2012
`
`Related U.S. Application Data
`
`(60) Provisional application No. 61/366,059, filed on Jul.
`20, 2010.
`
`(51)
`
`(2006.01)
`
`Int. Cl.
`G06F 15/16
`(52) U.S. Cl.
`USPC ............ 709/231; 709/219; 709/246; 386/125
`( 58) Field of Classification Search
`CPC ... H04L 65/4084; H04L 65/608; H04L 67/02;
`H04N 21/234327; H04N 21/64322; H04N
`21/8456
`USPC ........................... 709/231, 219, 246; 386/125
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`7,596,296 B2
`7,624,350 B2
`2003/0140159 Al
`2005/0025469 Al*
`
`9/2009 Hendrickson et al.
`11/2009 Garg et al.
`7/2003 Campbell
`2/2005 Geer et al.
`(Continued)
`
`.................... 386/125
`
`OTHER PUBLICATIONS
`
`Zhigang Chen, "Video and Audio: Organization and Retrieval in the
`WWW", Jun. 20, 2007; 13 Pages, http://choices.cs.uiuc.edu/Papers/
`New/www5/www5.htrnl.*
`
`(Continued)
`
`Primary Examiner - Michael C Lai
`(74) Attorney, Agent, or Firm - Ben Tabor; Kate Drakos;
`Micky Minhas
`
`ABSTRACT
`(57)
`The subject disclosure relates to dynamic compos1t10n
`including the ability to create interoperable combinations of
`content by the publisher, e.g., determined to be an optimal
`combination, and offer such combinations to client devices in
`an interoperable way to allow simple selection by devices
`without complex programming, web pages, etc. specific to
`each device. Compositions are dynamic in that new audio,
`video, subtitle, etc. tracks can be added to a given composi(cid:173)
`tion without changing any of the other tracks, e.g., by updat(cid:173)
`ing the composition's extensible markup language (XML),
`and new compositions can be created or removed at any time
`without changing any audio or video files. Interoperable and
`scalable "discovery" is also enabled whereby random devices
`can contact a Web server, find and play a composition
`matched to the given devices and users, e.g., optimal compo(cid:173)
`sition for a given device and user. Using the content identifi(cid:173)
`cation and description format of compositions, devices can
`search, sort, browse, display, etc. content that is available,
`determine if it is compatible at the device, decode, and deter(cid:173)
`mine digital rights management (DRM) level, and content
`level.
`
`6,704,738 Bl *
`7,489,707 B2
`
`3/2004 de Vries et al.
`2/2009 Pung et al.
`
`1/1
`
`16 Claims, 13 Drawing Sheets
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0001
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`US 8,782,268 B2
`Page 2
`
`(56)
`
`References Cited
`
`OTHER PUBLICATIONS
`
`U.S. PATENT DOCUMENTS
`
`2005/0097008 Al*
`2006/0059481 Al*
`2007/0005795 Al
`2009/0006643 Al
`2009/0210549 Al *
`2009/0259686 Al
`2011/0173345 Al *
`2011/0307581 Al*
`2012/0023251 Al*
`
`................... 705/26
`5/2005 Ehring et al.
`3/2006 Smith et al. ................... 717 /173
`1/2007 Gonzalez
`1/2009 Lee et al.
`8/2009 Hudson et al. ................ 709/231
`10/2009 Figueroa et al.
`7/2011 Knox et al .................... 709/246
`12/2011 Furbecketal. ............... 709/219
`1/2012 Pyle et al. ..................... 709/231
`
`"Working with Time-Based Media", Published Date: Apr. 30, 2008;
`6 pages; http://ditra.cs.umu.se/jmf2_0-guide-html/jmftbm.html. *
`"Working with Time-Based Media", Published Date: Apr. 30, 2008;
`6 pages; http://ditra.cs.umu.se/jmf2 0-guide-html/jmftbm.htrnl.
`Zhigang Chen, "Video and Audio: Organization and Retrieval in the
`WWW", Published Date: Jun. 20, 2007; 13 Pages, http:! /choices.cs.
`uiuc .edu/Papers/N ew/www5/www5 .html.
`
`* cited by examiner
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0002
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 1 of 13
`
`US 8,782,268 B2
`
`r•oo
`
`102
`
`104
`
`106
`
`COMPOSITION LA YER
`
`TRACK SET LA YER
`
`PRESENTATION
`
`FIG.1
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0003
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 2 of 13
`
`US 8,782,268 B2
`
`206
`
`CONTENT
`
`I
`I
`I
`I
`I
`MANI(cid:173)
`1 MANI-
`I~
`FEST 3 • • •
`1-
`: FEST 2
`I
`I
`I
`~2041 _____ 2042 _____ 2043 _______ 204N _____ _
`I
`
`MAIN(cid:173)
`FESTN
`
`MANI(cid:173)
`FEST2
`
`~ - - - - - - - - - - - - - - - -
`1
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`1-
`1-
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I--------------------------------->
`
`FIG. 2
`
`r 200
`
`202--......_
`
`-
`
`MANIFEST
`COMPONENT
`
`._
`
`-
`
`210--......_
`
`COMPOSITION ._ -
`
`COMPONENT
`
`-
`
`212-._
`/
`'--
`
`DATA
`STORE
`
`'-
`
`_ /
`
`-~
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0004
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 3 of 13
`
`US 8,782,268 B2
`
`300
`
`300
`
`302----.......
`
`304----.......
`
`306----.......
`
`CODEC
`
`RES-
`OLUTION
`
`BITRATE
`
`308----.......
`
`310---....
`
`312----.......
`
`CAMERA
`ANGLE
`
`LANG-
`UAGE
`
`AUDIO
`CHANNEL
`
`314----.......
`
`316---..._
`
`318----.......
`
`SUBTITLE/
`TEXT
`
`CONTENT
`TYPE
`
`DRM
`
`320---....
`
`322---..._
`
`324---..._
`
`RATING
`
`VERSION
`
`OTHER
`
`326
`
`OBJECT MODEL
`
`210
`
`COMPOSITION
`COMPONENT
`
`FIG. 3
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0005
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 4 of 13
`
`US 8,782,268 B2
`
`400
`
`420
`
`REQUEST
`
`418
`
`206
`
`CONTENT
`
`208
`REPRESENTATION
`
`- - - - - - - - 1- - - - - - - -
`1 SEGMENTS I SEGMENTS
`
`404
`
`406
`408
`4H)
`
`412
`
`414
`
`TRACK I
`I
`I
`
`SETS
`
`AUDIO TRACK SETS
`VIDEO TRACK SETS
`TEXT TRACK SETS
`ENHANCED LA YER
`TRACK SETS
`
`SELECT-
`ABLE
`TRACK
`SETS
`
`SWITCH-
`ABLE
`TRACK
`SETS
`
`416
`
`FIG. 4
`
`COMPOSITION
`COMPONENT
`
`202
`
`MANIFEST
`COMPONENT
`
`402
`
`TRACK SET
`COMPONENT
`
`424
`
`STORAGE
`COMPONENT
`
`r.r..l
`0
`;;z
`0
`f-<
`C/l
`c.,;..
`
`c.,;.. -0...
`
`426
`
`212
`
`DATA
`STORE
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0006
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 5 of 13
`
`US 8,782,268 B2
`
`500
`
`502
`
`COMPONENT
`
`506
`
`COMMUN(cid:173)
`ICATIONS
`COMPONENT
`
`204
`
`208
`
`ATTIONS
`
`FIG. 5
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0007
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 6 of 13
`
`US 8,782,268 B2
`
`600
`
`400
`
`606
`
`502
`
`SYSTEM 400 OR COMPONENTS
`THEREOF
`204~
`
`REQUEST
`
`504
`
`MANIFESTS
`
`SELECTED MANI.
`602
`
`SELECTION
`COMPONENT
`
`CONTENT SERVER 608
`
`208~
`
`REPRESENT(cid:173)
`ATTlONS
`
`604
`
`REC.
`
`506
`
`COMMUN(cid:173)
`ICATIONS
`COMPONENT
`
`610
`
`612
`PR~E_S_E_N_T_A_T_IO_N~- PRESENT A TTON
`COMPONENT
`
`FIG. 6
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0008
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 7 of 13
`
`US 8,782,268 B2
`
`r100
`
`rno
`
`.;:-400
`
`SYSTEM 400 OR
`COMPONENTS
`THEREOF
`
`FIG. 7A
`
`712
`
`COMPUTING DEVICE OR
`APPLIANCE
`
`600
`
`SYSTEM 600 OR
`COMPONENTS
`
`FIG. 7B
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0009
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 8 of 13
`
`US 8,782,268 B2
`
`rsoo
`
`FIG. 8
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0010
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 9 of 13
`
`US 8,782,268 B2
`
`START
`
`r9oo
`
`MAINTAIN A PLURALITY OF
`MANIFESTS FOR AN INDIVIDUAL
`ITEM OF CONTENT
`
`CONFIGURE THE MANIFESTS FOR
`DESCRIBING RESPECTIVE
`LOCATIONS OF
`CONTENT
`
`902
`
`904
`
`CLASSIFY THE MANIFESTS BASED
`UPON
`
`906
`
`B
`
`FIG. 9
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0011
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 10 of 13
`
`US 8,782,268 B2
`
`,rlOOO
`
`CLASSIFY THE MANIFESTS
`ACCORDING TO AN OBJECT MODEL
`DEFINING A SET OF ATTRIBUTES
`
`1002
`
`SETS ASSOCIATED
`IDENTIFY
`WITH RESPECTIVE REPRESENT(cid:173)
`ATIONS OF THE CONTENT
`
`I004
`
`TYPE ASSOCIATED WITH
`
`1006
`
`SPECIFY A FIRST SUBSET OF TRACK
`SETS THAT ARE SELECTED PRIOR
`TO DELIVERY
`
`1008
`
`OF
`A2ND
`THAT ARE SEAMLESSLY
`SWITCHABLE AT PRESENTATION
`
`1010
`
`STOP
`
`FIG. 10
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0012
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 11 of 13
`
`US 8,782,268 B2
`
`B
`
`,r1100
`
`OR TRANSMIT A REQUEST
`THE
`A
`SUIT ABLE REPRESENTATION
`
`1102
`
`OR RECEIVE A
`RESPONSE TO THE
`A
`
`DEFINE A PARTIAL
`REPRESENTATION AS A SINGLE
`TRACK OF CONTENT
`
`1104
`
`1106
`
`ENABLE SINGLE TRACK
`ADDRESSABILITY FOR THE
`PLURALITY OF REPRESENTATIONS
`
`1108
`
`ENABLE INDEPENDENT ACCESS TO
`ANY CONTENT COMPONENT TYPE
`ASSOCIATED WITH THE CONTENT
`
`1110
`
`EMPLOY AN ADDRESSING SCHEME
`FOR OPERATING AS A FUNCTION OF
`VARIOUS PARAMENTERS
`
`1112
`
`STOP
`
`FIG. 11
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0013
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`U.S. Patent
`
`Jul. 15, 2014
`
`Sheet 12 of 13
`
`US 8,782,268 B2
`
`1222
`
`D···~ Object
`
`1224
`
`Computing /
`Device
`1220
`
`Computi~g Device
`I
`I
`I
`I
`I
`I
`I
`
`1240
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`Computing
`Device
`1228
`
`Object
`1226
`
`,_
`
`Communications
`Network/Bus
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/
`
`, /~
`
`i[J 1210
`
`Server Object
`
`Data
`Store(s)
`1230
`
`FIG. 12
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0014
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`Computing Environment 1300
`
`System Memory l)3~3_0 _ _ _
`
`----------------------------------------1
`I
`I
`I
`I
`
`I 1310 1 - -
`~
`
`1
`Processing
`Unit(s). e.g.,
`CPU.GPU
`1320
`
`t
`
`....
`
`Output. e.g .•
`Display
`1350
`
`t
`
`t
`
`Input
`
`1340
`
`FIG. 13
`
`Network
`+I lnterface(s)
`1360
`r
`
`1372
`l /
`
`w
`REMOTE
`COMPUTER($)
`1
`
`~
`00
`•
`~
`~
`~
`
`~ = ~
`
`2' :--....
`0 ....
`
`~Ul
`N
`
`.i;...
`
`('D
`('D
`
`~
`
`rJJ =(cid:173)
`.....
`....
`0 ....
`....
`
`~
`
`d r.,;_
`00
`~
`00
`N
`'N
`O',
`
`00 = N
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0015
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`US 8,782,268 B2
`
`1
`DYNAMIC COMPOSITION OF MEDIA
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application claims priority to U.S. Provisional Appli(cid:173)
`cation Ser. No. 61/366,059, filed on Jul. 20, 2010, entitled
`FOR
`"DYNAMIC COMPOSITION OF MEDIA
`DEVICES", the entirety of which is incorporated herein by
`reference.
`
`2
`client devices, network conditions, and user preferences, the
`number of fixed muxed representations of the content grows
`exponentially.
`The above-described deficiencies of today's techniques are
`5 merely intended to provide an overview of some of the prob(cid:173)
`lems of conventional systems, and are not intended to be
`exhaustive. Other problems with conventional systems and
`corresponding benefits of the various non-limiting embodi-
`ments described herein may become further apparent upon
`review of the following description.
`
`10
`
`TECHNICAL FIELD
`
`SUMMARY
`
`The subject disclosure relates to dynamic composition of
`media for streaming to consuming devices.
`
`BACKGROUND
`
`Existing solutions for composing media and streaming to
`devices either do not allow for independent combination of
`media files at all because they require single files containing
`a permanently fixed set of tracks, or require non-standardized
`and proprietary methods to combine independent tracks
`available on the server, which limit widespread industry
`implementation and adoption due to the closed nature of such
`systems.
`For instance, with respect to digital versatile disks (DVD)
`and Blu-ray track formats, the tracks are provided as a single
`file containing many tracks, thus limiting flexibility and
`usability since in order to decode any portion of the data
`included in the DVD or Blu-ray, the entire file typically must
`be present. A single monolithic file for content, while accept(cid:173)
`able in terms of delivery by way of physical discs, is not very
`efficient in terms of streaming, and thus severely limits con(cid:173)
`tent streaming solutions.
`For another example, media presentation description
`(MPD) and other adaptive streaming solutions that switch
`entire files cannot independently switch tracks, so are, in
`practicality, limited to switching only video bitrates, or a few
`other attributes. This is so because of a "combinatorial com(cid:173)
`plexity problem." For instance, a content provider who
`desires to make a feature length film available via adaptive
`streaming generally must previously encode a different file
`for all the combinations that will be utilized for the set of
`clients. However, a typical movie may require multiple video
`resolutions, camera angles, video bitrates, audio channels,
`supported languages, descriptive audio tracks and closed cap(cid:173)
`tioning languages. Every combination represents a separate
`muxed version of the movie, leading to the aforementioned
`combinatorial complexity problem.
`For example, a movie with eight audio tracks to cover
`different language and codecs, two caption streams, and two
`video angles, would result in 8x2x2x8=256 separate, multi(cid:173)
`plexed or "muxed" versions of the movie (e.g., 256 different
`representations of the same content), each of which is be
`stored on content servers to allow subsequent streaming.
`Moreover, this problem becomes greater when the content is
`to be HTTP Live adaptive streaming with six quality levels
`broken up into ten-second segments. A two hour movie
`becomes 256x720=138,240 files (e.g., 120 minutes and 6
`chunks per minute=720 chunks per movie). Further still, the
`illustrated example provides only a few options for a client to
`choose, in particular, a client can choose between eight lan(cid:173)
`guages, two caption streams, and two video angles. In order to
`give the client more options and/or to adapt to a wider set of
`
`15
`
`A simplified summary is provided herein to help enable a
`basic or general understanding of various aspects of exem(cid:173)
`plary, non-limiting embodiments that follow in the more
`detailed description and the accompanying drawings. This
`surnniary is not intended, however, as an extensive or exhaus-
`20 tive overview. Instead, the sole purpose of this surnniary is to
`present some concepts related to some exemplary non-limit(cid:173)
`ing embodiments in a simplified form as a prelude to the more
`detailed description of the various embodiments that follow.
`In one or more embodiments, dynamic composition is
`25 enabled, which relates to the ability to create interoperable
`combinations of content by the publisher, e.g., determined to
`be an optimal combination, and offer such combinations to
`client devices in an interoperable way to allow simple selec(cid:173)
`tion by devices without complex prograniming, web pages,
`30 etc. specific to each device. Compositions are dynamic in that
`new audio, video, subtitle, etc. tracks can be added to a given
`composition without changing any of the other tracks, e.g., by
`updating the composition's extensible markup language
`(XML), and new compositions can be created or removed at
`35 any time without changing any audio or video files.
`In various non-limiting embodiments, interoperable and
`scalable "discovery" is also enabled whereby random devices
`can contact a Web server, find and play a composition
`matched to the given devices and users, e.g., optimal compo-
`40 sition for a given device and user. Using the content identifi(cid:173)
`cation and description format of compositions, the devices
`can search, sort, browse, display, etc. content that is available,
`determine if it is compatible at the device, decode and deter(cid:173)
`mine digital rights management (DRM) level, and content
`45 level (such as compatible language, optimum display resolu(cid:173)
`tion e.g. not 240 line video on a 1080 line screen, stereo for
`headphones, multichannel for a surround sound system, etc.).
`In one or more embodiments, presentations can be initial(cid:173)
`ized on devices based on track set information that provides
`50 information on the set of tracks available for playback in a
`selected composition.
`Compositions enable flexible selection between multiple
`wire formats for compatibility with more devices ( e.g., some
`devices are limited to MPEG-2 Transport Streams, MP4 mul-
`55 tiplex files, or protected interoperable file format (PIFF) Frag(cid:173)
`mented international organization for standardization (ISO)
`Base Media files, and so on).
`In this regard, compositions enable selection of new
`request protocols in combinations with some wire formats,
`60 such as PIFF, that eliminate the need to frequently request
`updates from the server. Time based Requests can use time
`information in the video stream that is shared by all alternate
`tracks in order to push information utilized for the next uni(cid:173)
`versal resource locator (URL) requests for an entire Switch
`65 Group or Groups sharing a common timeline. Event informa(cid:173)
`tion can also be delivered in other ways, e.g., "sparse tracks"
`in Smooth Streaming Transport Protocol (SSTP).
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0016
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`US 8,782,268 B2
`
`3
`Track sets enable a standardized way for devices to switch
`between tracks while streaming based on codec, resolution,
`camera angle, language, audio channels (e.g., 2.0, 5.1, 7.1 ch,
`etc.), subtitles, content type (e.g., dialog, translation, com(cid:173)
`mentary, description for visually impaired, etc.). The track 5
`sets are thus formed based on, e.g., optimized for, streaming
`with the selected tracks downloaded, as opposed to being a
`single file containing many tracks, such as in the case of DVD
`and Blu-ray track formats.
`These and other embodiments are described in more detail 10
`below.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`4
`various enhancements. These enhancements can, e.g., solve
`or mitigate the combinatorial complexity problem associated
`with adaptive streaming of content as well as provide for, e.g.,
`improved independent track storage and addressing, a com(cid:173)
`mon encryption for Digital Rights Management (DRM)
`interoperability, support for real-time/live trans-multiplexing
`to alternate wire formats, or improved support for efficient
`live streaming to name but a few.
`Mitigating the combinatorial complexity problem for any
`adaptive streaming approach will ideally support storage of
`tracks in separate files, permitting a client to combine tracks
`without the service provider being required to produce a
`separate muxed version of the content for each combination.
`Rather, certain portions of the content can be reused for
`15 composing the versions requested by the client. Likewise, the
`standardized encoding format need not require all tracks to be
`in the same file. Instead, such encoding format can support
`independent track storage and combination.
`Advantageously, subject matter disclosed herein can be
`20 compatible with ISO/IEC 13818-1 and ISO/IEC 14496-12
`and can therefore be fully compliant with the capabilities to
`work with both Motion Picture Experts Group Transport
`Stream (MPEG-2 TS) and fragmented MPEG-4 (MP4) files.
`Various embodiments can employ a file system or structure
`25 based on Protected Interoperable File Format (PIFF) code
`point of the ISO based media file format (ISOFF), which
`provides numerous inherent advantages that can be lever(cid:173)
`aged, including, e.g., independent track storage and address(cid:173)
`ing as well as DRM interoperability. By leveraging DRM-
`30 interoperable encoding, a variety of improvements over other
`MP4 encodings can be realized.
`In more detail, it can be presumed that content protection or
`DRM will continue to be utilized for some video content. For
`a variety of business and technical reasons, different original
`equipment manufactures (OEMs) will show a preference for
`different DRM systems. Perhaps more importantly, the
`notion of a "DRM Standard" is misleading, since all DRM
`systems require a license agreement, with compliance and
`robustness rules, and a certificate infrastructure. In other
`40 words, DRM systems are implemented as part of a business
`proposition, and as such remain proprietary.
`For these and other reasons, DRM interoperability is likely
`to be employed for any adaptive streaming system which
`intends to target a broad range of consumer electronic
`devices. This type of interoperability is most easily accom(cid:173)
`plished by establishing a common encryption mechanism,
`along with a generic way of signaling multiple DRM support
`for the file or stream. Advantageously, these features have
`been implemented in PIFF. The common encryption and sig(cid:173)
`naling mechanism defined in PIFF has been adopted by the
`Digital Entertainment Content Ecosystem (DECE), and has
`been proposed for standardization in the ISO MPEG file
`format working group, but as yet has not been adopted by
`conventional streaming systems or providers.
`Furthermore, it is to be understood that a difference exists
`between a particular storage format and typical wireline for(cid:173)
`mats used for adaptive streaming. For example, many devices
`are restricted to receive content in a specified wireline format.
`If the native storage format of the adaptive streaming tech-
`60 nology lends itself to be used with multiple wireline formats,
`by permitting real-time/live trans-muxing and trans-encapsu(cid:173)
`lation into alternative wireline format, then such can result in
`a format that is much more practical and/or efficient in a
`heterogeneous deployment environment, such as an adaptive
`65 streaming environment.
`Today, HTTP Streaming depends on the existence of a
`manifest file to inform the client what media elements are
`
`The system and methods for representing synchronization
`knowledge and/or partial knowledge for multiple nodes shar(cid:173)
`ing subsets of a set of information are further described with
`reference to the accompanying drawings in which:
`FIG. 1 illustrates a logical hierarchy of various layers
`employed for adaptive streaming;
`FIG. 2 illustrates a block diagram of an exemplary non(cid:173)
`limiting system that can facilitate hypertext transfer protocol
`(HTTP) delivery of streaming media;
`FIG. 3
`illustrates exemplary non-limiting example
`attributes various attributes;
`FIG. 4 is block diagram of an exemplary system that illus(cid:173)
`trates additional features or aspect in connection with HTTP
`delivery of streaming content;
`FIG. 5 is block diagram of an exemplary system that can
`facilitate presentation of streaming content delivered by way
`of HTTP;
`FIG. 6 is a block diagram of an exemplary system that can
`provide additional features or aspects in connection with
`presentation of HTTP streaming content;
`FIG. 7Ais a blockdiagramofanexemplary serverembodi- 35
`ment of the disclosed subject matter;
`FIG. 78 is block diagram of an exemplary device or appli(cid:173)
`ance embodiment of the disclosed subject matter;
`FIG. 8 is a graphical illustration that depicts an exemplary
`overview of Compositions and Track Set schema;
`FIG. 9 is an exemplary non-limiting flow diagram for
`facilitating hypertext transfer protocol (HTTP) delivery of
`streaming content;
`FIG. 10 is an exemplary non-limiting flow diagram for
`employing track sets and/or object models in connection with 45
`facilitating HTTP delivery of streaming content;
`FIG. 11 is an exemplary non-limiting flow diagram for
`providing additional features or aspects in connection with
`facilitating HTTP delivery of streaming content;
`FIG. 12 is a block diagram representing an exemplary 50
`non-limiting networked environment in which the various
`embodiments may be implemented; and
`FIG. 13 is a block diagram representing an exemplary
`non-limiting computing system or operating environment in
`which the various embodiments may be implemented.
`
`55
`
`DETAILED DESCRIPTION
`
`Overview
`
`By way of an introduction, the subject matter disclosed
`herein relates to various embodiments for hypertext transfer
`protocol (HTTP) streaming of content. As such, the disclosed
`subject matter can describe a next generation smooth stream(cid:173)
`ing transport protocol with a client manifest expressed in
`terms of the Third Generation Partnership Project (3GPP)
`Media Presentation Description (MPD) (TS 26.234), with
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0017
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`US 8,782,268 B2
`
`5
`available. In the 3GPP specification, this manifest is called
`the Media Presentation Description (MPD), and will be
`referred to herein as either an MPD or manifest. For live
`streaming to have low latency from real-time events, the MPD
`information is constantly updated for the client to formulate
`the correct uniform resource locator (URL) to obtain the
`latest segment of content.
`Thus, the 3GPP live streaming method requires periodic
`requests (e.g., every few seconds for low latency) of the entire
`MPD file. Moreover, since this MPD file is constantly chang(cid:173)
`ing, it typically must have a short expiration time and thus
`reduced utilization of HTTP edge caching. In contrast, one of
`the benefits of the disclosed subject matter is that Smooth
`Streaming can reduce the MPD live update traffic. Such can
`be accomplished by employing time-based segment address(cid:173)
`ing along with segment start and duration time information in
`the segment wire format.
`Such duration information can be utilized to update the
`MPD segment information for all tracks time-aligned to the
`track being streamed without requiring a reread the of MPD
`document. With this information in hand, the client can for(cid:173)
`mulate time-based segment request URLs.
`At the logical level, Media Description for Adaptive
`Streaming is a description of a set of media resources avail(cid:173)
`able for HTTP Adaptive Streaming (HAS), which is com(cid:173)
`posed of three layers, as illustrated with reference to FIG. 1.
`Referring initially to FIG. 1, illustration 100 depicts a
`logical hierarchy of various layers employed for adaptive
`streaming. At the top is a "composition" layer 102 for describ(cid:173)
`ing arrangements of resources suitable for various endpoint 30
`consumptions. For example, there may be one arrangement
`suitable for portable devices and another for personal com(cid:173)
`puters, or another for a certain "wire format" (e.g. MPEG2-
`TS or fragmented MP4), or another for a family friendly
`version of the content. Some of these may be chosen auto(cid:173)
`matically by the terminal while others are driven by the user
`interacting with player controls. It is the composition that
`combines an appropriate set of audio, video, and text tracks.
`Further detail is provided in connection with FIGS. 2 and 3,
`infra.
`Below the composition layer 102 is the "track set" layer
`104 for describing the various media component tracks avail(cid:173)
`able as alternatives both for adapting to network and client
`conditions, and for user choice. These sets are separated into
`those that are "seamlessly switchable" those that are "select- 45
`able" but can result in a non-seamless transition. Generally, a
`given track set deals with a single media component type. For
`more detail in connection with track sets, see FIG. 4.
`Next, below the track set layer 104 is the "presentation"
`layer 106 for describing periods of contiguous representa- 50
`tions of media segments available as HTTP resources for
`incremental streaming. It is underscored that the segmented
`nature of these representations allows the streaming to be
`adapted to varying network and client conditions. This layer
`has descriptive information for all the tracks, including dee- 55
`laration of which tracks are switchable. Additional detail with
`respect to these features can be found with reference to FI GS.
`5 and 6.
`With respect to compositions and track sets, dynamic com(cid:173)
`position can relate to the ability to create interoperable com(cid:173)
`binations of content determined to be optimal by the pub(cid:173)
`lisher and offer them to client devices in an interoperable way
`to allow simple selection by devices without complex pro(cid:173)
`gramming, web pages, etc. specific to each device. Compo(cid:173)
`sitions are dynamic in the sense that new audio, video, sub(cid:173)
`title, etc. tracks can be added without changing any of the
`other tracks just by updating the Composition XML, and new
`
`6
`Compositions can be created or removed at any time without
`changing any audio or video files.
`In contrast, existing solutions require some non-standard(cid:173)
`ized method to combine independent tracks available on the
`5 server, or do not allow for independent combination because
`they require single files containing a permanently fixed set of
`tracks.
`Terminology
`In order to simplify explanation of the disclosed subject
`10 matter and to further ensure conceptual understanding, the
`following terms are provided exemplary definitions:
`Adaptive bitrate streaming: A technique of dynamically
`varying the video bit rate to provide continuous playback at
`the highest quality that available bandwidth and client ren-
`15 dering power will support.
`AVC: Advanced Video Coding.
`Chunk: A contiguous set of samples for one track.
`Common Content Component: A content component
`shared by different services, e.g. two services use the same
`20 video tracks but different audio tracks, illustrating the shared
`video tracks as common content components.
`Composition Layer: Top layer describing arrangements of
`resources suitable for endpoint consumption, selecting the
`appropriate combination of audio, video and text tracks. See,
`25 e.g., composition layer 102. Compare 'track set' layer and
`'presentation' layer.
`Conformance Point: The specific profile and level of a
`content component that uniquely defines the decoding capa(cid:173)
`bilities employed.
`Content: A set of content components, e.g. a movie, a song.
`Content Component: Content of a single type or a subset
`thereof. For example, a video track, an audio track, movie
`subtitles, or an enhancement layer of video.
`Delivery: Transport of content components to a defined end
`35 point.
`HAS: HTTP Adaptive Streaming.
`ISO Base Media File: File format defined in reference
`ISOFF (ISO 14496-10).
`Media Component: Media components are an encoded
`40 version of one individual media type, such as audio, video or
`timed text. Media components are time-continuous across
`boundaries of consecutive media segments within one repre(cid:173)
`sentation.
`MPD: Media Presentation Description
`MPEG-2 TS: Motion Picture Experts Group Transport
`Stream.
`Partial Representation: The disclosed subject matter modi-
`fies the definition of partial representation from that defined
`in 3GPP MPD. In 3GPP, a partial representation is permitted
`to be more than one track. As used herein, a partial represen(cid:173)
`tation can be one and only one track. See Representation,
`Period.
`Period: Consistent with the 3GPP MPD definition, each
`period consists of one or more representations. See represen(cid:173)
`tation, partial representation.
`Presentation: Operation performed by one or more devices
`that allows a consumer to experience the content, e.g. view a
`movie or listen to a song.
`Presentation Layer: Bottom layer describing periods of
`60 contiguous representations of media segments. See e.g., pre(cid:173)
`sentation layer 106.
`Progressive download: Download where presentation is
`started before the delivery is completed.
`Random Access: Starting a presentation at an arbitrary
`65 point.
`Representation: The disclosed subject matter generally
`defines a representation in the same way as the 3GPP MPD
`
`Netflix, Inc. and Hulu, LLC - Ex. 1004, Page 0018
`IPR2020-00648 (Netflix, Inc. and Hulu, LLC v. DivX, LLC)
`
`
`
`US 8,782,268 B2
`
`5
`
`7
`does. A Representation is one of the alternative choices of the
`media content, typically the encoding choice. A representa(cid:173)
`tion