`ORGANISATION INTERNATIONALE DE NORMALISATION
`ISO/IEC JTC1/SC2/WG8
`CODED REPRESENTATION OF PICTURE AND AUDIO INFORMATION
`
`ISO/IEC JTC1ISC2/WG8
`MPEG 89/216
`
`October1989
`
`Source
`
`Title
`
`Status
`
`Allen Simon Intel
`Chairman of MPEG Systems Ad-hoc group
`Kurihama Meeting Report
`Draft
`
`Kurihama Meeting Report
`
`On October 20 two meetings of the Ad-hoc group on systems were held The following people
`attended for at least part of the time
`
`Allen Simon
`
`Geoff Morrison
`
`Bernard Szabo
`Jan Van de Meer
`Barrie Smith
`
`Takuyo Kogure
`Akiyoshi Tanaka
`Hiroshi Watanabe
`Jun Yonenitsu
`
`Kohtaro Asai
`
`Atsushi Nagata
`Takeshi Murakami
`Dither Le Gall
`Leonardo Chiarligione
`John Morris
`
`Ikuo Inoue
`
`Yogi Noguchi
`Yukitoshi Tsuboi
`
`Saint Gerans
`
`Intel
`BTRL
`DEC
`Philips
`BTRL
`Matsushita
`
`Matsushita
`NTF
`
`Sony
`Misubishi
`
`Matsushita
`
`Fujitsu
`Beilcore
`CSELT
`
`Philips
`Matsushita
`
`USA
`UK
`USA
`NL
`GB
`Japan
`
`Japan
`
`Japan
`
`Japan
`
`Japan
`
`Japan
`
`Japan
`USA
`
`Italy
`
`UK
`Japan
`
`Japan
`
`Sharp
`Hitachi
`Japan
`Thomson CE USA
`
`for
`
`At the meeting we discussed the charter of this group and generated
`proposed statement
`our area of work and programme of work We also spent approximately 10 minutes each discussing
`in MPEG 89/154
`as well as several other
`issues that were raised during the
`and came to whatever conclusion we could agree on within the time
`course of the discussion
`for most of them The results of this effort are summarized below including some editing
`constraint
`and additions by the chairman Also the foils used for these discussions are attached
`as Annex
`in detail from the actual words agreed to by the group Also this
`since my summary may differ
`and MHAG had their own meetings
`time when MPEG-audio
`meeting was held at
`so it is likely
`that addressed areas of their expertise will have to be re-examined when the
`that any conclusions
`experts are available
`
`the issues described
`
`appropriate
`
`short joint meeting with MHAG We discussed the need for
`Following this meeting there was
`more joint meetings and plan to schedule one for the next WG8 session MPEGs focus is on real
`time implications of interleaving data streams for video and one or more audio streams There is
`overlap in our concerns since both groups are concerned with ancillary data such as that describing
`
`1
`
`HP 1028
`
`
`
`subtitles of event triggers Further MPEG has
`need to settle these issues quickly but this should
`be done in way that will be ultimately compatible with the MHAG standard
`This has the
`beneflcieal effect on MHAG of forcing it to consider
`these issues in timely way
`
`ensuing discussion
`
`There was also
`short meeting with MPEG-audio
`The purpose of this meeting was largely to
`communicate conclusions
`that had been reached by the systems ad hoc group As
`result of the
`the decision that MPEG would be distribution only was revisited Inparticular
`that would benefit
`from the ability to do audio editing of the compressed
`there are applications
`audio bit stream and MPEG should not make arbitrary decisions
`to rule out such functionality
`This discussion
`is summarized below in revised issue and status report
`
`Statement of Work
`To define the means of combining coded audio and video and control data on storage medium
`or transmission system such that the combination can be used by an application
`
`Programme ofwork
`Analyze system capabilities of finalist algorithms and limitations of storage media
`Decide what operations on MPEG bitstream must be supported
`and functionality
`Design protocol
`layer architecture
`Perform detailed design of each layer
`Specify device dependent representation for significant delivery media
`Liaise with MPEG-audio MPEG-video and MHAG
`
`List of work items
`
`Definition of terms
`Summary of system capabilities of finalist algorithms
`Summary of limitations and capabilities for all significant storage media
`List and definition of supported operations on MPEG bitstream
`Reference model for MPEG-bitstream architecture
`Detailed proposal for MPEG bitstream syntax
`Issue list and status report
`
`draft of
`
`The following is
`the last work item based on the discussions held It consists of
`numberered paragraph describing the issue and one or more following paragraphs summarizing
`the consensus of those present These issues are in the order they were suggested rather
`than any
`logical order
`future drafts of the other work items will use this as source
`It is anticipated that
`material and present
`in more organized manner
`
`their content
`
`Is the MPEG bitstream for contribution or distribution
`That is is it in essentially final form
`distribution oris it anticipated that applications willwish to edit it inwhich case more information
`may need to be available in or pointed to by the bitstream contribution
`
`Our scope is limited to the format of the distribution form of the data There will need to be an
`associated contribution
`form used during the process of creating applications but it is not necessary
`the implications and requirements of this need at the present time However
`there are applications
`require the ability to perform audio editin using the coded
`that will
`bitstream We need to consider
`the implications of this need
`
`for us to consider
`
`Is there bit allocation between streams We use stream to refer to the data for particular
`kind of information and channel
`to refer to the abstraction that allows us to think of each stream
`independent entity and sometimes
`as
`use these words interchangeably
`inaccurately
`In other words is the percentage of bits to be allocated for video and audio out of the available
`bandwidth something that will be fixed and specified MPEG
`
`logically
`
`2
`
`
`
`No The application will some freedom to control
`the allocation precisely how much freedom
`needs further discussion
`related issue was discussed later in the meeting
`
`and
`Is there interleaving of streams For example if
`represent units of coded data in
`an audio and video stream does the MPEG bitsream consist of information such as for example
`AVAVAV.. or perhaps AAVAAV..
`
`The size of the interleave units must be small enough so that
`Yes there must be interleaving
`device and buffering limitations do not prevent
`the data for each stream to be effectively simul
`taneous There are however many details that need to be worked out
`
`What is the video interleave unit
`
`here that need to be worked out Two natural possibilities are
`There are many considerations
`frame time and unit of breaks in interframe dependency
`It may be necessary to consider
`this
`jointly with the next
`issue It may be necessary to consider
`the implications of trancoding between
`different display rates 242530 This issue was deferred
`for future discussion
`
`What
`
`is the audio interleave unit
`
`This issue was deferred
`
`for future discussion
`
`Can an MPEG bitstream be modified This envisages changes at
`the coding for individual audio and video bitstreams
`
`preserving
`
`the multiplex
`
`level while
`
`We distinguished two modes of changes by copying creating modified version someplace
`else
`on the disk and in place It was felt that changes by copying would present no tchnical challenge
`and hence should be supported
`In place changes were also felt to be desirable but we need more
`technical work to determine the extent
`to which it is reasonable to support then
`
`Is there
`
`form of the MPEG bitstream That
`is how do we deal
`storage device independent
`with the fact
`that some physical storage devices have limitations that prevent
`the use of some special
`playback modes and it is foolish to waste bits on modes that are unusable for given application
`and storage device
`
`It was felt that it should be easy to change an MPEG bitstream that had been optimized fo one
`storage device to one that is optimized for another One suggestion was that
`the detailed syntax
`bits contain
`that describe which access modes are
`information accessible by the application
`Further work is needed to clarify the implications of device independence
`
`supported
`
`Is random access supported at
`coded bitstream level or
`the multiplex level at
`both Suppose you wish to start playing an MPEG bitstream say one minute after its nominal start
`time How is the informationneeded to determine precisely where to seek to and which bit in the
`input data is the beginning of the data you wish to play This becomes
`tehnical problem since it
`is likely that both the audio and video bitstreanis will use some form of entropy coding and hence
`and it is not clear how one resynchroriizes
`have variable length structures
`especially if you dont
`even know whether you are looking at video audio or multiplex control data
`
`the individual
`
`There was sentiment that it had to be supported at
`the multiplex level but that there might also
`be some need to be supported at the individual bitstream level as well This was deferred for further
`study
`
`Where is error correction and detection handled
`
`3
`
`
`
`At the previous meeting it was agreed that input data error correction was outside the scope of
`MPEG and would be handled associated as part of the storage device interface However
`could be uncor(cid:231)ectable
`that an entire block of data is in error
`errors that report
`
`there
`
`10 Must we make provision for rcovery from unrecoverable input errors In other words if
`an entire block of data is lost and this fact reported to the application
`should we support the ability
`of the application to resume playing the MPEG btstream albeit with
`visible and audible
`where data was lost
`
`gap
`
`Yes we should design the MPEG bitstream syntax so that there is
`available to the application
`
`reasonable recovery strategy
`
`classical
`
`11 How does one resychronize for example after skipping past abad block of input data The
`technique is to reserve
`unique bit configuration within the entropy coding structure and
`if you get lost to search until you find this resychronizing code But
`this is complicated by the
`existence of both coded audio and coded video in the same MPEG bitstream
`
`Two architectural approaches were discussed for representing
`interleaved streanis Interleave-
`unit and fixed-slot The interleave-unit
`allows for variable amounts of physical audio
`approach
`and video data in each interleave unit where the actual amounts are determined by some logical
`notion such as the amount of video bits and audio bits to represent
`1/30 of second Inthe fixed-slot
`for the entire MPEG
`the interleave units for audio and video is fixed and constant
`architecture
`bitstream and bears no relationship to the coded bitstream for example an entropy coded value
`may be split across two interleave units The former approach
`natural one for modeling
`seems
`the notion ofsimultaneous
`events in different bitstreams However it requires some coordination
`the entropy codings used so that
`the resynchronization
`code is unique The latter
`has the advantage that one can tell apriori whether
`approach
`is audio video or control
`give bit
`data and hence the coding of the bitstreanis can be independent of each other This issue requires
`further study
`
`between
`
`12 It is possible that an audio decoder
`that has the speed to generate 128 kbs audio could also
`use that speed to generate two 64kbs sound streams and mix them together This functionality is
`useful for some applications
`
`This is not really system issue It was decided to refer itto the MPEG-audio
`group as desirable
`feature If the final choice of audio algorithm lends itself to this kind of functionality then
`optional
`it was felt it should be supported
`
`13 How often should there be audio editing points in the audio bitstream Does the size of the
`smallest editable audio unit have to be related to the video interleave unit
`
`This issue was discussed during
`joint meeting with the audio group at which at was felt that
`audio editting could be an important functionality There are
`number of technical problems due
`to the high probability that the access unit for audio would not be simply related to the access unit
`for video with further complications
`introduced by the NTSC/PAL
`need for two sizes of video
`access units
`suggested solution was the use of audio access units smaller than 33 ms that could
`be dithered to form an average audio unit that would allow
`frames worth of audio to be
`associated with each coded image In order to provide editing at psycho-acoustically
`appropriate
`instants finer resolution than 33 ins is needed One way this might be handled without requiring
`smaller audio-access unit is to provide
`header with each audio access unit or each frame of
`multiplexed data that identifies which smaller signal processing unit it starts and stops with
`This requires some extra storage for parts of the audio data that are unheard but since editing join
`
`4
`
`
`
`points are rare this can hopefully be neglected Another
`issue raised was that psycho-acoustically
`desirable cut-points may occur anyplace within the frame and the associated
`random-walk
`number of audio bits could grow significantly
`
`in
`
`14 What is the relationship between
`
`the charters of MPEG-systems and MJAG
`
`specific
`
`MPEG needs to have
`syntax defined early for inclusion in its draft proposal
`In theory
`MPEG should use MHAG syntax to describe the combined audio/video
`objects it deals with but
`practical matter it is proposed that MPEG define
`this syntax will not be available in time As
`syntax for its needs and liase with MHAG to assure that the this syntax has
`high probability of
`being adopted by MHAG as
`special case of the more general syntax it will eventually formulate
`This will require joint meetings between the ad hoc systems group and MHAG
`
`15 What are the implications of compatibility with computer graphics This issue originally
`captured the system cost advantages of using
`hardare approach
`in which computer graphics
`for
`overlayed on the video could be efficiently implemented
`
`examplesubtitles
`
`This has no apparent effect on the MPEG-systems
`further
`
`statement of work and will not be discussed
`
`16 Do we need square pixels
`
`This is
`issue Non-square pixels complicate
`things such as text
`font
`computer
`descriptions However
`ratio of pixels is really determined by the actual display device
`the aspect
`used and is outside the scope of MPEG control
`
`graphics
`
`17 Will
`
`the MPEG bitstream support multiile interleaved video streams
`
`This was deemed desirable if not difficult Wifi be re-evaluated when more details of the syntax
`have been discussed
`
`18 Must the bit allocation between audio and video be constant
`or can it change part way through playback
`
`for the entire MPEG bitstream
`
`This was
`
`felt to be desirable since image quality can be significantly
`improved for difficult
`by allowing the application
`some audio quality during
`to for example sacrifice
`sequences
`non-music sections of the audio track and use the freed up bits to improve video quality There
`was not universally agreed to and at
`least one open issue remained namely should there be
`minimum rate at which the allocation can be changed There are significant
`implications on the
`coding process if this degree of freedom is made available to the encoder possibly requiring any
`iterations of the encoding process
`
`19 It is possible that the eventual MPEG standard will have abase level of hardware and optional
`features How is this possibility reflected in the MPEG bitstream definition
`
`discover
`
`it is assumed that
`system will provide some mechanism for the application
`the operating
`the level of hardware support it is currently running on The MPEG syntax should contain
`data that allows an application to discover what hardware is needed to play the MPEG bitstream
`without needing to examine the entire MPEG bitstream The details of this data are to be
`determined later
`
`to
`
`20 What kind of interleaving constraints exist
`
`5
`
`
`
`Some issues remain to be resolved In what sense is interleaving periodic How is interleaving
`done to support modes such
`fast forward in which some of the desired data may be unavailable
`because it was skipped over Does the different pipeline delay for audio and video decoding imply
`that the audio and video data should be allowed to be offset
`to each other There are
`probably ther issues that will surfcae when we examine the layered protocol architecture in more
`detail
`
`relative
`
`21 The audio test fo fast forward assumes that all the audio blocks except for every fourth one
`are discarded For devices such as CD-ROM this is not feasible since skipping over unused data
`takes as long as reading and processing
`
`it
`
`This problemshould be communicated to the audio group It is possible it can be handled at the
`multiplex level by
`suitable interleaving strategy but this requires further study
`
`22 The audio coding group has defined fast
`forward as 4X normal speed and there is some
`sentiment that any faster speed cannot produce inteffigible speech The video coding group has
`defined fast forward as 8X-1OX and is reluctant
`to allow slower fast forward since including more
`breaks in interframe dependencies tends to leave fewer bits available for generating
`good quality
`images How do we reconcile these incompatible definitions
`
`One suggestion was to define
`lox fast forward speed that was silent and
`medium speed
`that had fast audiowith it Another was to leave the decision of what speed fast forward would be
`upto the application developer
`including perhaps the option of not supporting
`it all for bitstreams
`is willing to make the tradeoff of this feature against
`that the developer
`the possibility of better
`limitations on whether and how fast
`image quality Removing all
`forward would be was
`to because some users would have the expectation that all MPEG video could be
`browsed through using fast forward This issue could not be resolved at the MPEG-systems group
`level and so it was recommended that it be referred to the MPEG plenary
`
`objected
`
`fast
`
`23 We may decide to allow some of the special playback modes to be optional That
`is we may
`leave it up to the discretion of the application developer whether or not he wishes to add the bits
`the syntax of the MPEG bitstream
`needed to support them If we do this how does ths affect
`
`Analogous to the device dependencies discused in 19 there can be fields within the syntax that
`indicate such things as whether or not
`feature such as normal
`reverse playback is supported by
`this MPEG bitstream The details of this syntax are to be determined This is information which
`is known to the encoder and should be preserved for the convenience
`of the application
`
`24 There was discussion of the notion master-copy/limited-copy This is the notion that there
`representation of an MPEG object
`may be
`contribution level
`from which it is possible to
`number of limited copies that have less information The information is not lost
`generate
`since one an always go back to the master copy For example consider an application which needs
`to work in languages but will only use one at time Rather
`than interleave all five audio tracks
`and correspondingly
`reduce the amount of bandwidth for video the developer may adopt
`strategy On the CD-ROM he includes for example an MPEG bitstream with Li mbs of audio and
`five interleaved audio streams each at .1 mbs This MPEG bitstreamis not playable onaCD-ROM
`since the CD-ROM cannot be commanded to deliver data at 1.8 mbs Nevertheless the application
`can use this master copy to create
`limited object
`that only has one audio stream and
`can be played
`
`winchester
`
`diferent
`
`It was decided that this was largely an MHAG issue and it should be referred to them That is
`ad MHAG will define
`both the master and limited copies are MHAG objects
`the relationships
`between them
`
`-r
`
`6
`
`
`
`25 The audio hardware should be able to transcode G.722 audio This is the audio standard
`that any CC1TF-SG-XV compatible decoder must accept
`
`if the finalist video coding scheme can trancode the CCflTSG-XV video algorithm then it
`becomes REQUIREMENT that the audio decoder also be able to transcode G.722 The ability
`to transcode px64 without audio is virtually useless This should be communicated to MPEG audio
`
`26 Should ancillary data be provided at
`both
`
`the multiplex level
`
`at
`
`the coded bitstream level or
`
`layering architecture is used it is very natural and easy to put ancilliaiy
`if the interleave-unit
`the multiplex leveL One simply defines another type of channel that contains andilhiary data
`data at
`Hence the recommendation was only interleaved at multiplex level unless further study indicates
`good reason to do otherwise There were two possible concerns thatwere raised that might suggest
`putting this data inside the coded bitstream One that the anciliary data might need to be very
`closely synchronized to the coded bitstream data and that
`timing resolution detrmined by the
`interleave unit may not be suitable The other was the example of
`video stream with five audio
`streams each of which has an associated subtitling
`ancillary data stream There needs to be some
`mechanism for associating
`each subtitling
`corresponding audio bitstream Doing it
`stream with
`two-level grouping concept which would be an extra complication
`the multiplex level suggests
`On the other hand doing it at
`the coded bitsream level promises to be very inflexible in dealing
`with different needs for different amounts of anciliary data Again this needs further study but for
`the time being we believe that anciliary data support at the multiplex level will be sufficient
`
`at
`
`27 Is JPEG/CCflI
`time requirement
`real
`It was not clear whether
`transcoding
`the
`requirement could or must be done by non-realtime process The output of non
`transcoding
`realtime transcoding would be an MPEG-bitstream However if transcoding was expected to be
`done in real time then there is need to consider
`video bitstream imbedded in
`th possibility that
`an MPEG multiplex might be coded for CC11T not MPEG
`
`this has any implications on the format of the MPEG multiplex
`Maybe it is not clear whether
`referred to MPEG at
`it should
`In any event
`decision on the intended meaning of
`large for
`transcoding
`
`28 Are time stamps needed at channel and multiplex layer
`
`This is work item Is video audio timing in the bitstream as well as the multiplex There-
`some degree of timestamping that is direct consequence
`if we
`of the interleave-unit
`adopt it
`
`architecture
`
`is
`
`29 Why and when redundancy
`
`This was not discussed
`
`30 Are data streams self-synchronizing
`
`This was not discussed
`
`7
`
`
`
`.a
`
`1sreAfl5
`rJP4Wce
`fRPFMF
`rrs
`FI4ND PEsOfv
`VlDEO3NdN1DIO tSSOE
`
`t4
`
`tP
`Ja-oIo ouft Sco106
`fpo9L4
`tSSl/fS
`
`8
`
`
`
`DE
`
`To pea»; ‘55:“
`
`60085 A?)
`
`.
`
`Jl
`
`
`
`{79761
`
`.-.---
`
`74lUP1/JD/
`ra ddr7uIr_f
`
`U1J
`f
`@é 077* _
`jt/
`EWWTVFI" #19”; fiEgSa°LV€
`N0 :31 I: Avg/o i. AM A/—(/1 béo was:
`.11D
`V€£PE D; F& I? EDD/a M call D6’3.
`,1,-
`‘3'.
`.6:909???)
` C§DIM£E ‘11::3 fight
`a a :Qh
`‘
`I:
`, Z!
`___
`H55 (55m ' W55 -5-
`55"“
`“‘
`FESé" 136.3“
`2 m .
`KEIQVM‘M
`DO
`)WPAgKu/Lbao‘ 32;!“.TacofiDIW8—5W
`411p
`Mf
`n_
`3) MP5 5. Mar ) was? “Waffle sag-:3
`
`.
`
`Not 44Mpp
`
`2
`
`-J
`
`47
`
`/1
`
`
`
`9
`
`
`
`VVO/K
`
`.p
`
`PoRi4
`6--r DffS 11ce
`FEc7774
`Pi rcte. LAY 5f
`C7M4LtY
`PE
`3a SdPrdI
`
`VCID5
`1HA
`
`S3TbI
`
`V7W.i
`
`6E
`
`sFse FyDEVf Dj
`
`%////flIC4NT
`
`IV
`
`iDp
`
`ID
`
`WJTH
`
`APOVIE4514VAf
`
`/2
`
`10
`
`
`
`GICAc
`
`CO i1tO 04Th ca
`rAE5 W401
`ftA 6h
`P0063
`otE.rr 3TRekMf
`TAfI$J
`.ic
`psiisrc4fJ-
`
`siitfAtiS
`
`jpFO
`
`PsZcA. zuez
`
`STOAde
`
`9efeMOepcES
`.E.jD..cf/I
`vs iS4RZIc AAT
`
`wirE3rZG
`
`11
`
`
`
`Ie_L
`tIE.JrrFl soU6 rS33
`per1E 4fE WZC CL04TA
`iT
`MPU$PT
`Ni
`
`It
`
`var io 0o
`
`12
`
`
`
`CC1A1TSN Oft
`4OT OA/7PS74
`lO
`
`aisr. I4OC.TS.N 8fTw(cid:231)
`SPV1fl3
`PP JA/
`313 T$MP
`eF fPNs
`Ys 8b7 7II$
`
`cEv twr
`Ic fl4
`L(cid:231) YHef ADcv NDICNY 4.ft.$
`VTA5 lNTr MAas(cid:231) e7wr
`a74aLg psc
`
`ftE
`MO1 .C..w
`
`w.i
`
`ec
`
`13
`
`
`
`
`
`_Ji1ITTrT
`c4
`
` ,
`
`14
`
`
`
`Gerno ncoce cft
`ES 9flStPtSSoatXfl
`ID APAtI
`9Ano tmiiW FOUJTS
`DIPSSVet IWtSA nn.i
`4MIIM flP6
`wVisFJ
`Mf4Z VI canrnr.e
`/Ida avADS PPIeL
`grcP ewe
`IPbEO .tfl
`9ptVL.1tPLt
`
`15
`
`
`
`j7441
`
`w.4i-
`
`Vs
`
`9r 4LtOr1DW
`ft1M jlvDg
`CAI
`c5E
`LCD
`FOft
`be(cid:224)e
`1P
`714/7
`6VI V6-
`Al
`Drnf JOWCj FSNMIG-. 3V1S
`ra P4
`ID ttP
`$T W.Ab sgED 1$
`drpex 1qccES
`D3C1PT
`rte
`
`co1t1
`
`I__/
`
`r1lS
`
`r_f-ifT -T
`
`c- f7t
`
`7ffrfg 1Jc
`
`7EflLTffr
`
`417
`
`sr4l1Ps WPft
`ii
`VIP
`
`3T4N
`
`cAE4
`
`miCT jfr
`
`DPtTA
`
`A1
`
`16
`
`